name-rev: prefer shorter names over following merges

name-rev has a MERGE_TRAVERSAL_WEIGHT to say that traversing a second or
later parent of a merge should be 65535 times more expensive than a
first-parent traversal, as per ac076c29ae8d (name-rev: Fix non-shortest
description, 2007-08-27).  The point of this weight is to prefer names
like

    v2.32.0~1471^2

over names like

    v2.32.0~43^2~15^2~11^2~20^2~31^2

which are two equally valid names in git.git for the same commit.  Note
that the first follows 1472 parent traversals compared to a mere 125 for
the second.  Weighting all traversals equally would clearly prefer the
second name since it has fewer parent traversals, but humans aren't
going to be traversing commits and they tend to have an easier time
digesting names with fewer segments.  The fact that the former only has
two segments (~1471, ^2) makes it much simpler than the latter which has
six segments (~43, ^2, ~15, etc.).  Since name-rev is meant to "find
symbolic names suitable for human digestion", we prefer fewer segments.

However, the particular rule implemented in name-rev would actually
prefer

    v2.33.0-rc0~11^2~1

over

    v2.33.0-rc0~20^2

because both have precisely one second parent traversal, and it gives
the tie breaker to shortest number of total parent traversals.  Fewer
segments is more important for human consumption than number of hops, so
we'd rather see the latter which has one fewer segment.

Include the generation in is_better_name() and use a new
effective_distance() calculation so that we prefer fewer segments in
the printed name over fewer total parent traversals performed to get the
answer.

== Side-note on tie-breakers ==

When there are the same number of segments for two different names, we
actually use the name of an ancestor commit as a tie-breaker as well.
For example, for the commit cbdca289fb in the git.git repository, we
prefer the name v2.33.0-rc0~112^2~1 over v2.33.0-rc0~57^2~5.  This is
because:

  * cbdca289fb is the parent of 25e65b6dd5, which implies the name for
    cbdca289fb should be the first parent of the preferred name for
    25e65b6dd5
  * 25e65b6dd5 could be named either v2.33.0-rc0~112^2 or
    v2.33.0-rc0~57^2~4, but the former is preferred over the latter due
    to fewer segments
  * combine the two previous facts, and the name we get for cbdca289fb
    is "v2.33.0-rc0~112^2~1" rather than "v2.33.0-rc0~57^2~5".

Technically, we get this for free out of the implementation since we
only keep track of one name for each commit as we walk history (and
re-add parents to the queue if we find a better name for those parents),
but the first bullet point above ensures users get results that feel
more consistent.

== Alternative Ideas and Meanings Discussed ==

One suggestion that came up during review was that shortest
string-length might be easiest for users to consume.  However, such a
scheme would be rather computationally expensive (we'd have to track all
names for each commit as we traversed the graph) and would additionally
come with the possibly perplexing result that on a linear segment of
history we could rapidly swap back and forth on names:
   MYTAG~3^2     would     be preferred over   MYTAG~9998
   MYTAG~3^2~1   would NOT be preferred over   MYTAG~9999
   MYTAG~3^2~2   might     be preferred over   MYTAG~10000

Another item that came up was possible auxiliary semantic meanings for
name-rev results either before or after this patch.  The basic answer
was that the previous implementation had no known useful auxiliary
semantics, but that for many repositories (most in my experience), the
new scheme does.  In particular, the new name-rev output can often be
used to answer the question, "How or when did this commit get merged?"
Since that usefulness depends on how merges happen within the repository
and thus isn't universally applicable, details are omitted here but you
can see them at [1].

[1] https://lore.kernel.org/git/CABPp-BEeUM+3NLKDVdak90_UUeNghYCx=Dgir6=8ixvYmvyq3Q@mail.gmail.com/

Finally, it was noted that the algorithm could be improved by just
explicitly tracking the number of segments and using both it and
distance in the comparison, instead of giving a magic number that tries
to blend the two (and which therefore might give suboptimal results in
repositories with really huge numbers of commits that periodically merge
older code).  However, "[this patch] seems to give us a much better
results than the current code, so let's take it and leave further
futzing outside the scope."

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
1 file changed
tree: d93cc0a2fe086f91f56ada18d0f158c2450e373d
  1. .github/
  2. block-sha1/
  3. builtin/
  4. ci/
  5. compat/
  6. contrib/
  7. Documentation/
  8. ewah/
  9. git-gui/
  10. gitk-git/
  11. gitweb/
  12. mergetools/
  13. negotiator/
  14. perl/
  15. po/
  16. ppc/
  17. refs/
  18. sha1dc/
  19. sha256/
  20. t/
  21. templates/
  22. trace2/
  23. xdiff/
  24. .cirrus.yml
  25. .clang-format
  26. .editorconfig
  27. .gitattributes
  28. .gitignore
  29. .gitmodules
  30. .mailmap
  31. .travis.yml
  32. .tsan-suppressions
  33. abspath.c
  34. aclocal.m4
  35. add-interactive.c
  36. add-interactive.h
  37. add-patch.c
  38. advice.c
  39. advice.h
  40. alias.c
  41. alias.h
  42. alloc.c
  43. alloc.h
  44. apply.c
  45. apply.h
  46. archive-tar.c
  47. archive-zip.c
  48. archive.c
  49. archive.h
  50. attr.c
  51. attr.h
  52. banned.h
  53. base85.c
  54. bisect.c
  55. bisect.h
  56. blame.c
  57. blame.h
  58. blob.c
  59. blob.h
  60. bloom.c
  61. bloom.h
  62. branch.c
  63. branch.h
  64. builtin.h
  65. bulk-checkin.c
  66. bulk-checkin.h
  67. bundle.c
  68. bundle.h
  69. cache-tree.c
  70. cache-tree.h
  71. cache.h
  72. cbtree.c
  73. cbtree.h
  74. chdir-notify.c
  75. chdir-notify.h
  76. check-builtins.sh
  77. checkout.c
  78. checkout.h
  79. chunk-format.c
  80. chunk-format.h
  81. CODE_OF_CONDUCT.md
  82. color.c
  83. color.h
  84. column.c
  85. column.h
  86. combine-diff.c
  87. command-list.txt
  88. commit-graph.c
  89. commit-graph.h
  90. commit-reach.c
  91. commit-reach.h
  92. commit-slab-decl.h
  93. commit-slab-impl.h
  94. commit-slab.h
  95. commit.c
  96. commit.h
  97. common-main.c
  98. config.c
  99. config.h
  100. config.mak.dev
  101. config.mak.in
  102. config.mak.uname
  103. configure.ac
  104. connect.c
  105. connect.h
  106. connected.c
  107. connected.h
  108. convert.c
  109. convert.h
  110. copy.c
  111. COPYING
  112. credential.c
  113. credential.h
  114. csum-file.c
  115. csum-file.h
  116. ctype.c
  117. daemon.c
  118. date.c
  119. decorate.c
  120. decorate.h
  121. delta-islands.c
  122. delta-islands.h
  123. delta.h
  124. detect-compiler
  125. diff-delta.c
  126. diff-lib.c
  127. diff-merges.c
  128. diff-merges.h
  129. diff-no-index.c
  130. diff.c
  131. diff.h
  132. diffcore-break.c
  133. diffcore-delta.c
  134. diffcore-order.c
  135. diffcore-pickaxe.c
  136. diffcore-rename.c
  137. diffcore-rotate.c
  138. diffcore.h
  139. dir-iterator.c
  140. dir-iterator.h
  141. dir.c
  142. dir.h
  143. editor.c
  144. entry.c
  145. entry.h
  146. environment.c
  147. environment.h
  148. exec-cmd.c
  149. exec-cmd.h
  150. fetch-negotiator.c
  151. fetch-negotiator.h
  152. fetch-pack.c
  153. fetch-pack.h
  154. fmt-merge-msg.c
  155. fmt-merge-msg.h
  156. fsck.c
  157. fsck.h
  158. fsmonitor.c
  159. fsmonitor.h
  160. fuzz-commit-graph.c
  161. fuzz-pack-headers.c
  162. fuzz-pack-idx.c
  163. generate-cmdlist.sh
  164. generate-configlist.sh
  165. generate-hooklist.sh
  166. gettext.c
  167. gettext.h
  168. git-add--interactive.perl
  169. git-archimport.perl
  170. git-bisect.sh
  171. git-compat-util.h
  172. git-curl-compat.h
  173. git-cvsexportcommit.perl
  174. git-cvsimport.perl
  175. git-cvsserver.perl
  176. git-difftool--helper.sh
  177. git-filter-branch.sh
  178. git-instaweb.sh
  179. git-merge-octopus.sh
  180. git-merge-one-file.sh
  181. git-merge-resolve.sh
  182. git-mergetool--lib.sh
  183. git-mergetool.sh
  184. git-p4.py
  185. git-quiltimport.sh
  186. git-request-pull.sh
  187. git-send-email.perl
  188. git-sh-i18n.sh
  189. git-sh-setup.sh
  190. git-submodule.sh
  191. git-svn.perl
  192. GIT-VERSION-GEN
  193. git-web--browse.sh
  194. git.c
  195. git.rc
  196. gpg-interface.c
  197. gpg-interface.h
  198. graph.c
  199. graph.h
  200. grep.c
  201. grep.h
  202. hash-lookup.c
  203. hash-lookup.h
  204. hash.h
  205. hashmap.c
  206. hashmap.h
  207. help.c
  208. help.h
  209. hex.c
  210. hook.c
  211. hook.h
  212. http-backend.c
  213. http-fetch.c
  214. http-push.c
  215. http-walker.c
  216. http.c
  217. http.h
  218. ident.c
  219. imap-send.c
  220. INSTALL
  221. iterator.h
  222. json-writer.c
  223. json-writer.h
  224. khash.h
  225. kwset.c
  226. kwset.h
  227. levenshtein.c
  228. levenshtein.h
  229. LGPL-2.1
  230. line-log.c
  231. line-log.h
  232. line-range.c
  233. line-range.h
  234. linear-assignment.c
  235. linear-assignment.h
  236. list-objects-filter-options.c
  237. list-objects-filter-options.h
  238. list-objects-filter.c
  239. list-objects-filter.h
  240. list-objects.c
  241. list-objects.h
  242. list.h
  243. ll-merge.c
  244. ll-merge.h
  245. lockfile.c
  246. lockfile.h
  247. log-tree.c
  248. log-tree.h
  249. ls-refs.c
  250. ls-refs.h
  251. mailinfo.c
  252. mailinfo.h
  253. mailmap.c
  254. mailmap.h
  255. Makefile
  256. match-trees.c
  257. mem-pool.c
  258. mem-pool.h
  259. merge-blobs.c
  260. merge-blobs.h
  261. merge-ort-wrappers.c
  262. merge-ort-wrappers.h
  263. merge-ort.c
  264. merge-ort.h
  265. merge-recursive.c
  266. merge-recursive.h
  267. merge.c
  268. mergesort.c
  269. mergesort.h
  270. midx.c
  271. midx.h
  272. name-hash.c
  273. notes-cache.c
  274. notes-cache.h
  275. notes-merge.c
  276. notes-merge.h
  277. notes-utils.c
  278. notes-utils.h
  279. notes.c
  280. notes.h
  281. object-file.c
  282. object-name.c
  283. object-store.h
  284. object.c
  285. object.h
  286. oid-array.c
  287. oid-array.h
  288. oidmap.c
  289. oidmap.h
  290. oidset.c
  291. oidset.h
  292. oidtree.c
  293. oidtree.h
  294. pack-bitmap-write.c
  295. pack-bitmap.c
  296. pack-bitmap.h
  297. pack-check.c
  298. pack-objects.c
  299. pack-objects.h
  300. pack-revindex.c
  301. pack-revindex.h
  302. pack-write.c
  303. pack.h
  304. packfile.c
  305. packfile.h
  306. pager.c
  307. parallel-checkout.c
  308. parallel-checkout.h
  309. parse-options-cb.c
  310. parse-options.c
  311. parse-options.h
  312. patch-delta.c
  313. patch-ids.c
  314. patch-ids.h
  315. path.c
  316. path.h
  317. pathspec.c
  318. pathspec.h
  319. pkt-line.c
  320. pkt-line.h
  321. preload-index.c
  322. pretty.c
  323. pretty.h
  324. prio-queue.c
  325. prio-queue.h
  326. progress.c
  327. progress.h
  328. promisor-remote.c
  329. promisor-remote.h
  330. prompt.c
  331. prompt.h
  332. protocol-caps.c
  333. protocol-caps.h
  334. protocol.c
  335. protocol.h
  336. prune-packed.c
  337. prune-packed.h
  338. quote.c
  339. quote.h
  340. range-diff.c
  341. range-diff.h
  342. reachable.c
  343. reachable.h
  344. read-cache.c
  345. README.md
  346. rebase-interactive.c
  347. rebase-interactive.h
  348. rebase.c
  349. rebase.h
  350. ref-filter.c
  351. ref-filter.h
  352. reflog-walk.c
  353. reflog-walk.h
  354. refs.c
  355. refs.h
  356. refspec.c
  357. refspec.h
  358. remote-curl.c
  359. remote.c
  360. remote.h
  361. replace-object.c
  362. replace-object.h
  363. repo-settings.c
  364. repository.c
  365. repository.h
  366. rerere.c
  367. rerere.h
  368. reset.c
  369. reset.h
  370. resolve-undo.c
  371. resolve-undo.h
  372. revision.c
  373. revision.h
  374. run-command.c
  375. run-command.h
  376. SECURITY.md
  377. send-pack.c
  378. send-pack.h
  379. sequencer.c
  380. sequencer.h
  381. serve.c
  382. serve.h
  383. server-info.c
  384. setup.c
  385. sh-i18n--envsubst.c
  386. sha1dc_git.c
  387. sha1dc_git.h
  388. shallow.c
  389. shallow.h
  390. shell.c
  391. shortlog.h
  392. sideband.c
  393. sideband.h
  394. sigchain.c
  395. sigchain.h
  396. simple-ipc.h
  397. sparse-index.c
  398. sparse-index.h
  399. split-index.c
  400. split-index.h
  401. stable-qsort.c
  402. strbuf.c
  403. strbuf.h
  404. streaming.c
  405. streaming.h
  406. string-list.c
  407. string-list.h
  408. strmap.c
  409. strmap.h
  410. strvec.c
  411. strvec.h
  412. sub-process.c
  413. sub-process.h
  414. submodule-config.c
  415. submodule-config.h
  416. submodule.c
  417. submodule.h
  418. symlinks.c
  419. tag.c
  420. tag.h
  421. tar.h
  422. tempfile.c
  423. tempfile.h
  424. thread-utils.c
  425. thread-utils.h
  426. tmp-objdir.c
  427. tmp-objdir.h
  428. trace.c
  429. trace.h
  430. trace2.c
  431. trace2.h
  432. trailer.c
  433. trailer.h
  434. transport-helper.c
  435. transport-internal.h
  436. transport.c
  437. transport.h
  438. tree-diff.c
  439. tree-walk.c
  440. tree-walk.h
  441. tree.c
  442. tree.h
  443. unicode-width.h
  444. unimplemented.sh
  445. unix-socket.c
  446. unix-socket.h
  447. unix-stream-server.c
  448. unix-stream-server.h
  449. unpack-trees.c
  450. unpack-trees.h
  451. upload-pack.c
  452. upload-pack.h
  453. url.c
  454. url.h
  455. urlmatch.c
  456. urlmatch.h
  457. usage.c
  458. userdiff.c
  459. userdiff.h
  460. utf8.c
  461. utf8.h
  462. varint.c
  463. varint.h
  464. version.c
  465. version.h
  466. versioncmp.c
  467. walker.c
  468. walker.h
  469. wildmatch.c
  470. wildmatch.h
  471. worktree.c
  472. worktree.h
  473. wrap-for-bin.sh
  474. wrapper.c
  475. write-or-die.c
  476. ws.c
  477. wt-status.c
  478. wt-status.h
  479. xdiff-interface.c
  480. xdiff-interface.h
  481. zlib.c
README.md

Build status

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from https://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.txt to get started, then see Documentation/giteveryday.txt for a useful minimum set of commands, and Documentation/git-<commandname>.txt for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.txt (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission). To subscribe to the list, send an email with just “subscribe git” in the body to majordomo@vger.kernel.org. The mailing list archives are available at https://lore.kernel.org/git/, http://marc.info/?l=git and other archival sites.

Issues which are security relevant should be disclosed privately to the Git Security mailing list git-security@googlegroups.com.

The maintainer frequently sends the “What's cooking” reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name “git” was given by Linus Torvalds when he wrote the very first version. He described the tool as “the stupid content tracker” and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of “get” may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • “global information tracker”: you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • “goddamn idiotic truckload of sh*t”: when it breaks