commit-reach: add tips_reachable_from_bases()

Both 'git for-each-ref --merged=<X>' and 'git branch --merged=<X>' use
the ref-filter machinery to select references or branches (respectively)
that are reachable from a set of commits presented by one or more
--merged arguments. This happens within reach_filter(), which uses the
revision-walk machinery to walk history in a standard way.

However, the commit-reach.c file is full of custom searches that are
more efficient, especially for reachability queries that can terminate
early when reachability is discovered. Add a new
tips_reachable_from_bases() method to commit-reach.c and call it from
within reach_filter() in ref-filter.c. This affects both 'git branch'
and 'git for-each-ref' as tested in p1500-graph-walks.sh.

For the Linux kernel repository, we take an already-fast algorithm and
make it even faster:

Test                                            HEAD~1  HEAD
-------------------------------------------------------------------
1500.5: contains: git for-each-ref --merged     0.13    0.02 -84.6%
1500.6: contains: git branch --merged           0.14    0.02 -85.7%
1500.7: contains: git tag --merged              0.15    0.03 -80.0%

(Note that we remove the iterative 'git rev-list' test from p1500
because it no longer makes sense as a comparison to 'git for-each-ref'
and would just waste time running it for these comparisons.)

The algorithm is implemented in commit-reach.c in the method
tips_reachable_from_base(). This method takes a string_list of tips and
assigns the 'util' for each item with the value 1 if the base commit can
reach those tips.

Like other reachability queries in commit-reach.c, the fastest way to
search for "can A reach B?" is to do a depth-first search up to the
generation number of B, preferring to explore first parents before later
parents. While we must walk all reachable commits up to that generation
number when the answer is "no", the depth-first search can answer "yes"
much faster than other approaches in most cases.

This search becomes trickier when there are multiple targets for the
depth-first search. The commits with lower generation number are more
likely to be within the history of the start commit, but we don't want
to waste time searching commits of low generation number if the commit
target with lowest generation number has already been found.

The trick here is to take the input commits and sort them by generation
number in ascending order. Track the index within this order as
min_generation_index. When we find a commit, if its index in the list is
equal to min_generation_index, then we can increase the generation
number boundary of our search to the next-lowest value in the list.

With this mechanism, the number of commits to search is minimized with
respect to the depth-first search heuristic. We will walk all commits up
to the minimum generation number of a commit that is _not_ reachable
from the start, but we will walk only the necessary portion of the
depth-first search for the reachable commits of lower generation.

Add extra tests for this behavior in t6600-test-reach.sh as the
interesting data shape of that repository can sometimes demonstrate
corner case bugs.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
5 files changed
tree: b82889d57afa4033b855a2f58e69c709efc3f0bc
  1. .github/
  2. block-sha1/
  3. builtin/
  4. ci/
  5. compat/
  6. contrib/
  7. Documentation/
  8. ewah/
  9. git-gui/
  10. gitk-git/
  11. gitweb/
  12. mergetools/
  13. negotiator/
  14. oss-fuzz/
  15. perl/
  16. po/
  17. refs/
  18. reftable/
  19. sha1dc/
  20. sha256/
  21. t/
  22. templates/
  23. trace2/
  24. xdiff/
  25. .cirrus.yml
  26. .clang-format
  27. .editorconfig
  28. .gitattributes
  29. .gitignore
  30. .gitmodules
  31. .mailmap
  32. .tsan-suppressions
  33. abspath.c
  34. aclocal.m4
  35. add-interactive.c
  36. add-interactive.h
  37. add-patch.c
  38. advice.c
  39. advice.h
  40. alias.c
  41. alias.h
  42. alloc.c
  43. alloc.h
  44. apply.c
  45. apply.h
  46. archive-tar.c
  47. archive-zip.c
  48. archive.c
  49. archive.h
  50. attr.c
  51. attr.h
  52. banned.h
  53. base85.c
  54. bisect.c
  55. bisect.h
  56. blame.c
  57. blame.h
  58. blob.c
  59. blob.h
  60. bloom.c
  61. bloom.h
  62. branch.c
  63. branch.h
  64. builtin.h
  65. bulk-checkin.c
  66. bulk-checkin.h
  67. bundle-uri.c
  68. bundle-uri.h
  69. bundle.c
  70. bundle.h
  71. cache-tree.c
  72. cache-tree.h
  73. cache.h
  74. cbtree.c
  75. cbtree.h
  76. chdir-notify.c
  77. chdir-notify.h
  78. check-builtins.sh
  79. checkout.c
  80. checkout.h
  81. chunk-format.c
  82. chunk-format.h
  83. CODE_OF_CONDUCT.md
  84. color.c
  85. color.h
  86. column.c
  87. column.h
  88. combine-diff.c
  89. command-list.txt
  90. commit-graph.c
  91. commit-graph.h
  92. commit-reach.c
  93. commit-reach.h
  94. commit-slab-decl.h
  95. commit-slab-impl.h
  96. commit-slab.h
  97. commit.c
  98. commit.h
  99. common-main.c
  100. config.c
  101. config.h
  102. config.mak.dev
  103. config.mak.in
  104. config.mak.uname
  105. configure.ac
  106. connect.c
  107. connect.h
  108. connected.c
  109. connected.h
  110. convert.c
  111. convert.h
  112. copy.c
  113. COPYING
  114. credential.c
  115. credential.h
  116. csum-file.c
  117. csum-file.h
  118. ctype.c
  119. daemon.c
  120. date.c
  121. date.h
  122. decorate.c
  123. decorate.h
  124. delta-islands.c
  125. delta-islands.h
  126. delta.h
  127. detect-compiler
  128. diagnose.c
  129. diagnose.h
  130. diff-delta.c
  131. diff-lib.c
  132. diff-merges.c
  133. diff-merges.h
  134. diff-no-index.c
  135. diff.c
  136. diff.h
  137. diffcore-break.c
  138. diffcore-delta.c
  139. diffcore-order.c
  140. diffcore-pickaxe.c
  141. diffcore-rename.c
  142. diffcore-rotate.c
  143. diffcore.h
  144. dir-iterator.c
  145. dir-iterator.h
  146. dir.c
  147. dir.h
  148. editor.c
  149. entry.c
  150. entry.h
  151. environment.c
  152. environment.h
  153. exec-cmd.c
  154. exec-cmd.h
  155. fetch-negotiator.c
  156. fetch-negotiator.h
  157. fetch-pack.c
  158. fetch-pack.h
  159. fmt-merge-msg.c
  160. fmt-merge-msg.h
  161. fsck.c
  162. fsck.h
  163. fsmonitor--daemon.h
  164. fsmonitor-ipc.c
  165. fsmonitor-ipc.h
  166. fsmonitor-path-utils.h
  167. fsmonitor-settings.c
  168. fsmonitor-settings.h
  169. fsmonitor.c
  170. fsmonitor.h
  171. generate-cmdlist.sh
  172. generate-configlist.sh
  173. generate-hooklist.sh
  174. gettext.c
  175. gettext.h
  176. git-archimport.perl
  177. git-compat-util.h
  178. git-curl-compat.h
  179. git-cvsexportcommit.perl
  180. git-cvsimport.perl
  181. git-cvsserver.perl
  182. git-difftool--helper.sh
  183. git-filter-branch.sh
  184. git-instaweb.sh
  185. git-merge-octopus.sh
  186. git-merge-one-file.sh
  187. git-merge-resolve.sh
  188. git-mergetool--lib.sh
  189. git-mergetool.sh
  190. git-p4.py
  191. git-quiltimport.sh
  192. git-request-pull.sh
  193. git-send-email.perl
  194. git-sh-i18n.sh
  195. git-sh-setup.sh
  196. git-submodule.sh
  197. git-svn.perl
  198. GIT-VERSION-GEN
  199. git-web--browse.sh
  200. git.c
  201. git.rc
  202. gpg-interface.c
  203. gpg-interface.h
  204. graph.c
  205. graph.h
  206. grep.c
  207. grep.h
  208. hash-lookup.c
  209. hash-lookup.h
  210. hash.h
  211. hashmap.c
  212. hashmap.h
  213. help.c
  214. help.h
  215. hex.c
  216. hook.c
  217. hook.h
  218. http-backend.c
  219. http-fetch.c
  220. http-push.c
  221. http-walker.c
  222. http.c
  223. http.h
  224. ident.c
  225. imap-send.c
  226. INSTALL
  227. iterator.h
  228. json-writer.c
  229. json-writer.h
  230. khash.h
  231. kwset.c
  232. kwset.h
  233. levenshtein.c
  234. levenshtein.h
  235. LGPL-2.1
  236. line-log.c
  237. line-log.h
  238. line-range.c
  239. line-range.h
  240. linear-assignment.c
  241. linear-assignment.h
  242. list-objects-filter-options.c
  243. list-objects-filter-options.h
  244. list-objects-filter.c
  245. list-objects-filter.h
  246. list-objects.c
  247. list-objects.h
  248. list.h
  249. ll-merge.c
  250. ll-merge.h
  251. lockfile.c
  252. lockfile.h
  253. log-tree.c
  254. log-tree.h
  255. ls-refs.c
  256. ls-refs.h
  257. mailinfo.c
  258. mailinfo.h
  259. mailmap.c
  260. mailmap.h
  261. Makefile
  262. match-trees.c
  263. mem-pool.c
  264. mem-pool.h
  265. merge-blobs.c
  266. merge-blobs.h
  267. merge-ort-wrappers.c
  268. merge-ort-wrappers.h
  269. merge-ort.c
  270. merge-ort.h
  271. merge-recursive.c
  272. merge-recursive.h
  273. merge.c
  274. mergesort.h
  275. midx.c
  276. midx.h
  277. name-hash.c
  278. notes-cache.c
  279. notes-cache.h
  280. notes-merge.c
  281. notes-merge.h
  282. notes-utils.c
  283. notes-utils.h
  284. notes.c
  285. notes.h
  286. object-file.c
  287. object-name.c
  288. object-store.h
  289. object.c
  290. object.h
  291. oid-array.c
  292. oid-array.h
  293. oidmap.c
  294. oidmap.h
  295. oidset.c
  296. oidset.h
  297. oidtree.c
  298. oidtree.h
  299. pack-bitmap-write.c
  300. pack-bitmap.c
  301. pack-bitmap.h
  302. pack-check.c
  303. pack-mtimes.c
  304. pack-mtimes.h
  305. pack-objects.c
  306. pack-objects.h
  307. pack-revindex.c
  308. pack-revindex.h
  309. pack-write.c
  310. pack.h
  311. packfile.c
  312. packfile.h
  313. pager.c
  314. parallel-checkout.c
  315. parallel-checkout.h
  316. parse-options-cb.c
  317. parse-options.c
  318. parse-options.h
  319. patch-delta.c
  320. patch-ids.c
  321. patch-ids.h
  322. path.c
  323. path.h
  324. pathspec.c
  325. pathspec.h
  326. pkt-line.c
  327. pkt-line.h
  328. preload-index.c
  329. pretty.c
  330. pretty.h
  331. prio-queue.c
  332. prio-queue.h
  333. progress.c
  334. progress.h
  335. promisor-remote.c
  336. promisor-remote.h
  337. prompt.c
  338. prompt.h
  339. protocol-caps.c
  340. protocol-caps.h
  341. protocol.c
  342. protocol.h
  343. prune-packed.c
  344. prune-packed.h
  345. quote.c
  346. quote.h
  347. range-diff.c
  348. range-diff.h
  349. reachable.c
  350. reachable.h
  351. read-cache.c
  352. README.md
  353. rebase-interactive.c
  354. rebase-interactive.h
  355. rebase.c
  356. rebase.h
  357. ref-filter.c
  358. ref-filter.h
  359. reflog-walk.c
  360. reflog-walk.h
  361. reflog.c
  362. reflog.h
  363. refs.c
  364. refs.h
  365. refspec.c
  366. refspec.h
  367. remote-curl.c
  368. remote.c
  369. remote.h
  370. replace-object.c
  371. replace-object.h
  372. repo-settings.c
  373. repository.c
  374. repository.h
  375. rerere.c
  376. rerere.h
  377. reset.c
  378. reset.h
  379. resolve-undo.c
  380. resolve-undo.h
  381. revision.c
  382. revision.h
  383. run-command.c
  384. run-command.h
  385. scalar.c
  386. SECURITY.md
  387. send-pack.c
  388. send-pack.h
  389. sequencer.c
  390. sequencer.h
  391. serve.c
  392. serve.h
  393. server-info.c
  394. setup.c
  395. sh-i18n--envsubst.c
  396. sha1dc_git.c
  397. sha1dc_git.h
  398. shallow.c
  399. shallow.h
  400. shared.mak
  401. shell.c
  402. shortlog.h
  403. sideband.c
  404. sideband.h
  405. sigchain.c
  406. sigchain.h
  407. simple-ipc.h
  408. sparse-index.c
  409. sparse-index.h
  410. split-index.c
  411. split-index.h
  412. stable-qsort.c
  413. strbuf.c
  414. strbuf.h
  415. streaming.c
  416. streaming.h
  417. string-list.c
  418. string-list.h
  419. strmap.c
  420. strmap.h
  421. strvec.c
  422. strvec.h
  423. sub-process.c
  424. sub-process.h
  425. submodule-config.c
  426. submodule-config.h
  427. submodule.c
  428. submodule.h
  429. symlinks.c
  430. tag.c
  431. tag.h
  432. tar.h
  433. tempfile.c
  434. tempfile.h
  435. thread-utils.c
  436. thread-utils.h
  437. tmp-objdir.c
  438. tmp-objdir.h
  439. trace.c
  440. trace.h
  441. trace2.c
  442. trace2.h
  443. trailer.c
  444. trailer.h
  445. transport-helper.c
  446. transport-internal.h
  447. transport.c
  448. transport.h
  449. tree-diff.c
  450. tree-walk.c
  451. tree-walk.h
  452. tree.c
  453. tree.h
  454. unicode-width.h
  455. unimplemented.sh
  456. unix-socket.c
  457. unix-socket.h
  458. unix-stream-server.c
  459. unix-stream-server.h
  460. unpack-trees.c
  461. unpack-trees.h
  462. upload-pack.c
  463. upload-pack.h
  464. url.c
  465. url.h
  466. urlmatch.c
  467. urlmatch.h
  468. usage.c
  469. userdiff.c
  470. userdiff.h
  471. utf8.c
  472. utf8.h
  473. varint.c
  474. varint.h
  475. version.c
  476. version.h
  477. versioncmp.c
  478. walker.c
  479. walker.h
  480. wildmatch.c
  481. wildmatch.h
  482. worktree.c
  483. worktree.h
  484. wrap-for-bin.sh
  485. wrapper.c
  486. write-or-die.c
  487. ws.c
  488. wt-status.c
  489. wt-status.h
  490. xdiff-interface.c
  491. xdiff-interface.h
  492. zlib.c
README.md

Build status

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from https://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.txt to get started, then see Documentation/giteveryday.txt for a useful minimum set of commands, and Documentation/git-<commandname>.txt for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.txt (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission and Documentation/CodingGuidelines).

Those wishing to help with error message, usage and informational message string translations (localization l10) should see po/README.md (a po file is a Portable Object file that holds the translations).

To subscribe to the list, send an email with just “subscribe git” in the body to majordomo@vger.kernel.org (not the Git list). The mailing list archives are available at https://lore.kernel.org/git/, http://marc.info/?l=git and other archival sites.

Issues which are security relevant should be disclosed privately to the Git Security mailing list git-security@googlegroups.com.

The maintainer frequently sends the “What's cooking” reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name “git” was given by Linus Torvalds when he wrote the very first version. He described the tool as “the stupid content tracker” and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of “get” may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • “global information tracker”: you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • “goddamn idiotic truckload of sh*t”: when it breaks