reflog expire: don't lock reflogs using previously seen OID

During reflog expiry, the cmd_reflog_expire() function first iterates
over all reflogs in logs/*, and then one-by-one acquires the lock for
each one and expires it. This behavior has been with us since this
command was implemented in 4264dc15e1 ("git reflog expire",
2006-12-19).

Change this to stop calling lock_ref_oid_basic() with the OID we saw
when we looped over the logs, instead have it pass the OID it managed
to lock.

This mostly mitigates a race condition where e.g. "git gc" will fail
in a concurrently updated repository because the branch moved since
"git reflog expire --all" was started. I.e. with:

    error: cannot lock ref '<refname>': ref '<refname>' is at <OID-A> but expected <OID-B>

This behavior of passing in an "oid" was needed for an edge-case that
I've untangled in this and preceding commits though, namely that we
needed this OID because we'd:

 1. Lookup the reflog name/OID via dwim_log()
 2. With that OID, lock the reflog
 3. Later in builtin/reflog.c we use the OID we looked as input to
    lookup_commit_reference_gently(), assured that it's equal to the
    OID we got from dwim_log().

We can be sure that this change is safe to make because between
dwim_log (step #1) and lock_ref_oid_basic (step #2) there was no other
logic relevant to the OID or expiry run in the cmd_reflog_expire()
caller.

We can thus treat that code as a black box, before and after this
change it would get an OID that's been locked, the only difference is
that now we mostly won't be failing to get the lock due to the TOCTOU
race[0]. That failure was purely an implementation detail in how the
"current OID" was looked up, it was divorced from the locking
mechanism.

What do we mean with "mostly"? It mostly mitigates it because we'll
still run into cases where the ref is locked and being updated as we
want to expire it, and other git processes wanting to update the refs
will in turn race with us as we expire the reflog.

That remaining race can in turn be mitigated with the
core.filesRefLockTimeout setting, see 4ff0f01cb7 ("refs: retry
acquiring reference locks for 100ms", 2017-08-21). In practice if that
value is high enough we'll probably never have ref updates or reflog
expiry failing, since the clients involved will retry for far longer
than the time any of those operations could take.

See [1] for an initial report of how this impacted "git gc" and a
large discussion about this change in early 2019. In particular patch
looked good to Michael Haggerty, see his[2]. That message seems to not
have made it to the ML archive, its content is quoted in full in my
[3].

I'm leaving behind now-unused code the refs API etc. that takes the
now-NULL "unused_oid" argument, and other code that can be simplified now
that we never have on OID in that context, that'll be cleaned up in
subsequent commits, but for now let's narrowly focus on fixing the
"git gc" issue. As the modified assert() shows we always pass a NULL
oid to reflog_expire() now.

Unfortunately this sort of probabilistic contention is hard to turn
into a test. I've tested this by running the following three subshells
in concurrent terminals:

    (
        rm -rf /tmp/git &&
        git init /tmp/git &&
        while true
        do
            head -c 10 /dev/urandom | hexdump >/tmp/git/out &&
            git -C /tmp/git add out &&
            git -C /tmp/git commit -m"out"
        done
    )

    (
	rm -rf /tmp/git-clone &&
        git clone file:///tmp/git /tmp/git-clone &&
        while git -C /tmp/git-clone pull
        do
            date
        done
    )

    (
        while git -C /tmp/git-clone reflog expire --all
        do
            date
        done
    )

Before this change the "reflog expire" would fail really quickly with
the "but expected" error noted above.

After this change both the "pull" and "reflog expire" will run for a
while, but eventually fail because I get unlucky with
core.filesRefLockTimeout (the "reflog expire" is in a really tight
loop). As noted above that can in turn be mitigated with higher values
of core.filesRefLockTimeout than the 100ms default.

As noted in the commentary added in the preceding commit there's also
the case of branches being racily deleted, that can be tested by
adding this to the above:

    (
        while git -C /tmp/git-clone branch topic master &&
	      git -C /tmp/git-clone branch -D topic
        do
            date
        done
    )

With core.filesRefLockTimeout set to 10 seconds (it can probably be a
lot lower) I managed to run all four of these concurrently for about
an hour, and accumulated ~125k commits, auto-gc's and all, and didn't
have a single failure. The loops visibly stall while waiting for the
lock, but that's expected and desired behavior.

0. https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use
1. https://lore.kernel.org/git/87tvg7brlm.fsf@evledraar.gmail.com/
2. http://lore.kernel.org/git/b870a17d-2103-41b8-3cbc-7389d5fff33a@alum.mit.edu
3. https://lore.kernel.org/git/87pnqkco8v.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
3 files changed
tree: e34ba4d69147a0c9d981b5568242ff2c46a7395c
  1. .github/
  2. block-sha1/
  3. builtin/
  4. ci/
  5. compat/
  6. contrib/
  7. Documentation/
  8. ewah/
  9. git-gui/
  10. gitk-git/
  11. gitweb/
  12. mergetools/
  13. negotiator/
  14. perl/
  15. po/
  16. ppc/
  17. refs/
  18. sha1dc/
  19. sha256/
  20. t/
  21. templates/
  22. trace2/
  23. xdiff/
  24. .cirrus.yml
  25. .clang-format
  26. .editorconfig
  27. .gitattributes
  28. .gitignore
  29. .gitmodules
  30. .mailmap
  31. .travis.yml
  32. .tsan-suppressions
  33. abspath.c
  34. aclocal.m4
  35. add-interactive.c
  36. add-interactive.h
  37. add-patch.c
  38. advice.c
  39. advice.h
  40. alias.c
  41. alias.h
  42. alloc.c
  43. alloc.h
  44. apply.c
  45. apply.h
  46. archive-tar.c
  47. archive-zip.c
  48. archive.c
  49. archive.h
  50. attr.c
  51. attr.h
  52. banned.h
  53. base85.c
  54. bisect.c
  55. bisect.h
  56. blame.c
  57. blame.h
  58. blob.c
  59. blob.h
  60. bloom.c
  61. bloom.h
  62. branch.c
  63. branch.h
  64. builtin.h
  65. bulk-checkin.c
  66. bulk-checkin.h
  67. bundle.c
  68. bundle.h
  69. cache-tree.c
  70. cache-tree.h
  71. cache.h
  72. chdir-notify.c
  73. chdir-notify.h
  74. check-builtins.sh
  75. check_bindir
  76. checkout.c
  77. checkout.h
  78. chunk-format.c
  79. chunk-format.h
  80. CODE_OF_CONDUCT.md
  81. color.c
  82. color.h
  83. column.c
  84. column.h
  85. combine-diff.c
  86. command-list.txt
  87. commit-graph.c
  88. commit-graph.h
  89. commit-reach.c
  90. commit-reach.h
  91. commit-slab-decl.h
  92. commit-slab-impl.h
  93. commit-slab.h
  94. commit.c
  95. commit.h
  96. common-main.c
  97. config.c
  98. config.h
  99. config.mak.dev
  100. config.mak.in
  101. config.mak.uname
  102. configure.ac
  103. connect.c
  104. connect.h
  105. connected.c
  106. connected.h
  107. convert.c
  108. convert.h
  109. copy.c
  110. COPYING
  111. credential.c
  112. credential.h
  113. csum-file.c
  114. csum-file.h
  115. ctype.c
  116. daemon.c
  117. date.c
  118. decorate.c
  119. decorate.h
  120. delta-islands.c
  121. delta-islands.h
  122. delta.h
  123. detect-compiler
  124. diff-delta.c
  125. diff-lib.c
  126. diff-merges.c
  127. diff-merges.h
  128. diff-no-index.c
  129. diff.c
  130. diff.h
  131. diffcore-break.c
  132. diffcore-delta.c
  133. diffcore-order.c
  134. diffcore-pickaxe.c
  135. diffcore-rename.c
  136. diffcore-rotate.c
  137. diffcore.h
  138. dir-iterator.c
  139. dir-iterator.h
  140. dir.c
  141. dir.h
  142. editor.c
  143. entry.c
  144. entry.h
  145. environment.c
  146. environment.h
  147. exec-cmd.c
  148. exec-cmd.h
  149. fetch-negotiator.c
  150. fetch-negotiator.h
  151. fetch-pack.c
  152. fetch-pack.h
  153. fmt-merge-msg.c
  154. fmt-merge-msg.h
  155. fsck.c
  156. fsck.h
  157. fsmonitor.c
  158. fsmonitor.h
  159. fuzz-commit-graph.c
  160. fuzz-pack-headers.c
  161. fuzz-pack-idx.c
  162. generate-cmdlist.sh
  163. generate-configlist.sh
  164. gettext.c
  165. gettext.h
  166. git-add--interactive.perl
  167. git-archimport.perl
  168. git-bisect.sh
  169. git-compat-util.h
  170. git-cvsexportcommit.perl
  171. git-cvsimport.perl
  172. git-cvsserver.perl
  173. git-difftool--helper.sh
  174. git-filter-branch.sh
  175. git-instaweb.sh
  176. git-merge-octopus.sh
  177. git-merge-one-file.sh
  178. git-merge-resolve.sh
  179. git-mergetool--lib.sh
  180. git-mergetool.sh
  181. git-p4.py
  182. git-quiltimport.sh
  183. git-rebase--preserve-merges.sh
  184. git-request-pull.sh
  185. git-send-email.perl
  186. git-sh-i18n.sh
  187. git-sh-setup.sh
  188. git-submodule.sh
  189. git-svn.perl
  190. GIT-VERSION-GEN
  191. git-web--browse.sh
  192. git.c
  193. git.rc
  194. gpg-interface.c
  195. gpg-interface.h
  196. graph.c
  197. graph.h
  198. grep.c
  199. grep.h
  200. hash-lookup.c
  201. hash-lookup.h
  202. hash.h
  203. hashmap.c
  204. hashmap.h
  205. help.c
  206. help.h
  207. hex.c
  208. http-backend.c
  209. http-fetch.c
  210. http-push.c
  211. http-walker.c
  212. http.c
  213. http.h
  214. ident.c
  215. imap-send.c
  216. INSTALL
  217. iterator.h
  218. json-writer.c
  219. json-writer.h
  220. khash.h
  221. kwset.c
  222. kwset.h
  223. levenshtein.c
  224. levenshtein.h
  225. LGPL-2.1
  226. line-log.c
  227. line-log.h
  228. line-range.c
  229. line-range.h
  230. linear-assignment.c
  231. linear-assignment.h
  232. list-objects-filter-options.c
  233. list-objects-filter-options.h
  234. list-objects-filter.c
  235. list-objects-filter.h
  236. list-objects.c
  237. list-objects.h
  238. list.h
  239. ll-merge.c
  240. ll-merge.h
  241. lockfile.c
  242. lockfile.h
  243. log-tree.c
  244. log-tree.h
  245. ls-refs.c
  246. ls-refs.h
  247. mailinfo.c
  248. mailinfo.h
  249. mailmap.c
  250. mailmap.h
  251. Makefile
  252. match-trees.c
  253. mem-pool.c
  254. mem-pool.h
  255. merge-blobs.c
  256. merge-blobs.h
  257. merge-ort-wrappers.c
  258. merge-ort-wrappers.h
  259. merge-ort.c
  260. merge-ort.h
  261. merge-recursive.c
  262. merge-recursive.h
  263. merge.c
  264. mergesort.c
  265. mergesort.h
  266. midx.c
  267. midx.h
  268. name-hash.c
  269. notes-cache.c
  270. notes-cache.h
  271. notes-merge.c
  272. notes-merge.h
  273. notes-utils.c
  274. notes-utils.h
  275. notes.c
  276. notes.h
  277. object-file.c
  278. object-name.c
  279. object-store.h
  280. object.c
  281. object.h
  282. oid-array.c
  283. oid-array.h
  284. oidmap.c
  285. oidmap.h
  286. oidset.c
  287. oidset.h
  288. pack-bitmap-write.c
  289. pack-bitmap.c
  290. pack-bitmap.h
  291. pack-check.c
  292. pack-objects.c
  293. pack-objects.h
  294. pack-revindex.c
  295. pack-revindex.h
  296. pack-write.c
  297. pack.h
  298. packfile.c
  299. packfile.h
  300. pager.c
  301. parallel-checkout.c
  302. parallel-checkout.h
  303. parse-options-cb.c
  304. parse-options.c
  305. parse-options.h
  306. patch-delta.c
  307. patch-ids.c
  308. patch-ids.h
  309. path.c
  310. path.h
  311. pathspec.c
  312. pathspec.h
  313. pkt-line.c
  314. pkt-line.h
  315. preload-index.c
  316. pretty.c
  317. pretty.h
  318. prio-queue.c
  319. prio-queue.h
  320. progress.c
  321. progress.h
  322. promisor-remote.c
  323. promisor-remote.h
  324. prompt.c
  325. prompt.h
  326. protocol-caps.c
  327. protocol-caps.h
  328. protocol.c
  329. protocol.h
  330. prune-packed.c
  331. prune-packed.h
  332. quote.c
  333. quote.h
  334. range-diff.c
  335. range-diff.h
  336. reachable.c
  337. reachable.h
  338. read-cache.c
  339. README.md
  340. rebase-interactive.c
  341. rebase-interactive.h
  342. rebase.c
  343. rebase.h
  344. ref-filter.c
  345. ref-filter.h
  346. reflog-walk.c
  347. reflog-walk.h
  348. refs.c
  349. refs.h
  350. refspec.c
  351. refspec.h
  352. remote-curl.c
  353. remote.c
  354. remote.h
  355. replace-object.c
  356. replace-object.h
  357. repo-settings.c
  358. repository.c
  359. repository.h
  360. rerere.c
  361. rerere.h
  362. reset.c
  363. reset.h
  364. resolve-undo.c
  365. resolve-undo.h
  366. revision.c
  367. revision.h
  368. run-command.c
  369. run-command.h
  370. SECURITY.md
  371. send-pack.c
  372. send-pack.h
  373. sequencer.c
  374. sequencer.h
  375. serve.c
  376. serve.h
  377. server-info.c
  378. setup.c
  379. sh-i18n--envsubst.c
  380. sha1dc_git.c
  381. sha1dc_git.h
  382. shallow.c
  383. shallow.h
  384. shell.c
  385. shortlog.h
  386. sideband.c
  387. sideband.h
  388. sigchain.c
  389. sigchain.h
  390. simple-ipc.h
  391. sparse-index.c
  392. sparse-index.h
  393. split-index.c
  394. split-index.h
  395. stable-qsort.c
  396. strbuf.c
  397. strbuf.h
  398. streaming.c
  399. streaming.h
  400. string-list.c
  401. string-list.h
  402. strmap.c
  403. strmap.h
  404. strvec.c
  405. strvec.h
  406. sub-process.c
  407. sub-process.h
  408. submodule-config.c
  409. submodule-config.h
  410. submodule.c
  411. submodule.h
  412. symlinks.c
  413. tag.c
  414. tag.h
  415. tar.h
  416. tempfile.c
  417. tempfile.h
  418. thread-utils.c
  419. thread-utils.h
  420. tmp-objdir.c
  421. tmp-objdir.h
  422. trace.c
  423. trace.h
  424. trace2.c
  425. trace2.h
  426. trailer.c
  427. trailer.h
  428. transport-helper.c
  429. transport-internal.h
  430. transport.c
  431. transport.h
  432. tree-diff.c
  433. tree-walk.c
  434. tree-walk.h
  435. tree.c
  436. tree.h
  437. unicode-width.h
  438. unimplemented.sh
  439. unix-socket.c
  440. unix-socket.h
  441. unix-stream-server.c
  442. unix-stream-server.h
  443. unpack-trees.c
  444. unpack-trees.h
  445. upload-pack.c
  446. upload-pack.h
  447. url.c
  448. url.h
  449. urlmatch.c
  450. urlmatch.h
  451. usage.c
  452. userdiff.c
  453. userdiff.h
  454. utf8.c
  455. utf8.h
  456. varint.c
  457. varint.h
  458. version.c
  459. version.h
  460. versioncmp.c
  461. walker.c
  462. walker.h
  463. wildmatch.c
  464. wildmatch.h
  465. worktree.c
  466. worktree.h
  467. wrap-for-bin.sh
  468. wrapper.c
  469. write-or-die.c
  470. ws.c
  471. wt-status.c
  472. wt-status.h
  473. xdiff-interface.c
  474. xdiff-interface.h
  475. zlib.c
README.md

Build status

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from https://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.txt to get started, then see Documentation/giteveryday.txt for a useful minimum set of commands, and Documentation/git-<commandname>.txt for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.txt (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission). To subscribe to the list, send an email with just “subscribe git” in the body to majordomo@vger.kernel.org. The mailing list archives are available at https://lore.kernel.org/git/, http://marc.info/?l=git and other archival sites.

Issues which are security relevant should be disclosed privately to the Git Security mailing list git-security@googlegroups.com.

The maintainer frequently sends the “What's cooking” reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name “git” was given by Linus Torvalds when he wrote the very first version. He described the tool as “the stupid content tracker” and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of “get” may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • “global information tracker”: you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • “goddamn idiotic truckload of sh*t”: when it breaks