pack-objects: walk tag chains for --include-tag

When pack-objects is given --include-tag, it peels each tag
ref down to a non-tag object, and if that non-tag object is
going to be packed, we include the tag, too. But what
happens if we have a chain of tags (e.g., tag "A" points to
tag "B", which points to commit "C")?

We'll peel down to "C" and realize that we want to include
tag "A", but we do not ever consider tag "B", leading to a
broken pack (assuming "B" was not otherwise selected).
Instead, we have to walk the whole chain, adding any tags we
find to the pack.

Interestingly, it doesn't seem possible to trigger this
problem with "git fetch", but you can with "git clone
--single-branch". The reason is that we generate the correct
pack when the client explicitly asks for "A" (because we do
a real reachability analysis there), and "fetch" is more
willing to do so. There are basically two cases:

  1. If "C" is already a ref tip, then the client can deduce
     that it needs "A" itself (via find_non_local_tags), and
     will ask for it explicitly rather than relying on the
     include-tag capability. Everything works.

  2. If "C" is not already a ref tip, then we hope for
     include-tag to send us the correct tag. But it doesn't;
     it generates a broken pack. However, the next step is
     to do a follow-up run of find_non_local_tags(),
     followed by fetch_refs() to backfill any tags we
     learned about.

     In the normal case, fetch_refs() calls quickfetch(),
     which does a connectivity check and sees we have no
     new objects to fetch. We just write the refs.

     But for the broken-pack case, the connectivity check
     fails, and quickfetch will follow-up with the remote,
     asking explicitly for each of the ref tips. This picks
     up the missing object in a new pack.

For a regular "git clone", we are similarly OK, because we
explicitly request all of the tag refs, and get a correct
pack. But with "--single-branch", we kick in tag
auto-following via "include-tag", but do _not_ do a
follow-up backfill. We just take whatever the server sent us
via include-tag and write out tag refs for any tag objects
we were sent. So prior to c6807a4 (clone: open a shortcut
for connectivity check, 2013-05-26), we actually claimed the
clone was a success, but the result was silently
corrupted!  Since c6807a4, index-pack's connectivity
check catches this case, and we correctly complain.

The included test directly checks that pack-objects does not
generate a broken pack, but also confirms that "clone
--single-branch" does not hit the bug.

Note that tag chains introduce another interesting question:
if we are packing the tag "B" but not the commit "C", should
"A" be included?

Both before and after this patch, we do not include "A",
because the initial peel_ref() check only knows about the
bottom-most level, "C". To realize that "B" is involved at
all, we would have to switch to an incremental peel, in
which we examine each tagged object, asking if it is being
packed (and including the outer tag if so).

But that runs contrary to the optimizations in peel_ref(),
which avoid accessing the objects at all, in favor of using
the value we pull from packed-refs. It's OK to walk the
whole chain once we know we're going to include the tag (we
have to access it anyway, so the effort is proportional to
the pack we're generating). But for the initial selection,
we have to look at every ref. If we're only packing a few
objects, we'd still have to parse every single referenced
tag object just to confirm that it isn't part of a tag
chain.

This could be addressed if packed-refs stored the complete
tag chain for each peeled ref (in most cases, this would be
the same cost as now, as each "chain" is only a single
link). But given the size of that project, it's out of scope
for this fix (and probably nobody cares enough anyway, as
it's such an obscure situation). This commit limits itself
to just avoiding the creation of a broken pack.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2 files changed
tree: b160ecb85a806cca4b71aaa5c979f53f5d49a92b
  1. block-sha1/
  2. builtin/
  3. ci/
  4. compat/
  5. contrib/
  6. Documentation/
  7. ewah/
  8. git-gui/
  9. gitk-git/
  10. gitweb/
  11. mergetools/
  12. perl/
  13. po/
  14. ppc/
  15. refs/
  16. t/
  17. templates/
  18. vcs-svn/
  19. xdiff/
  20. .gitattributes
  21. .gitignore
  22. .mailmap
  23. .travis.yml
  24. abspath.c
  25. aclocal.m4
  26. advice.c
  27. advice.h
  28. alias.c
  29. alloc.c
  30. archive-tar.c
  31. archive-zip.c
  32. archive.c
  33. archive.h
  34. argv-array.c
  35. argv-array.h
  36. attr.c
  37. attr.h
  38. base85.c
  39. bisect.c
  40. bisect.h
  41. blob.c
  42. blob.h
  43. branch.c
  44. branch.h
  45. builtin.h
  46. bulk-checkin.c
  47. bulk-checkin.h
  48. bundle.c
  49. bundle.h
  50. cache-tree.c
  51. cache-tree.h
  52. cache.h
  53. check-builtins.sh
  54. check-racy.c
  55. check_bindir
  56. color.c
  57. color.h
  58. column.c
  59. column.h
  60. combine-diff.c
  61. command-list.txt
  62. commit-slab.h
  63. commit.c
  64. commit.h
  65. config.c
  66. config.mak.in
  67. config.mak.uname
  68. configure.ac
  69. connect.c
  70. connect.h
  71. connected.c
  72. connected.h
  73. convert.c
  74. convert.h
  75. copy.c
  76. COPYING
  77. credential-cache--daemon.c
  78. credential-cache.c
  79. credential-store.c
  80. credential.c
  81. credential.h
  82. csum-file.c
  83. csum-file.h
  84. ctype.c
  85. daemon.c
  86. date.c
  87. decorate.c
  88. decorate.h
  89. delta.h
  90. diff-delta.c
  91. diff-lib.c
  92. diff-no-index.c
  93. diff.c
  94. diff.h
  95. diffcore-break.c
  96. diffcore-delta.c
  97. diffcore-order.c
  98. diffcore-pickaxe.c
  99. diffcore-rename.c
  100. diffcore.h
  101. dir.c
  102. dir.h
  103. editor.c
  104. entry.c
  105. environment.c
  106. exec_cmd.c
  107. exec_cmd.h
  108. fast-import.c
  109. fetch-pack.c
  110. fetch-pack.h
  111. fmt-merge-msg.h
  112. fsck.c
  113. fsck.h
  114. generate-cmdlist.sh
  115. gettext.c
  116. gettext.h
  117. git-add--interactive.perl
  118. git-archimport.perl
  119. git-bisect.sh
  120. git-compat-util.h
  121. git-cvsexportcommit.perl
  122. git-cvsimport.perl
  123. git-cvsserver.perl
  124. git-difftool--helper.sh
  125. git-difftool.perl
  126. git-filter-branch.sh
  127. git-instaweb.sh
  128. git-merge-octopus.sh
  129. git-merge-one-file.sh
  130. git-merge-resolve.sh
  131. git-mergetool--lib.sh
  132. git-mergetool.sh
  133. git-p4.py
  134. git-parse-remote.sh
  135. git-quiltimport.sh
  136. git-rebase--am.sh
  137. git-rebase--interactive.sh
  138. git-rebase--merge.sh
  139. git-rebase.sh
  140. git-relink.perl
  141. git-remote-testgit.sh
  142. git-request-pull.sh
  143. git-send-email.perl
  144. git-sh-i18n.sh
  145. git-sh-setup.sh
  146. git-stash.sh
  147. git-submodule.sh
  148. git-svn.perl
  149. GIT-VERSION-GEN
  150. git-web--browse.sh
  151. git.c
  152. git.rc
  153. gpg-interface.c
  154. gpg-interface.h
  155. graph.c
  156. graph.h
  157. grep.c
  158. grep.h
  159. hashmap.c
  160. hashmap.h
  161. help.c
  162. help.h
  163. hex.c
  164. http-backend.c
  165. http-fetch.c
  166. http-push.c
  167. http-walker.c
  168. http.c
  169. http.h
  170. ident.c
  171. imap-send.c
  172. INSTALL
  173. khash.h
  174. kwset.c
  175. kwset.h
  176. levenshtein.c
  177. levenshtein.h
  178. LGPL-2.1
  179. line-log.c
  180. line-log.h
  181. line-range.c
  182. line-range.h
  183. list-objects.c
  184. list-objects.h
  185. ll-merge.c
  186. ll-merge.h
  187. lockfile.c
  188. lockfile.h
  189. log-tree.c
  190. log-tree.h
  191. mailinfo.c
  192. mailinfo.h
  193. mailmap.c
  194. mailmap.h
  195. Makefile
  196. match-trees.c
  197. merge-blobs.c
  198. merge-blobs.h
  199. merge-recursive.c
  200. merge-recursive.h
  201. merge.c
  202. mergesort.c
  203. mergesort.h
  204. name-hash.c
  205. notes-cache.c
  206. notes-cache.h
  207. notes-merge.c
  208. notes-merge.h
  209. notes-utils.c
  210. notes-utils.h
  211. notes.c
  212. notes.h
  213. object.c
  214. object.h
  215. pack-bitmap-write.c
  216. pack-bitmap.c
  217. pack-bitmap.h
  218. pack-check.c
  219. pack-objects.c
  220. pack-objects.h
  221. pack-revindex.c
  222. pack-revindex.h
  223. pack-write.c
  224. pack.h
  225. pager.c
  226. parse-options-cb.c
  227. parse-options.c
  228. parse-options.h
  229. patch-delta.c
  230. patch-ids.c
  231. patch-ids.h
  232. path.c
  233. pathspec.c
  234. pathspec.h
  235. pkt-line.c
  236. pkt-line.h
  237. preload-index.c
  238. pretty.c
  239. prio-queue.c
  240. prio-queue.h
  241. progress.c
  242. progress.h
  243. prompt.c
  244. prompt.h
  245. quote.c
  246. quote.h
  247. reachable.c
  248. reachable.h
  249. read-cache.c
  250. README.md
  251. ref-filter.c
  252. ref-filter.h
  253. reflog-walk.c
  254. reflog-walk.h
  255. refs.c
  256. refs.h
  257. remote-curl.c
  258. remote-testsvn.c
  259. remote.c
  260. remote.h
  261. replace_object.c
  262. rerere.c
  263. rerere.h
  264. resolve-undo.c
  265. resolve-undo.h
  266. revision.c
  267. revision.h
  268. run-command.c
  269. run-command.h
  270. send-pack.c
  271. send-pack.h
  272. sequencer.c
  273. sequencer.h
  274. server-info.c
  275. setup.c
  276. sh-i18n--envsubst.c
  277. sha1-array.c
  278. sha1-array.h
  279. sha1-lookup.c
  280. sha1-lookup.h
  281. sha1_file.c
  282. sha1_name.c
  283. shallow.c
  284. shell.c
  285. shortlog.h
  286. show-index.c
  287. sideband.c
  288. sideband.h
  289. sigchain.c
  290. sigchain.h
  291. split-index.c
  292. split-index.h
  293. strbuf.c
  294. strbuf.h
  295. streaming.c
  296. streaming.h
  297. string-list.c
  298. string-list.h
  299. submodule-config.c
  300. submodule-config.h
  301. submodule.c
  302. submodule.h
  303. symlinks.c
  304. tag.c
  305. tag.h
  306. tar.h
  307. tempfile.c
  308. tempfile.h
  309. thread-utils.c
  310. thread-utils.h
  311. trace.c
  312. trace.h
  313. trailer.c
  314. trailer.h
  315. transport-helper.c
  316. transport.c
  317. transport.h
  318. tree-diff.c
  319. tree-walk.c
  320. tree-walk.h
  321. tree.c
  322. tree.h
  323. unicode_width.h
  324. unimplemented.sh
  325. unix-socket.c
  326. unix-socket.h
  327. unpack-trees.c
  328. unpack-trees.h
  329. update_unicode.sh
  330. upload-pack.c
  331. url.c
  332. url.h
  333. urlmatch.c
  334. urlmatch.h
  335. usage.c
  336. userdiff.c
  337. userdiff.h
  338. utf8.c
  339. utf8.h
  340. varint.c
  341. varint.h
  342. version.c
  343. version.h
  344. versioncmp.c
  345. walker.c
  346. walker.h
  347. wildmatch.c
  348. wildmatch.h
  349. worktree.c
  350. worktree.h
  351. wrap-for-bin.sh
  352. wrapper.c
  353. write_or_die.c
  354. ws.c
  355. wt-status.c
  356. wt-status.h
  357. xdiff-interface.c
  358. xdiff-interface.h
  359. zlib.c
README.md

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from http://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.txt to get started, then see Documentation/giteveryday.txt for a useful minimum set of commands, and Documentation/git-.txt for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.txt (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission). To subscribe to the list, send an email with just “subscribe git” in the body to majordomo@vger.kernel.org. The mailing list archives are available at http://news.gmane.org/gmane.comp.version-control.git/, http://marc.info/?l=git and other archival sites.

The maintainer frequently sends the “What's cooking” reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name “git” was given by Linus Torvalds when he wrote the very first version. He described the tool as “the stupid content tracker” and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of “get” may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • “global information tracker”: you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • “goddamn idiotic truckload of sh*t”: when it breaks