sparse-index: convert from full to sparse

If we have a full index, then we can convert it to a sparse index by
replacing directories outside of the sparse cone with sparse directory
entries. The convert_to_sparse() method does this, when the situation is
appropriate.

For now, we avoid converting the index to a sparse index if:

 1. the index is split.
 2. the index is already sparse.
 3. sparse-checkout is disabled.
 4. sparse-checkout does not use cone mode.

Finally, we currently limit the conversion to when the
GIT_TEST_SPARSE_INDEX environment variable is enabled. A mode using Git
config will be added in a later change.

The trickiest thing about this conversion is that we might not be able
to mark a directory as a sparse directory just because it is outside the
sparse cone. There might be unmerged files within that directory, so we
need to look for those. Also, if there is some strange reason why a file
is not marked with CE_SKIP_WORKTREE, then we should give up on
converting that directory. There is still hope that some of its
subdirectories might be able to convert to sparse, so we keep looking
deeper.

The conversion process is assisted by the cache-tree extension. This is
calculated from the full index if it does not already exist. We then
abandon the cache-tree as it no longer applies to the newly-sparse
index. Thus, this cache-tree will be recalculated in every
sparse-full-sparse round-trip until we integrate the cache-tree
extension with the sparse index.

Some Git commands use the index after writing it. For example, 'git add'
will update the index, then write it to disk, then read its entries to
report information. To keep the in-memory index in a full state after
writing, we re-expand it to a full one after the write. This is wasteful
for commands that only write the index and do not read from it again,
but that is only the case until we make those commands "sparse aware."

We can compare the behavior of the sparse-index in
t1092-sparse-checkout-compability.sh by using GIT_TEST_SPARSE_INDEX=1
when operating on the 'sparse-index' repo. We can also compare the two
sparse repos directly, such as comparing their indexes (when expanded to
full in the case of the 'sparse-index' repo). We also verify that the
index is actually populated with sparse directory entries.

The 'checkout and reset (mixed)' test is marked for failure when
comparing a sparse repo to a full repo, but we can compare the two
sparse-checkout cases directly to ensure that we are not changing the
behavior when using a sparse index.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
6 files changed
tree: 529b8c4ed147a7b63b08091c20fd37c1173ec152
  1. .github/
  2. block-sha1/
  3. builtin/
  4. ci/
  5. compat/
  6. contrib/
  7. Documentation/
  8. ewah/
  9. git-gui/
  10. gitk-git/
  11. gitweb/
  12. mergetools/
  13. negotiator/
  14. perl/
  15. po/
  16. ppc/
  17. refs/
  18. sha1dc/
  19. sha256/
  20. t/
  21. templates/
  22. trace2/
  23. vcs-svn/
  24. xdiff/
  25. .cirrus.yml
  26. .clang-format
  27. .editorconfig
  28. .gitattributes
  29. .gitignore
  30. .gitmodules
  31. .mailmap
  32. .travis.yml
  33. .tsan-suppressions
  34. abspath.c
  35. aclocal.m4
  36. add-interactive.c
  37. add-interactive.h
  38. add-patch.c
  39. advice.c
  40. advice.h
  41. alias.c
  42. alias.h
  43. alloc.c
  44. alloc.h
  45. apply.c
  46. apply.h
  47. archive-tar.c
  48. archive-zip.c
  49. archive.c
  50. archive.h
  51. attr.c
  52. attr.h
  53. banned.h
  54. base85.c
  55. bisect.c
  56. bisect.h
  57. blame.c
  58. blame.h
  59. blob.c
  60. blob.h
  61. bloom.c
  62. bloom.h
  63. branch.c
  64. branch.h
  65. builtin.h
  66. bulk-checkin.c
  67. bulk-checkin.h
  68. bundle.c
  69. bundle.h
  70. cache-tree.c
  71. cache-tree.h
  72. cache.h
  73. chdir-notify.c
  74. chdir-notify.h
  75. check-builtins.sh
  76. check_bindir
  77. checkout.c
  78. checkout.h
  79. chunk-format.c
  80. chunk-format.h
  81. CODE_OF_CONDUCT.md
  82. color.c
  83. color.h
  84. column.c
  85. column.h
  86. combine-diff.c
  87. command-list.txt
  88. commit-graph.c
  89. commit-graph.h
  90. commit-reach.c
  91. commit-reach.h
  92. commit-slab-decl.h
  93. commit-slab-impl.h
  94. commit-slab.h
  95. commit.c
  96. commit.h
  97. common-main.c
  98. config.c
  99. config.h
  100. config.mak.dev
  101. config.mak.in
  102. config.mak.uname
  103. configure.ac
  104. connect.c
  105. connect.h
  106. connected.c
  107. connected.h
  108. convert.c
  109. convert.h
  110. copy.c
  111. COPYING
  112. credential.c
  113. credential.h
  114. csum-file.c
  115. csum-file.h
  116. ctype.c
  117. daemon.c
  118. date.c
  119. decorate.c
  120. decorate.h
  121. delta-islands.c
  122. delta-islands.h
  123. delta.h
  124. detect-compiler
  125. diff-delta.c
  126. diff-lib.c
  127. diff-merges.c
  128. diff-merges.h
  129. diff-no-index.c
  130. diff.c
  131. diff.h
  132. diffcore-break.c
  133. diffcore-delta.c
  134. diffcore-order.c
  135. diffcore-pickaxe.c
  136. diffcore-rename.c
  137. diffcore-rotate.c
  138. diffcore.h
  139. dir-iterator.c
  140. dir-iterator.h
  141. dir.c
  142. dir.h
  143. editor.c
  144. entry.c
  145. environment.c
  146. environment.h
  147. exec-cmd.c
  148. exec-cmd.h
  149. fetch-negotiator.c
  150. fetch-negotiator.h
  151. fetch-pack.c
  152. fetch-pack.h
  153. fmt-merge-msg.c
  154. fmt-merge-msg.h
  155. fsck.c
  156. fsck.h
  157. fsmonitor.c
  158. fsmonitor.h
  159. fuzz-commit-graph.c
  160. fuzz-pack-headers.c
  161. fuzz-pack-idx.c
  162. generate-cmdlist.sh
  163. generate-configlist.sh
  164. gettext.c
  165. gettext.h
  166. git-add--interactive.perl
  167. git-archimport.perl
  168. git-bisect.sh
  169. git-compat-util.h
  170. git-cvsexportcommit.perl
  171. git-cvsimport.perl
  172. git-cvsserver.perl
  173. git-difftool--helper.sh
  174. git-filter-branch.sh
  175. git-instaweb.sh
  176. git-merge-octopus.sh
  177. git-merge-one-file.sh
  178. git-merge-resolve.sh
  179. git-mergetool--lib.sh
  180. git-mergetool.sh
  181. git-p4.py
  182. git-quiltimport.sh
  183. git-rebase--preserve-merges.sh
  184. git-request-pull.sh
  185. git-send-email.perl
  186. git-sh-i18n.sh
  187. git-sh-setup.sh
  188. git-submodule.sh
  189. git-svn.perl
  190. GIT-VERSION-GEN
  191. git-web--browse.sh
  192. git.c
  193. git.rc
  194. gpg-interface.c
  195. gpg-interface.h
  196. graph.c
  197. graph.h
  198. grep.c
  199. grep.h
  200. hash-lookup.c
  201. hash-lookup.h
  202. hash.h
  203. hashmap.c
  204. hashmap.h
  205. help.c
  206. help.h
  207. hex.c
  208. http-backend.c
  209. http-fetch.c
  210. http-push.c
  211. http-walker.c
  212. http.c
  213. http.h
  214. ident.c
  215. imap-send.c
  216. INSTALL
  217. iterator.h
  218. json-writer.c
  219. json-writer.h
  220. khash.h
  221. kwset.c
  222. kwset.h
  223. levenshtein.c
  224. levenshtein.h
  225. LGPL-2.1
  226. line-log.c
  227. line-log.h
  228. line-range.c
  229. line-range.h
  230. linear-assignment.c
  231. linear-assignment.h
  232. list-objects-filter-options.c
  233. list-objects-filter-options.h
  234. list-objects-filter.c
  235. list-objects-filter.h
  236. list-objects.c
  237. list-objects.h
  238. list.h
  239. ll-merge.c
  240. ll-merge.h
  241. lockfile.c
  242. lockfile.h
  243. log-tree.c
  244. log-tree.h
  245. ls-refs.c
  246. ls-refs.h
  247. mailinfo.c
  248. mailinfo.h
  249. mailmap.c
  250. mailmap.h
  251. Makefile
  252. match-trees.c
  253. mem-pool.c
  254. mem-pool.h
  255. merge-blobs.c
  256. merge-blobs.h
  257. merge-ort-wrappers.c
  258. merge-ort-wrappers.h
  259. merge-ort.c
  260. merge-ort.h
  261. merge-recursive.c
  262. merge-recursive.h
  263. merge.c
  264. mergesort.c
  265. mergesort.h
  266. midx.c
  267. midx.h
  268. name-hash.c
  269. notes-cache.c
  270. notes-cache.h
  271. notes-merge.c
  272. notes-merge.h
  273. notes-utils.c
  274. notes-utils.h
  275. notes.c
  276. notes.h
  277. object-file.c
  278. object-name.c
  279. object-store.h
  280. object.c
  281. object.h
  282. oid-array.c
  283. oid-array.h
  284. oidmap.c
  285. oidmap.h
  286. oidset.c
  287. oidset.h
  288. pack-bitmap-write.c
  289. pack-bitmap.c
  290. pack-bitmap.h
  291. pack-check.c
  292. pack-objects.c
  293. pack-objects.h
  294. pack-revindex.c
  295. pack-revindex.h
  296. pack-write.c
  297. pack.h
  298. packfile.c
  299. packfile.h
  300. pager.c
  301. parse-options-cb.c
  302. parse-options.c
  303. parse-options.h
  304. patch-delta.c
  305. patch-ids.c
  306. patch-ids.h
  307. path.c
  308. path.h
  309. pathspec.c
  310. pathspec.h
  311. pkt-line.c
  312. pkt-line.h
  313. preload-index.c
  314. pretty.c
  315. pretty.h
  316. prio-queue.c
  317. prio-queue.h
  318. progress.c
  319. progress.h
  320. promisor-remote.c
  321. promisor-remote.h
  322. prompt.c
  323. prompt.h
  324. protocol.c
  325. protocol.h
  326. prune-packed.c
  327. prune-packed.h
  328. quote.c
  329. quote.h
  330. range-diff.c
  331. range-diff.h
  332. reachable.c
  333. reachable.h
  334. read-cache.c
  335. README.md
  336. rebase-interactive.c
  337. rebase-interactive.h
  338. rebase.c
  339. rebase.h
  340. ref-filter.c
  341. ref-filter.h
  342. reflog-walk.c
  343. reflog-walk.h
  344. refs.c
  345. refs.h
  346. refspec.c
  347. refspec.h
  348. remote-curl.c
  349. remote.c
  350. remote.h
  351. replace-object.c
  352. replace-object.h
  353. repo-settings.c
  354. repository.c
  355. repository.h
  356. rerere.c
  357. rerere.h
  358. reset.c
  359. reset.h
  360. resolve-undo.c
  361. resolve-undo.h
  362. revision.c
  363. revision.h
  364. run-command.c
  365. run-command.h
  366. send-pack.c
  367. send-pack.h
  368. sequencer.c
  369. sequencer.h
  370. serve.c
  371. serve.h
  372. server-info.c
  373. setup.c
  374. sh-i18n--envsubst.c
  375. sha1dc_git.c
  376. sha1dc_git.h
  377. shallow.c
  378. shallow.h
  379. shell.c
  380. shortlog.h
  381. sideband.c
  382. sideband.h
  383. sigchain.c
  384. sigchain.h
  385. sparse-index.c
  386. sparse-index.h
  387. split-index.c
  388. split-index.h
  389. stable-qsort.c
  390. strbuf.c
  391. strbuf.h
  392. streaming.c
  393. streaming.h
  394. string-list.c
  395. string-list.h
  396. strmap.c
  397. strmap.h
  398. strvec.c
  399. strvec.h
  400. sub-process.c
  401. sub-process.h
  402. submodule-config.c
  403. submodule-config.h
  404. submodule.c
  405. submodule.h
  406. symlinks.c
  407. tag.c
  408. tag.h
  409. tar.h
  410. tempfile.c
  411. tempfile.h
  412. thread-utils.c
  413. thread-utils.h
  414. tmp-objdir.c
  415. tmp-objdir.h
  416. trace.c
  417. trace.h
  418. trace2.c
  419. trace2.h
  420. trailer.c
  421. trailer.h
  422. transport-helper.c
  423. transport-internal.h
  424. transport.c
  425. transport.h
  426. tree-diff.c
  427. tree-walk.c
  428. tree-walk.h
  429. tree.c
  430. tree.h
  431. unicode-width.h
  432. unimplemented.sh
  433. unix-socket.c
  434. unix-socket.h
  435. unpack-trees.c
  436. unpack-trees.h
  437. upload-pack.c
  438. upload-pack.h
  439. url.c
  440. url.h
  441. urlmatch.c
  442. urlmatch.h
  443. usage.c
  444. userdiff.c
  445. userdiff.h
  446. utf8.c
  447. utf8.h
  448. varint.c
  449. varint.h
  450. version.c
  451. version.h
  452. versioncmp.c
  453. walker.c
  454. walker.h
  455. wildmatch.c
  456. wildmatch.h
  457. worktree.c
  458. worktree.h
  459. wrap-for-bin.sh
  460. wrapper.c
  461. write-or-die.c
  462. ws.c
  463. wt-status.c
  464. wt-status.h
  465. xdiff-interface.c
  466. xdiff-interface.h
  467. zlib.c
README.md

Build status

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from https://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.txt to get started, then see Documentation/giteveryday.txt for a useful minimum set of commands, and Documentation/git-<commandname>.txt for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.txt (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission). To subscribe to the list, send an email with just “subscribe git” in the body to majordomo@vger.kernel.org. The mailing list archives are available at https://lore.kernel.org/git/, http://marc.info/?l=git and other archival sites.

Issues which are security relevant should be disclosed privately to the Git Security mailing list git-security@googlegroups.com.

The maintainer frequently sends the “What's cooking” reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name “git” was given by Linus Torvalds when he wrote the very first version. He described the tool as “the stupid content tracker” and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of “get” may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • “global information tracker”: you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • “goddamn idiotic truckload of sh*t”: when it breaks