is_ntfs_dotgit(): only verify the leading segment

The config setting `core.protectNTFS` is specifically designed to work
not only on Windows, but anywhere, to allow for repositories hosted on,
say, Linux servers to be protected against NTFS-specific attack vectors.

As a consequence, `is_ntfs_dotgit()` manually splits backslash-separated
paths (but does not do the same for paths separated by forward slashes),
under the assumption that the backslash might not be a valid directory
separator on the _current_ Operating System.

However, the two callers, `verify_path()` and `fsck_tree()`, are
supposed to feed only individual path segments to the `is_ntfs_dotgit()`
function.

This causes a lot of duplicate scanning (and very inefficient scanning,
too, as the inner loop of `is_ntfs_dotgit()` was optimized for
readability rather than for speed.

Let's simplify the design of `is_ntfs_dotgit()` by putting the burden of
splitting the paths by backslashes as directory separators on the
callers of said function.

Consequently, the `verify_path()` function, which already splits the
path by directory separators, now treats backslashes as directory
separators _explicitly_ when `core.protectNTFS` is turned on, even on
platforms where the backslash is _not_ a directory separator.

Note that we have to repeat some code in `verify_path()`: if the
backslash is not a directory separator on the current Operating System,
we want to allow file names like `\`, but we _do_ want to disallow paths
that are clearly intended to cause harm when the repository is cloned on
Windows.

The `fsck_tree()` function (the other caller of `is_ntfs_dotgit()`) now
needs to look for backslashes in tree entries' names specifically when
`core.protectNTFS` is turned on. While it would be tempting to
completely disallow backslashes in that case (much like `fsck` reports
names containing forward slashes as "full paths"), this would be
overzealous: when `core.protectNTFS` is turned on in a non-Windows
setup, backslashes are perfectly valid characters in file names while we
_still_ want to disallow tree entries that are clearly designed to
exploit NTFS-specific behavior.

This simplification will make subsequent changes easier to implement,
such as turning `core.protectNTFS` on by default (not only on Windows)
or protecting against attack vectors involving NTFS Alternate Data
Streams.

Incidentally, this change allows for catching malicious repositories
that contain tree entries of the form `dir\.gitmodules` already on the
server side rather than only on the client side (and previously only on
Windows): in contrast to `is_ntfs_dotgit()`, the
`is_ntfs_dotgitmodules()` function already expects the caller to split
the paths by directory separators.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
3 files changed
tree: 86984af16137a62e6062503f5d3d2278eeed6519
  1. .github/
  2. block-sha1/
  3. builtin/
  4. ci/
  5. compat/
  6. contrib/
  7. Documentation/
  8. ewah/
  9. git-gui/
  10. gitk-git/
  11. gitweb/
  12. mergetools/
  13. perl/
  14. po/
  15. ppc/
  16. refs/
  17. sha1dc/
  18. t/
  19. templates/
  20. vcs-svn/
  21. xdiff/
  22. .gitattributes
  23. .gitignore
  24. .gitmodules
  25. .mailmap
  26. .travis.yml
  27. .tsan-suppressions
  28. abspath.c
  29. aclocal.m4
  30. advice.c
  31. advice.h
  32. alias.c
  33. alloc.c
  34. apply.c
  35. apply.h
  36. archive-tar.c
  37. archive-zip.c
  38. archive.c
  39. archive.h
  40. argv-array.c
  41. argv-array.h
  42. attr.c
  43. attr.h
  44. base85.c
  45. bisect.c
  46. bisect.h
  47. blame.c
  48. blame.h
  49. blob.c
  50. blob.h
  51. branch.c
  52. branch.h
  53. builtin.h
  54. bulk-checkin.c
  55. bulk-checkin.h
  56. bundle.c
  57. bundle.h
  58. cache-tree.c
  59. cache-tree.h
  60. cache.h
  61. check-builtins.sh
  62. check-racy.c
  63. check_bindir
  64. color.c
  65. color.h
  66. column.c
  67. column.h
  68. combine-diff.c
  69. command-list.txt
  70. commit-slab.h
  71. commit.c
  72. commit.h
  73. common-main.c
  74. config.c
  75. config.h
  76. config.mak.in
  77. config.mak.uname
  78. configure.ac
  79. connect.c
  80. connect.h
  81. connected.c
  82. connected.h
  83. convert.c
  84. convert.h
  85. copy.c
  86. COPYING
  87. credential-cache--daemon.c
  88. credential-cache.c
  89. credential-store.c
  90. credential.c
  91. credential.h
  92. csum-file.c
  93. csum-file.h
  94. ctype.c
  95. daemon.c
  96. date.c
  97. decorate.c
  98. decorate.h
  99. delta.h
  100. diff-delta.c
  101. diff-lib.c
  102. diff-no-index.c
  103. diff.c
  104. diff.h
  105. diffcore-break.c
  106. diffcore-delta.c
  107. diffcore-order.c
  108. diffcore-pickaxe.c
  109. diffcore-rename.c
  110. diffcore.h
  111. dir-iterator.c
  112. dir-iterator.h
  113. dir.c
  114. dir.h
  115. editor.c
  116. entry.c
  117. environment.c
  118. exec_cmd.c
  119. exec_cmd.h
  120. fast-import.c
  121. fetch-pack.c
  122. fetch-pack.h
  123. fmt-merge-msg.h
  124. fsck.c
  125. fsck.h
  126. generate-cmdlist.sh
  127. gettext.c
  128. gettext.h
  129. git-add--interactive.perl
  130. git-archimport.perl
  131. git-bisect.sh
  132. git-compat-util.h
  133. git-cvsexportcommit.perl
  134. git-cvsimport.perl
  135. git-cvsserver.perl
  136. git-difftool--helper.sh
  137. git-filter-branch.sh
  138. git-instaweb.sh
  139. git-merge-octopus.sh
  140. git-merge-one-file.sh
  141. git-merge-resolve.sh
  142. git-mergetool--lib.sh
  143. git-mergetool.sh
  144. git-p4.py
  145. git-parse-remote.sh
  146. git-quiltimport.sh
  147. git-rebase--am.sh
  148. git-rebase--interactive.sh
  149. git-rebase--merge.sh
  150. git-rebase.sh
  151. git-remote-testgit.sh
  152. git-request-pull.sh
  153. git-send-email.perl
  154. git-sh-i18n.sh
  155. git-sh-setup.sh
  156. git-stash.sh
  157. git-submodule.sh
  158. git-svn.perl
  159. GIT-VERSION-GEN
  160. git-web--browse.sh
  161. git.c
  162. git.rc
  163. gpg-interface.c
  164. gpg-interface.h
  165. graph.c
  166. graph.h
  167. grep.c
  168. grep.h
  169. hash.h
  170. hashmap.c
  171. hashmap.h
  172. help.c
  173. help.h
  174. hex.c
  175. http-backend.c
  176. http-fetch.c
  177. http-push.c
  178. http-walker.c
  179. http.c
  180. http.h
  181. ident.c
  182. imap-send.c
  183. INSTALL
  184. iterator.h
  185. khash.h
  186. kwset.c
  187. kwset.h
  188. levenshtein.c
  189. levenshtein.h
  190. LGPL-2.1
  191. line-log.c
  192. line-log.h
  193. line-range.c
  194. line-range.h
  195. list-objects.c
  196. list-objects.h
  197. list.h
  198. ll-merge.c
  199. ll-merge.h
  200. lockfile.c
  201. lockfile.h
  202. log-tree.c
  203. log-tree.h
  204. mailinfo.c
  205. mailinfo.h
  206. mailmap.c
  207. mailmap.h
  208. Makefile
  209. match-trees.c
  210. merge-blobs.c
  211. merge-blobs.h
  212. merge-recursive.c
  213. merge-recursive.h
  214. merge.c
  215. mergesort.c
  216. mergesort.h
  217. mru.c
  218. mru.h
  219. name-hash.c
  220. notes-cache.c
  221. notes-cache.h
  222. notes-merge.c
  223. notes-merge.h
  224. notes-utils.c
  225. notes-utils.h
  226. notes.c
  227. notes.h
  228. object.c
  229. object.h
  230. oidset.c
  231. oidset.h
  232. pack-bitmap-write.c
  233. pack-bitmap.c
  234. pack-bitmap.h
  235. pack-check.c
  236. pack-objects.c
  237. pack-objects.h
  238. pack-revindex.c
  239. pack-revindex.h
  240. pack-write.c
  241. pack.h
  242. pager.c
  243. parse-options-cb.c
  244. parse-options.c
  245. parse-options.h
  246. patch-delta.c
  247. patch-ids.c
  248. patch-ids.h
  249. path.c
  250. path.h
  251. pathspec.c
  252. pathspec.h
  253. pkt-line.c
  254. pkt-line.h
  255. preload-index.c
  256. pretty.c
  257. prio-queue.c
  258. prio-queue.h
  259. progress.c
  260. progress.h
  261. prompt.c
  262. prompt.h
  263. quote.c
  264. quote.h
  265. reachable.c
  266. reachable.h
  267. read-cache.c
  268. README.md
  269. ref-filter.c
  270. ref-filter.h
  271. reflog-walk.c
  272. reflog-walk.h
  273. refs.c
  274. refs.h
  275. remote-curl.c
  276. remote-testsvn.c
  277. remote.c
  278. remote.h
  279. replace_object.c
  280. repository.c
  281. repository.h
  282. rerere.c
  283. rerere.h
  284. resolve-undo.c
  285. resolve-undo.h
  286. revision.c
  287. revision.h
  288. run-command.c
  289. run-command.h
  290. send-pack.c
  291. send-pack.h
  292. sequencer.c
  293. sequencer.h
  294. server-info.c
  295. setup.c
  296. sh-i18n--envsubst.c
  297. sha1-array.c
  298. sha1-array.h
  299. sha1-lookup.c
  300. sha1-lookup.h
  301. sha1_file.c
  302. sha1_name.c
  303. sha1dc_git.c
  304. sha1dc_git.h
  305. shallow.c
  306. shell.c
  307. shortlog.h
  308. show-index.c
  309. sideband.c
  310. sideband.h
  311. sigchain.c
  312. sigchain.h
  313. split-index.c
  314. split-index.h
  315. strbuf.c
  316. strbuf.h
  317. streaming.c
  318. streaming.h
  319. string-list.c
  320. string-list.h
  321. sub-process.c
  322. sub-process.h
  323. submodule-config.c
  324. submodule-config.h
  325. submodule.c
  326. submodule.h
  327. symlinks.c
  328. tag.c
  329. tag.h
  330. tar.h
  331. tempfile.c
  332. tempfile.h
  333. thread-utils.c
  334. thread-utils.h
  335. tmp-objdir.c
  336. tmp-objdir.h
  337. trace.c
  338. trace.h
  339. trailer.c
  340. trailer.h
  341. transport-helper.c
  342. transport.c
  343. transport.h
  344. tree-diff.c
  345. tree-walk.c
  346. tree-walk.h
  347. tree.c
  348. tree.h
  349. unicode_width.h
  350. unimplemented.sh
  351. unix-socket.c
  352. unix-socket.h
  353. unpack-trees.c
  354. unpack-trees.h
  355. upload-pack.c
  356. url.c
  357. url.h
  358. urlmatch.c
  359. urlmatch.h
  360. usage.c
  361. userdiff.c
  362. userdiff.h
  363. utf8.c
  364. utf8.h
  365. varint.c
  366. varint.h
  367. version.c
  368. version.h
  369. versioncmp.c
  370. walker.c
  371. walker.h
  372. wildmatch.c
  373. wildmatch.h
  374. worktree.c
  375. worktree.h
  376. wrap-for-bin.sh
  377. wrapper.c
  378. write_or_die.c
  379. ws.c
  380. wt-status.c
  381. wt-status.h
  382. xdiff-interface.c
  383. xdiff-interface.h
  384. zlib.c
README.md

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from https://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.txt to get started, then see Documentation/giteveryday.txt for a useful minimum set of commands, and Documentation/git-.txt for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.txt (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission). To subscribe to the list, send an email with just “subscribe git” in the body to majordomo@vger.kernel.org. The mailing list archives are available at https://public-inbox.org/git/, http://marc.info/?l=git and other archival sites.

The maintainer frequently sends the “What's cooking” reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name “git” was given by Linus Torvalds when he wrote the very first version. He described the tool as “the stupid content tracker” and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of “get” may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • “global information tracker”: you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • “goddamn idiotic truckload of sh*t”: when it breaks