reencode_string: use st_add/st_mult helpers

When converting a string with iconv, if the output buffer
isn't big enough, we grow it. But our growth is done without
any concern for integer overflow. So when we add:

  outalloc = sofar + insz * 2 + 32;

we may end up wrapping outalloc (which is a size_t), and
allocating a too-small buffer. We then manipulate it
further:

  outsz = outalloc - sofar - 1;

and feed outsz back to iconv. If outalloc is wrapped and
smaller than sofar, we'll end up with a small allocation but
feed a very large outsz to iconv, which could result in it
overflowing the buffer.

Can we use this to construct an attack wherein the victim
clones a repository with a very large commit object with an
encoding header, and running "git log" reencodes it into
utf8, causing an overflow?

An attack of this sort is likely impossible in practice.
"sofar" is how many output bytes we've written total, and
"insz" is the number of input bytes remaining. Imagine our
input doubles in size as we output it (which is easy to do
by converting latin1 to utf8, for example), and that we
start with N input bytes. Our initial output buffer also
starts at N bytes, so after the first call we'd have N/2
input bytes remaining (insz), and have written N bytes
(sofar). That means our next allocation will be
(N + N/2 * 2 + 32) bytes, or (2N + 32).

We can therefore overflow a 32-bit size_t with a commit
message that's just under 2^31 bytes, assuming it consists
mostly of "doubling" sequences (e.g., latin1 0xe1 which
becomes utf8 0xc3 0xa1).

But we'll never make it that far with such a message. We'll
be spending 2^31 bytes on the original string. And our
initial output buffer will also be 2^31 bytes. Which is not
going to succeed on a system with a 32-bit size_t, since
there will be other things using the address space, too. The
initial malloc will fail.

If we imagine instead that we can triple the size when
converting, then our second allocation becomes
(N + 2/3N * 2 + 32), or (7/3N + 32). That still requires two
allocations of 3/7 of our address space (6/7 of the total)
to succeed.

If we imagine we can quadruple, it becomes (5/2N + 32); we
need to be able to allocate 4/5 of the address space to
succeed.

This might start to get plausible. But is it possible to get
a 4-to-1 increase in size? Probably if you're converting to
some obscure encoding. But since git defaults to utf8 for
its output, that's the likely destination encoding for an
attack. And while there are 4-character utf8 sequences, it's
unlikely that you'd be able find a single-byte source
sequence in any encoding.

So this is certainly buggy code which should be fixed, but
it is probably not a useful attack vector.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
1 file changed
tree: 6ac1f8e8124f5878aaf228d84bc04d5357b4d310
  1. .github/
  2. block-sha1/
  3. builtin/
  4. ci/
  5. compat/
  6. contrib/
  7. Documentation/
  8. ewah/
  9. git-gui/
  10. gitk-git/
  11. gitweb/
  12. mergetools/
  13. perl/
  14. po/
  15. ppc/
  16. refs/
  17. sha1dc/
  18. t/
  19. templates/
  20. vcs-svn/
  21. xdiff/
  22. .clang-format
  23. .gitattributes
  24. .gitignore
  25. .gitmodules
  26. .mailmap
  27. .travis.yml
  28. .tsan-suppressions
  29. abspath.c
  30. aclocal.m4
  31. advice.c
  32. advice.h
  33. alias.c
  34. alias.h
  35. alloc.c
  36. apply.c
  37. apply.h
  38. archive-tar.c
  39. archive-zip.c
  40. archive.c
  41. archive.h
  42. argv-array.c
  43. argv-array.h
  44. attr.c
  45. attr.h
  46. base85.c
  47. bisect.c
  48. bisect.h
  49. blame.c
  50. blame.h
  51. blob.c
  52. blob.h
  53. branch.c
  54. branch.h
  55. builtin.h
  56. bulk-checkin.c
  57. bulk-checkin.h
  58. bundle.c
  59. bundle.h
  60. cache-tree.c
  61. cache-tree.h
  62. cache.h
  63. chdir-notify.c
  64. chdir-notify.h
  65. check-builtins.sh
  66. check-racy.c
  67. check_bindir
  68. checkout.c
  69. checkout.h
  70. color.c
  71. color.h
  72. column.c
  73. column.h
  74. combine-diff.c
  75. command-list.txt
  76. commit-graph.c
  77. commit-graph.h
  78. commit-slab.h
  79. commit.c
  80. commit.h
  81. common-main.c
  82. config.c
  83. config.h
  84. config.mak.dev
  85. config.mak.in
  86. config.mak.uname
  87. configure.ac
  88. connect.c
  89. connect.h
  90. connected.c
  91. connected.h
  92. convert.c
  93. convert.h
  94. copy.c
  95. COPYING
  96. credential-cache--daemon.c
  97. credential-cache.c
  98. credential-store.c
  99. credential.c
  100. credential.h
  101. csum-file.c
  102. csum-file.h
  103. ctype.c
  104. daemon.c
  105. date.c
  106. decorate.c
  107. decorate.h
  108. delta.h
  109. detect-compiler
  110. diff-delta.c
  111. diff-lib.c
  112. diff-no-index.c
  113. diff.c
  114. diff.h
  115. diffcore-break.c
  116. diffcore-delta.c
  117. diffcore-order.c
  118. diffcore-pickaxe.c
  119. diffcore-rename.c
  120. diffcore.h
  121. dir-iterator.c
  122. dir-iterator.h
  123. dir.c
  124. dir.h
  125. editor.c
  126. entry.c
  127. environment.c
  128. exec-cmd.c
  129. exec-cmd.h
  130. fast-import.c
  131. fetch-object.c
  132. fetch-object.h
  133. fetch-pack.c
  134. fetch-pack.h
  135. fmt-merge-msg.h
  136. fsck.c
  137. fsck.h
  138. fsmonitor.c
  139. fsmonitor.h
  140. generate-cmdlist.sh
  141. gettext.c
  142. gettext.h
  143. git-add--interactive.perl
  144. git-archimport.perl
  145. git-bisect.sh
  146. git-compat-util.h
  147. git-cvsexportcommit.perl
  148. git-cvsimport.perl
  149. git-cvsserver.perl
  150. git-difftool--helper.sh
  151. git-filter-branch.sh
  152. git-instaweb.sh
  153. git-merge-octopus.sh
  154. git-merge-one-file.sh
  155. git-merge-resolve.sh
  156. git-mergetool--lib.sh
  157. git-mergetool.sh
  158. git-p4.py
  159. git-parse-remote.sh
  160. git-quiltimport.sh
  161. git-rebase--am.sh
  162. git-rebase--interactive.sh
  163. git-rebase--merge.sh
  164. git-rebase.sh
  165. git-remote-testgit.sh
  166. git-request-pull.sh
  167. git-send-email.perl
  168. git-sh-i18n.sh
  169. git-sh-setup.sh
  170. git-stash.sh
  171. git-submodule.sh
  172. git-svn.perl
  173. GIT-VERSION-GEN
  174. git-web--browse.sh
  175. git.c
  176. git.rc
  177. gpg-interface.c
  178. gpg-interface.h
  179. graph.c
  180. graph.h
  181. grep.c
  182. grep.h
  183. hash.h
  184. hashmap.c
  185. hashmap.h
  186. help.c
  187. help.h
  188. hex.c
  189. http-backend.c
  190. http-fetch.c
  191. http-push.c
  192. http-walker.c
  193. http.c
  194. http.h
  195. ident.c
  196. imap-send.c
  197. INSTALL
  198. iterator.h
  199. khash.h
  200. kwset.c
  201. kwset.h
  202. levenshtein.c
  203. levenshtein.h
  204. LGPL-2.1
  205. line-log.c
  206. line-log.h
  207. line-range.c
  208. line-range.h
  209. list-objects-filter-options.c
  210. list-objects-filter-options.h
  211. list-objects-filter.c
  212. list-objects-filter.h
  213. list-objects.c
  214. list-objects.h
  215. list.h
  216. ll-merge.c
  217. ll-merge.h
  218. lockfile.c
  219. lockfile.h
  220. log-tree.c
  221. log-tree.h
  222. ls-refs.c
  223. ls-refs.h
  224. mailinfo.c
  225. mailinfo.h
  226. mailmap.c
  227. mailmap.h
  228. Makefile
  229. match-trees.c
  230. mem-pool.c
  231. mem-pool.h
  232. merge-blobs.c
  233. merge-blobs.h
  234. merge-recursive.c
  235. merge-recursive.h
  236. merge.c
  237. mergesort.c
  238. mergesort.h
  239. name-hash.c
  240. notes-cache.c
  241. notes-cache.h
  242. notes-merge.c
  243. notes-merge.h
  244. notes-utils.c
  245. notes-utils.h
  246. notes.c
  247. notes.h
  248. object-store.h
  249. object.c
  250. object.h
  251. oidmap.c
  252. oidmap.h
  253. oidset.c
  254. oidset.h
  255. pack-bitmap-write.c
  256. pack-bitmap.c
  257. pack-bitmap.h
  258. pack-check.c
  259. pack-objects.c
  260. pack-objects.h
  261. pack-revindex.c
  262. pack-revindex.h
  263. pack-write.c
  264. pack.h
  265. packfile.c
  266. packfile.h
  267. pager.c
  268. parse-options-cb.c
  269. parse-options.c
  270. parse-options.h
  271. patch-delta.c
  272. patch-ids.c
  273. patch-ids.h
  274. path.c
  275. path.h
  276. pathspec.c
  277. pathspec.h
  278. pkt-line.c
  279. pkt-line.h
  280. preload-index.c
  281. pretty.c
  282. pretty.h
  283. prio-queue.c
  284. prio-queue.h
  285. progress.c
  286. progress.h
  287. prompt.c
  288. prompt.h
  289. protocol.c
  290. protocol.h
  291. quote.c
  292. quote.h
  293. reachable.c
  294. reachable.h
  295. read-cache.c
  296. README.md
  297. ref-filter.c
  298. ref-filter.h
  299. reflog-walk.c
  300. reflog-walk.h
  301. refs.c
  302. refs.h
  303. refspec.c
  304. refspec.h
  305. remote-curl.c
  306. remote-testsvn.c
  307. remote.c
  308. remote.h
  309. replace-object.c
  310. replace-object.h
  311. repository.c
  312. repository.h
  313. rerere.c
  314. rerere.h
  315. resolve-undo.c
  316. resolve-undo.h
  317. revision.c
  318. revision.h
  319. run-command.c
  320. run-command.h
  321. send-pack.c
  322. send-pack.h
  323. sequencer.c
  324. sequencer.h
  325. serve.c
  326. serve.h
  327. server-info.c
  328. setup.c
  329. sh-i18n--envsubst.c
  330. sha1-array.c
  331. sha1-array.h
  332. sha1-file.c
  333. sha1-lookup.c
  334. sha1-lookup.h
  335. sha1-name.c
  336. sha1dc_git.c
  337. sha1dc_git.h
  338. shallow.c
  339. shell.c
  340. shortlog.h
  341. show-index.c
  342. sideband.c
  343. sideband.h
  344. sigchain.c
  345. sigchain.h
  346. split-index.c
  347. split-index.h
  348. strbuf.c
  349. strbuf.h
  350. streaming.c
  351. streaming.h
  352. string-list.c
  353. string-list.h
  354. sub-process.c
  355. sub-process.h
  356. submodule-config.c
  357. submodule-config.h
  358. submodule.c
  359. submodule.h
  360. symlinks.c
  361. tag.c
  362. tag.h
  363. tar.h
  364. tempfile.c
  365. tempfile.h
  366. thread-utils.c
  367. thread-utils.h
  368. tmp-objdir.c
  369. tmp-objdir.h
  370. trace.c
  371. trace.h
  372. trailer.c
  373. trailer.h
  374. transport-helper.c
  375. transport-internal.h
  376. transport.c
  377. transport.h
  378. tree-diff.c
  379. tree-walk.c
  380. tree-walk.h
  381. tree.c
  382. tree.h
  383. unicode-width.h
  384. unimplemented.sh
  385. unix-socket.c
  386. unix-socket.h
  387. unpack-trees.c
  388. unpack-trees.h
  389. upload-pack.c
  390. upload-pack.h
  391. url.c
  392. url.h
  393. urlmatch.c
  394. urlmatch.h
  395. usage.c
  396. userdiff.c
  397. userdiff.h
  398. utf8.c
  399. utf8.h
  400. varint.c
  401. varint.h
  402. version.c
  403. version.h
  404. versioncmp.c
  405. walker.c
  406. walker.h
  407. wildmatch.c
  408. wildmatch.h
  409. worktree.c
  410. worktree.h
  411. wrap-for-bin.sh
  412. wrapper.c
  413. write-or-die.c
  414. ws.c
  415. wt-status.c
  416. wt-status.h
  417. xdiff-interface.c
  418. xdiff-interface.h
  419. zlib.c
README.md

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from https://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.txt to get started, then see Documentation/giteveryday.txt for a useful minimum set of commands, and Documentation/git-.txt for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.txt (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission). To subscribe to the list, send an email with just “subscribe git” in the body to majordomo@vger.kernel.org. The mailing list archives are available at https://public-inbox.org/git/, http://marc.info/?l=git and other archival sites.

Issues which are security relevant should be disclosed privately to the Git Security mailing list git-security@googlegroups.com.

The maintainer frequently sends the “What's cooking” reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name “git” was given by Linus Torvalds when he wrote the very first version. He described the tool as “the stupid content tracker” and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of “get” may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • “global information tracker”: you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • “goddamn idiotic truckload of sh*t”: when it breaks