Racy GIT
This fixes the longstanding "Racy GIT" problem, which was pretty
much there from the beginning of time, but was first
demonstrated by Pasky in this message on October 24, 2005:
http://marc.theaimsgroup.com/?l=git&m=113014629716878
If you run the following sequence of commands:
echo frotz >infocom
git update-index --add infocom
echo xyzzy >infocom
so that the second update to file "infocom" does not change
st_mtime, what is recorded as the stat information for the cache
entry "infocom" exactly matches what is on the filesystem
(owner, group, inum, mtime, ctime, mode, length). After this
sequence, we incorrectly think "infocom" file still has string
"frotz" in it, and get really confused. E.g. git-diff-files
would say there is no change, git-update-index --refresh would
not even look at the filesystem to correct the situation.
Some ways of working around this issue were already suggested by
Linus in the same thread on the same day, including waiting
until the next second before returning from update-index if a
cache entry written out has the current timestamp, but that
means we can make at most one commit per second, and given that
the e-mail patch workflow used by Linus needs to process at
least 5 commits per second, it is not an acceptable solution.
Linus notes that git-apply is primarily used to update the index
while processing e-mailed patches, which is true, and
git-apply's up-to-date check is fooled by the same problem but
luckily in the other direction, so it is not really a big issue,
but still it is disturbing.
The function ce_match_stat() is called to bypass the comparison
against filesystem data when the stat data recorded in the cache
entry matches what stat() returns from the filesystem. This
patch tackles the problem by changing it to actually go to the
filesystem data for cache entries that have the same mtime as
the index file itself. This works as long as the index file and
working tree files are on the filesystems that share the same
monotonic clock. Files on network mounted filesystems sometimes
get skewed timestamps compared to "date" output, but as long as
working tree files' timestamps are skewed the same way as the
index file's, this approach still works. The only problematic
files are the ones that have the same timestamp as the index
file's, because two file updates that sandwitch the index file
update must happen within the same second to trigger the
problem.
Signed-off-by: Junio C Hamano <junkio@cox.net>
diff --git a/t/t0010-racy-git.sh b/t/t0010-racy-git.sh
new file mode 100755
index 0000000..eb175b7
--- /dev/null
+++ b/t/t0010-racy-git.sh
@@ -0,0 +1,24 @@
+#!/bin/sh
+
+test_description='racy GIT'
+
+. ./test-lib.sh
+
+# This test can give false success if your machine is sufficiently
+# slow or your trial happened to happen on second boundary.
+
+for trial in 0 1 2 3 4 5 6 7 8 9
+do
+ rm -f .git/index
+ echo frotz >infocom
+ echo xyzzy >activision
+ git update-index --add infocom activision
+ echo xyzzy >infocom
+
+ files=`git diff-files -p`
+ test_expect_success \
+ "Racy GIT trial #$trial" \
+ 'test "" != "$files"'
+done
+
+test_done