submodule: process conflicting submodules only once

During a merge module_list returns conflicting submodules several times
(stage 1,2,3) which caused the submodules to be used multiple times in
git submodule init, sync, update and status command.

There are 5 callers of module_list; they all read (mode, sha1, stage,
path) tuple, and most of them care only about path.  As a first level
approximation, it should be Ok (in the sense that it does not make things
worse than it currently is) to filter the duplicate paths from module_list
output, but some callers should change their behaviour when the merge in
the superproject still has conflicts.

Notice the higher-stage entries, and emit only one record from
module_list, but while doing so, mark the entry with "U" (not [0-3]) in
the $stage field and null out the SHA-1 part, as the object name for the
lowest stage does not give any useful information to the caller, and this
way any caller that uses the object name would hopefully barf.  Then
update the codepaths for each subcommands this way:

 - "update" should not touch the submodule repository, because we do not
   know what commit should be checked out yet.

 - "status" reports the conflicting submodules as 'U000...000' and does
   not recurse into them (we might later want to make it recurse).

 - The command called by "foreach" may want to do whatever it wants to do
   by noticing the merged status in the superproject itself, so feed the
   path to it from module_list as before, but only once per submodule.

 - "init" and "sync" are unlikely things to do while the superproject is
   still not merged, but as long as a submodule is there in $path, there
   is no point skipping it. It might however want to take the merged
   status of .gitmodules into account, but that is outside of the scope of
   this topic.

Acked-by: Jens Lehmann <Jens.Lehmann@web.de>
Thanks-to: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nicolas Morey-Chaisemartin <nicolas@morey-chaisemartin.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff --git a/git-submodule.sh b/git-submodule.sh
index 3a13397..7f6b3cf 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -72,7 +72,24 @@
 #
 module_list()
 {
-	git ls-files --error-unmatch --stage -- "$@" | sane_grep '^160000 '
+	git ls-files --error-unmatch --stage -- "$@" |
+	perl -e '
+	my %unmerged = ();
+	my ($null_sha1) = ("0" x 40);
+	while (<STDIN>) {
+		chomp;
+		my ($mode, $sha1, $stage, $path) =
+			/^([0-7]+) ([0-9a-f]{40}) ([0-3])\t(.*)$/;
+		next unless $mode eq "160000";
+		if ($stage ne "0") {
+			if (!$unmerged{$path}++) {
+				print "$mode $null_sha1 U\t$path\n";
+			}
+			next;
+		}
+		print "$_\n";
+	}
+	'
 }
 
 #
@@ -427,6 +444,11 @@
 	module_list "$@" |
 	while read mode sha1 stage path
 	do
+		if test "$stage" = U
+		then
+			echo >&2 "Skipping unmerged submodule $path"
+			continue
+		fi
 		name=$(module_name "$path") || exit
 		url=$(git config submodule."$name".url)
 		update_module=$(git config submodule."$name".update)
@@ -770,6 +792,11 @@
 		name=$(module_name "$path") || exit
 		url=$(git config submodule."$name".url)
 		displaypath="$prefix$path"
+		if test "$stage" = U
+		then
+			say "U$sha1 $displaypath"
+			continue
+		fi
 		if test -z "$url" || ! test -d "$path"/.git -o -f "$path"/.git
 		then
 			say "-$sha1 $displaypath"