Documentation/howto/revert-a-faulty-merge.txt - jrn/git - Git at Google

 Date: Fri, 19 Dec 2008 00:45:19 -0800
 From: Linus Torvalds <torvalds@linux-foundation.org>, Junio C Hamano <gitster@pobox.com>
 Subject: Re: Odd merge behaviour involving reverts
 Abstract: Sometimes a branch that was already merged to the mainline
  is later found to be faulty.  Linus and Junio give guidance on
  recovering from such a premature merge and continuing development
  after the offending branch is fixed.
 Message-ID: <7vocz8a6zk.fsf@gitster.siamese.dyndns.org>
 References: <alpine.LFD.2.00.0812181949450.14014@localhost.localdomain>
 Content-type: text/asciidoc

 How to revert a faulty merge
 ============================

 Alan <alan@clueserver.org> said:

     I have a master branch.  We have a branch off of that that some
     developers are doing work on.  They claim it is ready. We merge it
     into the master branch.  It breaks something so we revert the merge.
     They make changes to the code.  they get it to a point where they say
     it is ok and we merge again.

     When examined, we find that code changes made before the revert are
     not in the master branch, but code changes after are in the master
     branch.

 and asked for help recovering from this situation.

 The history immediately after the "revert of the merge" would look like
 this:

  ---o---o---o---M---x---x---W
 	       /
        ---A---B

 where A and B are on the side development that was not so good, M is the
 merge that brings these premature changes into the mainline, x are changes
 unrelated to what the side branch did and already made on the mainline,
 and W is the "revert of the merge M" (doesn't W look M upside down?).
 IOW, "diff W^..W" is similar to "diff -R M^..M".

 Such a "revert" of a merge can be made with:

     $ git revert -m 1 M

 After the developers of the side branch fix their mistakes, the history
 may look like this:

  ---o---o---o---M---x---x---W---x
 	       /
        ---A---B-------------------C---D

 where C and D are to fix what was broken in A and B, and you may already
 have some other changes on the mainline after W.

 If you merge the updated side branch (with D at its tip), none of the
 changes made in A nor B will be in the result, because they were reverted
 by W.  That is what Alan saw.

 Linus explains the situation:

     Reverting a regular commit just effectively undoes what that commit
     did, and is fairly straightforward. But reverting a merge commit also
     undoes the _data_ that the commit changed, but it does absolutely
     nothing to the effects on _history_ that the merge had.

     So the merge will still exist, and it will still be seen as joining
     the two branches together, and future merges will see that merge as
     the last shared state - and the revert that reverted the merge brought
     in will not affect that at all.

     So a "revert" undoes the data changes, but it's very much _not_ an
     "undo" in the sense that it doesn't undo the effects of a commit on
     the repository history.

     So if you think of "revert" as "undo", then you're going to always
     miss this part of reverts. Yes, it undoes the data, but no, it doesn't
     undo history.

 In such a situation, you would want to first revert the previous revert,
 which would make the history look like this:

  ---o---o---o---M---x---x---W---x---Y
 	       /
        ---A---B-------------------C---D

 where Y is the revert of W.  Such a "revert of the revert" can be done
 with:

     $ git revert W

 This history would (ignoring possible conflicts between what W and W..Y
 changed) be equivalent to not having W nor Y at all in the history:

  ---o---o---o---M---x---x-------x----
 	       /
        ---A---B-------------------C---D

 and merging the side branch again will not have conflict arising from an
 earlier revert and revert of the revert.

  ---o---o---o---M---x---x-------x-------*
 	       /                       /
        ---A---B-------------------C---D

 Of course the changes made in C and D still can conflict with what was
 done by any of the x, but that is just a normal merge conflict.

 On the other hand, if the developers of the side branch discarded their
 faulty A and B, and redone the changes on top of the updated mainline
 after the revert, the history would have looked like this:

  ---o---o---o---M---x---x---W---x---x
 	       /                 \
        ---A---B                   A'--B'--C'

 If you reverted the revert in such a case as in the previous example:

  ---o---o---o---M---x---x---W---x---x---Y---*
 	       /                 \         /
        ---A---B                   A'--B'--C'

 where Y is the revert of W, A' and B' are rerolled A and B, and there may
 also be a further fix-up C' on the side branch.  "diff Y^..Y" is similar
 to "diff -R W^..W" (which in turn means it is similar to "diff M^..M"),
 and "diff A'^..C'" by definition would be similar but different from that,
 because it is a rerolled series of the earlier change.  There will be a
 lot of overlapping changes that result in conflicts.  So do not do "revert
 of revert" blindly without thinking..

  ---o---o---o---M---x---x---W---x---x
 	       /                 \
        ---A---B                   A'--B'--C'

 In the history with rebased side branch, W (and M) are behind the merge
 base of the updated branch and the tip of the mainline, and they should
 merge without the past faulty merge and its revert getting in the way.

 To recap, these are two very different scenarios, and they want two very
 different resolution strategies:

  - If the faulty side branch was fixed by adding corrections on top, then
    doing a revert of the previous revert would be the right thing to do.

  - If the faulty side branch whose effects were discarded by an earlier
    revert of a merge was rebuilt from scratch (i.e. rebasing and fixing,
    as you seem to have interpreted), then re-merging the result without
    doing anything else fancy would be the right thing to do.
    (See the ADDENDUM below for how to rebuild a branch from scratch
    without changing its original branching-off point.)

 However, there are things to keep in mind when reverting a merge (and
 reverting such a revert).

 For example, think about what reverting a merge (and then reverting the
 revert) does to bisectability. Ignore the fact that the revert of a revert
 is undoing it - just think of it as a "single commit that does a lot".
 Because that is what it does.

 When you have a problem you are chasing down, and you hit a "revert this
 merge", what you're hitting is essentially a single commit that contains
 all the changes (but obviously in reverse) of all the commits that got
 merged. So it's debugging hell, because now you don't have lots of small
 changes that you can try to pinpoint which _part_ of it changes.

 But does it all work? Sure it does. You can revert a merge, and from a
 purely technical angle, Git did it very naturally and had no real
 troubles. It just considered it a change from "state before merge" to
 "state after merge", and that was it. Nothing complicated, nothing odd,
 nothing really dangerous. Git will do it without even thinking about it.

 So from a technical angle, there's nothing wrong with reverting a merge,
 but from a workflow angle it's something that you generally should try to
 avoid.

 If at all possible, for example, if you find a problem that got merged
 into the main tree, rather than revert the merge, try _really_ hard to
 bisect the problem down into the branch you merged, and just fix it, or
 try to revert the individual commit that caused it.

 Yes, it's more complex, and no, it's not always going to work (sometimes
 the answer is: "oops, I really shouldn't have merged it, because it wasn't
 ready yet, and I really need to undo _all_ of the merge"). So then you
 really should revert the merge, but when you want to re-do the merge, you
 now need to do it by reverting the revert.

 ADDENDUM

 Sometimes you have to rewrite one of a topic branch's commits *and* you can't
 change the topic's branching-off point.  Consider the following situation:

  P---o---o---M---x---x---W---x
   \         /
    A---B---C

 where commit W reverted commit M because it turned out that commit B was wrong
 and needs to be rewritten, but you need the rewritten topic to still branch
 from commit P (perhaps P is a branching-off point for yet another branch, and
 you want be able to merge the topic into both branches).

 The natural thing to do in this case is to checkout the A-B-C branch and use
 "rebase -i P" to change commit B.  However this does not rewrite commit A,
 because "rebase -i" by default fast-forwards over any initial commits selected
 with the "pick" command.  So you end up with this:

  P---o---o---M---x---x---W---x
   \         /
    A---B---C   <-- old branch
     \
      B'---C'   <-- naively rewritten branch

 To merge A-B'-C' into the mainline branch you would still have to first revert
 commit W in order to pick up the changes in A, but then it's likely that the
 changes in B' will conflict with the original B changes re-introduced by the
 reversion of W.

 However, you can avoid these problems if you recreate the entire branch,
 including commit A:

    A'---B'---C'  <-- completely rewritten branch
   /
  P---o---o---M---x---x---W---x
   \         /
    A---B---C

 You can merge A'-B'-C' into the mainline branch without worrying about first
 reverting W.  Mainline's history would look like this:

    A'---B'---C'------------------
   /                              \
  P---o---o---M---x---x---W---x---M2
   \         /
    A---B---C

 But if you don't actually need to change commit A, then you need some way to
 recreate it as a new commit with the same changes in it.  The rebase command's
 --no-ff option provides a way to do this:

     $ git rebase [-i] --no-ff P

 The --no-ff option creates a new branch A'-B'-C' with all-new commits (all the
 SHA IDs will be different) even if in the interactive case you only actually
 modify commit B.  You can then merge this new branch directly into the mainline
 branch and be sure you'll get all of the branch's changes.

 You can also use --no-ff in cases where you just add extra commits to the topic
 to fix it up.  Let's revisit the situation discussed at the start of this howto:

  P---o---o---M---x---x---W---x
   \         /
    A---B---C----------------D---E   <-- fixed-up topic branch

 At this point, you can use --no-ff to recreate the topic branch:

     $ git checkout E
     $ git rebase --no-ff P

 yielding

    A'---B'---C'------------D'---E'  <-- recreated topic branch
   /
  P---o---o---M---x---x---W---x
   \         /
    A---B---C----------------D---E

 You can merge the recreated branch into the mainline without reverting commit W,
 and mainline's history will look like this:

    A'---B'---C'------------D'---E'
   /                              \
  P---o---o---M---x---x---W---x---M2
   \         /
    A---B---C
	Date: Fri, 19 Dec 2008 00:45:19 -0800
	From: Linus Torvalds <torvalds@linux-foundation.org>, Junio C Hamano <gitster@pobox.com>
	Subject: Re: Odd merge behaviour involving reverts
	Abstract: Sometimes a branch that was already merged to the mainline
	is later found to be faulty. Linus and Junio give guidance on
	recovering from such a premature merge and continuing development
	after the offending branch is fixed.
	Message-ID: <7vocz8a6zk.fsf@gitster.siamese.dyndns.org>
	References: <alpine.LFD.2.00.0812181949450.14014@localhost.localdomain>
	Content-type: text/asciidoc

	How to revert a faulty merge
	============================

	Alan <alan@clueserver.org> said:

	I have a master branch. We have a branch off of that that some
	developers are doing work on. They claim it is ready. We merge it
	into the master branch. It breaks something so we revert the merge.
	They make changes to the code. they get it to a point where they say
	it is ok and we merge again.

	When examined, we find that code changes made before the revert are
	not in the master branch, but code changes after are in the master
	branch.

	and asked for help recovering from this situation.

	The history immediately after the "revert of the merge" would look like
	this:

	---o---o---o---M---x---x---W
	/
	---A---B

	where A and B are on the side development that was not so good, M is the
	merge that brings these premature changes into the mainline, x are changes
	unrelated to what the side branch did and already made on the mainline,
	and W is the "revert of the merge M" (doesn't W look M upside down?).
	IOW, "diff W^..W" is similar to "diff -R M^..M".

	Such a "revert" of a merge can be made with:

	$ git revert -m 1 M

	After the developers of the side branch fix their mistakes, the history
	may look like this:

	---o---o---o---M---x---x---W---x
	/
	---A---B-------------------C---D

	where C and D are to fix what was broken in A and B, and you may already
	have some other changes on the mainline after W.

	If you merge the updated side branch (with D at its tip), none of the
	changes made in A nor B will be in the result, because they were reverted
	by W. That is what Alan saw.

	Linus explains the situation:

	Reverting a regular commit just effectively undoes what that commit
	did, and is fairly straightforward. But reverting a merge commit also
	undoes the _data_ that the commit changed, but it does absolutely
	nothing to the effects on _history_ that the merge had.

	So the merge will still exist, and it will still be seen as joining
	the two branches together, and future merges will see that merge as
	the last shared state - and the revert that reverted the merge brought
	in will not affect that at all.

	So a "revert" undoes the data changes, but it's very much _not_ an
	"undo" in the sense that it doesn't undo the effects of a commit on
	the repository history.

	So if you think of "revert" as "undo", then you're going to always
	miss this part of reverts. Yes, it undoes the data, but no, it doesn't
	undo history.

	In such a situation, you would want to first revert the previous revert,
	which would make the history look like this:

	---o---o---o---M---x---x---W---x---Y
	/
	---A---B-------------------C---D

	where Y is the revert of W. Such a "revert of the revert" can be done
	with:

	$ git revert W

	This history would (ignoring possible conflicts between what W and W..Y
	changed) be equivalent to not having W nor Y at all in the history:

	---o---o---o---M---x---x-------x----
	/
	---A---B-------------------C---D

	and merging the side branch again will not have conflict arising from an
	earlier revert and revert of the revert.

	---o---o---o---M---x---x-------x-------*
	/ /
	---A---B-------------------C---D

	Of course the changes made in C and D still can conflict with what was
	done by any of the x, but that is just a normal merge conflict.

	On the other hand, if the developers of the side branch discarded their
	faulty A and B, and redone the changes on top of the updated mainline
	after the revert, the history would have looked like this:

	---o---o---o---M---x---x---W---x---x
	/ \
	---A---B A'--B'--C'

	If you reverted the revert in such a case as in the previous example:

	---o---o---o---M---x---x---W---x---x---Y---*
	/ \ /
	---A---B A'--B'--C'

	where Y is the revert of W, A' and B' are rerolled A and B, and there may
	also be a further fix-up C' on the side branch. "diff Y^..Y" is similar
	to "diff -R W^..W" (which in turn means it is similar to "diff M^..M"),
	and "diff A'^..C'" by definition would be similar but different from that,
	because it is a rerolled series of the earlier change. There will be a
	lot of overlapping changes that result in conflicts. So do not do "revert
	of revert" blindly without thinking..

	---o---o---o---M---x---x---W---x---x
	/ \
	---A---B A'--B'--C'

	In the history with rebased side branch, W (and M) are behind the merge
	base of the updated branch and the tip of the mainline, and they should
	merge without the past faulty merge and its revert getting in the way.

	To recap, these are two very different scenarios, and they want two very
	different resolution strategies:

	- If the faulty side branch was fixed by adding corrections on top, then
	doing a revert of the previous revert would be the right thing to do.

	- If the faulty side branch whose effects were discarded by an earlier
	revert of a merge was rebuilt from scratch (i.e. rebasing and fixing,
	as you seem to have interpreted), then re-merging the result without
	doing anything else fancy would be the right thing to do.
	(See the ADDENDUM below for how to rebuild a branch from scratch
	without changing its original branching-off point.)

	However, there are things to keep in mind when reverting a merge (and
	reverting such a revert).

	For example, think about what reverting a merge (and then reverting the
	revert) does to bisectability. Ignore the fact that the revert of a revert
	is undoing it - just think of it as a "single commit that does a lot".
	Because that is what it does.

	When you have a problem you are chasing down, and you hit a "revert this
	merge", what you're hitting is essentially a single commit that contains
	all the changes (but obviously in reverse) of all the commits that got
	merged. So it's debugging hell, because now you don't have lots of small
	changes that you can try to pinpoint which _part_ of it changes.

	But does it all work? Sure it does. You can revert a merge, and from a
	purely technical angle, Git did it very naturally and had no real
	troubles. It just considered it a change from "state before merge" to
	"state after merge", and that was it. Nothing complicated, nothing odd,
	nothing really dangerous. Git will do it without even thinking about it.

	So from a technical angle, there's nothing wrong with reverting a merge,
	but from a workflow angle it's something that you generally should try to
	avoid.

	If at all possible, for example, if you find a problem that got merged
	into the main tree, rather than revert the merge, try _really_ hard to
	bisect the problem down into the branch you merged, and just fix it, or
	try to revert the individual commit that caused it.

	Yes, it's more complex, and no, it's not always going to work (sometimes
	the answer is: "oops, I really shouldn't have merged it, because it wasn't
	ready yet, and I really need to undo _all_ of the merge"). So then you
	really should revert the merge, but when you want to re-do the merge, you
	now need to do it by reverting the revert.

	ADDENDUM

	Sometimes you have to rewrite one of a topic branch's commits and you can't
	change the topic's branching-off point. Consider the following situation:

	P---o---o---M---x---x---W---x
	\ /
	A---B---C

	where commit W reverted commit M because it turned out that commit B was wrong
	and needs to be rewritten, but you need the rewritten topic to still branch
	from commit P (perhaps P is a branching-off point for yet another branch, and
	you want be able to merge the topic into both branches).

	The natural thing to do in this case is to checkout the A-B-C branch and use
	"rebase -i P" to change commit B. However this does not rewrite commit A,
	because "rebase -i" by default fast-forwards over any initial commits selected
	with the "pick" command. So you end up with this:

	P---o---o---M---x---x---W---x
	\ /
	A---B---C <-- old branch
	\
	B'---C' <-- naively rewritten branch

	To merge A-B'-C' into the mainline branch you would still have to first revert
	commit W in order to pick up the changes in A, but then it's likely that the
	changes in B' will conflict with the original B changes re-introduced by the
	reversion of W.

	However, you can avoid these problems if you recreate the entire branch,
	including commit A:

	A'---B'---C' <-- completely rewritten branch
	/
	P---o---o---M---x---x---W---x
	\ /
	A---B---C

	You can merge A'-B'-C' into the mainline branch without worrying about first
	reverting W. Mainline's history would look like this:

	A'---B'---C'------------------
	/ \
	P---o---o---M---x---x---W---x---M2
	\ /
	A---B---C

	But if you don't actually need to change commit A, then you need some way to
	recreate it as a new commit with the same changes in it. The rebase command's
	--no-ff option provides a way to do this:

	$ git rebase [-i] --no-ff P

	The --no-ff option creates a new branch A'-B'-C' with all-new commits (all the
	SHA IDs will be different) even if in the interactive case you only actually
	modify commit B. You can then merge this new branch directly into the mainline
	branch and be sure you'll get all of the branch's changes.

	You can also use --no-ff in cases where you just add extra commits to the topic
	to fix it up. Let's revisit the situation discussed at the start of this howto:

	P---o---o---M---x---x---W---x
	\ /
	A---B---C----------------D---E <-- fixed-up topic branch

	At this point, you can use --no-ff to recreate the topic branch:

	$ git checkout E
	$ git rebase --no-ff P

	yielding

	A'---B'---C'------------D'---E' <-- recreated topic branch
	/
	P---o---o---M---x---x---W---x
	\ /
	A---B---C----------------D---E

	You can merge the recreated branch into the mainline without reverting commit W,
	and mainline's history will look like this:

	A'---B'---C'------------D'---E'
	/ \
	P---o---o---M---x---x---W---x---M2
	\ /
	A---B---C