| Directory rename detection |
| ========================== |
| |
| Rename detection logic in diffcore-rename that checks for renames of |
| individual files is aggregated and analyzed in merge-recursive for cases |
| where combinations of renames indicate that a full directory has been |
| renamed. |
| |
| Scope of abilities |
| ------------------ |
| |
| It is perhaps easiest to start with an example: |
| |
| * When all of x/a, x/b and x/c have moved to z/a, z/b and z/c, it is |
| likely that x/d added in the meantime would also want to move to z/d by |
| taking the hint that the entire directory 'x' moved to 'z'. |
| |
| More interesting possibilities exist, though, such as: |
| |
| * one side of history renames x -> z, and the other renames some file to |
| x/e, causing the need for the merge to do a transitive rename so that |
| the rename ends up at z/e. |
| |
| * one side of history renames x -> z, but also renames all files within x. |
| For example, x/a -> z/alpha, x/b -> z/bravo, etc. |
| |
| * both 'x' and 'y' being merged into a single directory 'z', with a |
| directory rename being detected for both x->z and y->z. |
| |
| * not all files in a directory being renamed to the same location; |
| i.e. perhaps most the files in 'x' are now found under 'z', but a few |
| are found under 'w'. |
| |
| * a directory being renamed, which also contained a subdirectory that was |
| renamed to some entirely different location. (And perhaps the inner |
| directory itself contained inner directories that were renamed to yet |
| other locations). |
| |
| * combinations of the above; see t/t6423-merge-rename-directories.sh for |
| various interesting cases. |
| |
| Limitations -- applicability of directory renames |
| ------------------------------------------------- |
| |
| In order to prevent edge and corner cases resulting in either conflicts |
| that cannot be represented in the index or which might be too complex for |
| users to try to understand and resolve, a couple basic rules limit when |
| directory rename detection applies: |
| |
| 1) If a given directory still exists on both sides of a merge, we do |
| not consider it to have been renamed. |
| |
| 2) If a subset of to-be-renamed files have a file or directory in the |
| way (or would be in the way of each other), "turn off" the directory |
| rename for those specific sub-paths and report the conflict to the |
| user. |
| |
| 3) If the other side of history did a directory rename to a path that |
| your side of history renamed away, then ignore that particular |
| rename from the other side of history for any implicit directory |
| renames (but warn the user). |
| |
| Limitations -- detailed rules and testcases |
| ------------------------------------------- |
| |
| t/t6423-merge-rename-directories.sh contains extensive tests and commentary |
| which generate and explore the rules listed above. It also lists a few |
| additional rules: |
| |
| a) If renames split a directory into two or more others, the directory |
| with the most renames, "wins". |
| |
| b) Only apply implicit directory renames to directories if the other side |
| of history is the one doing the renaming. |
| |
| c) Do not perform directory rename detection for directories which had no |
| new paths added to them. |
| |
| Limitations -- support in different commands |
| -------------------------------------------- |
| |
| Directory rename detection is supported by 'merge' and 'cherry-pick'. |
| Other git commands which users might be surprised to see limited or no |
| directory rename detection support in: |
| |
| * diff |
| |
| Folks have requested in the past that `git diff` detect directory |
| renames and somehow simplify its output. It is not clear whether this |
| would be desirable or how the output should be simplified, so this was |
| simply not implemented. Further, to implement this, directory rename |
| detection logic would need to move from merge-recursive to |
| diffcore-rename. |
| |
| * am |
| |
| git-am tries to avoid a full three way merge, instead calling |
| git-apply. That prevents us from detecting renames at all, which may |
| defeat the directory rename detection. There is a fallback, though; if |
| the initial git-apply fails and the user has specified the -3 option, |
| git-am will fall back to a three way merge. However, git-am lacks the |
| necessary information to do a "real" three way merge. Instead, it has |
| to use build_fake_ancestor() to get a merge base that is missing files |
| whose rename may have been important to detect for directory rename |
| detection to function. |
| |
| * rebase |
| |
| Since am-based rebases work by first generating a bunch of patches |
| (which no longer record what the original commits were and thus don't |
| have the necessary info from which we can find a real merge-base), and |
| then calling git-am, this implies that am-based rebases will not always |
| successfully detect directory renames either (see the 'am' section |
| above). merged-based rebases (rebase -m) and cherry-pick-based rebases |
| (rebase -i) are not affected by this shortcoming, and fully support |
| directory rename detection. |