Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 1 | git-filter-branch(1) |
| 2 | ==================== |
| 3 | |
| 4 | NAME |
| 5 | ---- |
| 6 | git-filter-branch - Rewrite branches |
| 7 | |
| 8 | SYNOPSIS |
| 9 | -------- |
| 10 | [verse] |
| 11 | 'git-filter-branch' [--env-filter <command>] [--tree-filter <command>] |
| 12 | [--index-filter <command>] [--parent-filter <command>] |
| 13 | [--msg-filter <command>] [--commit-filter <command>] |
| 14 | [--tag-name-filter <command>] [--subdirectory-filter <directory>] |
Giuseppe Bilotta | 5433235 | 2007-08-30 19:10:42 +0200 | [diff] [blame] | 15 | [--original <namespace>] [-d <directory>] [-f | --force] |
| 16 | [<rev-list options>...] |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 17 | |
| 18 | DESCRIPTION |
| 19 | ----------- |
Johannes Schindelin | 0820366 | 2007-08-31 17:42:33 +0100 | [diff] [blame] | 20 | Lets you rewrite git revision history by rewriting the branches mentioned |
| 21 | in the <rev-list options>, applying custom filters on each revision. |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 22 | Those filters can modify each tree (e.g. removing a file or running |
| 23 | a perl rewrite on all files) or information about each commit. |
| 24 | Otherwise, all information (including original commit times or merge |
| 25 | information) will be preserved. |
| 26 | |
Johannes Schindelin | 0820366 | 2007-08-31 17:42:33 +0100 | [diff] [blame] | 27 | The command will only rewrite the _positive_ refs mentioned in the |
| 28 | command line (i.e. if you pass 'a..b', only 'b' will be rewritten). |
| 29 | If you specify no filters, the commits will be recommitted without any |
| 30 | changes, which would normally have no effect. Nevertheless, this may be |
| 31 | useful in the future for compensating for some git bugs or such, |
| 32 | therefore such a usage is permitted. |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 33 | |
Johannes Schindelin | 73616fd | 2007-07-04 15:50:45 +0100 | [diff] [blame] | 34 | *WARNING*! The rewritten history will have different object names for all |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 35 | the objects and will not converge with the original branch. You will not |
| 36 | be able to easily push and distribute the rewritten branch on top of the |
| 37 | original branch. Please do not use this command if you do not know the |
| 38 | full implications, and avoid using it anyway, if a simple single commit |
| 39 | would suffice to fix your problem. |
| 40 | |
Johannes Schindelin | dfd05e3 | 2007-07-23 18:34:13 +0100 | [diff] [blame] | 41 | Always verify that the rewritten version is correct: The original refs, |
| 42 | if different from the rewritten ones, will be stored in the namespace |
| 43 | 'refs/original/'. |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 44 | |
| 45 | Note that since this operation is extensively I/O expensive, it might |
Johannes Schindelin | 0820366 | 2007-08-31 17:42:33 +0100 | [diff] [blame] | 46 | be a good idea to redirect the temporary directory off-disk with the |
| 47 | '-d' option, e.g. on tmpfs. Reportedly the speedup is very noticeable. |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 48 | |
| 49 | |
| 50 | Filters |
| 51 | ~~~~~~~ |
| 52 | |
| 53 | The filters are applied in the order as listed below. The <command> |
Johannes Schindelin | 6cb93bf | 2007-07-05 17:07:48 +0100 | [diff] [blame] | 54 | argument is always evaluated in shell using the 'eval' command (with the |
| 55 | notable exception of the commit filter, for technical reasons). |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 56 | Prior to that, the $GIT_COMMIT environment variable will be set to contain |
| 57 | the id of the commit being rewritten. Also, GIT_AUTHOR_NAME, |
| 58 | GIT_AUTHOR_EMAIL, GIT_AUTHOR_DATE, GIT_COMMITTER_NAME, GIT_COMMITTER_EMAIL, |
Johannes Schindelin | 73616fd | 2007-07-04 15:50:45 +0100 | [diff] [blame] | 59 | and GIT_COMMITTER_DATE are set according to the current commit. |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 60 | |
| 61 | A 'map' function is available that takes an "original sha1 id" argument |
| 62 | and outputs a "rewritten sha1 id" if the commit has been already |
Johannes Sixt | 32c37c1 | 2007-07-04 09:32:47 +0200 | [diff] [blame] | 63 | rewritten, and "original sha1 id" otherwise; the 'map' function can |
| 64 | return several ids on separate lines if your commit filter emitted |
| 65 | multiple commits. |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 66 | |
| 67 | |
| 68 | OPTIONS |
| 69 | ------- |
| 70 | |
| 71 | --env-filter <command>:: |
| 72 | This is the filter for modifying the environment in which |
| 73 | the commit will be performed. Specifically, you might want |
| 74 | to rewrite the author/committer name/email/time environment |
| 75 | variables (see gitlink:git-commit[1] for details). Do not forget |
| 76 | to re-export the variables. |
| 77 | |
| 78 | --tree-filter <command>:: |
| 79 | This is the filter for rewriting the tree and its contents. |
| 80 | The argument is evaluated in shell with the working |
| 81 | directory set to the root of the checked out tree. The new tree |
| 82 | is then used as-is (new files are auto-added, disappeared files |
| 83 | are auto-removed - neither .gitignore files nor any other ignore |
Johannes Schindelin | 73616fd | 2007-07-04 15:50:45 +0100 | [diff] [blame] | 84 | rules *HAVE ANY EFFECT*!). |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 85 | |
| 86 | --index-filter <command>:: |
| 87 | This is the filter for rewriting the index. It is similar to the |
| 88 | tree filter but does not check out the tree, which makes it much |
| 89 | faster. For hairy cases, see gitlink:git-update-index[1]. |
| 90 | |
| 91 | --parent-filter <command>:: |
| 92 | This is the filter for rewriting the commit's parent list. |
| 93 | It will receive the parent string on stdin and shall output |
| 94 | the new parent string on stdout. The parent string is in |
| 95 | a format accepted by gitlink:git-commit-tree[1]: empty for |
| 96 | the initial commit, "-p parent" for a normal commit and |
| 97 | "-p parent1 -p parent2 -p parent3 ..." for a merge commit. |
| 98 | |
| 99 | --msg-filter <command>:: |
| 100 | This is the filter for rewriting the commit messages. |
| 101 | The argument is evaluated in the shell with the original |
| 102 | commit message on standard input; its standard output is |
| 103 | used as the new commit message. |
| 104 | |
| 105 | --commit-filter <command>:: |
| 106 | This is the filter for performing the commit. |
| 107 | If this filter is specified, it will be called instead of the |
| 108 | gitlink:git-commit-tree[1] command, with arguments of the form |
| 109 | "<TREE_ID> [-p <PARENT_COMMIT_ID>]..." and the log message on |
| 110 | stdin. The commit id is expected on stdout. |
| 111 | + |
| 112 | As a special extension, the commit filter may emit multiple |
| 113 | commit ids; in that case, ancestors of the original commit will |
| 114 | have all of them as parents. |
Johannes Schindelin | f95eef1 | 2007-08-31 20:06:27 +0100 | [diff] [blame] | 115 | + |
| 116 | You can use the 'map' convenience function in this filter, and other |
| 117 | convenience functions, too. For example, calling 'skip_commit "$@"' |
| 118 | will leave out the current commit (but not its changes! If you want |
| 119 | that, use gitlink:git-rebase[1] instead). |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 120 | |
| 121 | --tag-name-filter <command>:: |
| 122 | This is the filter for rewriting tag names. When passed, |
| 123 | it will be called for every tag ref that points to a rewritten |
| 124 | object (or to a tag object which points to a rewritten object). |
| 125 | The original tag name is passed via standard input, and the new |
| 126 | tag name is expected on standard output. |
| 127 | + |
| 128 | The original tags are not deleted, but can be overwritten; |
Brian Gernhardt | 5876b8e | 2007-08-17 19:13:04 -0400 | [diff] [blame] | 129 | use "--tag-name-filter cat" to simply update the tags. In this |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 130 | case, be very careful and make sure you have the old tags |
| 131 | backed up in case the conversion has run afoul. |
| 132 | + |
| 133 | Note that there is currently no support for proper rewriting of |
| 134 | tag objects; in layman terms, if the tag has a message or signature |
| 135 | attached, the rewritten tag won't have it. Sorry. (It is by |
| 136 | definition impossible to preserve signatures at any rate.) |
| 137 | |
| 138 | --subdirectory-filter <directory>:: |
Johannes Schindelin | 73616fd | 2007-07-04 15:50:45 +0100 | [diff] [blame] | 139 | Only look at the history which touches the given subdirectory. |
| 140 | The result will contain that directory (and only that) as its |
| 141 | project root. |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 142 | |
Giuseppe Bilotta | 5433235 | 2007-08-30 19:10:42 +0200 | [diff] [blame] | 143 | --original <namespace>:: |
| 144 | Use this option to set the namespace where the original commits |
| 145 | will be stored. The default value is 'refs/original'. |
| 146 | |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 147 | -d <directory>:: |
| 148 | Use this option to set the path to the temporary directory used for |
| 149 | rewriting. When applying a tree filter, the command needs to |
| 150 | temporary checkout the tree to some directory, which may consume |
| 151 | considerable space in case of large projects. By default it |
| 152 | does this in the '.git-rewrite/' directory but you can override |
| 153 | that choice by this parameter. |
| 154 | |
Jonas Fonseca | 7b55eee | 2007-11-02 10:10:11 +0100 | [diff] [blame] | 155 | -f|--force:: |
Johannes Schindelin | dfd05e3 | 2007-07-23 18:34:13 +0100 | [diff] [blame] | 156 | `git filter-branch` refuses to start with an existing temporary |
| 157 | directory or when there are already refs starting with |
| 158 | 'refs/original/', unless forced. |
| 159 | |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 160 | <rev-list-options>:: |
| 161 | When options are given after the new branch name, they will |
| 162 | be passed to gitlink:git-rev-list[1]. Only commits in the resulting |
| 163 | output will be filtered, although the filtered commits can still |
| 164 | reference parents which are outside of that set. |
| 165 | |
| 166 | |
| 167 | Examples |
| 168 | -------- |
| 169 | |
| 170 | Suppose you want to remove a file (containing confidential information |
| 171 | or copyright violation) from all commits: |
| 172 | |
| 173 | ------------------------------------------------------- |
Johannes Schindelin | dfd05e3 | 2007-07-23 18:34:13 +0100 | [diff] [blame] | 174 | git filter-branch --tree-filter 'rm filename' HEAD |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 175 | ------------------------------------------------------- |
| 176 | |
| 177 | A significantly faster version: |
| 178 | |
Johannes Schindelin | dfd05e3 | 2007-07-23 18:34:13 +0100 | [diff] [blame] | 179 | -------------------------------------------------------------------------- |
| 180 | git filter-branch --index-filter 'git update-index --remove filename' HEAD |
| 181 | -------------------------------------------------------------------------- |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 182 | |
Johannes Schindelin | 8ef4451 | 2007-10-17 03:22:25 +0100 | [diff] [blame] | 183 | Now, you will get the rewritten history saved in HEAD. |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 184 | |
Johannes Sixt | 32c37c1 | 2007-07-04 09:32:47 +0200 | [diff] [blame] | 185 | To set a commit (which typically is at the tip of another |
| 186 | history) to be the parent of the current initial commit, in |
| 187 | order to paste the other history behind the current history: |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 188 | |
Johannes Schindelin | dfd05e3 | 2007-07-23 18:34:13 +0100 | [diff] [blame] | 189 | ------------------------------------------------------------------- |
| 190 | git filter-branch --parent-filter 'sed "s/^\$/-p <graft-id>/"' HEAD |
| 191 | ------------------------------------------------------------------- |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 192 | |
Johannes Schindelin | 0820366 | 2007-08-31 17:42:33 +0100 | [diff] [blame] | 193 | (if the parent string is empty - which happens when we are dealing with |
| 194 | the initial commit - add graftcommit as a parent). Note that this assumes |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 195 | history with a single root (that is, no merge without common ancestors |
| 196 | happened). If this is not the case, use: |
| 197 | |
Johannes Schindelin | dfd05e3 | 2007-07-23 18:34:13 +0100 | [diff] [blame] | 198 | -------------------------------------------------------------------------- |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 199 | git filter-branch --parent-filter \ |
Johannes Schindelin | dfd05e3 | 2007-07-23 18:34:13 +0100 | [diff] [blame] | 200 | 'cat; test $GIT_COMMIT = <commit-id> && echo "-p <graft-id>"' HEAD |
| 201 | -------------------------------------------------------------------------- |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 202 | |
Johannes Sixt | 32c37c1 | 2007-07-04 09:32:47 +0200 | [diff] [blame] | 203 | or even simpler: |
| 204 | |
| 205 | ----------------------------------------------- |
| 206 | echo "$commit-id $graft-id" >> .git/info/grafts |
Johannes Schindelin | dfd05e3 | 2007-07-23 18:34:13 +0100 | [diff] [blame] | 207 | git filter-branch $graft-id..HEAD |
Johannes Sixt | 32c37c1 | 2007-07-04 09:32:47 +0200 | [diff] [blame] | 208 | ----------------------------------------------- |
| 209 | |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 210 | To remove commits authored by "Darl McBribe" from the history: |
| 211 | |
| 212 | ------------------------------------------------------------------------------ |
| 213 | git filter-branch --commit-filter ' |
| 214 | if [ "$GIT_AUTHOR_NAME" = "Darl McBribe" ]; |
| 215 | then |
Johannes Schindelin | f95eef1 | 2007-08-31 20:06:27 +0100 | [diff] [blame] | 216 | skip_commit "$@"; |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 217 | else |
| 218 | git commit-tree "$@"; |
Johannes Schindelin | dfd05e3 | 2007-07-23 18:34:13 +0100 | [diff] [blame] | 219 | fi' HEAD |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 220 | ------------------------------------------------------------------------------ |
| 221 | |
Sergei Organov | 8451c56 | 2007-11-01 16:24:11 +0300 | [diff] [blame] | 222 | The function 'skip_commit' is defined as follows: |
Johannes Schindelin | f95eef1 | 2007-08-31 20:06:27 +0100 | [diff] [blame] | 223 | |
| 224 | -------------------------- |
| 225 | skip_commit() |
| 226 | { |
| 227 | shift; |
| 228 | while [ -n "$1" ]; |
| 229 | do |
| 230 | shift; |
| 231 | map "$1"; |
| 232 | shift; |
| 233 | done; |
| 234 | } |
| 235 | -------------------------- |
| 236 | |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 237 | The shift magic first throws away the tree id and then the -p |
| 238 | parameters. Note that this handles merges properly! In case Darl |
| 239 | committed a merge between P1 and P2, it will be propagated properly |
| 240 | and all children of the merge will become merge commits with P1,P2 |
| 241 | as their parents instead of the merge commit. |
| 242 | |
Johannes Schindelin | f95eef1 | 2007-08-31 20:06:27 +0100 | [diff] [blame] | 243 | |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 244 | To restrict rewriting to only part of the history, specify a revision |
| 245 | range in addition to the new branch name. The new branch name will |
| 246 | point to the top-most revision that a 'git rev-list' of this range |
| 247 | will print. |
| 248 | |
Johannes Schindelin | 0820366 | 2007-08-31 17:42:33 +0100 | [diff] [blame] | 249 | *NOTE* the changes introduced by the commits, and which are not reverted |
| 250 | by subsequent commits, will still be in the rewritten branch. If you want |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 251 | to throw out _changes_ together with the commits, you should use the |
| 252 | interactive mode of gitlink:git-rebase[1]. |
| 253 | |
Johannes Schindelin | 0820366 | 2007-08-31 17:42:33 +0100 | [diff] [blame] | 254 | |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 255 | Consider this history: |
| 256 | |
| 257 | ------------------ |
| 258 | D--E--F--G--H |
| 259 | / / |
| 260 | A--B-----C |
| 261 | ------------------ |
| 262 | |
| 263 | To rewrite only commits D,E,F,G,H, but leave A, B and C alone, use: |
| 264 | |
| 265 | -------------------------------- |
Johannes Schindelin | dfd05e3 | 2007-07-23 18:34:13 +0100 | [diff] [blame] | 266 | git filter-branch ... C..H |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 267 | -------------------------------- |
| 268 | |
| 269 | To rewrite commits E,F,G,H, use one of these: |
| 270 | |
| 271 | ---------------------------------------- |
Johannes Schindelin | dfd05e3 | 2007-07-23 18:34:13 +0100 | [diff] [blame] | 272 | git filter-branch ... C..H --not D |
| 273 | git filter-branch ... D..H --not C |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 274 | ---------------------------------------- |
| 275 | |
| 276 | To move the whole tree into a subdirectory, or remove it from there: |
| 277 | |
| 278 | --------------------------------------------------------------- |
| 279 | git filter-branch --index-filter \ |
| 280 | 'git ls-files -s | sed "s-\t-&newsubdir/-" | |
| 281 | GIT_INDEX_FILE=$GIT_INDEX_FILE.new \ |
| 282 | git update-index --index-info && |
Johannes Schindelin | dfd05e3 | 2007-07-23 18:34:13 +0100 | [diff] [blame] | 283 | mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE' HEAD |
Johannes Schindelin | c401b33 | 2007-07-04 00:41:55 +0100 | [diff] [blame] | 284 | --------------------------------------------------------------- |
| 285 | |
| 286 | |
| 287 | Author |
| 288 | ------ |
| 289 | Written by Petr "Pasky" Baudis <pasky@suse.cz>, |
| 290 | and the git list <git@vger.kernel.org> |
| 291 | |
| 292 | Documentation |
| 293 | -------------- |
| 294 | Documentation by Petr Baudis and the git list. |
| 295 | |
| 296 | GIT |
| 297 | --- |
| 298 | Part of the gitlink:git[7] suite |