| diff API |
| ======== |
| |
| The diff API is for programs that compare two sets of files (e.g. two |
| trees, one tree and the index) and present the found difference in |
| various ways. The calling program is responsible for feeding the API |
| pairs of files, one from the "old" set and the corresponding one from |
| "new" set, that are different. The library called through this API is |
| called diffcore, and is responsible for two things. |
| |
| * finding total rewrites (`-B`), renames (`-M`) and copies (`-C`), and |
| changes that touch a string (`-S`), as specified by the caller. |
| |
| * outputting the differences in various formats, as specified by the |
| caller. |
| |
| Calling sequence |
| ---------------- |
| |
| * Prepare `struct diff_options` to record the set of diff options, and |
| then call `diff_setup()` to initialize this structure. This sets up |
| the vanilla default. |
| |
| * Fill in the options structure to specify desired output format, rename |
| detection, etc. `diff_opt_parse()` can be used to parse options given |
| from the command line in a way consistent with existing git-diff |
| family of programs. |
| |
| * Call `diff_setup_done()`; this inspects the options set up so far for |
| internal consistency and make necessary tweaking to it (e.g. if |
| textual patch output was asked, recursive behaviour is turned on). |
| |
| * As you find different pairs of files, call `diff_change()` to feed |
| modified files, `diff_addremove()` to feed created or deleted files, |
| or `diff_unmerged()` to feed a file whose state is 'unmerged' to the |
| API. These are thin wrappers to a lower-level `diff_queue()` function |
| that is flexible enough to record any of these kinds of changes. |
| |
| * Once you finish feeding the pairs of files, call `diffcore_std()`. |
| This will tell the diffcore library to go ahead and do its work. |
| |
| * Calling `diff_flush()` will produce the output. |
| |
| |
| Data structures |
| --------------- |
| |
| * `struct diff_filespec` |
| |
| This is the internal representation for a single file (blob). It |
| records the blob object name (if known -- for a work tree file it |
| typically is a NUL SHA-1), filemode and pathname. This is what the |
| `diff_addremove()`, `diff_change()` and `diff_unmerged()` synthesize and |
| feed `diff_queue()` function with. |
| |
| * `struct diff_filepair` |
| |
| This records a pair of `struct diff_filespec`; the filespec for a file |
| in the "old" set (i.e. preimage) is called `one`, and the filespec for a |
| file in the "new" set (i.e. postimage) is called `two`. A change that |
| represents file creation has NULL in `one`, and file deletion has NULL |
| in `two`. |
| |
| A `filepair` starts pointing at `one` and `two` that are from the same |
| filename, but `diffcore_std()` can break pairs and match component |
| filespecs with other filespecs from a different filepair to form new |
| filepair. This is called 'rename detection'. |
| |
| * `struct diff_queue` |
| |
| This is a collection of filepairs. Notable members are: |
| |
| `queue`:: |
| |
| An array of pointers to `struct diff_filepair`. This |
| dynamically grows as you add filepairs; |
| |
| `alloc`:: |
| |
| The allocated size of the `queue` array; |
| |
| `nr`:: |
| |
| The number of elements in the `queue` array. |
| |
| |
| * `struct diff_options` |
| |
| This describes the set of options the calling program wants to affect |
| the operation of diffcore library with. |
| |
| Notable members are: |
| |
| `output_format`:: |
| The output format used when `diff_flush()` is run. |
| |
| `context`:: |
| Number of context lines to generate in patch output. |
| |
| `break_opt`, `detect_rename`, `rename-score`, `rename_limit`:: |
| Affects the way detection logic for complete rewrites, renames |
| and copies. |
| |
| `abbrev`:: |
| Number of hexdigits to abbreviate raw format output to. |
| |
| `pickaxe`:: |
| A constant string (can and typically does contain newlines to |
| look for a block of text, not just a single line) to filter out |
| the filepairs that do not change the number of strings contained |
| in its preimage and postimage of the diff_queue. |
| |
| `flags`:: |
| This is mostly a collection of boolean options that affects the |
| operation, but some do not have anything to do with the diffcore |
| library. |
| |
| BINARY, TEXT;; |
| Affects the way how a file that is seemingly binary is treated. |
| |
| FULL_INDEX;; |
| Tells the patch output format not to use abbreviated object |
| names on the "index" lines. |
| |
| FIND_COPIES_HARDER;; |
| Tells the diffcore library that the caller is feeding unchanged |
| filepairs to allow copies from unmodified files be detected. |
| |
| COLOR_DIFF;; |
| Output should be colored. |
| |
| COLOR_DIFF_WORDS;; |
| Output is a colored word-diff. |
| |
| NO_INDEX;; |
| Tells diff-files that the input is not tracked files but files |
| in random locations on the filesystem. |
| |
| ALLOW_EXTERNAL;; |
| Tells output routine that it is Ok to call user specified patch |
| output routine. Plumbing disables this to ensure stable output. |
| |
| QUIET;; |
| Do not show any output. |
| |
| REVERSE_DIFF;; |
| Tells the library that the calling program is feeding the |
| filepairs reversed; `one` is two, and `two` is one. |
| |
| EXIT_WITH_STATUS;; |
| For communication between the calling program and the options |
| parser; tell the calling program to signal the presence of |
| difference using program exit code. |
| |
| HAS_CHANGES;; |
| Internal; used for optimization to see if there is any change. |
| |
| SILENT_ON_REMOVE;; |
| Affects if diff-files shows removed files. |
| |
| RECURSIVE, TREE_IN_RECURSIVE;; |
| Tells if tree traversal done by tree-diff should recursively |
| descend into a tree object pair that are different in preimage |
| and postimage set. |
| |
| (JC) |