| Scalar |
| ====== |
| |
| Scalar is a repository management tool that optimizes Git for use in large |
| repositories. It accomplishes this by helping users to take advantage of |
| advanced performance features in Git. Unlike most other Git built-in commands, |
| Scalar is not executed as a subcommand of 'git'; rather, it is built as a |
| separate executable containing its own series of subcommands. |
| |
| Background |
| ---------- |
| |
| Scalar was originally designed as an add-on to Git and implemented as a .NET |
| Core application. It was created based on the learnings from the VFS for Git |
| project (another application aimed at improving the experience of working with |
| large repositories). As part of its initial implementation, Scalar relied on |
| custom features in the Microsoft fork of Git that have since been integrated |
| into core Git: |
| |
| * partial clone, |
| * commit graphs, |
| * multi-pack index, |
| * sparse checkout (cone mode), |
| * scheduled background maintenance, |
| * etc |
| |
| With the requisite Git functionality in place and a desire to bring the benefits |
| of Scalar to the larger Git community, the Scalar application itself was ported |
| from C# to C and integrated upstream. |
| |
| Features |
| -------- |
| |
| Scalar is comprised of two major pieces of functionality: automatically |
| configuring built-in Git performance features and managing repository |
| enlistments. |
| |
| The Git performance features configured by Scalar (see "Background" for |
| examples) confer substantial performance benefits to large repositories, but are |
| either too experimental to enable for all of Git yet, or only benefit large |
| repositories. As new features are introduced, Scalar should be updated |
| accordingly to incorporate them. This will prevent the tool from becoming stale |
| while also providing a path for more easily bringing features to the appropriate |
| users. |
| |
| Enlistments are how Scalar knows which repositories on a user's system should |
| utilize Scalar-configured features. This allows it to update performance |
| settings when new ones are added to the tool, as well as centrally manage |
| repository maintenance. The enlistment structure - a root directory with a |
| `src/` subdirectory containing the cloned repository itself - is designed to |
| encourage users to route build outputs outside of the repository to avoid the |
| performance-limiting overhead of ignoring those files in Git. |
| |
| Design |
| ------ |
| |
| Scalar is implemented in C and interacts with Git via a mix of child process |
| invocations of Git and direct usage of `libgit.a`. Internally, it is structured |
| much like other built-ins with subcommands (e.g., `git stash`), containing a |
| `cmd_<subcommand>()` function for each subcommand, routed through a `cmd_main()` |
| function. Most options are unique to each subcommand, with `scalar` respecting |
| some "global" `git` options (e.g., `-c` and `-C`). |
| |
| Because `scalar` is not invoked as a Git subcommand (like `git scalar`), it is |
| built and installed as its own executable in the `bin/` directory, alongside |
| `git`, `git-gui`, etc. |
| |
| Roadmap |
| ------- |
| |
| NOTE: this section will be removed once the remaining tasks outlined in this |
| roadmap are complete. |
| |
| Scalar is a large enough project that it is being upstreamed incrementally, |
| living in `contrib/` until it is feature-complete. So far, the following patch |
| series have been accepted: |
| |
| - `scalar-the-beginning`: The initial patch series which sets up |
| `contrib/scalar/` and populates it with a minimal `scalar` command that |
| demonstrates the fundamental ideas. |
| |
| - `scalar-c-and-C`: The `scalar` command learns about two options that can be |
| specified before the command, `-c <key>=<value>` and `-C <directory>`. |
| |
| - `scalar-diagnose`: The `scalar` command is taught the `diagnose` subcommand. |
| |
| - `scalar-generalize-diagnose`: Move the functionality of `scalar diagnose` |
| into `git diagnose` and `git bugreport --diagnose`. |
| |
| - 'scalar-add-fsmonitor: Enable the built-in FSMonitor in Scalar |
| enlistments. At the end of this series, Scalar should be feature-complete |
| from the perspective of a user. |
| |
| Roughly speaking (and subject to change), the following series are needed to |
| "finish" this initial version of Scalar: |
| |
| - Move Scalar to toplevel: Move Scalar out of `contrib/` and into the root of |
| `git`. This includes a variety of related updates, including: |
| - building & installing Scalar in the Git root-level 'make [install]'. |
| - builing & testing Scalar as part of CI. |
| - moving and expanding test coverage of Scalar (including perf tests). |
| - implementing 'scalar help'/'git help scalar' to display scalar |
| documentation. |
| |
| Finally, there are two additional patch series that exist in Microsoft's fork of |
| Git, but there is no current plan to upstream them. There are some interesting |
| ideas there, but the implementation is too specific to Azure Repos and/or VFS |
| for Git to be of much help in general. |
| |
| These still exist mainly because the GVFS protocol is what Azure Repos has |
| instead of partial clone, while Git is focused on improving partial clone: |
| |
| - `scalar-with-gvfs`: The primary purpose of this patch series is to support |
| existing Scalar users whose repositories are hosted in Azure Repos (which does |
| not support Git's partial clones, but supports its predecessor, the GVFS |
| protocol, which is used by Scalar to emulate the partial clone). |
| |
| Since the GVFS protocol will never be supported by core Git, this patch series |
| will remain in Microsoft's fork of Git. |
| |
| - `run-scalar-functional-tests`: The Scalar project developed a quite |
| comprehensive set of integration tests (or, "Functional Tests"). They are the |
| sole remaining part of the original C#-based Scalar project, and this patch |
| adds a GitHub workflow that runs them all. |
| |
| Since the tests partially depend on features that are only provided in the |
| `scalar-with-gvfs` patch series, this patch cannot be upstreamed. |