commit | a949ebd342440049a1ac77ca675f66884eae4187 | [log] [tgz] |
---|---|---|
author | Justin Tobler <jltobler@gmail.com> | Mon Apr 08 16:16:55 2024 +0000 |
committer | Junio C Hamano <gitster@pobox.com> | Mon Apr 08 12:11:10 2024 -0700 |
tree | ce902fd70128c76ad2387baefb108c7f1bc36433 | |
parent | 7c8eb5928f3ae504f9ce92b67a1eb41db82d81f7 [diff] |
reftable/stack: use geometric table compaction To reduce the number of on-disk reftables, compaction is performed. Contiguous tables with the same binary log value of size are grouped into segments. The segment that has both the lowest binary log value and contains more than one table is set as the starting point when identifying the compaction segment. Since segments containing a single table are not initially considered for compaction, if the table appended to the list does not match the previous table log value, no compaction occurs for the new table. It is therefore possible for unbounded growth of the table list. This can be demonstrated by repeating the following sequence: git branch -f foo git branch -d foo Each operation results in a new table being written with no compaction occurring until a separate operation produces a table matching the previous table log value. Instead, to avoid unbounded growth of the table list, the compaction strategy is updated to ensure tables follow a geometric sequence after each operation by individually evaluating each table in reverse index order. This strategy results in a much simpler and more robust algorithm compared to the previous one while also maintaining a minimal ordered set of tables on-disk. When creating 10 thousand references, the new strategy has no performance impact: Benchmark 1: update-ref: create refs sequentially (revision = HEAD~) Time (mean ± σ): 26.516 s ± 0.047 s [User: 17.864 s, System: 8.491 s] Range (min … max): 26.447 s … 26.569 s 10 runs Benchmark 2: update-ref: create refs sequentially (revision = HEAD) Time (mean ± σ): 26.417 s ± 0.028 s [User: 17.738 s, System: 8.500 s] Range (min … max): 26.366 s … 26.444 s 10 runs Summary update-ref: create refs sequentially (revision = HEAD) ran 1.00 ± 0.00 times faster than update-ref: create refs sequentially (revision = HEAD~) Some tests in `t0610-reftable-basics.sh` assert the on-disk state of tables and are therefore updated to specify the correct new table count. Since compaction is more aggressive in ensuring tables maintain a geometric sequence, the expected table count is reduced in these tests. In `reftable/stack_test.c` tests related to `sizes_to_segments()` are removed because the function is no longer needed. Also, the `test_suggest_compaction_segment()` test is updated to better showcase and reflect the new geometric compaction behavior. Signed-off-by: Justin Tobler <jltobler@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.
Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.
Please read the file INSTALL for installation instructions.
Many Git online resources are accessible from https://git-scm.com/ including full documentation and Git related tools.
See Documentation/gittutorial.txt to get started, then see Documentation/giteveryday.txt for a useful minimum set of commands, and Documentation/git-<commandname>.txt
for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial
or git help tutorial
, and the documentation of each command with man git-<commandname>
or git help <commandname>
.
CVS users may also want to read Documentation/gitcvs-migration.txt (man gitcvs-migration
or git help cvs-migration
if git is installed).
The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission and Documentation/CodingGuidelines).
Those wishing to help with error message, usage and informational message string translations (localization l10) should see po/README.md (a po
file is a Portable Object file that holds the translations).
To subscribe to the list, send an email to git+subscribe@vger.kernel.org (see https://subspace.kernel.org/subscribing.html for details). The mailing list archives are available at https://lore.kernel.org/git/, https://marc.info/?l=git and other archival sites.
Issues which are security relevant should be disclosed privately to the Git Security mailing list git-security@googlegroups.com.
The maintainer frequently sends the “What's cooking” reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.
The name “git” was given by Linus Torvalds when he wrote the very first version. He described the tool as “the stupid content tracker” and the name as (depending on your mood):