| Concerning Git's Packing Heuristics |
| =================================== |
| |
| Oh, here's a really stupid question: |
| |
| Where do I go |
| to learn the details |
| of git's packing heuristics? |
| |
| Be careful what you ask! |
| |
| Followers of the git, please open the git IRC Log and turn to |
| February 10, 2006. |
| |
| It's a rare occasion, and we are joined by the King Git Himself, |
| Linus Torvalds (linus). Nathaniel Smith, (njs`), has the floor |
| and seeks enlightenment. Others are present, but silent. |
| |
| Let's listen in! |
| |
| <njs`> Oh, here's a really stupid question -- where do I go to |
| learn the details of git's packing heuristics? google avails |
| me not, reading the source didn't help a lot, and wading |
| through the whole mailing list seems less efficient than any |
| of that. |
| |
| It is a bold start! A plea for help combined with a simultaneous |
| tri-part attack on some of the tried and true mainstays in the quest |
| for enlightenment. Brash accusations of google being useless. Hubris! |
| Maligning the source. Heresy! Disdain for the mailing list archives. |
| Woe. |
| |
| <pasky> yes, the packing-related delta stuff is somewhat |
| mysterious even for me ;) |
| |
| Ah! Modesty after all. |
| |
| <linus> njs, I don't think the docs exist. That's something where |
| I don't think anybody else than me even really got involved. |
| Most of the rest of git others have been busy with (especially |
| Junio), but packing nobody touched after I did it. |
| |
| It's cryptic, yet vague. Linus in style for sure. Wise men |
| interpret this as an apology. A few argue it is merely a |
| statement of fact. |
| |
| <njs`> I guess the next step is "read the source again", but I |
| have to build up a certain level of gumption first :-) |
| |
| Indeed! On both points. |
| |
| <linus> The packing heuristic is actually really really simple. |
| |
| Bait... |
| |
| <linus> But strange. |
| |
| And switch. That ought to do it! |
| |
| <linus> Remember: git really doesn't follow files. So what it does is |
| - generate a list of all objects |
| - sort the list according to magic heuristics |
| - walk the list, using a sliding window, seeing if an object |
| can be diffed against another object in the window |
| - write out the list in recency order |
| |
| The traditional understatement: |
| |
| <njs`> I suspect that what I'm missing is the precise definition of |
| the word "magic" |
| |
| The traditional insight: |
| |
| <pasky> yes |
| |
| And Bable-like confusion flowed. |
| |
| <njs`> oh, hmm, and I'm not sure what this sliding window means either |
| |
| <pasky> iirc, it appeared to me to be just the sha1 of the object |
| when reading the code casually ... |
| |
| ... which simply doesn't sound as a very good heuristics, though ;) |
| |
| <njs`> .....and recency order. okay, I think it's clear I didn't |
| even realize how much I wasn't realizing :-) |
| |
| Ah, grasshopper! And thus the enlightenment begins anew. |
| |
| <linus> The "magic" is actually in theory totally arbitrary. |
| ANY order will give you a working pack, but no, it's not |
| ordered by SHA1. |
| |
| Before talking about the ordering for the sliding delta |
| window, let's talk about the recency order. That's more |
| important in one way. |
| |
| <njs`> Right, but if all you want is a working way to pack things |
| together, you could just use cat and save yourself some |
| trouble... |
| |
| Waaait for it.... |
| |
| <linus> The recency ordering (which is basically: put objects |
| _physically_ into the pack in the order that they are |
| "reachable" from the head) is important. |
| |
| <njs`> okay |
| |
| <linus> It's important because that's the thing that gives packs |
| good locality. It keeps the objects close to the head (whether |
| they are old or new, but they are _reachable_ from the head) |
| at the head of the pack. So packs actually have absolutely |
| _wonderful_ IO patterns. |
| |
| Read that again, because it is important. |
| |
| <linus> But recency ordering is totally useless for deciding how |
| to actually generate the deltas, so the delta ordering is |
| something else. |
| |
| The delta ordering is (wait for it): |
| - first sort by the "basename" of the object, as defined by |
| the name the object was _first_ reached through when |
| generating the object list |
| - within the same basename, sort by size of the object |
| - but always sort different types separately (commits first). |
| |
| That's not exactly it, but it's very close. |
| |
| <njs`> The "_first_ reached" thing is not too important, just you |
| need some way to break ties since the same objects may be |
| reachable many ways, yes? |
| |
| And as if to clarify: |
| |
| <linus> The point is that it's all really just any random |
| heuristic, and the ordering is totally unimportant for |
| correctness, but it helps a lot if the heuristic gives |
| "clumping" for things that are likely to delta well against |
| each other. |
| |
| It is an important point, so secretly, I did my own research and have |
| included my results below. To be fair, it has changed some over time. |
| And through the magic of Revisionistic History, I draw upon this entry |
| from The Git IRC Logs on my father's birthday, March 1: |
| |
| <gitster> The quote from the above linus should be rewritten a |
| bit (wait for it): |
| - first sort by type. Different objects never delta with |
| each other. |
| - then sort by filename/dirname. hash of the basename |
| occupies the top BITS_PER_INT-DIR_BITS bits, and bottom |
| DIR_BITS are for the hash of leading path elements. |
| - then if we are doing "thin" pack, the objects we are _not_ |
| going to pack but we know about are sorted earlier than |
| other objects. |
| - and finally sort by size, larger to smaller. |
| |
| In one swell-foop, clarification and obscurification! Nonetheless, |
| authoritative. Cryptic, yet concise. It even solicits notions of |
| quotes from The Source Code. Clearly, more study is needed. |
| |
| <gitster> That's the sort order. What this means is: |
| - we do not delta different object types. |
| - we prefer to delta the objects with the same full path, but |
| allow files with the same name from different directories. |
| - we always prefer to delta against objects we are not going |
| to send, if there are some. |
| - we prefer to delta against larger objects, so that we have |
| lots of removals. |
| |
| The penultimate rule is for "thin" packs. It is used when |
| the other side is known to have such objects. |
| |
| There it is again. "Thin" packs. I'm thinking to myself, "What |
| is a 'thin' pack?" So I ask: |
| |
| <jdl> What is a "thin" pack? |
| |
| <gitster> Use of --objects-edge to rev-list as the upstream of |
| pack-objects. The pack transfer protocol negotiates that. |
| |
| Woo hoo! Cleared that _right_ up! |
| |
| <gitster> There are two directions - push and fetch. |
| |
| There! Did you see it? It is not '"push" and "pull"'! How often the |
| confusion has started here. So casually mentioned, too! |
| |
| <gitster> For push, git-send-pack invokes git-receive-pack on the |
| other end. The receive-pack says "I have up to these commits". |
| send-pack looks at them, and computes what are missing from |
| the other end. So "thin" could be the default there. |
| |
| In the other direction, fetch, git-fetch-pack and |
| git-clone-pack invokes git-upload-pack on the other end |
| (via ssh or by talking to the daemon). |
| |
| There are two cases: fetch-pack with -k and clone-pack is one, |
| fetch-pack without -k is the other. clone-pack and fetch-pack |
| with -k will keep the downloaded packfile without expanded, so |
| we do not use thin pack transfer. Otherwise, the generated |
| pack will have delta without base object in the same pack. |
| |
| But fetch-pack without -k will explode the received pack into |
| individual objects, so we automatically ask upload-pack to |
| give us a thin pack if upload-pack supports it. |
| |
| OK then. |
| |
| Uh. |
| |
| Let's return to the previous conversation still in progress. |
| |
| <njs`> and "basename" means something like "the tail of end of |
| path of file objects and dir objects, as per basename(3), and |
| we just declare all commit and tag objects to have the same |
| basename" or something? |
| |
| Luckily, that too is a point that gitster clarified for us! |
| |
| If I might add, the trick is to make files that _might_ be similar be |
| located close to each other in the hash buckets based on their file |
| names. It used to be that "foo/Makefile", "bar/baz/quux/Makefile" and |
| "Makefile" all landed in the same bucket due to their common basename, |
| "Makefile". However, now they land in "close" buckets. |
| |
| The algorithm allows not just for the _same_ bucket, but for _close_ |
| buckets to be considered delta candidates. The rationale is |
| essentially that files, like Makefiles, often have very similar |
| content no matter what directory they live in. |
| |
| <linus> I played around with different delta algorithms, and with |
| making the "delta window" bigger, but having too big of a |
| sliding window makes it very expensive to generate the pack: |
| you need to compare every object with a _ton_ of other objects. |
| |
| There are a number of other trivial heuristics too, which |
| basically boil down to "don't bother even trying to delta this |
| pair" if we can tell before-hand that the delta isn't worth it |
| (due to size differences, where we can take a previous delta |
| result into account to decide that "ok, no point in trying |
| that one, it will be worse"). |
| |
| End result: packing is actually very size efficient. It's |
| somewhat CPU-wasteful, but on the other hand, since you're |
| really only supposed to do it maybe once a month (and you can |
| do it during the night), nobody really seems to care. |
| |
| Nice Engineering Touch, there. Find when it doesn't matter, and |
| proclaim it a non-issue. Good style too! |
| |
| <njs`> So, just to repeat to see if I'm following, we start by |
| getting a list of the objects we want to pack, we sort it by |
| this heuristic (basically lexicographically on the tuple |
| (type, basename, size)). |
| |
| Then we walk through this list, and calculate a delta of |
| each object against the last n (tunable paramater) objects, |
| and pick the smallest of these deltas. |
| |
| Vastly simplified, but the essence is there! |
| |
| <linus> Correct. |
| |
| <njs`> And then once we have picked a delta or fulltext to |
| represent each object, we re-sort by recency, and write them |
| out in that order. |
| |
| <linus> Yup. Some other small details: |
| |
| And of course there is the "Other Shoe" Factor too. |
| |
| <linus> - We limit the delta depth to another magic value (right |
| now both the window and delta depth magic values are just "10") |
| |
| <njs`> Hrm, my intuition is that you'd end up with really _bad_ IO |
| patterns, because the things you want are near by, but to |
| actually reconstruct them you may have to jump all over in |
| random ways. |
| |
| <linus> - When we write out a delta, and we haven't yet written |
| out the object it is a delta against, we write out the base |
| object first. And no, when we reconstruct them, we actually |
| get nice IO patterns, because: |
| - larger objects tend to be "more recent" (Linus' law: files grow) |
| - we actively try to generate deltas from a larger object to a |
| smaller one |
| - this means that the top-of-tree very seldom has deltas |
| (i.e. deltas in _practice_ are "backwards deltas") |
| |
| Again, we should reread that whole paragraph. Not just because |
| Linus has slipped Linus's Law in there on us, but because it is |
| important. Let's make sure we clarify some of the points here: |
| |
| <njs`> So the point is just that in practice, delta order and |
| recency order match each other quite well. |
| |
| <linus> Yes. There's another nice side to this (and yes, it was |
| designed that way ;): |
| - the reason we generate deltas against the larger object is |
| actually a big space saver too! |
| |
| <njs`> Hmm, but your last comment (if "we haven't yet written out |
| the object it is a delta against, we write out the base object |
| first"), seems like it would make these facts mostly |
| irrelevant because even if in practice you would not have to |
| wander around much, in fact you just brute-force say that in |
| the cases where you might have to wander, don't do that :-) |
| |
| <linus> Yes and no. Notice the rule: we only write out the base |
| object first if the delta against it was more recent. That |
| means that you can actually have deltas that refer to a base |
| object that is _not_ close to the delta object, but that only |
| happens when the delta is needed to generate an _old_ object. |
| |
| <linus> See? |
| |
| Yeah, no. I missed that on the first two or three readings myself. |
| |
| <linus> This keeps the front of the pack dense. The front of the |
| pack never contains data that isn't relevant to a "recent" |
| object. The size optimization comes from our use of xdelta |
| (but is true for many other delta algorithms): removing data |
| is cheaper (in size) than adding data. |
| |
| When you remove data, you only need to say "copy bytes n--m". |
| In contrast, in a delta that _adds_ data, you have to say "add |
| these bytes: 'actual data goes here'" |
| |
| *** njs` has quit: Read error: 104 (Connection reset by peer) |
| |
| <linus> Uhhuh. I hope I didn't blow njs` mind. |
| |
| *** njs` has joined channel #git |
| |
| <pasky> :) |
| |
| The silent observers are amused. Of course. |
| |
| And as if njs` was expected to be omniscient: |
| |
| <linus> njs - did you miss anything? |
| |
| OK, I'll spell it out. That's Geek Humor. If njs` was not actually |
| connected for a little bit there, how would he know if missed anything |
| while he was disconnected? He's a benevolent dictator with a sense of |
| humor! Well noted! |
| |
| <njs`> Stupid router. Or gremlins, or whatever. |
| |
| It's a cheap shot at Cisco. Take 'em when you can. |
| |
| <njs`> Yes and no. Notice the rule: we only write out the base |
| object first if the delta against it was more recent. |
| |
| I'm getting lost in all these orders, let me re-read :-) |
| So the write-out order is from most recent to least recent? |
| (Conceivably it could be the opposite way too, I'm not sure if |
| we've said) though my connection back at home is logging, so I |
| can just read what you said there :-) |
| |
| And for those of you paying attention, the Omniscient Trick has just |
| been detailed! |
| |
| <linus> Yes, we always write out most recent first |
| |
| For the other record: |
| |
| <pasky> njs`: http://pastebin.com/547965 |
| |
| The 'net never forgets, so that should be good until the end of time. |
| |
| <njs`> And, yeah, I got the part about deeper-in-history stuff |
| having worse IO characteristics, one sort of doesn't care. |
| |
| <linus> With the caveat that if the "most recent" needs an older |
| object to delta against (hey, shrinking sometimes does |
| happen), we write out the old object with the delta. |
| |
| <njs`> (if only it happened more...) |
| |
| <linus> Anyway, the pack-file could easily be denser still, but |
| because it's used both for streaming (the git protocol) and |
| for on-disk, it has a few pessimizations. |
| |
| Actually, it is a made-up word. But it is a made-up word being |
| used as setup for a later optimization, which is a real word: |
| |
| <linus> In particular, while the pack-file is then compressed, |
| it's compressed just one object at a time, so the actual |
| compression factor is less than it could be in theory. But it |
| means that it's all nice random-access with a simple index to |
| do "object name->location in packfile" translation. |
| |
| <njs`> I'm assuming the real win for delta-ing large->small is |
| more homogenous statistics for gzip to run over? |
| |
| (You have to put the bytes in one place or another, but |
| putting them in a larger blob wins on compression) |
| |
| Actually, what is the compression strategy -- each delta |
| individually gzipped, the whole file gzipped, somewhere in |
| between, no compression at all, ....? |
| |
| Right. |
| |
| Reality IRC sets in. For example: |
| |
| <pasky> I'll read the rest in the morning, I really have to go |
| sleep or there's no hope whatsoever for me at the today's |
| exam... g'nite all. |
| |
| Heh. |
| |
| <linus> pasky: g'nite |
| |
| <njs`> pasky: 'luck |
| |
| <linus> Right: large->small matters exactly because of compression |
| behaviour. If it was non-compressed, it probably wouldn't make |
| any difference. |
| |
| <njs`> yeah |
| |
| <linus> Anyway: I'm not even trying to claim that the pack-files |
| are perfect, but they do tend to have a nice balance of |
| density vs ease-of use. |
| |
| Gasp! OK, saved. That's a fair Engineering trade off. Close call! |
| In fact, Linus reflects on some Basic Engineering Fundamentals, |
| design options, etc. |
| |
| <linus> More importantly, they allow git to still _conceptually_ |
| never deal with deltas at all, and be a "whole object" store. |
| |
| Which has some problems (we discussed bad huge-file |
| behaviour on the git lists the other day), but it does mean |
| that the basic git concepts are really really simple and |
| straightforward. |
| |
| It's all been quite stable. |
| |
| Which I think is very much a result of having very simple |
| basic ideas, so that there's never any confusion about what's |
| going on. |
| |
| Bugs happen, but they are "simple" bugs. And bugs that |
| actually get some object store detail wrong are almost always |
| so obious that they never go anywhere. |
| |
| <njs`> Yeah. |
| |
| Nuff said. |
| |
| <linus> Anyway. I'm off for bed. It's not 6AM here, but I've got |
| three kids, and have to get up early in the morning to send |
| them off. I need my beauty sleep. |
| |
| <njs`> :-) |
| |
| <njs`> appreciate the infodump, I really was failing to find the |
| details on git packs :-) |
| |
| And now you know the rest of the story. |