Re: RFC: Flat directory for notes, or fan-out? Both!

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Tue, 10 Feb 2009 18:35:52 -0800 (PST)

On Tue, 10 Feb 2009, Boyd Stephen Smith Jr. wrote:
> 
> Yes, this would require a custom merge strategy for notes to flatten -> merge 
> -> canonicalize.

That sounds unnecessarily complicated. It also really sucks for the case 
you want to optimize: small differences between trees, where you don't 
need to even linearize the common parts.

Why not make it just a straight fixed 12-bit prefix, single-level trie.

Sure, if you have less than 4k objects, it's going to add an unnecessary 
indirection, and close to an extra tree object for each object. But it 
should scale pretty well to a fairly huge numbe of notes. IOW, if you have 
less than 2^24 notes (16 million), you'll never have a tree object with 
more than 4k entries.

And with each tree being ~70 bytes/object (40 bytes name, 20 bytes SHA1 + 
overhead), the individual tree objects will still be a reasonable(ish) 
size. And the fixed depth and prefix size means that merging is trivial 
and can use the normal tree merge that avoids touching common subtrees.

The default .git/objects fan-out of just 8 bits might work too, but if 
we're thinking millions of notes (which is not entirely unreasonable), it 
gets ugly pretty fast. The reason it works ok for git is the repacking.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html