On Mon, Jan 22, 2018 at 07:47:10PM -0500, Jeff King wrote: > > I think Ævar is talking about the case of: > > 1. You make 100 objects that aren't referenced. They're loose. > > 2. You run git-gc. They're still too recent to be deleted. > > Right now those recent loose objects sit loose, and have zero cost at > the time of gc. In a "cruft pack" world, you'd pay some I/O to copy > them into the cruft pack, and some CPU to zlib and delta-compress them. > I think that's probably fine, though. I wasn't assuming that git-gc would create a cruft pack --- although I guess it could. As you say, recent loose objects have relatively zero cost at the time of gc. To the extent that the gc has to read lots of loose files, there may be more seeks in the cold cache case, so there is actually *some* cost to having the loose objects, but it's not great. What I was thinking about instead is that in cases where we know we are likely to be creating a large number of loose objects (whether they referenced or not), in a world where we will be calling fsync(2) after every single loose object being created, pack files start looking *way* more efficient. So in general, if you know you will be creating N loose objects, where N is probably around 50 or so, you'll want to create a pack instead. One of those cases is "repack -A", and in that case the loose objects are all going tobe not referenced, so it would be a "cruft pack". But in many other cases where we might be importing from another DCVS, which will be another case where doing an fsync(2) after every loose object creation (and where I have sometimes seen it create them *all* loose, and not use a pack at all), is going to get extremely slow and painful. > So if we pack all the loose objects into a cruft pack, the mtime of the > cruft pack becomes the new gauge for "recent". And if we migrate objects > from old cruft pack to new cruft pack at each gc, then they'll keep > getting their mtimes refreshed, and we'll never drop them. Well, I was assuming that gc would be a special case which doesn't the mtime of the old cruft pack. (Or more generally, any time an object is gets copied out of the cruft pack, either to a loose object, or to another pack, the mtime on the source pack should not be touched.) - Ted