On Thu, Feb 13, 2025 at 10:27:39AM +0100, Christian Couder wrote: > On Thu, Feb 13, 2025 at 8:13 AM Patrick Steinhardt <ps@xxxxxx> wrote: > > > We end up with two tables: the first one has been created when cloning > > the repository and contains all references. The second one has been > > created when deleting all references, so it only contains ref deletions. > > Because deletions don't have to carry an object ID, the resulting table > > is also much smaller. This has the effect that auto-compaction does not > > kick in, because we see that the geometric sequence is still intact. > > Not that I think we should work on this right now, but theoretically, > could we "just" count the number of entries in each file and base the > geometric sequence on the number of entries in each file instead of > file size? In theory we could, and that may lead to better results in edge cases like these indeed. And I think if either the header or footer of reftables contained a total count of contained records that might have been a viable thing to do indeed. But they don't, so we'd have to open and parse every complete reftable to do so. Because of that I think the cost of this would ultimately outweight the benfit. After all, this logic kicks in on every write to determine if we need to auto-compact. As a result, it needs to remain cheap. Patrick