On Wed, Jul 14, 2021 at 08:19:15PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> The reason why we want to avoid freshen the mtime of ".pack" file is to > >> improve the reading speed of Git Servers. > > > > That's surprising behavior to me. Are you saying that calling utime(2) > > causes the *page* cache to be invalidated and that most reads are > > cache-misses lowering overall IOPS? > > I think you may be right narrowly, but wrong in this context :) > > I.e. my understanding of this problem is that they have some incremental > backup job, e.g. rsync without --checksum (not that doing that would > help, chicken & egg issue).. Ah, thanks for explaining. That's helpful, and changes my thinking. Ideally, Sun would be able to use --checksum (if they are using rsync) or some equivalent (if they are not). In other words, this seems like a problem that Git shouldn't be bending over backwards for. But if that isn't possible, then I find introducing a new file to redefine the pack's mtime just to accommodate a backup system that doesn't know better to be a poor justification for adding this complexity. Especially since we agree that rsync-ing live Git repositories is a bad idea in the first place ;). If it were me, I would probably stop here and avoid pursuing this further. But an OK middle ground might be core.freshenPackfiles=<bool> to indicate whether or not packs can be freshened, or the objects contained within them should just be rewritten loose. Sun could then set this configuration to "false", implying: - That they would have more random loose objects, leading to some redundant work by their backup system. - But they wouldn't have to resync their huge packfiles. ...and we wouldn't have to introduce any new formats/file types to do it. To me, that seems like a net-positive outcome. Thanks, Taylor