Re: [PATCH] packfile: enhance the mtime of packfile by idx file

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Mon, 12 Jul 2021 01:44:15 +0200

On Sat, Jul 10 2021, Sun Chao via GitGitGadget wrote:

> From: Sun Chao <16657101987@xxxxxxx>
>
> Commit 33d4221c79 (write_sha1_file: freshen existing objects,
> 2014-10-15) avoid writing existing objects by freshen their
> mtime (especially the packfiles contains them) in order to
> aid the correct caching, and some process like find_lru_pack
> can make good decision. However, this is unfriendly to
> incremental backup jobs or services rely on file system
> cache when there are large '.pack' files exists.
>
> For example, after packed all objects, use 'write-tree' to
> create same commit with the same tree and same environments
> such like GIT_COMMITTER_DATE and GIT_AUTHOR_DATE, we can
> notice the '.pack' file's mtime changed, but '.idx' file not.
>
> So if we update the mtime of packfile by updating the '.idx'
> file instead of '.pack' file, when we check the mtime
> of packfile, get it from '.idx' file instead. Large git
> repository may contains large '.pack' files, but '.idx'
> files are smaller enough, this can avoid file system cache
> reload the large files again and speed up git commands.
>
> Signed-off-by: Sun Chao <16657101987@xxxxxxx>

Does this have the unstated trade-off that in a mixed-version
environment (say two git versions coordinating writes to an NFS share)
where one is old and thinks *.pack needs updating, and the other is new
and thinks *.idx is what should be checked, that until both are upgraded
we're effectively back to pre-33d4221c79.

I don't think it's a dealbreaker, just wondering if I've got that right
& if it is's a trade-off you thought about, maybe we should check the
mtime of both. The stat() is cheap, it's the re-sync that matters for
you.

But just to run with that thought, wouldn't it be even more helpful to
you to have say a config setting to create a *.bump file next to the
*.{idx,pack}.

Then you'd have an empty file (the *.idx is smaller, but still not
empty), and as a patch it seems relatively simple, i.e. some core.* or
gc.* or pack.* setting changing what we touch/stat().