Re: [PATCH v3] packfile: freshen the mtime of packfile by configuration

Sun Chao <16657101987@xxxxxxx> · Tue, 20 Jul 2021 23:34:25 +0800



> 2021年7月20日 14:19，Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> 写道：
> 
> 
> On Mon, Jul 19 2021, Taylor Blau wrote:
> 
>> On Mon, Jul 19, 2021 at 07:53:19PM +0000, Sun Chao via GitGitGadget wrote:
>>> From: Sun Chao <16657101987@xxxxxxx>
>>> 
>>> Commit 33d4221c79 (write_sha1_file: freshen existing objects,
>>> 2014-10-15) avoid writing existing objects by freshen their
>>> mtime (especially the packfiles contains them) in order to
>>> aid the correct caching, and some process like find_lru_pack
>>> can make good decision. However, this is unfriendly to
>>> incremental backup jobs or services rely on cached file system
>>> when there are large '.pack' files exists.
>>> 
>>> For example, after packed all objects, use 'write-tree' to
>>> create same commit with the same tree and same environments
>>> such like GIT_COMMITTER_DATE and GIT_AUTHOR_DATE, we can
>>> notice the '.pack' file's mtime changed. Git servers
>>> that mount the same NFS disk will re-sync the '.pack' files
>>> to cached file system which will slow the git commands.
>>> 
>>> So if add core.freshenPackfiles to indicate whether or not
>>> packs can be freshened, turning off this option on some
>>> servers can speed up the execution of some commands on servers
>>> which use NFS disk instead of local disk.
>> 
>> Hmm. I'm still quite unconvinced that we should be taking this direction
>> without better motivation. We talked about your assumption that NFS
>> seems to be invalidating the block cache when updating the inodes that
>> point at those blocks, but I don't recall seeing further evidence.
> 
> I don't know about Sun's setup, but what he's describing is consistent
> with how NFS works, or can commonly be made to work.
> 
> See e.g. "lookupcache" in nfs(5) on Linux, but also a lot of people use
> some random vendor's proprietary NFS implementation, and commonly tweak
> various options that make it anywhere between "I guess that's not too
> crazy" and "are you kidding me?" levels of non-POSIX compliant.
> 
>> Regardless, a couple of idle thoughts:
>> 
>>> +	if (!core_freshen_packfiles)
>>> +		return 1;
>> 
>> It is important to still freshen the object mtimes even when we cannot
>> update the pack mtimes. That's why we return 0 when "freshen_file"
>> returned 0: even if there was an error calling utime, we should still
>> freshen the object. This is important because it impacts when
>> unreachable objects are pruned.
>> 
>> So I would have assumed that if a user set "core.freshenPackfiles=false"
>> that they would still want their object mtimes updated, in which case
>> the only option we have is to write those objects out loose.
>> 
>> ...and that happens by the caller of freshen_packed_object (either
>> write_object_file() or hash_object_file_literally()) then calling
>> write_loose_object() if freshen_packed_object() failed. So I would have
>> expected to see a "return 0" in the case that packfile freshening was
>> disabled.
>> 
>> But that leads us to an interesting problem: how many redundant objects
>> do we expect to see on the server? It may be a lot, in which case you
>> may end up having the same IO problems for a different reason. Peff
>> mentioned to me off-list that he suspected write-tree was overeager in
>> how many trees it would try to write out. I'm not sure.
> 
> In my experience with NFS the thing that kills you is anything that
> needs to do iterations, i.e. recursive readdir() and the like, or to
> read a lot of data, throughput was excellent. It's why I hacked core
> that core.checkCollisions patch.
I have read your patch, looks like a good idea to reduce the expensive operations
like readdir(). And in my production environments, IOPS stress of the NFS server
bothers me which make the git commands slow.

> 
> Jeff improved the situation I was mainly trying to fix with with the
> loose objects cache. I never got around to benchmarking the two in
> production, and now that setup is at an ex-job...
> 
>>> +test_expect_success 'do not bother loosening old objects without freshen pack time' '
>>> +	obj1=$(echo three | git hash-object -w --stdin) &&
>>> +	obj2=$(echo four | git hash-object -w --stdin) &&
>>> +	pack1=$(echo $obj1 | git -c core.freshenPackFiles=false pack-objects .git/objects/pack/pack) &&
>>> +	pack2=$(echo $obj2 | git -c core.freshenPackFiles=false pack-objects .git/objects/pack/pack) &&
>>> +	git -c core.freshenPackFiles=false prune-packed &&
>>> +	git cat-file -p $obj1 &&
>>> +	git cat-file -p $obj2 &&
>>> +	test-tool chmtime =-86400 .git/objects/pack/pack-$pack2.pack &&
>>> +	git -c core.freshenPackFiles=false repack -A -d --unpack-unreachable=1.hour.ago &&
>>> +	git cat-file -p $obj1 &&
>>> +	test_must_fail git cat-file -p $obj2
>>> +'
>> 
>> I had a little bit of a hard time following this test. AFAICT, it
>> proceeds as follows:
>> 
>>  - Write two packs, each containing a unique unreachable blob.
>>  - Call 'git prune-packed' with packfile freshening disabled, then
>>    check that the object survived.
>>  - Then repack while in a state where one of the pack's contents would
>>    be pruned.
>>  - Make sure that one object survives and the other doesn't.
>> 
>> This doesn't really seem to be testing the behavior of disabling
>> packfile freshening so much as it's testing prune-packed, and repack's
>> `--unpack-unreachable` option. I would probably have expected to see
>> something more along the lines of:
>> 
>>  - Write an unreachable object, pack it, and then remove the loose copy
>>    you wrote in the first place.
>>  - Then roll the pack's mtime to some fixed value in the past.
>>  - Try to write the same object again with packfile freshening
>>    disabled, and verify that:
>>    - the pack's mtime was unchanged,
>>    - the object exists loose again
>> 
>> But I'd really like to get some other opinions (especially from Peff,
>> who brought up the potential concerns with write-tree) as to whether or
>> not this is a direction worth pursuing.
>> 
>> My opinion is that it is not, and that the bizarre caching behavior you
>> are seeing is out of Git's control.
> 
> Thanks for this, I found the test hard to follow too, but didn't have
> time to really think about it, this makes sense.
> 
> Back to the topic: I share your sentiment of trying to avoid complexity
> in this area.
> 
> Sun: Have you considered --keep-unreachable to "git repack"? It's
> orthagonal to what you're trying here, but I wonder if being more
> aggressive about keeping objects + some impromevents to perhaps skip
> this "touch" at all if we have that in effect wouldn't be more viable &
> something e.g. Taylor would be more comforable having part of git.git.

Yes, I will try to create some more useful test cases with both `--keep-unreachable`
and `--unpack-unreachable` if I still believe I need to do something with
the mtime, whatever I need to get my test reports first, thanks ;)