Re: [PATCH v2] packfile: freshen the mtime of packfile by configuration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> 2021年7月14日 10:52,Taylor Blau <me@xxxxxxxxxxxx> 写道:
> 
> On Wed, Jul 14, 2021 at 03:39:18AM +0200, Ævar Arnfjörð Bjarmason wrote:
>> Hrm, per my v1 feedback (and I'm not sure if my suggestion is even good
>> here, there's others more familiar with this area than I am), I was
>> thinking of something like a *.bump file written via:
>> 
>>    core.packUseBumpFiles=bool
>> 
>> Or something like that, anyway, the edge case in allowing the user to
>> pick arbitrary suffixes is that we'd get in-the-wild user arbitrary
>> configuration squatting on a relatively sensitive part of the object
>> store.
>> 
>> E.g. we recently added *.rev files to go with
>> *.{pack,idx,bitmap,keep,promisor} (and I'm probably forgetting some
>> suffix). What if before that a user had set:
>> 
>>    core.packMtimeSuffix=rev
> 
> I think making the suffix configurable is probably a mistake. It seems
> like an unnecessary detail to expose, but it also forces us to think
> about cases like these where the configured suffix is already used for
> some other purpose.

Thanks, I agree with you and will fix it, such like the *.keep file, we
do not use the suffix configuration to create keep files.

> 
> I don't think that a new ".bump" file is a bad idea, but it does seem
> like we have a lot of files that represent a relatively little amount of
> the state that a pack can be in. The ".promisor" and ".keep" files both
> come to mind here. Some thoughts in this direction:
> 
>  - Combining *all* of the pack-related files (including the index,
>    reverse-index, bitmap, and so on) into a single "pack-meta" file
>    seems like a mistake for caching reasons.
> 
>  - But a meta file that contains just the small state (like promisor
>    information and whether or not the pack is "kept") seems like it
>    could be OK. On the other hand, being able to tweak the kept state
>    by touching or deleting a file is convenient (and having to rewrite
>    a meta file containing other information is much less so).

Yes, read and rewrite a meta file also means we need do lock/unlock, which
may cause inconvenient operations.

> 
> But a ".bump" file does seem like an awkward way to not rely on the
> mtime of the pack itself. And I do think it runs into compatibility
> issues like Ævar mentioned. Any version of Git that includes a
> hypothetical .bump file (or something like it) needs to also update the
> pack's mtime, too, so that old versions of Git can understand it. (Of
> course, that could be configurable, but that seems far too obscure to
> me).

Here we will have a configuration and default is backward compatiblity,
and if user decide to use the '.bump' file, which means he can not use
the old versions of Git, like the Repository Format Version, it is limited.

> 
> Stepping back, I'm not sure I understand why freshening a pack is so
> slow for you. freshen_file() just calls utime(2), and any sync back to
> the disk shouldn't need to update the pack itself, just a couple of
> fields in its inode. Maybe you could help explain further.
> 
> In any case, I couldn't find a spot in your patch that updates the
> packed_git's 'mtime' field, which is used to (a) sort packs in the
> linked list of packs, and (b) for determining the least-recently used
> pack if it has individual windows mmapped.

The reason why we want to avoid freshen the mtime of ".pack" file is to
improve the reading speed of Git Servers.

We have some large repositories in our Git Severs (some are bigger than 10GB),
and we created '.keep' files for large ".pack" files, we want the big files
unchanged to speed up git upload-pack, because in our mind the file system
cache will reduce the disk IO if a file does not changed.

However we find the mtime of ".pack" files changes over time which makes the
file system always reload the big files, that takes a lot of IO time and result
in lower speed of git upload-pack and even further the disk IOPS is exhausted.

> 
> Thanks,
> Taylor
> 






[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux