Re: [PATCH v2 0/4] [RFC] repack: add --filter=

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Taylor,

On 21 Feb 2022, at 16:42, Taylor Blau wrote:

> On Mon, Feb 21, 2022 at 10:10:15PM +0100, Christian Couder wrote:
>>>> Also, to have more protection we can either
>>>>
>>>> 1. add a config value that needs to be set to true for repack to remove
>>>> objects (repack.allowDestroyFilter).
>>
>> I don't think it's of much value. We don't have such config values for
>> other possibly destructive operations.
>>
>>>> 2. --filter is dry-run by default and prints out objects that would have been removed,
>>>> and it has to be combined with another flag --destroy in order for it to actually remove
>>>> objects from the odb.
>>
>> I am not sure it's of much value either compared to naming it
>> --filter-destroy. It's likely to just make things more difficult for
>> users to understand.
>
> On this and the above, I agree with Christian.
>
>>> I share the same concern as Robert and Stolee do. But I think this issue
>>> goes deeper than just naming.
>>>
>>> Even if we called this `git repack --delete-filter` and only ran it with
>>> `--i-know-what-im-doing` flag, we would still be leaving repository
>>> corruption on the table, just making it marginally more difficult to
>>> achieve.
>>
>> My opinion on this is that the promisor object mechanism assumes by
>> design that some objects are outside a repo, and that this repo
>> shouldn't care much about these objects possibly being corrupted.
>
> For what it's worth, I am fine having a mode of repack which allows us
> to remove objects that we know are stored by a promisor remote. But this
> series doesn't do that, so users could easily run `git repack -d
> --filter=...` and find that they have irrecoverably corrupted their
> repository.
>
> I think that there are some other reasonable directions, though. One
> which Robert and I discussed was making it possible to split a
> repository into two packs, one which holds objects that match some
> `--filter` criteria, and one which holds the objects that don't match
> that filter.
>
> Another option would be to prune the repository according to objects
> that are already made available by a promisor remote.

Thanks for the discussion around the two packfile idea. Definitely interesting.
However, I'm leaning towards the second option here where we ensure that objects that
are about to be deleted can be retrieved via a promisor remote. That way we have an
easy path to recovery.

>
> An appealing quality about the above two directions is that the first
> doesn't actually remove any objects, just makes it easier to push a
> whole pack of unwanted objects off to a promsior remote. The second
> prunes the repository according to objects that are already made
> available by the promisor remote. (Yes, there is a TOCTOU race there,
> too, but it's the same prune-while-pushing race that Git already has
> today).
>
>> I am not against a name and some docs that strongly state that users
>> should be very careful when using such a command, but otherwise I
>> think such a command is perfectly ok. We have other commands that by
>> design could lead to some objects or data being lost.
>
> I can think of a handful of ways to remove objects which are unreachable
> from a repository, but I am not sure we have any ways to remove objects
> which are reachable.
>
>>> But as it stands right now, I worry that this feature is too easily
>>> misused and could result in unintended repository corruption.
>>
>> Are you worrying about the UI or about what it does?
>>
>> I am ok with improving the UI, but I think what it does is reasonable.
>
> I am more worried about the proposal's functionality than its UI,
> hopefully my concerns there are summarized above.
>
> Thanks,
> Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux