Re: [PATCH v2 2/8] revision: mark commit parents as NOT_USER_GIVEN

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 06, 2021 at 01:30:57PM -0400, Jeff King wrote:
> On Mon, Mar 15, 2021 at 02:14:36PM +0100, Patrick Steinhardt wrote:
> 
> > The NOT_USER_GIVEN flag of an object marks whether a flag was explicitly
> > provided by the user or not. The most important use case for this is
> > when filtering objects: only objects that were not explicitly requested
> > will get filtered.
> > 
> > The flag is currently only set for blobs and trees, which has been fine
> > given that there are no filters for tags or commits currently. We're
> > about to extend filtering capabilities to add object type filter though,
> > which requires us to set up the NOT_USER_GIVEN flag correctly -- if it's
> > not set, the object wouldn't get filtered at all.
> > 
> > Mark unseen commit parents as NOT_USER_GIVEN when processing parents.
> > Like this, explicitly provided parents stay user-given and thus
> > unfiltered, while parents which get loaded as part of the graph walk
> > can be filtered.
> > 
> > This commit shouldn't have any user-visible impact yet as there is no
> > logic to filter commits yet.
> 
> I'm still scratching my head a bit to understand how NOT_USER_GIVEN can
> possibly be correct (as opposed to USER_GIVEN). If we visit the commit
> in a not-user-given context and add the flag, how do we know it wasn't
> _also_ visited in a user-given context?
> 
> Just guessing, but perhaps the SEEN flag is saving us here? If we visit
> the user-given commit itself first, then we give it the SEEN flag. Then
> if we try to visit it again via parent traversal, we've already
> processed it and don't add the NOT_USER_GIVEN flag here.

Yes, I think that's mostly it.

> That seems the opposite of the order we'd usually traverse, but I think
> we set SEEN on each commit in prepare_revision_walk(), before we do any
> traversing.
> 
> So I _think_ it all works even with your changes here, but I have to say
> this NOT_USER_GIVEN thing seems really fragile to me. Not new in your
> series, of course, but something we may want to look at.
> 
> Just grepping around, "rev-list -g" will happily remove SEEN flags, so I
> suspect it interacts badly with --filter. Just trying "rev-list -g
> --objects --filter=object:type=blob HEAD" shows that it produces quite a
> lot of commits (which I think is a more fundamental problem: it is not
> walking the parent chain at all to assign these NOT_USER_GIVEN flags).

I totally agree that this feels fragile, and developing this series with
NOT_USER_GIVEN wasn't the most enjoyable experience either. I wouldn't
love to be doing the conversion back to USER_GIVEN as part of this
series, but I wouldn't oppose doing that job either. Right now I don't
feel like I'm sufficiently sure that it's working for all cases, and
indeed your example with "rev-list -g" already shows one case where it's
breaking.

So let me know whether I should add the conversion as preparatory step.

Patrick

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux