Re: [PATCH 03/13] list-objects: filter objects in traverse_commit_list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 27 Sep 2017 13:04:42 -0400
Jeff Hostetler <git@xxxxxxxxxxxxxxxxx> wrote:

> The sparse filter is looking at pathnames and using the same rules
> as sparse-checkout to decide which to *include* in the result.  This
> is essentially backwards from the other filters which are looking for
> reasons to *exclude* a blob.  If I see a {pathname, sha} pair and the
> pathname is not wanted (by the sparse-checkout rules), I still don't
> know anything about the object -- since the same SHA may appear later
> in the treewalk but with a different pathname that *does* match the
> patterns and *is* wanted.
> 
> The net-net is that I have to mark the blob as "provisionally omitted"
> until I've seen all the pathnames associated with it.  Only then can I
> say it should be omitted.

How is this different from refraining from marking the blob as
LOFR_MARK_SEEN? When you would provisionally omit the blob, return
LOFR_ZERO so that a future iteration will revisit the blob again, and
when you would include it in the output, return
LOFR_MARK_SEEN|LOFR_SHOW.

> Likewise, there are things about the tree object that we cannot
> decide until we've seen all possible directory paths that reference it.
> For example, if you rename a tree/directory between 2 commits, but make no
> other changes within the directory, it will/should have the same SHA in the
> second commit.  If one of those paths is included in the sparse-checkout
> and one is not, then you need include those blobs (and the tree object)
> in the result.  If the treewalk visits the excluded case first, you don't
> want to discard the tree object (and shortcut future treewalks) because
> the filter won't get a chance to see the included directory path case.

For trees, I guess it's slightly different in that you do need an extra
flag to keep track of whether the tree has been shown. So mark SHOWN and
return LOFR_SHOW on the first time the tree is shown, and LOFR_ZERO
otherwise. And trees must never be marked as LOFR_MARK_SEEN.

(This SHOWN flag might play a similar role to your FILTER_REVISIT.)

Until now, it seems to me that the _END event is not required.

> Also, the current code does not attempt to omit tree objects, but a
> future version may.  And having the _BEGIN_ and _END_ events means the
> filter can keep track of the nesting without the expense of having to
> discover it by parsing the pathname looking for slashes as we do elsewhere.

A feature that omits tree objects would need _END, true. But until
then, I personally don't think we should add such infrastructure until
we have a feature that needs it.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux