Re: [PATCH 1/3] list-objects: add filter_blob to traverse_commit_list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 22 Jun 2017 14:45:26 -0700
Jonathan Tan <jonathantanmy@xxxxxxxxxx> wrote:

> On Thu, 22 Jun 2017 20:36:13 +0000
> Jeff Hostetler <git@xxxxxxxxxxxxxxxxx> wrote:
> 
> > From: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx>
> > 
> > In preparation for partial/sparse clone/fetch where the
> > server is allowed to omit large/all blobs from the packfile,
> > teach traverse_commit_list() to take a blob filter-proc that
> > controls when blobs are shown and marked as SEEN.
> > 
> > Normally, traverse_commit_list() always marks visited blobs
> > as SEEN, so that the show_object() callback will never see
> > duplicates.  Since a single blob OID may be referenced by
> > multiple pathnames, we may not be able to decide if a blob
> > should be excluded until later pathnames have been traversed.
> > With the filter-proc, the automatic deduping is disabled.
> 
> Comparing this approach (this patch and the next one) against mine [1],
> I see that this has the advantage of (in pack-objects) avoiding the
> invocation of add_preferred_base_object() on blobs that are filtered
> out, avoiding polluting the "to_pack" data structure with information
> that we do not need.
> 
> But it does add an extra place where blobs are filtered out (besides
> add_object_entry()), which makes it harder for the reader to keep track
> of what's going on. I took a brief look to see if filtering could be
> eliminated from add_object_entry(), but that function is called from
> many places, so I couldn't tell.
> 
> Anyway, I think both approaches will work (this patch's and mine [1]).
> 
> [1] https://public-inbox.org/git/6f7934621717141ce3bb6bc05cf1d59c7900ccc5.1496432147.git.jonathantanmy@xxxxxxxxxx/

Also it should be mentioned somewhere that this does not cover the
bitmap case - a similar discussion should be included in one of the
patches, like I did in [1].

And looking back at my original cover letter [2], I wrote:

> This is similar to [1] except that this
[...]
> is slightly more comprehensive in
> that the read_object_list_from_stdin() codepath is also covered in
> addition to the get_object_list() codepath. (Although, to be clear,
> upload-pack always passes "--revs" and thus only needs the
> get_object_list() codepath).

(The [1] in the quote above refers to one of Jeff Hostetler's patches,
[QUOTE 1].)

The same issue applies to this patch (since
read_object_list_from_stdin() invokes add_object_entry() directly
without going through the traversal mechanism) and probably warrants at
least some description in one of the commit messages.

[1] https://public-inbox.org/git/6f7934621717141ce3bb6bc05cf1d59c7900ccc5.1496432147.git.jonathantanmy@xxxxxxxxxx/
[2] https://public-inbox.org/git/cover.1496361873.git.jonathantanmy@xxxxxxxxxx/

[QUOTE 1] https://public-inbox.org/git/1488994685-37403-3-git-send-email-jeffhost@xxxxxxxxxxxxx/



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux