Re: RFC: New reference iteration paradigm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff King <peff@xxxxxxxx> writes:

> On Thu, Mar 31, 2016 at 11:01:44AM -0700, Junio C Hamano wrote:
>
>> Michael Haggerty <mhagger@xxxxxxxxxxxx> writes:
>> 
>> > the backend now has to implement
>> >
>> >> struct ref_iterator *ref_iterator_begin_fn(const char *submodule,
>> >>                                            const char *prefix,
>> >>                                            unsigned int flags);
>> >
>> > The ref_iterator itself has to implement two main methods:
>> >
>> >> int iterator_advance_fn(struct ref_iterator *ref_iterator);
>> >> void iterator_free_fn(struct ref_iterator *ref_iterator);
>> >
>> > A loop over references now looks something like
>> >
>> >> struct ref_iterator *iter = each_ref_in_iterator("refs/tags/");
>> >> while (ref_iterator_advance(iter)) {
>> >>         /* code using iter->refname, iter->oid, iter->flags */
>> >> }
>> 
>> We'd want to take advantage of the tree-like organization of the
>> refs (i.e. refs/tags/a and refs/tags/b sit next to each other and
>> they are closer to each other than they are to refs/heads/a) so that
>> a request "I want to iterate only over tags, even though I may have
>> millions of other kinds of refs" can be done with cost that is
>> proportional to how many tags you have.
>> 
>> The current implementation of for_each_tag_ref() that goes down to
>> do_for_each_entry() in files-backend.c has that propertly, and the
>> new iteration mechanism with the above design seems to keep it,
>> which is very nice.
>
> Actually, that is a slight fiction. :)

I know.  My first draft had "(at least for the loose ref side)"
there, but I omitted it for brevity.

> We traverse only the loose ref directories we need, but we populate the
> entire packed-refs tree in one go.
> ...
> 800MB packed-refs file, as looking up one tiny subset of the entries
> wastes a lot of RAM and CPU pulling that into our internal
> representation[1].

Yes, that is an important use case that needs to be kept in mind for
any restructure of this machinery.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux