RFC: New reference iteration paradigm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Currently the way to iterate over references is via a family of
for_each_ref()-style functions. You pass some arguments plus a callback
function and cb_data to the function, and your callback is called for
each reference that is selected.

This works, but it has two big disadvantages:

1. It is cumbersome for callers. The caller's logic has to be split
   into two functions, the one that calls for_each_ref() and the
   callback function. Any data that have to be passed between the
   functions has to be stuck in a separate data structure.

2. This interface is not composable. For example, you can't write a
   single function that iterates over references from two sources,
   as is interesting for combining packed plus loose references,
   shared plus worktree-specific references, symbolic plus normal
   references, etc. The current code for combining packed and loose
   references needs to walk the two reference trees in lockstep,
   using intimate knowledge about how references are stored [1,2,3].

I'm currently working on a patch series to transition the reference code
from using for_each_ref()-style iteration to using proper iterators.

The main point of this change is to change the base iteration paradigm
that has to be supported by reference backends. So instead of

> int do_for_each_ref_fn(const char *submodule, const char *base,
>                        each_ref_fn fn, int trim, int flags,
>                        void *cb_data);

the backend now has to implement

> struct ref_iterator *ref_iterator_begin_fn(const char *submodule,
>                                            const char *prefix,
>                                            unsigned int flags);

The ref_iterator itself has to implement two main methods:

> int iterator_advance_fn(struct ref_iterator *ref_iterator);
> void iterator_free_fn(struct ref_iterator *ref_iterator);

A loop over references now looks something like

> struct ref_iterator *iter = each_ref_in_iterator("refs/tags/");
> while (ref_iterator_advance(iter)) {
>         /* code using iter->refname, iter->oid, iter->flags */
> }

I built quite a bit of ref_iterator infrastructure to make it easy to
plug things together quite flexibly. For example, there is an
overlay_ref_iterator which takes two other iterators (e.g., one for
packed and one for loose refs) and overlays them, presenting the result
via the same iterator interface. But the same overlay_ref_iterator can
be used to overlay any two other iterators on top of each other.

If you are interested, check out my branch wip/ref-iterators on my
GitHub repo [4]. That branch is based off of a version of David Turner's
patch series (i.e., it will have to be rebased at some point). But it
all works and the early part of the patch series is pretty well polished
I think. In fact, the later patches are optional; there is no special
reason to rewrite client code wholesale to use the new reference
iteration API, because the old API continues to be supported (but is now
built on the new API).

Feedback is welcome!

Michael

[1]
https://github.com/git/git/blob/90f7b16b3adc78d4bbabbd426fb69aa78c714f71/refs/files-backend.c#L1665-L1719
[2]
https://github.com/git/git/blob/90f7b16b3adc78d4bbabbd426fb69aa78c714f71/refs/files-backend.c#L582-L608
[3]
https://github.com/git/git/blob/90f7b16b3adc78d4bbabbd426fb69aa78c714f71/refs/files-backend.c#L610-L680
[4] https://github.com/mhagger/git
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]