On 12/13/2011 01:45 AM, Junio C Hamano wrote: > mhagger@xxxxxxxxxxxx writes: > >> From: Michael Haggerty <mhagger@xxxxxxxxxxxx> >> >> This purely textual change is in preparation for storing references >> hierarchically, when the old ref_array structure will represent one >> "directory" of references. Rename functions that deal with this >> structure analogously, and also rename the structure's "refs" member >> to "entries". >> >> Signed-off-by: Michael Haggerty <mhagger@xxxxxxxxxxxx> >> --- >> refs.c | 166 ++++++++++++++++++++++++++++++++-------------------------------- >> 1 files changed, 83 insertions(+), 83 deletions(-) >> >> diff --git a/refs.c b/refs.c >> index fe6d657..b74ef80 100644 >> --- a/refs.c >> +++ b/refs.c >> @@ -106,9 +106,9 @@ struct ref_value { >> unsigned char peeled[20]; >> }; >> >> -struct ref_array { >> +struct ref_dir { >> int nr, alloc; >> - struct ref_entry **refs; >> + struct ref_entry **entries; >> }; > > The s/refs/entries/ renaming is a sane thing to do; on the other hand, I > somehow find the s/ref_array/ref_dir/ renaming is a short-sighted change > and undesirable, as you are essentially declaring that "if you use this > structure, the contents you store there are expected to be named > hierarchically", forbidding users that want to use a simple flat array. This data structure is not exposed outside of the module, so it only describes the de facto current storage scheme rather than making any promises to the outside world. > BUT. That was an observation before I continued reading the remainder of > the series. > I think the above observation primarily come from my worries around the > extra ref stuff, which by nature does not fit well with the hierarchical > naming (they do not even have any meaningful names). Sorting or requiring > uniqueness among them do not make any sense, let alone cutting their names > in hierarchical boundaries. > > As an alternative, it _might_ make sense to get rid of "add_extra_ref()" > API from refs.c and make it the responsibility for its users to add their > extra anchor points where they use for_each_ref() to find out all anchor > points in the history from the refs.c subsystem. If we go that route, I > fully agree that "s/ref_array/ref_dir/" renaming is the right thing to do, > as refs.c subsystem will _only_ handle the hierarchical ref namespace and > nothing else after such a change. I absolutely agree; the fact that extra refs are part of the refs module has the foul smell of expedience. And your suggestion for changing the situation makes sense to me as far as I can follow it. But I don't think I have the gumption to attack another big part of the code base before I've even finished the changes that I still have planned for the refs API. If somebody else wants to volunteer to make extra_refs redundant, I would be happy to support that person on the refs side. Otherwise, I propose to just avoid de-duping extra refs so that my patch series doesn't make things worse. If it is acceptable, I would prefer to add the fix as a patch at the end of the series, because after struct ref_dir: store a reference to the enclosing ref_cache ref_dir carries around information that can be used to distinguish between the extra_refs and other ref caches. I think it is as easy as the attached patch (untested). Apropos testing, it is unsettling that our test suite doesn't show any failures after my changes. The dir entries in extra_refs are now always sorted and de-duped when do_for_each_ref() is called. Could it be that duplicate ".have" references never come up in the test suite? It sounds like an important code path is not being tested. In particular, I won't be able to test whether my fix works :-/ Michael -- Michael Haggerty mhagger@xxxxxxxxxxxx http://softwareswirl.blogspot.com/
diff --git a/refs.c b/refs.c index 2d5c827..2fd3db2 100644 --- a/refs.c +++ b/refs.c @@ -457,7 +457,6 @@ static int is_dup_ref(const struct ref_entry *ref1, const struct ref_entry *ref2 */ static void sort_ref_dir(struct ref_entry *direntry) { - int i, j; struct ref_entry *last = NULL; struct ref_dir *dir; assert(direntry->flag & REF_DIR); @@ -468,19 +467,24 @@ static void sort_ref_dir(struct ref_entry *direntry) qsort(dir->entries, dir->nr, sizeof(*dir->entries), ref_entry_cmp); - /* Remove any duplicates: */ - for (i = 0, j = 0; j < dir->nr; j++) { - struct ref_entry *entry = dir->entries[j]; - if (last && is_dup_ref(last, entry)) { - free_ref_entry(entry); - } else if (entry->flag & REF_DIR) { - dir->entries[i++] = entry; - last = NULL; - } else { - last = dir->entries[i++] = entry; + /* Remove any duplicates, but not for extra_refs: */ + if (dir->ref_cache != NULL) { + int i, j; + for (i = 0, j = 0; j < dir->nr; j++) { + struct ref_entry *entry = dir->entries[j]; + if (last && is_dup_ref(last, entry)) { + free_ref_entry(entry); + } else if (entry->flag & REF_DIR) { + dir->entries[i++] = entry; + last = NULL; + } else { + last = dir->entries[i++] = entry; + } } + dir->nr = i; } - dir->sorted = dir->nr = i; + + dir->sorted = dir->nr; } #define DO_FOR_EACH_INCLUDE_BROKEN 01