Duy Nguyen <pclouds@xxxxxxxxx> writes: > On Sun, Jul 7, 2013 at 3:11 PM, Thomas Gummerer <t.gummerer@xxxxxxxxx> wrote: >> Add an api for access to the index file. Currently there is only a very >> basic api for accessing the index file, which only allows a full read of >> the index, and lets the users of the data filter it. The new index api >> gives the users the possibility to use only part of the index and >> provides functions for iterating over and accessing cache entries. >> >> This simplifies future improvements to the in-memory format, as changes >> will be concentrated on one file, instead of the whole git source code. >> >> Signed-off-by: Thomas Gummerer <t.gummerer@xxxxxxxxx> >> --- >> cache.h | 57 +++++++++++++++++++++++++++++- >> read-cache-v2.c | 96 +++++++++++++++++++++++++++++++++++++++++++++++-- >> read-cache.c | 108 ++++++++++++++++++++++++++++++++++++++++++++++++++++---- >> read-cache.h | 12 ++++++- >> 4 files changed, 263 insertions(+), 10 deletions(-) >> >> diff --git a/cache.h b/cache.h >> index 5082b34..d38dfbd 100644 >> --- a/cache.h >> +++ b/cache.h >> @@ -127,7 +127,8 @@ struct cache_entry { >> unsigned int ce_flags; >> unsigned int ce_namelen; >> unsigned char sha1[20]; >> - struct cache_entry *next; >> + struct cache_entry *next; /* used by name_hash */ >> + struct cache_entry *next_ce; /* used to keep a list of cache entries */ >> char name[FLEX_ARRAY]; /* more */ >> }; > > From what I read, doing > > ce = start; > while (ce) { do(something); ce = next_cache_entry(ce); } > > is the same as > > i = start_index; > while (i < active_nr) { ce = active_cache[i]; do(something); i++; } > > What's the advantage of using the former over the latter? Do you plan > to eliminate the latter loop (by hiding "struct cache_entry **cache;" > from public index_state structure? Yes, I wanted to eliminate the latter loop, because it depends on the in-memory format of the index. By moving all direct accesses of index_state->cache to an api it gets easier to change the in-memory format. I played a bit with a tree-based in-memory format [1], which represents the on-disk format of index-v5 more closely, making modifications and partial-loading simpler. I've tried switching all those loops to api calls, but that would make the api too bloated because there is a lot of those loops. I found it more sensible to do it this way, leaving the loops how they are, while making future changes to the in-memory format a lot simpler. [1] https://github.com/tgummerer/git/blob/index-v5api/read-cache-v5.c#L17 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html