Document the new index api and add examples of how it should be used instead of the old functions directly accessing the index. Helped-by: Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx> Signed-off-by: Thomas Gummerer <t.gummerer@xxxxxxxxx> --- Duy Nguyen <pclouds@xxxxxxxxx> writes: > Hmm.. I was confused actually (documentation on the api would help > greatly). As promised, a draft for a documentation for the index api as it is in this series. Documentation/technical/api-in-core-index.txt | 108 +++++++++++++++++++++++++- 1 file changed, 106 insertions(+), 2 deletions(-) diff --git a/Documentation/technical/api-in-core-index.txt b/Documentation/technical/api-in-core-index.txt index adbdbf5..5269bb1 100644 --- a/Documentation/technical/api-in-core-index.txt +++ b/Documentation/technical/api-in-core-index.txt @@ -1,14 +1,116 @@ in-core index API ================= +Reading API +----------- + +`read_index()`:: + Read the whole index file from disk. + +`index_name_pos(name, namelen)`:: + Find a cache_entry with name in the index. Returns pos if an + entry is matched exactly and -pos-1 if an entry is matched + partially. + e.g. + index: + file1 + file2 + path/file1 + zzz + + index_name_pos("path/file1", 10) returns 2, while + index_name_pos("path", 4) returns -1 + +`read_index_filtered(opts)`:: + This method behaves differently for index-v2 and index-v5. + + For index-v2 it simply reads the whole index as read_index() + does, so we are sure we don't have to reload anything if the + user wants a different filter. It also sets the filter_opts + in the index_state, which is used to limit the results when + iterating over the index with for_each_index_entry(). + + The whole index is read to avoid the need to eventually + re-read the index later, because the performance is no + different when reading it partially. + + For index-v5 it creates an adjusted_pathspec to filter the + reading. First all the directory entries are read and then + the cache_entries in the directories that match the adjusted + pathspec are read. The filter_opts in the index_state are set + to filter out the rest of the cache_entries that are matched + by the adjusted pathspec but not by the pathspec given. The + rest of the index entries are filtered out when iterating over + the cache with for_each_index_entries. + +`get_index_entry_by_name(name, namelen, &ce)`:: + Returns a cache_entry matched by the name, returned via the + &ce parameter. If a cache entry is matched exactly, 1 is + returned, otherwise 0. For an example see index_name_pos(). + This function should be used instead of the index_name_pos() + function to retrieve cache entries. + +`for_each_index_entry(fn, cb_data)`:: + Iterates over all cache_entries in the index filtered by + filter_opts in the index_stat. For each cache entry fn is + executed with cb_data as callback data. From within the loop + do `return 0` to continue, or `return 1` to break the loop. + +`next_index_entry(ce)`:: + Returns the cache_entry that follows after ce + +`index_change_filter_opts(opts)`:: + This function again has a slightly different functionality for + index-v2 and index-v5. + + For index-v2 it simply changes the filter_opts, so + for_each_index_entry uses the changed index_opts, to iterate + over a different set of cache entries. + + For index-v5 it refreshes the index if the filter_opts have + changed and sets the new filter_opts in the index state, again + to iterate over a different set of cache entries as with + index-v2. + + This has some optimization potential, in the case that the + opts get stricter (less of the index should be read) it + doesn't have to reload anything, but currently does. + +Using the new index api +----------------------- + +Currently loops over a specific set of index entry were written as: + i = start_index; + while (i < active_nr) { ce = active_cache[i]; do(something); i++; } + +they should be rewritten to: + ce = start; + while (ce) { do(something); ce = next_cache_entry(ce); } + +which is the equivalent operation but hides the in-memory format of +the index from the user. + +For getting a cache entry get_cache_entry_by_name() should be used +instead of cache_name_pos(). e.g.: + int pos = cache_name_pos(name, namelen); + struct cache_entry *ce = active_cache[pos]; + if (pos < 0) { do(something) } + else { do(somethingelse) } + +should be written as: + struct cache_entry *ce; + int ret = get_cache_entry_by_name(name, namelen, &ce); + if (!ret) { do(something) } + else { do(somethingelse) } + +TODO +---- Talk about <read-cache.c> and <cache-tree.c>, things like: * cache -> the_index macros -* read_index() * write_index() * ie_match_stat() and ie_modified(); how they are different and when to use which. -* index_name_pos() * remove_index_entry_at() * remove_file_from_index() * add_file_to_index() @@ -18,4 +120,6 @@ Talk about <read-cache.c> and <cache-tree.c>, things like: * cache_tree_invalidate_path() * cache_tree_update() + + (JC, Linus) -- 1.8.3.453.g1dfc63d -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html