Hi, previous rounds (without api) are at $gmane/202752, $gmane/202923, $gmane/203088 and $gmane/203517, the previous rounds with api were at $gmane/229732, $gmane/230210 and $gmane/232488. Thanks to Duy for reviewing the the last round and Junio, Ramsay and Eric for additional comments. Since the last round I've added a POC for partial writing, resulting in the following performance improvements for update-index: Test 1063432 HEAD ------------------------------------------------------------------------------------ 0003.2: v[23]: update-index 0.60(0.38+0.20) 0.76(0.36+0.17) +26.7% 0003.3: v[23]: grep nonexistent -- subdir 0.28(0.17+0.11) 0.28(0.18+0.09) +0.0% 0003.4: v[23]: ls-files -- subdir 0.26(0.15+0.10) 0.24(0.14+0.09) -7.7% 0003.7: v[23] update-index 0.59(0.36+0.22) 0.58(0.36+0.20) -1.7% 0003.9: v4: update-index 0.46(0.28+0.17) 0.45(0.30+0.11) -2.2% 0003.10: v4: grep nonexistent -- subdir 0.26(0.14+0.11) 0.21(0.14+0.07) -19.2% 0003.11: v4: ls-files -- subdir 0.24(0.14+0.10) 0.20(0.12+0.08) -16.7% 0003.14: v4 update-index 0.49(0.31+0.18) 0.65(0.34+0.17) +32.7% 0003.16: v5: update-index 0.53(0.30+0.22) 0.50(0.28+0.20) -5.7% 0003.17: v5: ls-files 0.27(0.15+0.12) 0.27(0.17+0.10) +0.0% 0003.18: v5: grep nonexistent -- subdir 0.02(0.01+0.01) 0.03(0.01+0.01) +50.0% 0003.19: v5: ls-files -- subdir 0.02(0.00+0.02) 0.02(0.01+0.01) +0.0% 0003.22: v5 update-index 0.53(0.29+0.23) 0.02(0.01+0.01) -96.2% Given this, I don't think a complete change of the in-core format for the cache-entries is necessary to take full advantage of the new index file format. Instead some changes to the current in-core format would work well with the new on-disk format. The current in-memory format fits the internal needs of git fairly well, so I don't think changing it to fit a better index file format would make a lot of sense, given that we can take advantage of the new format with the existing in-memory format. This series doesn't use kb/fast-hashmap yet, but that should be fairly simple to change if the series is deemed a good change. The performance tests for update-index test require tg/perf-lib-test-perf-cleanup. Other changes, made following the review comments are: documentation: add documentation of the index-v5 file format - Update documentation that directory flags are now 32-bits. That makes aligned access simpler - offset_to_offset is no longer included in the checksum for files. It's unnecessary. read-cache: read index-v5 - Add fix for reading with different level pathspecs given - Use init_directory_entry to initialize all fields in a new directory entry - use memset to simplify the create_new_conflict function - Add comments to explain -5 when reading directories and files - Add comments for the more complex functions - Add name flex_array to the end of ondisk_directory_entry for simplified reading - Add name flex_array to the end of ondisk_cache_entry for simplified reading - Move conflict reading functions to next patch - mark functions as static when they are read-cache: read resolve-undo data - Add comments for the more complex function - Read conflicts + resolve undo data as extension read-cache: read cache-tree in index-v5 - Add comments for the more complex function - Instead of sorting the directory entries, sort the cache-tree directly. This also required changing the algorithms with which the cache entries are extracted from the directory tree. read-cache: write index-v5 - Free pointers allocated by super_directory - Rewrite condition as suggested by Duy - Don't check for CE_REMOVE'd entries in the writing code, they are already checked in the compile_directory_data code - Remove overly complicated directory size calculation since flags are now 32-bits read-cache: write resolve-undo data for index-v5 - Free pointers allocated by super_directory - Write conflicts + resolve undo data as extension introduce GIT_INDEX_VERSION environment variable - Add documentation for GIT_INDEX_VERSION test-lib: allow setting the index format version Removed commits: - read-cache: don't check uid, gid, ino - read-cache: use fixed width integer types (independently in pu) - read-cache: clear version in discard_index() Typos fixed as suggested by Eric Sunshine Thomas Gummerer (22): read-cache: split index file version specific functionality read-cache: move index v2 specific functions to their own file read-cache: Re-read index if index file changed add documentation for the index api read-cache: add index reading api make sure partially read index is not changed grep.c: use index api ls-files.c: use index api documentation: add documentation of the index-v5 file format read-cache: make in-memory format aware of stat_crc read-cache: read index-v5 read-cache: read resolve-undo data read-cache: read cache-tree in index-v5 read-cache: write index-v5 read-cache: write index-v5 cache-tree data read-cache: write resolve-undo data for index-v5 update-index.c: rewrite index when index-version is given introduce GIT_INDEX_VERSION environment variable test-lib: allow setting the index format version t1600: add index v5 specific tests POC for partial writing perf: add partial writing test Thomas Rast (1): p0003-index.sh: add perf test for the index formats Documentation/git.txt | 5 + Documentation/technical/api-in-core-index.txt | 56 +- Documentation/technical/index-file-format-v5.txt | 294 +++++ Makefile | 10 + builtin/apply.c | 2 + builtin/grep.c | 69 +- builtin/ls-files.c | 36 +- builtin/update-index.c | 50 +- cache-tree.c | 15 +- cache-tree.h | 2 + cache.h | 115 +- lockfile.c | 2 +- read-cache-v2.c | 561 +++++++++ read-cache-v5.c | 1406 ++++++++++++++++++++++ read-cache.c | 691 +++-------- read-cache.h | 67 ++ resolve-undo.c | 1 + t/perf/p0003-index.sh | 74 ++ t/t1600-index-v5.sh | 25 + t/t2101-update-index-reupdate.sh | 12 +- t/test-lib-functions.sh | 5 + t/test-lib.sh | 3 + test-index-version.c | 6 + unpack-trees.c | 3 +- 24 files changed, 2921 insertions(+), 589 deletions(-) create mode 100644 Documentation/technical/index-file-format-v5.txt create mode 100644 read-cache-v2.c create mode 100644 read-cache-v5.c create mode 100644 read-cache.h create mode 100755 t/perf/p0003-index.sh create mode 100755 t/t1600-index-v5.sh -- 1.8.4.2 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html