Thomas Gummerer <t.gummerer@xxxxxxxxx> writes: > Hi, > > previous rounds (without api) are at $gmane/202752, $gmane/202923, > $gmane/203088 and $gmane/203517, the previous rounds with api were at > $gmane/229732, $gmane/230210 and $gmane/232488. Thanks to Duy for > reviewing the the last round and Junio, Ramsay and Eric for additional > comments. > > Since the last round I've added a POC for partial writing, resulting > in the following performance improvements for update-index: > > Test 1063432 HEAD > ------------------------------------------------------------------------------------ > 0003.2: v[23]: update-index 0.60(0.38+0.20) 0.76(0.36+0.17) +26.7% > 0003.3: v[23]: grep nonexistent -- subdir 0.28(0.17+0.11) 0.28(0.18+0.09) +0.0% > 0003.4: v[23]: ls-files -- subdir 0.26(0.15+0.10) 0.24(0.14+0.09) -7.7% > 0003.7: v[23] update-index 0.59(0.36+0.22) 0.58(0.36+0.20) -1.7% > 0003.9: v4: update-index 0.46(0.28+0.17) 0.45(0.30+0.11) -2.2% > 0003.10: v4: grep nonexistent -- subdir 0.26(0.14+0.11) 0.21(0.14+0.07) -19.2% > 0003.11: v4: ls-files -- subdir 0.24(0.14+0.10) 0.20(0.12+0.08) -16.7% > 0003.14: v4 update-index 0.49(0.31+0.18) 0.65(0.34+0.17) +32.7% > 0003.16: v5: update-index 0.53(0.30+0.22) 0.50(0.28+0.20) -5.7% > 0003.17: v5: ls-files 0.27(0.15+0.12) 0.27(0.17+0.10) +0.0% > 0003.18: v5: grep nonexistent -- subdir 0.02(0.01+0.01) 0.03(0.01+0.01) +50.0% > 0003.19: v5: ls-files -- subdir 0.02(0.00+0.02) 0.02(0.01+0.01) +0.0% > 0003.22: v5 update-index 0.53(0.29+0.23) 0.02(0.01+0.01) -96.2% > > Given this, I don't think a complete change of the in-core format for > the cache-entries is necessary to take full advantage of the new index > file format. Instead some changes to the current in-core format would > work well with the new on-disk format. > > The current in-memory format fits the internal needs of git fairly well, > so I don't think changing it to fit a better index file format would > make a lot of sense, given that we can take advantage of the new format > with the existing in-memory format. Any more opinions on this series? I've applied the changes suggested by Duy, Antoine and Eric locally, but I wouldn't want to spam the list with the whole series without a chance of this being applied. How do you want me to proceed? > This series doesn't use kb/fast-hashmap yet, but that should be fairly > simple to change if the series is deemed a good change. The > performance tests for update-index test require > tg/perf-lib-test-perf-cleanup. > > Other changes, made following the review comments are: > > documentation: add documentation of the index-v5 file format > - Update documentation that directory flags are now 32-bits. That > makes aligned access simpler > - offset_to_offset is no longer included in the checksum for files. > It's unnecessary. > > read-cache: read index-v5 > - Add fix for reading with different level pathspecs given > - Use init_directory_entry to initialize all fields in a new > directory entry > - use memset to simplify the create_new_conflict function > - Add comments to explain -5 when reading directories and files > - Add comments for the more complex functions > - Add name flex_array to the end of ondisk_directory_entry for > simplified reading > - Add name flex_array to the end of ondisk_cache_entry for > simplified reading > - Move conflict reading functions to next patch > - mark functions as static when they are > > read-cache: read resolve-undo data > - Add comments for the more complex function > - Read conflicts + resolve undo data as extension > > read-cache: read cache-tree in index-v5 > - Add comments for the more complex function > - Instead of sorting the directory entries, sort the cache-tree > directly. This also required changing the algorithms with which > the cache entries are extracted from the directory tree. > > read-cache: write index-v5 > - Free pointers allocated by super_directory > - Rewrite condition as suggested by Duy > - Don't check for CE_REMOVE'd entries in the writing code, they are > already checked in the compile_directory_data code > - Remove overly complicated directory size calculation since flags > are now 32-bits > > read-cache: write resolve-undo data for index-v5 > - Free pointers allocated by super_directory > - Write conflicts + resolve undo data as extension > > introduce GIT_INDEX_VERSION environment variable > - Add documentation for GIT_INDEX_VERSION > > test-lib: allow setting the index format version > > Removed commits: > - read-cache: don't check uid, gid, ino > - read-cache: use fixed width integer types (independently in pu) > - read-cache: clear version in discard_index() > > Typos fixed as suggested by Eric Sunshine > > Thomas Gummerer (22): > read-cache: split index file version specific functionality > read-cache: move index v2 specific functions to their own file > read-cache: Re-read index if index file changed > add documentation for the index api > read-cache: add index reading api > make sure partially read index is not changed > grep.c: use index api > ls-files.c: use index api > documentation: add documentation of the index-v5 file format > read-cache: make in-memory format aware of stat_crc > read-cache: read index-v5 > read-cache: read resolve-undo data > read-cache: read cache-tree in index-v5 > read-cache: write index-v5 > read-cache: write index-v5 cache-tree data > read-cache: write resolve-undo data for index-v5 > update-index.c: rewrite index when index-version is given > introduce GIT_INDEX_VERSION environment variable > test-lib: allow setting the index format version > t1600: add index v5 specific tests > POC for partial writing > perf: add partial writing test > > Thomas Rast (1): > p0003-index.sh: add perf test for the index formats > > Documentation/git.txt | 5 + > Documentation/technical/api-in-core-index.txt | 56 +- > Documentation/technical/index-file-format-v5.txt | 294 +++++ > Makefile | 10 + > builtin/apply.c | 2 + > builtin/grep.c | 69 +- > builtin/ls-files.c | 36 +- > builtin/update-index.c | 50 +- > cache-tree.c | 15 +- > cache-tree.h | 2 + > cache.h | 115 +- > lockfile.c | 2 +- > read-cache-v2.c | 561 +++++++++ > read-cache-v5.c | 1406 ++++++++++++++++++++++ > read-cache.c | 691 +++-------- > read-cache.h | 67 ++ > resolve-undo.c | 1 + > t/perf/p0003-index.sh | 74 ++ > t/t1600-index-v5.sh | 25 + > t/t2101-update-index-reupdate.sh | 12 +- > t/test-lib-functions.sh | 5 + > t/test-lib.sh | 3 + > test-index-version.c | 6 + > unpack-trees.c | 3 +- > 24 files changed, 2921 insertions(+), 589 deletions(-) > create mode 100644 Documentation/technical/index-file-format-v5.txt > create mode 100644 read-cache-v2.c > create mode 100644 read-cache-v5.c > create mode 100644 read-cache.h > create mode 100755 t/perf/p0003-index.sh > create mode 100755 t/t1600-index-v5.sh > > -- > 1.8.4.2 > -- Thomas -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html