Am 15.06.2017 um 15:25 schrieb Jeff King: > On Thu, Jun 15, 2017 at 01:33:34PM +0200, René Scharfe wrote: >> Can we trust object directory time stamps for cache invalidation? > > I think it would work on POSIX-ish systems, since loose object changes > always involve new files, not modifications of existing ones. I don't > know if there are systems that don't update directory mtimes even for > newly added files. > > That would still be a stat() per request. I'd be curious how the timing > compares to the opendir/readdir that happens now. Modification times wouldn't be exact, as you mentioned above: An object could be added just after checking the time stamp. So perhaps we don't need that, or a time-based invalidation suffices, e.g. we discard the cache after five minutes or so? Anyway, here's a patch for stat-based invalidation, on top of the other one. Array removal is really slow (hope I didn't sneak a bug in there, but my confidence in this code isn't very high). No locking is done; parallel threads removing and adding entries could make a mess, but that's not an issue for log. Timings for "time git log --pretty=%h >/dev/null" in my git repository with 5094 loose objects on Debian: master first patch this patch real 0m1.065s 0m0.581s 0m0.633s user 0m0.648s 0m0.564s 0m0.580s sys 0m0.412s 0m0.016s 0m0.052s And on mingw with 227 loose objects: master first patch this patch real 0m1.756s 0m0.546s 0m1.659s user 0m0.000s 0m0.000s 0m0.000s sys 0m0.000s 0m0.000s 0m0.000s So at least for Windows it would be really nice if we could avoid calling stat.. --- cache.h | 1 + sha1_name.c | 32 ++++++++++++++++++++++++++++---- 2 files changed, 29 insertions(+), 4 deletions(-) diff --git a/cache.h b/cache.h index 8c6748907b..386b09710d 100644 --- a/cache.h +++ b/cache.h @@ -1589,6 +1589,7 @@ extern struct alternate_object_database { uint32_t loose_objects_subdir_bitmap[8]; struct oid_array loose_objects_cache; + time_t loose_objects_subdir_mtime[256]; char path[FLEX_ARRAY]; } *alt_odb_list; diff --git a/sha1_name.c b/sha1_name.c index ae6a5c3abe..baecb10454 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -77,6 +77,24 @@ static void update_candidates(struct disambiguate_state *ds, const struct object /* otherwise, current can be discarded and candidate is still good */ } +static void remove_subdir_entries(struct oid_array *array, int subdir_nr) +{ + struct object_id oid; + int start, end; + + memset(&oid, 0, sizeof(oid)); + oid.hash[0] = subdir_nr; + start = oid_array_lookup(array, &oid); + if (start < 0) + start = -1 - start; + end = start; + while (end < array->nr && array->oid[end].hash[0] == subdir_nr) + end++; + memmove(array->oid + start, array->oid + end, + (array->nr - end) * sizeof(array->oid[0])); + array->nr -= end - start; +} + static void read_loose_object_subdir(struct alternate_object_database *alt, int subdir_nr) { @@ -86,15 +104,21 @@ static void read_loose_object_subdir(struct alternate_object_database *alt, struct dirent *de; size_t bitmap_index = subdir_nr / 32; uint32_t bitmap_mask = 1 << (subdir_nr % 32); - - if (alt->loose_objects_subdir_bitmap[bitmap_index] & bitmap_mask) - return; - alt->loose_objects_subdir_bitmap[bitmap_index] |= bitmap_mask; + struct stat st; buf = alt_scratch_buf(alt); strbuf_addf(buf, "%02x/", subdir_nr); xsnprintf(hex, sizeof(hex), "%02x", subdir_nr); + stat(buf->buf, &st); + if (alt->loose_objects_subdir_bitmap[bitmap_index] & bitmap_mask) { + if (alt->loose_objects_subdir_mtime[subdir_nr] == st.st_mtime) + return; + remove_subdir_entries(&alt->loose_objects_cache, subdir_nr); + } + alt->loose_objects_subdir_mtime[subdir_nr] = st.st_mtime; + alt->loose_objects_subdir_bitmap[bitmap_index] |= bitmap_mask; + dir = opendir(buf->buf); if (!dir) return; -- 2.13.0