On Mon, Aug 27, 2018 at 12:36:27PM -0700, Junio C Hamano wrote: > > PS. I notice that v4 does not pad to align entries at 4 byte boundary > > like v2/v3. This could cause a slight slow down on x86 and segfault on > > some other platforms. > > Care to elaborate? > > Long time ago, we used to mmap and read directly from the index file > contents, requiring either an unaligned read or padded entries. But > that was eons ago and we first read and convert from on-disk using > get_be32() etc. to in-core structure, so I am not sure what you mean > by "segfault" here. To conclude this unalignment thing (since I plan more changes in the index to keep its size down, which may increase unaligned access), I ran with the following patch on amd64 (still webkit.git, 275k files, 100 runs), the index version that does not make unaligned access does not give noticeable differences. Still roughly around 4.2s. Running with NO_UNALIGNED_LOADS defined is clearly slower, in 4.3s range. So in theory if we avoid unaligned access in the index and avoid slow get_beXX versions, we could bring performance back to 4.2s range for those platforms. But on the other hand, padding the index increases the index size by ~1MB (v4 version before padding is 21MB) and this may add more cost at update time because of the trailer hash. So, yeah it's probably ok to keep living with unaligned access and not pad more. At least until those on "no unaligned access" platforms yell up. diff --git a/read-cache.c b/read-cache.c index 8628d0f3a8..33ee35fb81 100644 --- a/read-cache.c +++ b/read-cache.c @@ -1794,7 +1794,7 @@ static struct cache_entry *create_from_disk(struct index_state *istate, if (copy_len) memcpy(ce->name, previous_ce->name, copy_len); memcpy(ce->name + copy_len, name, len + 1 - copy_len); - *ent_size = (name - ((char *)ondisk)) + len + 1 - copy_len; + *ent_size = ((name - ((char *)ondisk)) + len - copy_len + 8) & ~7; } else { memcpy(ce->name, name, len + 1); *ent_size = ondisk_ce_size(ce); @@ -2345,8 +2345,10 @@ static int ce_write_entry(git_hash_ctx *c, int fd, struct cache_entry *ce, result = ce_write(c, fd, to_remove_vi, prefix_size); if (!result) result = ce_write(c, fd, ce->name + common, ce_namelen(ce) - common); - if (!result) - result = ce_write(c, fd, padding, 1); + if (!result) { + int len = prefix_size + ce_namelen(ce) - common; + result = ce_write(c, fd, padding, align_padding_size(size, len)); + } strbuf_splice(previous_name, common, to_remove, ce->name + common, ce_namelen(ce) - common); -- Duy