David Turner <dturner@xxxxxxxxxxxxxxxx> writes: > From: Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx> > > Instead of reading the index from disk and worrying about disk > corruption, the index is cached in memory (memory bit-flips happen > too, but hopefully less often). The result is faster read. Read time > is reduced by 70%. > > The biggest gain is not having to verify the trailing SHA-1, which > takes lots of time especially on large index files. But this also > opens doors for further optimiztions: > > - we could create an in-memory format that's essentially the memory > dump of the index to eliminate most of parsing/allocation > overhead. The mmap'd memory can be used straight away. Experiment > [1] shows we could reduce read time by 88%. > > - we could cache non-index info such as name hash > > The shared memory's name folows the template "git-<object>-<SHA1>" > where <SHA1> is the trailing SHA-1 of the index file. <object> is > "index" for cached index files (and may be "name-hash" for name-hash > cache). If such shared memory exists, it contains the same index > content as on disk. The content is already validated by the daemon and > git won't validate it again (except comparing the trailing SHA-1s). This indeed is an interesting approach; what is not explained but must be is when the on-disk index is updated to reflect the reality (if I am reading the explanation and the code right, while the daemon is running, its in-core cache becomes the source of truth by forcing everybody's read-index-from() to go to the daemon). The explanation could be "this is only for read side, and updating the index happens via the traditional 'write a new file and rename it to the final place' codepath, at which time the daemon must be told to re-read it." -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html