On Tue, 2016-03-29 at 09:31 +0700, Duy Nguyen wrote: > On Sat, Mar 19, 2016 at 8:04 AM, David Turner < > dturner@xxxxxxxxxxxxxxxx> wrote: > > From: Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx> > > > > Instead of reading the index from disk and worrying about disk > > corruption, the index is cached in memory (memory bit-flips happen > > too, but hopefully less often). The result is faster read. Read > > time > > is reduced by 70%. > > > > The biggest gain is not having to verify the trailing SHA-1, which > > takes lots of time especially on large index files. But this also > > opens doors for further optimiztions: > > > > - we could create an in-memory format that's essentially the > > memory > > dump of the index to eliminate most of parsing/allocation > > overhead. The mmap'd memory can be used straight away. > > Experiment > > [1] shows we could reduce read time by 88%. > > This reference [1] is missing (even in my old version). I believe > it's > http://thread.gmane.org/gmane.comp.version-control.git/247268/focus=2 > 48771, > comparing 256.442ms in that mail with v4 number, 2245.113ms in 0/8 > mail from the same thread. > > > Git can poke the daemon via named pipes to tell it to refresh the > > index cache, or to keep it alive some more minutes. It can't give > > any > > real index data directly to the daemon. Real data goes to disk > > first, > > then the daemon reads and verifies it from there. Poking only > > happens > > for $GIT_DIR/index, not temporary index files. > > I think we should go with unix socket on *nix platform instead of > named pipe. UNIX named pipe only allows one communication channel at > a > time. Windows named pipe is different and allows multiple clients, > which is the same as unix socket. > > > > $GIT_DIR/index-helper.pipe is the named pipe for daemon process. > > The > > daemon reads from the pipe and executes commands. Commands that > > need > > replies from the daemon will have to open their own pipe, since a > > named pipe should only have one reader. Unix domain sockets don't > > have this problem, but are less portable. > > Hm..NO_UNIX_SOCKETS is only set for Windows in config.mak.uname and > Windows will need to be specially tailored anyway, I think unix > socket > would be more elegant. One annoyance with unix sockets is that they must have short paths (UNIX_PATH_MAX -- about a hundred characters). This basically means they should be in $TMPDIR. I guess we could go back to pid files in $GIT_DIR, and then have a socket named after the pid. There's also some security issues, but it actually looks like there's a simple enough workaround for them. I'll change this, but it might take a bit as I'm busy with other things this week. > > +static void share_index(struct index_state *istate, struct shm > > *is) > > +{ > > + void *new_mmap; > > + if (istate->mmap_size <= 20 || > > + hashcmp(istate->sha1, > > + (unsigned char *)istate->mmap + istate > > ->mmap_size - 20) || > > + !hashcmp(istate->sha1, is->sha1) || > > + git_shm_map(O_CREAT | O_EXCL | O_RDWR, 0700, istate > > ->mmap_size, > > + &new_mmap, PROT_READ | PROT_WRITE, > > MAP_SHARED, > > + "git-index-%s", sha1_to_hex(istate->sha1)) > > < 0) > > + return; > > + > > + release_index_shm(is); > > + is->size = istate->mmap_size; > > + is->shm = new_mmap; > > + hashcpy(is->sha1, istate->sha1); > > + memcpy(new_mmap, istate->mmap, istate->mmap_size - 20); > > + > > + /* > > + * The trailing hash must be written last after everything > > is > > + * written. It's the indication that the shared memory is > > now > > + * ready. > > + */ > > + hashcpy((unsigned char *)new_mmap + istate->mmap_size - 20, > > is->sha1); > > You commented here [1] a long time ago about memory barrier. I'm not > entirely sure if compilers dare to reorder function calls, but when > hashcpy is inlined and memcpy is builtin, I suppose that's > possible... > > [1] http://article.gmane.org/gmane.comp.version-control.git/280729 Oh, right. Will fix. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html