Jeff King wrote: > > A quick perf run shows most of the time is spent inflating objects. The > diff code has a sneaky trick to re-use worktree files when we know they > are stat-clean (in diff's case it is to avoid writing a tempfile). I > wonder if we should use the same trick here. > > It would hurt the cold cache case, though, as the compressed versions > require fewer disk accesses, of course. I just found out that on Linux, there's mincore() that can tell us (racily, but who cares) whether a given file mapping is in memory. If you would like to try it, see the source at the end, but I'm getting things such as # in a random collection of files, none of which I have accessed lately $ ls -l -rw-r--r-- 1 thomas users 116534 Jul 4 2010 IMG_4884.JPG -rw-r--r-- 1 thomas users 7278081 Aug 25 2010 remoteserverrepo.zip $ ./mincore IMG_4884.JPG 00000000000000000000000000000 $ cat IMG_4884.JPG > /dev/null $ ./mincore IMG_4884.JPG 11111111111111111111111111111 $ ./mincore remoteserverrepo.zip 0000000000000000000000[...] $ head -10 remoteserverrepo.zip >/dev/null $ ./mincore remoteserverrepo.zip 1111000000000000000000[...] So that looks fairly promising, and the order would then be: - if stat-clean, and we have mincore(), and it tells us we can do it cheaply: grab file from tree - if it's a loose object: decompress it - if stat-clean: grab file from tree - access packs as usual > PS I suspect your timings are somewhat affected by the simplicity of the > regex you are asking for. The time to inflate the blobs dominates, > because the search is just a memmem(). On my quad-core w/ > hyperthreading (i.e., 8 apparent cores): > > $ /usr/bin/time git grep INITRAMFS_ROOT_UID >/dev/null > 0.42user 0.45system 0:00.15elapsed 578%CPU > $ /usr/bin/time git grep 'a.*b' >/dev/null > 14.68user 0.50system 0:02.00elapsed 758%CPU > $ /usr/bin/time git grep --cached INITRAMFS_ROOT_UID >/dev/null > 7.64user 0.41system 0:07.61elapsed 105%CPU > $ /usr/bin/time git grep --cached 'a.*b' >/dev/null > 23.46user 0.47system 0:08.42elapsed 284%CPU > > So I think there is value in parallelizing even --cached greps. But > we could do so much better if blob inflation could be done in > parallel. Ok, I see, I missed that part. Perhaps the heuristic should then be "if the regex boils down to memmem, disable threading", but let's see what loose object decompression in parallel can give us. ---- 8< ---- mincore.c ---- 8< ---- #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/types.h> #include <sys/stat.h> #include <sys/mman.h> #include <fcntl.h> void die(const char *s) { perror(s); exit(1); } int main (int argc, char *argv[]) { void *mem; size_t len; struct stat st; int fd; unsigned char *vec; int vsize; int i; size_t page = sysconf(_SC_PAGESIZE); if (argc != 2) { fprintf(stderr, "usage: %s <file>\n", argv[0]); exit(2); } fd = open(argv[1], O_RDONLY); if (fd == -1) die("open failed"); if (fstat(fd, &st) == -1) die("fstat failed"); mem = mmap(NULL, st.st_size, PROT_READ, MAP_SHARED, fd, 0); if (mem == (void*) -1) die("mmap failed"); vsize = (st.st_size+page-1)/page; vec = malloc(vsize); if (!vec) die("malloc failed"); if (mincore(mem, st.st_size, vec) == -1) die("mincore failed"); for (i = 0; i < vsize; i++) printf("%d", (int) vec[i]); printf("\n"); return 0; } -- Thomas Rast trast@{inf,student}.ethz.ch -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html