2012/3/12 Thomas Rast <trast@xxxxxxxxxxx>: > Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx> writes: > >> This puts delta resolving on each base on a separate thread, one base >> cache per thread. Per-thread data is grouped in struct thread_local. >> When running with nr_threads == 1, no pthreads calls are made. The >> system essentially runs in non-thread mode. > > As discussed when we took the git-grep measurements, it may be > interesting to have a way to run 1 thread. Can you put in such an > option? Sorry I wasn't clear, nr_threads == 1 is equivalent to --threads=1. So yes it supports running in non-thread mode. >> An experiment on a Xeon 24 core machine with linux-2.6.git shows that >> performance does not increase proportional to the number of cores. So >> by default, we use maximum 3 cores. Some numbers with --threads from 1 >> to 16: >> >> 1..4 >> real 1m16.310s 0m48.183s 0m37.866s 0m32.834s >> user 1m13.773s 1m15.537s 1m15.781s 1m16.233s >> sys 0m2.480s 0m3.936s 0m4.448s 0m4.852s >> >> 5..8 >> real 0m33.170s 0m30.369s 0m28.406s 0m26.968s >> user 1m31.474s 1m30.322s 1m29.562s 1m28.694s >> sys 0m6.096s 0m6.268s 0m6.684s 0m7.172s > > Interesting. Is this a real 24-core machine or 12*2 hyperthreaded? > Does it use Turbo Boost and how far (how fast and on how many cores > simultaneously) does that go? I'll check on that later. > I'm asking because if Turbo Boost starts to wear off around 4 cores, > like these measurements suggest, then it may not be beneficial to spawn > threads on 2*2HT CPUs (found in many laptops) where Turbo Boost only > really works if you only use a single core. That might explain why it performs poorly on my two (probably HT) cores laptop after 4 threads. I was worried there was some contention in the code (and failed to find one) that made it perform worse as more threads were spawn. Any pointers for identifying cpu features in linux? > Oh, and could you write a perf test for this? :-) Yeah, about that, index-pack is mostly used as part of git-fetch or git-clone. Maybe we need to add --threads to those commands too, then we can see how clone/fetch performs. I'll need such tests anyway if I'm going to push for cheaper connectivity check in git-fetch in another thread. I guess one test with --threads=1, one with threads=2 and one without --threads. Any ideas? We can try testing it on half available cores, all cores, double available cores, but that would require exporting online_cpus(), perhaps via test command. I didn't see grep --threads perf test either (wanted to use it as template..) -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html