On Tue, Feb 07, 2012 at 09:55:40AM -0500, Jeff Darcy wrote: > > I have been digging through source code. If I am reading it right, it > > seems that xlators/performance/io-threads only sets the number of > > threads to be log(base2) of the number of outstanding I/O entries in > > the queue. That is, to get 8 concurrent threads you need 256 > > outstanding requests in the queue. > > FWIW, I've also wondered sometimes whether we increase this count > aggressively enough. At first blush, it seems that we should be > increasing the thread count whenever the queue is staying full and the > last increment improved our iowait/idle ratio. I'd be very curious to > hear about your results. Is there any particular reason to hold back on the number of threads? Given that you could have an array with many spindles, and NCQ allows drives to re-order outstanding requests, I'd have thought it would always be good to have plenty. Anyway, I've made a very trivial change: --- xlators/performance/io-threads/src/io-threads.c.orig 2012-02-07 15:24:18.494130539 +0000 +++ xlators/performance/io-threads/src/io-threads.c 2012-02-07 15:24:29.922130383 +0000 @@ -1992,7 +1992,7 @@ pthread_t thread; int ret = 0; - log2 = log_base2 (conf->queue_size); + log2 = conf->queue_size / 2; scale = log2; And this seems to have the desired effect: #p files/sec 1 38.58 2 70.68 5 141.74 10 209.76 # was 157.48 20 277.99 # was 179.75 30 316.24 # was 206.34 Performance with glusterfs over the wire is now within about 5% of direct local access to the filesystem. Incidentally this is with performance.io-thread-count: 32 performance.client-io-threads: off (which I'm fairly sure is default) Regards, Brian.