Configuration: global-dhp__io-metadata-ext3 Benchmarks: dbench3, fsmark-single, fsmark-threaded Summary ======= While the fsmark figures look ok, fsmark in single threaded mode has a small number of large outliers towards the min end of the scale. The resulting standard deviation fuzzes results but filtering is not necessarily the best answer. A similar effect is visible when running in threaded mode except that there is clustering around some values that could be presented better. Arithmetic mean is unsuitable for this sort of data. Benchmark notes =============== mkfs was run on system startup. No attempt was made to age it. No special mkfs or mount options were used. dbench3 was chosen as it's metadata intensive. o Duration was 180 seconds o OSYNC, OSYNC_DIRECTORY and FSYNC were all off As noted in the MMTests, dbench3 can be a random number generator particularly when run in asynchronous mode. Even with the limitations, it can be useful as an early warning system and as it's still used by QA teams it's still worth keeping an eye on. FSMark o Parallel directories were used o 1 Thread per CPU o 0 Filesize o 225 directories o 22500 files per directory o 50000 files per iteration o 15 iterations Single: ./fs_mark -d /tmp/fsmark-9227/1 -D 225 -N 22500 -n 50000 -L 15 -S0 -s 0 Thread: ./fs_mark -d /tmp/fsmark-9407/1 -d /tmp/fsmark-9407/2 -D 225 -N 22500 -n 25000 -L 15 -S0 -s 0 FSMark is a more realistic indicator of metadata intensive workloads. =========================================================== Machine: arnold Result: http://www.csn.ul.ie/~mel/postings/mmtests-20120424/global-dhp__io-metadata-ext3/arnold/comparison.html Arch: x86 CPUs: 1 socket, 2 threads Model: Pentium 4 Disk: Single Rotary Disk Status: Fine =========================================================== dbench ------ For single clients, we're doing reasonably well. There was a big spike for large number of clients in 2.6.34 and to a lesser extent in 2.6.39.4 but much of this is due to the operations taking place in memory without reaching disk. There were also fairness issues and the indicated throughput figures are far higher than the disks capabilities so I do not consider this to be a regression. There was a mild dip for 3.2.x and 3.3.x that has been recovered somewhat in 3.4. As this was when IO-Less Dirty Throttling got merged it is hardly a surprise and a dip in dbench is not worth backing that out for. Recent kernels appear to deal worse for large number of clients. However, I very strongly suspect this is due to improved fairness in IO. The high throughput figures are due to one client making an unfair amount of progress while other clients stall. fsmark-single ------------- Again, this is looking good. Files/sec has improved slightly with the exception of a dip in 3.2 and 3.3 which again may be due to IO-Less dirty throttling. I have a slight concern with the overhead measurements. Somewhere between 3.0.23 and 3.1.10 the overhead started deviating a lot more. Ideally this should be bisected because it difficult to blame IO-Less throttling with any certainity. IO-Less Throttling. fsmark-threaded --------------- Looks better but due to high deviations it's hard to be 100% sure. If this is of interest then the thing to do is do a proper measurement of whether the results are significant or not although with 15 samples it still will be fuzzy. Right now, there is little to be concerned about. ========================================================== Machine: hydra Result: http://www.csn.ul.ie/~mel/postings/mmtests-20120424/global-dhp__io-metadata-ext3/hydra/comparison.html Arch: x86-64 CPUs: 1 socket, 4 threads Model: AMD Phenom II X4 940 Disk: Single Rotary Disk Status: Fine ========================================================== dbench ------ Very similar story to the arnold machine. Big spike in 2.6.34 and then drops off. fsmark-single ------------- Nothing really notable here. Deviations are too high to draw reasonable conclusions from. Looking at the raw results it's due to a small number of low outliers. These could be filtered but it would mask the fact that throughput is not consistent so strong justification would be required. fsmark-threaded --------------- Similar to fsmark-single. Figures look ok but large deviations are a problem Unlike the single-threaded case the raw data shows that we cluster around two points that are very far apart from each other. It is worth investigating if this can be presented in some sensible manner such as k-means clustering because arithmetic mean with this sort of data is crap. ========================================================== Machine: sandy Result: http://www.csn.ul.ie/~mel/postings/mmtests-20120424/global-dhp__io-metadata-ext3/sandy/comparison.html Arch: x86-64 CPUs: 1 socket, 8 threads Model: Intel Core i7-2600 Disk: Single Rotary Disk Status: Fine ========================================================== dbench ------ Same as the other two complete with spikes. fsmark-single ------------- Other than overhead going crazy in 3.2 there is nothing notable either. As with hydra, there are a small number of outliers that result in large deviations fsmark-threaded --------------- Similar to hydra. Figures look basically ok but deviations are high with some clustering going on. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html