* Waiman Long <waiman.long@xxxxxx> wrote: > I had run some performance tests using the fserver and new_fserver > benchmarks (on ext4 filesystems) of the AIM7 test suite on a 80-core > DL980 with HT on. The following kernels were used: > > 1. Modified 3.10.1 kernel with mb_cache_spinlock in fs/mbcache.c > replaced by a rwlock > 2. Modified 3.10.1 kernel + modified __read_lock_failed code as suggested > by Ingo > 3. Modified 3.10.1 kernel + queue read/write lock > 4. Modified 3.10.1 kernel + queue read/write lock in classic read/write > lock behavior > > The last one is with the read lock stealing flag set in the qrwlock > structure to give priority to readers and behave more like the classic > read/write lock with less fairness. > > The following table shows the averaged results in the 200-1000 > user ranges: > > +-----------------+--------+--------+--------+--------+ > | Kernel | 1 | 2 | 3 | 4 | > +-----------------+--------+--------+--------+--------+ > | fserver JPM | 245598 | 274457 | 403348 | 411941 | > | % change from 1 | 0% | +11.8% | +64.2% | +67.7% | > +-----------------+--------+--------+--------+--------+ > | new-fserver JPM | 231549 | 269807 | 399093 | 399418 | > | % change from 1 | 0% | +16.5% | +72.4% | +72.5% | > +-----------------+--------+--------+--------+--------+ So it's not just herding that is a problem. I'm wondering, how sensitive is this particular benchmark to fairness? I.e. do the 200-1000 simulated users each perform the same number of ops, so that any smearing of execution time via unfairness gets amplified? I.e. does steady-state throughput go up by 60%+ too with your changes? Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html