> On Jun 22, 2023, at 3:23 PM, Tejun Heo <tj@xxxxxxxxxx> wrote: > > Hello, > > On Thu, Jun 22, 2023 at 03:45:18PM +0000, Chuck Lever III wrote: >> The good news: >> >> On stock 6.4-rc7: >> >> fio 8k [r=108k,w=46.9k IOPS] >> >> On the affinity-scopes-v2 branch (with no other tuning): >> >> fio 8k [r=130k,w=55.9k IOPS] > > Ah, okay, that's probably coming from per-cpu pwq. Didn't expect that to > make that much difference but that's nice. "cpu" and "smt" work equally well on this system. "cache", "numa", and "system" work equally poorly. I have HT disabled, and there's only one NUMA node, so the difference here is plausible. >> The bad news: >> >> pool->lock is still the hottest lock on the system during the test. >> >> I'll try some of the alternate scope settings this afternoon. > > Yeah, in your system, there's still gonna be one pool shared across all > CPUs. SMT or CPU may behave better but it might make sense to add a way to > further segment the scope so that e.g. one can split a cache domain N-ways. If there could be more than one pool to choose from, then these WQs would not be hitting the same lock. Alternately, finding a lockless way to queue the work on a pool would be a huge win. -- Chuck Lever