Re: contention on pwq->pool->lock under heavy NFS workload

Chuck Lever III <chuck.lever@xxxxxxxxxx> · Thu, 22 Jun 2023 19:39:05 +0000

> On Jun 22, 2023, at 3:23 PM, Tejun Heo <tj@xxxxxxxxxx> wrote:
> 
> Hello,
> 
> On Thu, Jun 22, 2023 at 03:45:18PM +0000, Chuck Lever III wrote:
>> The good news:
>> 
>> On stock 6.4-rc7:
>> 
>> fio 8k [r=108k,w=46.9k IOPS]
>> 
>> On the affinity-scopes-v2 branch (with no other tuning):
>> 
>> fio 8k [r=130k,w=55.9k IOPS]
> 
> Ah, okay, that's probably coming from per-cpu pwq. Didn't expect that to
> make that much difference but that's nice.

"cpu" and "smt" work equally well on this system.

"cache", "numa", and "system" work equally poorly.

I have HT disabled, and there's only one NUMA node, so
the difference here is plausible.

>> The bad news:
>> 
>> pool->lock is still the hottest lock on the system during the test.
>> 
>> I'll try some of the alternate scope settings this afternoon.
> 
> Yeah, in your system, there's still gonna be one pool shared across all
> CPUs. SMT or CPU may behave better but it might make sense to add a way to
> further segment the scope so that e.g. one can split a cache domain N-ways.

If there could be more than one pool to choose from, then these
WQs would not be hitting the same lock. Alternately, finding a
lockless way to queue the work on a pool would be a huge win.

--
Chuck Lever