Re: knfsd performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2024-06-18 at 19:54 +0000, Chuck Lever III wrote:
> 
> 
> > On Jun 18, 2024, at 3:50 PM, Trond Myklebust
> > <trondmy@xxxxxxxxxxxxxxx> wrote:
> > 
> > On Tue, 2024-06-18 at 19:39 +0000, Chuck Lever III wrote:
> > > 
> > > 
> > > > On Jun 18, 2024, at 3:29 PM, Trond Myklebust
> > > > <trondmy@xxxxxxxxxxxxxxx> wrote:
> > > > 
> > > > On Tue, 2024-06-18 at 18:40 +0000, Chuck Lever III wrote:
> > > > > 
> > > > > 
> > > > > > On Jun 18, 2024, at 2:32 PM, Trond Myklebust
> > > > > > <trondmy@xxxxxxxxxxxxxxx> wrote:
> > > > > > 
> > > > > > I recently back ported Neil's lwq code and sunrpc server
> > > > > > changes to
> > > > > > our
> > > > > > 5.15.130 based kernel in the hope of improving the
> > > > > > performance
> > > > > > for
> > > > > > our
> > > > > > data servers.
> > > > > > 
> > > > > > Our performance team recently ran a fio workload on a
> > > > > > client
> > > > > > that
> > > > > > was
> > > > > > doing 100% NFSv3 reads in O_DIRECT mode over an RDMA
> > > > > > connection
> > > > > > (infiniband) against that resulting server. I've attached
> > > > > > the
> > > > > > resulting
> > > > > > flame graph from a perf profile run on the server side.
> > > > > > 
> > > > > > Is anyone else seeing this massive contention for the spin
> > > > > > lock
> > > > > > in
> > > > > > __lwq_dequeue? As you can see, it appears to be dwarfing
> > > > > > all
> > > > > > the
> > > > > > other
> > > > > > nfsd activity on the system in question here, being
> > > > > > responsible
> > > > > > for
> > > > > > 45%
> > > > > > of all the perf hits.
> > > > > 
> > > > > I haven't seen that, but I've been working on other issues.
> > > > > 
> > > > > What's the nfsd thread count on your test server? Have you
> > > > > seen a similar impact on 6.10 kernels ?
> > > > > 
> > > > 
> > > > 640 knfsd threads. The machine was a supermicro 2029BT-HNR with
> > > > 2xIntel
> > > > 6150, 384GB of memory and 6xWDC SN840.
> > > > 
> > > > Unfortunately, the machine was a loaner, so cannot compare to
> > > > 6.10.
> > > > That's why I was asking if anyone has seen anything similar.
> > > 
> > > If this system had more than one NUMA node, then using
> > > svc's "numa pool" mode might have helped.
> > > 
> > 
> > Interesting. I had forgotten about that setting.
> > 
> > Just out of curiosity, is there any reason why we might not want to
> > default to that mode on a NUMA enabled system?
> 
> Can't think of one off hand. Maybe back in the day it was
> hard to tell when you were actually /on/ a NUMA system.
> 
> Copying Dave to see if he has any recollection.
> 

It's at least partly because of the klunkiness of the old pool_threads
interface: You have to bring up the server first using the "threads"
procfile, and then you can actually bring up threads in the various
pools using pool_threads.

Same for shutdown. You have to bring down the pool_threads first and
then you can bring down the final thread and the rest of the server
with it. Why it was designed this way, I have NFC.

The new nfsdctl tool and netlink interfaces should make this simpler in
the future. You'll be able to set the pool-mode in /etc/nfs.conf and
configure a list of per-pool thread counts in there too. Once we have
that, I think we'll be in a better position to consider doing it by
default.

Eventually we'd like to make the thread poos dynamic, at which point
making that the default becomes much simpler from an administrative
standpoint.
-- 
Jeff Layton <jlayton@xxxxxxxxxx>





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux