On Sat, Sep 03, 2022 at 10:59:17AM -0700, dai.ngo@xxxxxxxxxx wrote: > On 9/3/22 10:29 AM, Chuck Lever III wrote: > >What I was suggesting was a longer term strategy for improving the > >laundromat. In order to scale well in the number of clients, it > >needs to schedule client expiry and deletion without serializing. > > > >(ie, the laundromat itself can identify a set of clients to clean, > >but then it should pass that list to other workers so it can run > >again as soon as it needs to -- and that also means it can use more > >than just one CPU at a time to do its work). > > I see. Currently on my lowly 1-CPU VM it takes about ~35 secs to > destroy 128 clients, each with only few states (generated by pynfs's > CID5 test). We can improve on this. Careful--it's not the CPU that's the issue, it's waiting for disk. If you're on a hard drive, for example, it's going to take at least one seek (probably at least 10ms) to expire a single client, so you're never going to destroy more than 100 per second. That's what you need to parallelize. See item 3 from https://lore.kernel.org/linux-nfs/20220523154026.GD24163@xxxxxxxxxxxx/ Also, looks like current nfs-utils is still doing 3 commits per expiry. Steve, for some reason I think "nfsdcld: use WAL journal for faster commits" never got applied: https://lore.kernel.org/linux-nfs/20220104222445.GF12040@xxxxxxxxxxxx/ --b.