On 9/6/22 6:00 AM, J. Bruce Fields wrote:
On Sat, Sep 03, 2022 at 10:59:17AM -0700, dai.ngo@xxxxxxxxxx wrote:
On 9/3/22 10:29 AM, Chuck Lever III wrote:
What I was suggesting was a longer term strategy for improving the
laundromat. In order to scale well in the number of clients, it
needs to schedule client expiry and deletion without serializing.
(ie, the laundromat itself can identify a set of clients to clean,
but then it should pass that list to other workers so it can run
again as soon as it needs to -- and that also means it can use more
than just one CPU at a time to do its work).
I see. Currently on my lowly 1-CPU VM it takes about ~35 secs to
destroy 128 clients, each with only few states (generated by pynfs's
CID5 test). We can improve on this.
Careful--it's not the CPU that's the issue, it's waiting for disk.
If you're on a hard drive, for example, it's going to take at least one
seek (probably at least 10ms) to expire a single client, so you're never
going to destroy more than 100 per second. That's what you need to
parallelize. See item 3 from
https://urldefense.com/v3/__https://lore.kernel.org/linux-nfs/20220523154026.GD24163@xxxxxxxxxxxx/__;!!ACWV5N9M2RV99hQ!KK7Xqkksy2WBks12oxpw0FMQW8z7_FpSDgruhtAIrNCW5kmhvAY7noT5d6ybenxkIowyx9cVXBLDysVq$
Right! thank you for reminding me of this. I'll add it to my plate
if no one gets to it yet.
-Dai
Also, looks like current nfs-utils is still doing 3 commits per expiry.
Steve, for some reason I think "nfsdcld: use WAL journal for faster
commits" never got applied:
https://urldefense.com/v3/__https://lore.kernel.org/linux-nfs/20220104222445.GF12040@xxxxxxxxxxxx/__;!!ACWV5N9M2RV99hQ!KK7Xqkksy2WBks12oxpw0FMQW8z7_FpSDgruhtAIrNCW5kmhvAY7noT5d6ybenxkIowyx9cVXP3Ocamj$
--b.