On Mon, 25 Apr 2022 at 14:22, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > > On Mon, Apr 25, 2022 at 02:00:32PM +0100, Daire Byrne wrote: > > On Mon, 21 Feb 2022 at 13:59, Daire Byrne <daire@xxxxxxxx> wrote: > > > > > > On Fri, 18 Feb 2022 at 07:46, NeilBrown <neilb@xxxxxxx> wrote: > > > > I've ported it to mainline without much trouble. I started some simple > > > > testing (parallel create/delete of the same file) and hit a bug quite > > > > easily. I fixed that (eventually) and then tried with more than 1 CPU, > > > > and hit another bug. But then it was quitting time. If I can get rid > > > > of all the easy to find bugs, I'll post it with a CC to you, and you can > > > > find some more for me! > > > > > > That would be awesome! I have a real world production case for this > > > and it's a pretty heavy workload. If that doesn't shake out any bugs, > > > nothing will. > > > > > > The only caveat being that it will likely be restricted to NFSv3 > > > testing due to the concurrency limitations with NFSv4.1+ (from the > > > other thread). > > > > > > Daire > > > > Just to follow up on this again - I have been using Neil's patch for > > parallel file creates (thanks!) but I'm a bit confused as to why it > > doesn't seem to help in my NFS re-export case. > > > > With the patch, I can achieve much higher parallel (multi process) > > creates directly on my re-export server to a high latency remote > > server mount, but when I re-export that to multiple clients, the > > aggregate create rate again degrades to that which we might expect > > either without the patch or if there was only one process creating the > > files in sequence. > > > > My assumption was that the nfsd threads of the re-export server would > > act as multiple independent processes and it's clients would be spread > > across them such that they would also benefit from the parallel > > creates patch on the re-export server. So I expected many clients > > creating files in the same directory would achieve much higher > > aggregate performance. > > That's the idea. > > I've lost track, where's the latest version of Neil's patch? > > --b. The latest is still the one from this thread (with a minor update to apply it to v5.18-rc): https://lore.kernel.org/lkml/893053D7-E5DD-43DB-941A-05C10FF5F396@xxxxxxxxx/T/#m922999bf830cacb745f32cc464caf72d5ffa7c2c My test is something like this: reexport1 # for x in {1..5000}; do echo /srv/server1/touch.$HOSTNAME.$x done | xargs -n1 -P 200 -iX -t touch X 2>&1 | pv -l -a >|/dev/null Without the patch this results in 3 creates/s and with the patch it's ~250 creates/s with 200 threads/processes (200ms latency) when run directly against a remote RHEL8 server (server1). Then I run something similar to this but simultaneously across 200 clients of the "reexport1" server's re-export of the originating "server1". I get an aggregate of around 3 creates/s even with the patch applied to reexport1 (v5.18-rc2) which is suspiciously similar to the performance without the parallel vfs create patch. The clients don't run any special kernels or configurations. I have only tested NFSv3 so far. Daire