On Mon, 24 Jan 2022 at 19:38, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > > On Sun, Jan 23, 2022 at 11:53:08PM +0000, Daire Byrne wrote: > > I've been experimenting a bit more with high latency NFSv4.2 (200ms). > > I've noticed a difference between the file creation rates when you > > have parallel processes running against a single client mount creating > > files in multiple directories compared to in one shared directory. > > The Linux VFS requires an exclusive lock on the directory while you're > creating a file. Right. So when I mounted the same server/dir multiple times using namespaces, all I was really doing was making the VFS *think* I wanted locks on different directories even though the remote server directory was actually the same? > So, if L is the time in seconds required to create a single file, you're > never going to be able to create more than 1/L files per second, because > there's no parallelism. And things like directory delegations can't help with this kind of workload? You can't batch directories locks or file creates I guess. > So, it's not surprising you'd get a higher rate when creating in > multiple directories. > > Also, that lock's taken on both client and server. So it makes sense > that you might get a little more parallelism from multiple clients. > > So the usual advice is just to try to get that latency number as low as > possible, by using a low-latency network and storage that can commit > very quickly. (An NFS server isn't permitted to reply to the RPC > creating the new file until the new file actually hits stable storage.) > > Are you really seeing 200ms in production? Yea, it's just a (crazy) test for now. This is the latency between two of our offices. Running batch jobs over this kind of latency with a NFS re-export server doing all the caching works surprisingly well. It's just these file creations that's the deal breaker. A batch job might create 100,000+ files in a single directory across many clients. Maybe many containerised re-export servers in round-robin with a common cache is the only way to get more directory locks and file creates in flight at the same time. Cheers, Daire