Dominique Martinet wrote on Mon, Jul 23, 2018: > I'll try to get figures for various approaches before the merge window > for 4.19 starts, it's getting closer though... Here's some numbers; with v4.18-rc7 + current test tree (my 9p-next) as a base. For the context, I'm running on VMs that bind their cores to CPUs on the host (32 cores), and have a Connect-IB mellanox card through SRIOV. The server is nfs-ganesha, serving a tmpfs filesystem on a second VM (different host) Mounting with msize=$((1024*1024)) My main problem with this test is that the client has way too much memory and it's mostly pristine with a boot not long before, so any kind of memory pressure won't be seen here. If someone knows how to fragment memory quickly I'll take that and rerun the tests :) I've changed my mind from mdtest to a simple ior, as I'm testing on trans=rdma there's no difference and I'm more familiar with ior options. I ran two workloads: - 32 processes, file per process, 512k at a time writing a total of 32GB (1GB per file), repeated 10 times - 32 processes, file per process, 32 bytes at a time writing a total of 16MB (512k per file), repeated 10 times. The first test gives a proper impression of the throughput the systems can sustain and the results are pretty much around what I was expecting for the setup; the second test is purely a latency test (how long does it take to send 512k RPCs) I ran almost all of these tests with KASAN enabled in the VMs a first time, so leaving the results with KASAN at the end for reference... Overall I'm rather happy with the result, without KASAN the overhead of the patch isn't negligible (~6%) but I'd say it's acceptable for correctness and with an extra two patchs with the suggesteed changes (rounding down the alloc size to not include the struct overhead and separate kmem cache) it's getting down to 0.5% which is quite good, I think. I'll send the two patchs to the list shortly. The first one is rather huge even if it's a trivial change logically, so part of me wants to get it merged quickly to not have to deal with rebases... ;) With KASAN, well, it certainly does more things but I hope performance-critical systems don't have it enabled in the first place. Raw results: * Base = 4.18-rc7 + queued patches, without request cache rework - "Big" I/Os: Summary of all tests: Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum write 5842.40 5751.58 5793.53 23.93 5.65606 0 32 32 10 1 0 1 0 0 1 1073741824 524288 34359738368 POSIX 0 read 6098.92 6018.63 6064.30 20.00 5.40348 0 32 32 10 1 0 1 0 0 1 1073741824 524288 34359738368 POSIX 0 - "Small" I/Os: Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum write 2.10 1.91 2.00 0.05 8.01074 0 32 32 10 1 0 1 0 0 1 524288 32 16777216 POSIX 0 read 1.27 1.07 1.15 0.06 13.93901 0 32 32 10 1 0 1 0 0 1 524288 32 16777216 POSIX 0 -> 512k / 8.01074 = 65.4k req/s * Base + patch as submitted - "Big" I/Os: Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum write 5844.84 5665.32 5787.15 48.94 5.66261 0 32 32 10 1 0 1 0 0 1 1073741824 524288 34359738368 POSIX 0 read 6082.24 6039.62 6057.14 12.50 5.40983 0 32 32 10 1 0 1 0 0 1 1073741824 524288 34359738368 POSIX 0 - "Small" I/Os: Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum write 1.95 1.82 1.88 0.04 8.50453 0 32 32 10 1 0 1 0 0 1 524288 32 16777216 POSIX 0 read 1.18 1.07 1.14 0.03 14.04634 0 32 32 10 1 0 1 0 0 1 524288 32 16777216 POSIX 0 -> 512k / 8.50453 = 61.6k req/s * Base + patch as submitted + moving the header into req so the allocation is "round" as suggested by Matthew - "Big" I/Os: Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum write 5861.79 5680.99 5795.71 48.84 5.65424 0 32 32 10 1 0 1 0 0 1 1073741824 524288 34359738368 POSIX 0 read 6098.54 6037.55 6067.80 19.39 5.40036 0 32 32 10 1 0 1 0 0 1 1073741824 524288 34359738368 POSIX 0 - "Small" I/Os: Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum write 1.98 1.81 1.90 0.06 8.43521 0 32 32 10 1 0 1 0 0 1 524288 32 16777216 POSIX 0 read 1.19 1.08 1.13 0.03 14.11709 0 32 32 10 1 0 1 0 0 1 524288 32 16777216 POSIX 0 -> 62.2k req/s * Base + patchs submitted + round alloc + kmem cache in the client struct - "Big" I/Os Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum write 5859.51 5747.64 5808.22 34.81 5.64186 0 32 32 10 1 0 1 0 0 1 1073741824 524288 34359738368 POSIX 0 read 6087.90 6037.03 6063.98 15.14 5.40374 0 32 32 10 1 0 1 0 0 1 1073741824 524288 34359738368 POSIX 0 - "Small" I/Os: Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum write 2.07 1.95 1.99 0.03 8.05362 0 32 32 10 1 0 1 0 0 1 524288 32 16777216 POSIX 0 read 1.22 1.11 1.16 0.04 13.75312 0 32 32 10 1 0 1 0 0 1 524288 32 16777216 POSIX 0 -> 65.1k req/s * Base + patchs submitted + kmem cache in the client struct (kind of similar to testing an 'odd' msize like 1.001MB) - "Big" I/Os: Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum write 5883.03 5725.30 5811.58 45.22 5.63874 0 32 32 10 1 0 1 0 0 1 1073741824 524288 34359738368 POSIX 0 read 6090.29 6015.23 6062.49 25.93 5.40514 0 32 32 10 1 0 1 0 0 1 1073741824 524288 34359738368 POSIX 0 - "Small" I/Os: Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum write 2.07 1.89 1.98 0.05 8.10028 0 32 32 10 1 0 1 0 0 1 524288 32 16777216 POSIX 0 read 1.23 1.05 1.12 0.05 14.25607 0 32 32 10 1 0 1 0 0 1 524288 32 16777216 POSIX 0 -> 64.7k req/s Raw results with KASAN: * Base = 4.18-rc7 + queued patches, without request cache rework - "Big" I/Os: Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum write 5790.03 5705.32 5749.69 27.63 5.69922 0 32 32 10 1 0 1 0 0 1 1073741824 524288 34359738368 POSIX 0 read 6095.11 6007.29 6066.50 26.26 5.40157 0 32 32 10 1 0 1 0 0 1 1073741824 524288 34359738368 POSIX 0 - "Small" I/Os: Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum write 1.63 1.53 1.58 0.03 10.10286 0 32 32 10 1 0 1 0 0 1 524288 32 16777216 POSIX 0 read 1.43 1.19 1.31 0.07 12.27704 0 32 32 10 1 0 1 0 0 1 524288 32 16777216 POSIX 0 * Base + patch as submitted - "Big" I/Os: Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum write 5773.60 5673.92 5729.01 29.63 5.71982 0 32 32 10 1 0 1 0 0 1 1073741824 524288 34359738368 POSIX 0 read 6097.96 6006.50 6059.40 26.74 5.40790 0 32 32 10 1 0 1 0 0 1 1073741824 524288 34359738368 POSIX 0 - "Small" I/Os: Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum write 1.15 1.08 1.12 0.02 14.32230 0 32 32 10 1 0 1 0 0 1 524288 32 16777216 POSIX 0 read 1.18 1.06 1.10 0.04 14.51172 0 32 32 10 1 0 1 0 0 1 524288 32 16777216 POSIX 0 * Base + patch as submitted + moving the header into req so the allocation is "round" as suggested by Matthew - "Big" I/Os: Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum write 5878.75 5709.74 5798.96 57.12 5.65122 0 32 32 10 1 0 1 0 0 1 1073741824 524288 34359738368 POSIX 0 read 6089.83 6039.75 6072.64 14.78 5.39604 0 32 32 10 1 0 1 0 0 1 1073741824 524288 34359738368 POSIX 0 - "Small" I/Os Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum write 1.33 1.26 1.29 0.02 12.38185 0 32 32 10 1 0 1 0 0 1 524288 32 16777216 POSIX 0 read 1.18 1.08 1.15 0.03 13.90525 0 32 32 10 1 0 1 0 0 1 524288 32 16777216 POSIX 0 * Base + patchs submitted + round alloc + kmem cache in the client struct - "Big" I/Os Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum write 5816.89 5729.58 5775.02 26.71 5.67422 0 32 32 10 1 0 1 0 0 1 1073741824 524288 34359738368 POSIX 0 read 6087.33 6032.62 6058.69 16.73 5.40847 0 32 32 10 1 0 1 0 0 1 1073741824 524288 34359738368 POSIX 0 - "Small" I/Os Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum write 0.87 0.85 0.86 0.01 18.59584 0 32 32 10 1 0 1 0 0 1 524288 32 16777216 POSIX 0 read 0.89 0.86 0.88 0.01 18.26275 0 32 32 10 1 0 1 0 0 1 524288 32 16777216 POSIX 0 -> I'm not sure why it's so different, actually; the cache doesn't turn up in /proc/slabinfo so I'm figuring it got merged with kmalloc-1024 so there should be no difference? And this turned out fine without KASAN... -- Dominique