On Wed, 15 Nov 2023 at 10:50, David Howells <dhowells@xxxxxxxxxx> wrote: > > Add kunit tests to benchmark 256MiB copies to a KVEC iterator, a BVEC > iterator, an XARRAY iterator and to a loop that allocates 256-page BVECs > and fills them in (similar to a maximal bio struct being set up). I see *zero* advantage of doing this in the kernel as opposed to doing this benchmarking in user space. If you cannot see the performance difference due to some user space interface costs, then the performance difference doesn't matter. Yes, some of the cases may be harder to trigger than others. iov_iter_xarray() isn't as common an op as ubuf/iovec/etc, but that either means that it doesn't matter enough, or that maybe some more filesystems could be taught to use it for splice or whatever. Particularly for something like different versions of memcpy(), this whole benchmarking would want (a) profiles (b) be run on many different machines (c) be run repeatedly to get some idea of variance and all of those only get *harder* to do with Kunit tests. In user space? Just run the damn binary (ok, to get profiles you then have to make sure you have the proper permission setup to get the kernel profiles too, but a echo 1 > /proc/sys/kernel/perf_event_paranoid as root will do that for you without you having to then do the actual profiling run as root) Linus