On 4/23/22 12:02 PM, Jens Axboe wrote: > On 4/23/22 11:32 AM, Jens Axboe wrote: >>> I guess copy_to_user saves us from having to consider endianness. >> >> I was considering that too, definitely something that should be >> investigated. Making it a 1/2/4/8 switch and using put_user() is >> probably a better idea. Easy enough to benchmark. > > FWIW, this is the current version. Some quick benchmarking doesn't show > any difference between copy_to_user and put_user, but that may depend on > the arch as well (using aarch64). But we might as well use put user and > combine it with the length check, so we explicitly only support 1/2/4/8 > sizes. In terms of performance, on this laptop, I can do about 36-37M NOP requests per second. If I use IORING_OP_MEMCPY with immediate mode, it's around 15M ops/sec. This is regardless of the size, get about the same whether it's 1 or 8 byte memory writes. -- Jens Axboe