On 11/30/18 2:44 PM, Jeff Moyer wrote: > Hi, Jens, > > Jens Axboe <axboe@xxxxxxxxx> writes: > >> If we have fixed user buffers, we can map them into the kernel when we >> setup the io_context. That avoids the need to do get_user_pages() for >> each and every IO. >> >> To utilize this feature, the application must set both >> IOCTX_FLAG_USERIOCB, to provide iocb's in userspace, and then >> IOCTX_FLAG_FIXEDBUFS. The latter tells aio that the iocbs that are >> mapped already contain valid destination and sizes. These buffers can >> then be mapped into the kernel for the life time of the io_context, as >> opposed to just the duration of the each single IO. >> >> Only works with non-vectored read/write commands for now, not with >> PREADV/PWRITEV. >> >> A limit of 4M is imposed as the largest buffer we currently support. >> There's nothing preventing us from going larger, but we need some cap, >> and 4M seemed like it would definitely be big enough. > > Doesn't this mean that a user can pin a bunch of memory? Something like > 4MB * aio_max_nr? > > $ sysctl fs.aio-max-nr > fs.aio-max-nr = 1048576 > > If so, it may be a good idea to account the memory under RLIMIT_MEMLOCK. Yes, it'll need some kind of limiting, right now the limit would indeed be aio-max-nr * 4MB. 4G isn't terrible, but... RLIMIT_MEMLOCK isn't a bad idea. > I'm not sure how close you are to proposing this patch set for realz. > If it's soon (now?), then CC-ing linux-api and writing man pages would > be a good idea. I can help out with the libaio bits if you'd like. I > haven't yet had time to take this stuff for a spin, sorry. I'll try to > get to that soonish. I am proposing it for real, not sure how long it'll take to get it reviewed and moved forward. Unless I get lucky. 4.22 seems like a more viable version than 4.21. I'll take any help I can get on the API/man page parts. And/or testing! > The speedups are pretty impressive! That's why I put them in there, maybe that'd get peoples attention :-) -- Jens Axboe