On Thu, Jul 24, 2014 at 11:16 PM, Dave Kleikamp <dave.kleikamp@xxxxxxxxxx> wrote: > On 07/23/2014 08:57 PM, Ming Lei wrote: >> On Thu, Jul 24, 2014 at 7:16 AM, Zach Brown <zab@xxxxxxxxx> wrote: >>> On Thu, Jul 24, 2014 at 06:55:28AM +0800, Ming Lei wrote: >>>> From: Dave Kleikamp <dave.kleikamp@xxxxxxxxxx> >>>> >>>> This adds an interface that lets kernel callers submit aio iocbs without >>>> going through the user space syscalls. This lets kernel callers avoid >>>> the management limits and overhead of the context. It will also let us >>>> integrate aio operations with other kernel apis that the user space >>>> interface doesn't have access to. >>>> >>>> This patch is based on Dave's posts in below links: >>>> >>>> https://lkml.org/lkml/2013/10/16/365 >>>> https://groups.google.com/forum/#!topic/linux.kernel/l7mogGJZoKQ >>> >>> This was originally written a billion years ago when dinosaurs roamed >>> the earth. Also, notably, before Kent and Ben reworked a bunch of the >> >> Not so far away, this patch is based on Dave's last version of V9, which >> was posted in Oct, 2013, :-) > > Which was based on a much earlier patch from Zach. I regret that I left > aio_kernel_submit entangled with aio_run_iocb when I reworked his patches. > >>> aio core. I'd want them to take a look at this patch to make sure that >>> it doesn't rely on any assumptions that have changed. >> >> Looks I missed to Cc Ken, :-( >> >>> >>>> +/* opcode values not exposed to user space */ >>>> +enum { >>>> + IOCB_CMD_READ_ITER = 0x10000, >>>> + IOCB_CMD_WRITE_ITER = 0x10001, >>>> +}; >>> >>> And I think the consensus was that this isn't good enough. Find a way >>> to encode the kernel caller ops without polluting the uiocb cmd name >>> space. >> >> That is easy, since the two cmd names are only for kernel AIO, whatever >> should be OK, but looks I didn't see such comment. > > Agreed. These were added because the flags had been interpreted by > aio_run_iocb(). I'm happy that is no longer the case. We can remove the two cmd names completely, and just use one read/write flag, will do it in V1. > >>> >>> (I've now come to think that this entire approach of having loop use aio >>> is misguided and that the way forward is to have dio consume what loop >>> naturally produces -- bios, blk-mq rqs, whatever -- but I'm on to other >> >> Yes, that is what these patches are doing, and actually AIO's >> model is a good match to driver's interface. Lots of drivers >> use the asynchronous model(submit, complete, ...). >> >>> things these days.) >> >> At least, loop can improve its throughput much by kernel AIO >> without big changes to fs/direct-io(attribute much to ITER_BVEC), >> and vhost-scsi should benefit from it too. >> >> Thanks, >> -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html