Hello all, First off, sorry for the wide reaching To and Cc, but this patch series touches the core kernel and also reaches across subsystems a bit. If some of the people who read this can provide review feedback, I would very much appreciate it. This series introduces new AIO functionality to make use of kernel threads (by way of queue_work()) to implement additional asynchronous operations. The work came about as the result of various tuning done to the kernel for my employer (Solace Systems) that we ship in our products. First off, the benefits: using kernel threads to implement AIO functionality has a significant benefit in our application. Compared to a user space thread pool based AIO implementation, we see roughly a 25% performance improvement in our application by using this new kernel based AIO functionality. This comes about as a consequence of fewer context switches, fewer transitions to/from userspace, and the ability to make certain optimizations in the kernel that are otherwise impossible in userspace (ie the new readahead functionality). Now the downsides: when using queue_work(), code executes in the context of a different task that the submitter of the operation. This means that there are significant security concerns if there are any bugs in the code that sets up the appropriate security credentials and related context in struct task. There may well be DoS bugs in this implementation which have yet to be discovered. Given the benefits, I am of the opinion that this patch series is a useful addition to the kernel. Since this code will be experimental for some period of time as the interactions with other subsystems are reviewed and tested, I have implemented a config option to allow for this code to be compiled out and a sysctl (fs.aio-auto-threads) that must be explicitly set to 1 before this new functionality is available to userspace. Hopefully this is enough to address the security concerns during the growing pains and allow other developers to safely explore the new functionality. Caveats: the existing O_DIRECT AIO code path is currently bypassed when the new thread helpers are enabled. I plan to do additional work in this area, but the fact that the dio code can block under certain conditions is not acceptable to the applications I am working on, as it leads to starvation of other requests the system is processing. That said, this is what's ready today, and I hope that people can provide feedback to help drive further improvements. I will be posting further documentation and test cases later this week for people to experiment with, but for those looking for a few test programs to exercise the new functionality, there is a collection of code at git://git.kvack.org/aio-testprogs.git/ . Getting the code cleaned up from the internal implementation to something that is in reasonable condition for submission ended up taking longer than expected. Thankfully, this kernel cycle lines up with some internal QA work, so there should be additional testing taking place over the next couple of months. Also, the libaio test harness has some bugs that the new functionality revealed. A version with fixes for those tests can be fetched from git://git.kvack.org/~bcrl/libaio.git/ . Wrappers for the new IOCB_CMD types should be posted there by the end of the day. Some notes on the new functionality: all operations are cancellable providing the kernel subsystem involved aborts operations when delivered a SIGKILL. This ensures that async operations on pipe and sockets are cancelled when the process that issued the operations exits. A couple of the test programs exercise this functionality on pipes. Signal handling is slightly impacted by this AIO functionality. Specifically, the first patch in the series introduces a new helper, io_send_sig() that delivers a signal intended for the performer of an io operation. This is used to deliver signals like SIGXFS and SIGPIPE. It is a straightforward replacement of send_sig(SIGXXX, current, 0) to io_send_sig(SIGXXX). As always, comments, bug reports and feedback are appreciated. Developers looking for a git pull can find one at git://git.kvack.org/aio-next.git/ . Cheers! -ben Benjamin LaHaise (13): signals: distinguish signals sent due to i/o via io_send_sig() aio: add aio_get_mm() helper aio: for async operations, make the iter argument persistent signals: add and use aio_get_task() to direct signals sent via io_send_sig() fs: make do_loop_readv_writev() non-static aio: add queue_work() based threaded aio support aio: enabled thread based async fsync aio: add support for aio poll via aio thread helper aio: add support for async openat() aio: add async unlinkat functionality mm: enable __do_page_cache_readahead() to include present pages aio: add support for aio readahead aio: add support for aio renameat operation drivers/gpu/drm/drm_lock.c | 2 +- drivers/gpu/drm/ttm/ttm_lock.c | 6 +- fs/aio.c | 727 ++++++++++++++++++++++++++++++++++++++--- fs/attr.c | 2 +- fs/binfmt_flat.c | 2 +- fs/fuse/dev.c | 2 +- fs/internal.h | 6 + fs/namei.c | 2 +- fs/pipe.c | 4 +- fs/read_write.c | 5 +- fs/splice.c | 8 +- include/linux/aio.h | 9 + include/linux/fs.h | 3 + include/linux/sched.h | 6 + include/uapi/linux/aio_abi.h | 15 +- init/Kconfig | 13 + kernel/auditsc.c | 6 +- kernel/signal.c | 20 ++ kernel/sysctl.c | 9 + mm/filemap.c | 6 +- mm/internal.h | 4 +- mm/readahead.c | 13 +- net/atm/common.c | 4 +- net/ax25/af_ax25.c | 2 +- net/caif/caif_socket.c | 2 +- net/core/stream.c | 2 +- net/decnet/af_decnet.c | 2 +- net/irda/af_irda.c | 4 +- net/netrom/af_netrom.c | 2 +- net/rose/af_rose.c | 2 +- net/sctp/socket.c | 2 +- net/unix/af_unix.c | 4 +- net/x25/af_x25.c | 2 +- 33 files changed, 817 insertions(+), 81 deletions(-) -- 2.5.0 -- "Thought is the essence of where you are now." -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html