On Fri 23-11-12 02:55:03, Christoph Hellwig wrote: > Add support to the core direct-io code to defer AIO completions to user > context using a workqueue. This replaces opencoded and less efficient > code in XFS and ext4 and will be needed to properly support O_(D)SYNC > for AIO. > > The communication between the filesystem and the direct I/O code requires > a new buffer head flag, which is a bit ugly but not avoidable until the > direct I/O code stops abusing the buffer_head structure for communicating > with the filesystems. > > Currently this creates a per-superblock unbnound workqueue for these > completions, which is taken from an earlier patch by Jan Kara. I'm > not really convince about this use and would prefer a "normal" global > workqueue with a high concurrently limit, but this needs further discussion. > > Signed-off-by: Christoph Hellwig <hch@xxxxxx> I like this, but this patch already breaks ext4, doesn't it? > Index: linux-2.6/fs/ext4/inode.c > =================================================================== > --- linux-2.6.orig/fs/ext4/inode.c 2012-11-06 13:33:43.483326794 +0100 > +++ linux-2.6/fs/ext4/inode.c 2012-11-21 21:19:35.227136042 +0100 > @@ -2876,15 +2876,13 @@ static int ext4_get_block_write_nolock(s > } > > static void ext4_end_io_dio(struct kiocb *iocb, loff_t offset, > - ssize_t size, void *private, int ret, > - bool is_async) > + ssize_t size, void *private) > { > - struct inode *inode = iocb->ki_filp->f_path.dentry->d_inode; > ext4_io_end_t *io_end = iocb->private; > > /* if not async direct IO or dio with 0 bytes write, just return */ > if (!io_end || !size) > - goto out; > + return; > > ext_debug("ext4_end_io_dio(): io_end 0x%p " > "for inode %lu, iocb 0x%p, offset %llu, size %zd\n", > @@ -2896,19 +2894,11 @@ static void ext4_end_io_dio(struct kiocb > /* if not aio dio with unwritten extents, just free io and return */ > if (!(io_end->flag & EXT4_IO_END_UNWRITTEN)) { > ext4_free_io_end(io_end); > -out: > - if (is_async) > - aio_complete(iocb, ret, 0); > - inode_dio_done(inode); > return; > } > > io_end->offset = offset; > io_end->size = size; > - if (is_async) { > - io_end->iocb = iocb; > - io_end->result = ret; > - } > > ext4_add_complete_io(io_end); ^^^ Here ext4 offloads IO completion to a worker thread. So you now complete AIO / DIO before ext4_end_io() runs which is a bug (ext4_end_io() is responsible for example for calling end_page_writeback()). I'll modify these patches to work for ext4 tomorrow I hope... Honza -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html