On Fri 30-08-13 16:53:01, Al Viro wrote: > On Wed, Aug 14, 2013 at 11:10:54AM +0200, Jan Kara wrote: > > Hello, > > > > this is second iteration of patches to fix handling of O_SYNC AIO DIO. > > Since previous version I've addressed Dave's comments: > > - slightly expanded changelog of the first patch > > - workqueue is now created with parameters allowing paralelism > > - workqueue name contains sb->s_id > > - workqueue is created on demand (I decided to do this to reduce the overhead > > in unnecessary cases) > > > > The patchset survives xfstests run for ext4 & xfs so it should be sane. Since > > this touches several filesystems (although only ext4 & xfs are non-trivial), > > the question is who should carry these patches. Maybe Al? But since xfs and > > ext4 changes are non-trivial, I'd like to have a review from their > > developers... > > Looks sane, except that I'd probably put destroying the queue after > evict_inodes(), next to ->put_super() call. OK, I've changed that. I'll send v3 in a moment. > Said that, there's another interesting problem in the code affected by that > sucker: generic_file_aio_write() might very well sync the wrong range. > Consider O_APPEND case; __generic_file_aio_write() will call > generic_write_checks(), which will update its copy of pos, and proceed to > write starting from there. All right and proper, but then we return into > generic_file_aio_write() and sync the range of the right length, starting > at the *original* value of pos... Yes, that looks like a bug. I was looking into how we could fix that and the easiest seems to be to move generic_segment_checks() and generic_write_checks() from __generic_file_aio_write() to generic_file_aio_write(). There are only three callers of __generic_file_aio_write(). cifs_writev() which can and should use generic_file_aio_write() anyway, ext4_file_dio_write() which could use generic_file_aio_write() if we cleaned up the code and moved it around a bit, and blkdev_aio_write() which really needs to call __generic_file_aio_write() (it doesn't want to grab i_mutex). So that last caller would need to do the moved checks manually. But this all seems a bit complex so I'd prefer to do it as a separate series. Honza -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html