On Mon, May 28, 2018 at 03:04:34PM +0100, Al Viro wrote: > How about lifting the list removal into aio_complete_rw() and aio_poll_work(), > with WARN_ON() left in its place in aio_complete() itself? Look: > aio_compelete() call chains are > aio_complete_rw() > aio_fsync_work() > __aio_poll_complete() > aio_poll_work() > aio_poll_wake() > aio_poll() > > The call in aio_fsync_work() is guaranteed to have iocb not on cancel lists. > The call in aio_poll_wake() *relies* upon aio_complete() not going into > list removal. The call in aio_poll() is also guaranteed to be not on cancel > list - we get there only if mask != 0 and we add to ->active_reqs only if > mask == 0. > > So if we take the list removal into aio_complete_rw() and aio_poll_wake() we > should get the right ordering - iocb gets removed from the list before fput() > in all cases. And aio_complete() locking footprint becomes simpler... As > a fringe benefit, __aio_poll_complete() becomes simply > fput(req->file); > aio_complete(iocb, mangle_poll(mask), 0); > since we don't need to order fput() vs. aio_complete() anymore - the caller > of __aio_poll_complete() has already taken care of ->ki_cancel() possibility. Anyway, what I have in mind is in vfs.git#work.aio; on top of your fix for missing break it's aio: take list removal to (some) callers of aio_complete() aio: all callers of aio_{read,write,fsync,poll} treat 0 and -EIOCBQUEUED the same way aio_read_events_ring(): make a bit more readable aio: shift copyin of iocb into io_submit_one() aio: fold do_io_submit() into callers aio: sanitize the limit checking in io_submit(2) (in followups)