On Wed, Mar 06, 2019 at 09:53:23PM +0800, zhengbin wrote: > CPU 0 CPU 1 > aio_poll-->vfs_poll > eventfd_write-->spin_lock_irq(lock) > -->..-->aio_poll_wake > -->spin_unlock_irq(lock) > -->spin_lock(lock) > -->if (req->woken) > mask = 0; --->did not call aio_poll_complete > -->iocb_put > > aio_poll_wake > req->woken = true; > if (mask) { > if (!(mask & req->events)) > return 0; --->did not call aio_poll_complete too ... and it's still on waitqueue, so it shouldn't be different from _not_ having had a wakeup yet. And yes, aio_poll() in mainline right now ends up _not_ adding it to "can be cancelled" list, leading to that bug. > vfs_poll-->eventfd_poll-->poll_wait-->aio_poll_queue_proc(add > aio_poll_wake to req->head) > > eventfd_write-->wake_up_locked_poll-->__wake_up_common-->curr->func > -->aio_poll_wake > > This patch fixes that. by the way, fix the bug of the error handling path. Leak on error is real (see thread a few days ago), and overall logics for "woken" should be similar to what you suggest, but I'd rather handle it slightly differently (see the same thread). I've a patch that ought to fix that and it seems to survive testing; I'll post once I finish carving it up - too many cleanups mixed into it. Give me a couple of hours; should be done (and posted) by then.