On Wed, Mar 06, 2019 at 05:30:21PM -0800, Linus Torvalds wrote: > On Wed, Mar 6, 2019 at 5:20 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > > > I'll try to massage that series on top of your patch; I still hate the > > post-vfs_poll() logics in aio_poll() ;-/ Give me about half an hour > > and I'll have something to post. > > No inherent hurry, I sent the ping just to make sure it hadn't gotten lost. > > And yeah, I think the post-vfs_poll() logic cannot possibly be > necessary. My gut feel is that *if* we have the refcounting right, > then we should be able to just let the wakeup come in at any later > point, and ordering shouldn't matter all that much, and we shouldn't > even need any locking. > > I'd like to think that it can be done with something like "just 'or' > in the mask atomically" (so that we don't care about ordering between > the synchronous vfs_poll() and the async poll wakeup), together with > "when refcount goes to zero, finish the thing off and complete it" (so > that we don't care who finishes first). > > No "woken" logic, no "who fired first" logic, no BS. Just make the > operations work regardless of ordering. > > And maybe it can't be done. But the current model seems just so hacky > that it can't be the right model. Umm... It is kinda-sorta doable; we do need something vaguely similar to ->woken ("should we add it to the list of cancellables, or is the async reference already gone?"), but other than that it seems to be feasible. See vfs.git#work.aio; the crucial bits are in these commits: keep io_event in aio_kiocb get rid of aio_complete() res/res2 arguments move aio_complete() to final iocb_put(), try to fix aio_poll() logics The first two are preparations, the last is where the fixes (hopefully) happen. The logics in aio_poll() after vfs_poll(): * we might want to steal the async reference (e.g. due to event returned from the very beginning, or due to attempt to put on more than one waitqueue, which makes results unreliable). That's _NOT_ possible if the thing had been put on a waitqueue, but currently isn't there. It might be either due to early wakeup having done everything or the same having scheduled aio_poll_complete_work(). In either case, the best we can do is to ignore the return value of vfs_poll() and, in case of error, mark the sucker cancelled. We *can't* return an error in that case. * if we want and can steal the async reference, rip it from waitqueue; otherwise, put it on the "cancellable" list, unless it's already gone or unless we are simulating the cancel ourselves. * if vfs_poll() has reported something we want and we have successufully stolen the iocb, put it there, have the reference we'd taken over dropped and return 0 Comments?