On Tue, Jan 18, 2011 at 6:07 AM, Jeff Moyer <jmoyer@xxxxxxxxxx> wrote: > Nick Piggin <npiggin@xxxxxxxxx> writes: > >> On Sat, Jan 15, 2011 at 1:52 AM, Jeff Moyer <jmoyer@xxxxxxxxxx> wrote: >>> Nick Piggin <npiggin@xxxxxxxxx> writes: >>> >>>> Hi, >>>> >>>> While hunting down a bug in NFS's AIO, I believe I found this >>>> buggy code... >>>> >>>> fs: aio fix rcu ioctx lookup >>>> >>>> aio-dio-invalidate-failure GPFs in aio_put_req from io_submit. >>>> >>>> lookup_ioctx doesn't implement the rcu lookup pattern properly. >>>> rcu_read_lock does not prevent refcount going to zero, so we >>>> might take a refcount on a zero count ioctx. >>> >>> So, does this patch fix the problem? You didn't actually say.... >> >> No, it seemd to be an NFS AIO problem, although it was a >> slightly older kernel so I'll re test after -rc1 if I haven't heard >> back about it. > > OK. > >> Do you agree with the theoretical problem? I didn't try to >> write a racer to break it yet. Inserting a delay before the >> get_ioctx might do the trick. > > I'm not convinced, no. The last reference to the kioctx is always the > process, released in the exit_aio path, or via sys_io_destroy. In both > cases, we cancel all aios, then wait for them all to complete before > dropping the final reference to the context. That wouldn't appear to prevent a concurrent thread from doing an io operation that requires ioctx lookup, and taking the last reference after the io_cancel thread drops the ref. > So, while I agree that what you wrote is better, I remain unconvinced of > it solving a real-world problem. Feel free to push it in as a cleanup, > though. Well I think it has to be technically correct first. If there is indeed a guaranteed ref somehow, it just needs a comment. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html