On 3/6/20 9:44 AM, Paul E. McKenney wrote: > On Fri, Mar 06, 2020 at 04:36:20PM +0100, Jann Horn wrote: >> On Fri, Mar 6, 2020 at 4:34 PM Jens Axboe <axboe@xxxxxxxxx> wrote: >>> On 3/6/20 7:57 AM, Jann Horn wrote: >>>> +paulmck >>>> >>>> On Wed, Mar 4, 2020 at 3:40 PM Jens Axboe <axboe@xxxxxxxxx> wrote: >>>>> On 3/4/20 12:59 AM, Dmitry Vyukov wrote: >>>>>> On Fri, Feb 7, 2020 at 9:14 AM syzbot >>>>>> <syzbot+e017e49c39ab484ac87a@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote: >>>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> syzbot found the following crash on: >>>>>>> >>>>>>> HEAD commit: 4c7d00cc Merge tag 'pwm/for-5.6-rc1' of git://git.kernel.o.. >>>>>>> git tree: upstream >>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=12fec785e00000 >>>>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=e162021ddededa72 >>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=e017e49c39ab484ac87a >>>>>>> compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81) >>>>>>> >>>>>>> Unfortunately, I don't have any reproducer for this crash yet. >>>>>>> >>>>>>> IMPORTANT: if you fix the bug, please add the following tag to the commit: >>>>>>> Reported-by: syzbot+e017e49c39ab484ac87a@xxxxxxxxxxxxxxxxxxxxxxxxx >>>>>> >>>>>> +io_uring maintainers >>>>>> >>>>>> Here is a repro: >>>>>> https://gist.githubusercontent.com/dvyukov/6b340beab6483a036f4186e7378882ce/raw/cd1922185516453c201df8eded1d4b006a6d6a3a/gistfile1.txt >>>>> >>>>> I've queued up a fix for this: >>>>> >>>>> https://git.kernel.dk/cgit/linux-block/commit/?h=io_uring-5.6&id=9875fe3dc4b8cff1f1b440fb925054a5124403c3 >>>> >>>> I believe that this fix relies on call_rcu() having FIFO ordering; but >>>> <https://www.kernel.org/doc/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html#Callback%20Registry> >>>> says: >>>> >>>> | call_rcu() normally acts only on CPU-local state[...] It simply >>>> enqueues the rcu_head structure on a per-CPU list, > > Indeed. For but one example, if there was a CPU-to-CPU migration between > the two call_rcu() invocations, it would not be at all surprising for > the two callbacks to execute out of order. > >>>> Is this fix really correct? >>> >>> That's a good point, there's a potentially stronger guarantee we need >>> here that isn't "nobody is inside an RCU critical section", but rather >>> that we're depending on a previous call_rcu() to have happened. Hence I >>> think you are right - it'll shrink the window drastically, since the >>> previous callback is already queued up, but it's not a full close. >>> >>> Hmm... >> >> You could potentially hack up the semantics you want by doing a >> call_rcu() whose callback does another call_rcu(), or something like >> that - but I'd like to hear paulmck's opinion on this first. > > That would work! > > Or, alternatively, do an rcu_barrier() between the two calls to > call_rcu(), assuming that the use case can tolerate rcu_barrier() > overhead and latency. If the nested call_rcu() works, that seems greatly preferable to needing the rcu_barrier(), even if that would not be a showstopper for me. The nested call_rcu() is just a bit odd, but with a comment it should be OK. Incremental here I'm going to test, would just fold in of course. diff --git a/fs/io_uring.c b/fs/io_uring.c index f3218fc81943..95ba95b4d8ec 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -5330,7 +5330,7 @@ static void io_file_ref_kill(struct percpu_ref *ref) complete(&data->done); } -static void io_file_ref_exit_and_free(struct rcu_head *rcu) +static void __io_file_ref_exit_and_free(struct rcu_head *rcu) { struct fixed_file_data *data = container_of(rcu, struct fixed_file_data, rcu); @@ -5338,6 +5338,18 @@ static void io_file_ref_exit_and_free(struct rcu_head *rcu) kfree(data); } +static void io_file_ref_exit_and_free(struct rcu_head *rcu) +{ + /* + * We need to order our exit+free call again the potentially + * existing call_rcu() for switching to atomic. One way to do that + * is to have this rcu callback queue the final put and free, as we + * could otherwise a pre-existing atomic switch complete _after_ + * the free callback we queued. + */ + call_rcu(rcu, __io_file_ref_exit_and_free); +} + static int io_sqe_files_unregister(struct io_ring_ctx *ctx) { struct fixed_file_data *data = ctx->file_data; -- Jens Axboe