Re: [PATCH 12/12] closures: fix a race on wakeup from closure_sync

Marc Smith <msmith626@xxxxxxxxx> · Tue, 24 Nov 2020 18:07:56 -0500

Hi,

On Mon, Jul 22, 2019 at 1:51 PM Kent Overstreet
<kent.overstreet@xxxxxxxxx> wrote:
>
> On Thu, Jul 18, 2019 at 03:46:46PM +0800, Coly Li wrote:
> > On 2019/7/16 6:47 下午, Coly Li wrote:
> > > Hi Kent,
> > >
> > > On 2019/6/11 3:14 上午, Kent Overstreet wrote:
> > >> Signed-off-by: Kent Overstreet <kent.overstreet@xxxxxxxxx>
> > > Acked-by: Coly Li <colyli@xxxxxxx>
> > >
> > > And also I receive report for suspicious closure race condition in
> > > bcache, and people ask for having this patch into Linux v5.3.
> > >
> > > So before this patch gets merged into upstream, I plan to rebase it to
> > > drivers/md/bcache/closure.c at this moment. Of cause the author is you.
> > >
> > > When lib/closure.c merged into upstream, I will rebase all closure usage
> > > from bcache to use lib/closure.{c,h}.
> >
> > Hi Kent,
> >
> > The race bug reporter replies me that the closure race bug is very rare
> > to reproduce, after applying the patch and testing, they are not sure
> > whether their closure race problem is fixed or not.
> >
> > And I notice rcu_read_lock()/rcu_read_unlock() is used here, but it is
> > not clear to me what is the functionality of the rcu read lock in
> > closure_sync_fn(). I believe you have reason to use the rcu stuffs here,
> > could you please provide some hints to help me to understand the change
> > better ?
>
> The race was when a thread using closure_sync() notices cl->s->done == 1 before
> the thread calling closure_put() calls wake_up_process(). Then, it's possible
> for that thread to return and exit just before wake_up_process() is called - so
> we're trying to wake up a process that no longer exists.
>
> rcu_read_lock() is sufficient to protect against this, as there's an rcu barrier
> somewhere in the process teardown path.

Is rcu_read_lock() sufficient even in a kernel that uses
"CONFIG_PREEMPT_NONE=y"? I ask because I hit this panic quite
frequently and it sounds like what you described as this patch is
supposed to fix/prevent:
crash> bt
PID: 0      TASK: ffff88862461ab00  CPU: 2   COMMAND: "swapper/2"
 #0 [ffffc90000100a88] machine_kexec at ffffffff8103d6b5
 #1 [ffffc90000100ad0] __crash_kexec at ffffffff8110d37b
 #2 [ffffc90000100b98] crash_kexec at ffffffff8110e07d
 #3 [ffffc90000100ba8] oops_end at ffffffff8101a9de
 #4 [ffffc90000100bc8] no_context at ffffffff81045e99
 #5 [ffffc90000100c30] async_page_fault at ffffffff81e010cf
    [exception RIP: atomic_try_cmpxchg+2]
    RIP: ffffffff810d3e3b  RSP: ffffc90000100ce8  RFLAGS: 00010046
    RAX: 0000000000000000  RBX: 0000000000000003  RCX: 0000000080080007
    RDX: 0000000000000001  RSI: ffffc90000100cf4  RDI: 0000000100000a5b
    RBP: 00000000ffffffff   R8: 0000000000000001   R9: ffffffffa0422d4e
    R10: ffff888532779000  R11: ffff888532779000  R12: 0000000000000046
    R13: 0000000000000000  R14: ffff88858a062000  R15: 0000000100000a5b
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #6 [ffffc90000100ce8] _raw_spin_lock_irqsave at ffffffff81cf7d7d
 #7 [ffffc90000100d08] try_to_wake_up at ffffffff810c1624
 #8 [ffffc90000100d58] closure_sync_fn at ffffffffa0419b07 [bcache]
 #9 [ffffc90000100d60] clone_endio at ffffffff81aac48c
#10 [ffffc90000100d90] call_bio_endio at ffffffff81a78e20
#11 [ffffc90000100da8] raid_end_bio_io at ffffffff81a78e69
#12 [ffffc90000100dd8] raid1_end_write_request at ffffffff81a79ad9
#13 [ffffc90000100e48] blk_update_request at ffffffff814c3ab1
#14 [ffffc90000100e88] blk_mq_end_request at ffffffff814caaf2
#15 [ffffc90000100ea0] blk_mq_complete_request at ffffffff814c91c1
#16 [ffffc90000100ec8] nvme_complete_cqes at ffffffffa002fb03 [nvme]
#17 [ffffc90000100f08] nvme_irq at ffffffffa002fb7f [nvme]
#18 [ffffc90000100f30] __handle_irq_event_percpu at ffffffff810e0d60
#19 [ffffc90000100f70] handle_irq_event_percpu at ffffffff810e0e65
#20 [ffffc90000100f98] handle_irq_event at ffffffff810e0ecb
#21 [ffffc90000100fb0] handle_edge_irq at ffffffff810e494d
#22 [ffffc90000100fc8] do_IRQ at ffffffff81e01900
--- <IRQ stack> ---
#23 [ffffc90000077e38] ret_from_intr at ffffffff81e00a0f
    [exception RIP: default_idle+24]
    RIP: ffffffff81cf7938  RSP: ffffc90000077ee0  RFLAGS: 00000246
    RAX: 0000000000000000  RBX: ffff88862461ab00  RCX: ffff888627a96000
    RDX: 0000000000000001  RSI: 0000000000000002  RDI: 0000000000000001
    RBP: 0000000000000000   R8: 000000cd42e4dffb   R9: 00023d75261d1b20
    R10: 00000000000001f0  R11: 0000000000000000  R12: 0000000000000000
    R13: 0000000000000002  R14: 0000000000000000  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffdb  CS: 0010  SS: 0018
#24 [ffffc90000077ee0] do_idle at ffffffff810c4ee4
#25 [ffffc90000077f20] cpu_startup_entry at ffffffff810c514f
#26 [ffffc90000077f30] start_secondary at ffffffff81037045
#27 [ffffc90000077f50] secondary_startup_64 at ffffffff810000d4
crash> dis closure_sync_fn
0xffffffffa0419af4 <closure_sync_fn>:   mov    0x8(%rdi),%rax
0xffffffffa0419af8 <closure_sync_fn+4>: movl   $0x1,0x8(%rax)
0xffffffffa0419aff <closure_sync_fn+11>:        mov    (%rax),%rdi
0xffffffffa0419b02 <closure_sync_fn+14>:        callq
0xffffffff810c185f <wake_up_process>
0xffffffffa0419b07 <closure_sync_fn+19>:        retq
crash>

I'm using Linux 5.4.69.

--Marc