On 6/13/19 4:45 PM, Jens Axboe wrote: > On 5/31/19 12:05 AM, Bob Liu wrote: >> Dulicate call of null_del_dev() will trigger null pointer error like below. >> The reason is a race condition between nullb_device_power_store() and >> nullb_group_drop_item(). >> >> CPU#0 CPU#1 >> ---------------- ----------------- >> do_rmdir() >> >configfs_rmdir() >> >client_drop_item() >> >nullb_group_drop_item() >> nullb_device_power_store() >> >null_del_dev() >> >> >test_and_clear_bit(NULLB_DEV_FL_UP >> >null_del_dev() >> ^^^^^ >> Duplicated null_dev_dev() triger null pointer error >> >> >clear_bit(NULLB_DEV_FL_UP >> >> The fix could be keep the sequnce of clear NULLB_DEV_FL_UP and null_del_dev(). >> >> [ 698.613600] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 >> [ 698.613608] #PF error: [normal kernel read fault] >> [ 698.613611] PGD 0 P4D 0 >> [ 698.613619] Oops: 0000 [#1] SMP PTI >> [ 698.613627] CPU: 3 PID: 6382 Comm: rmdir Not tainted 5.0.0+ #35 >> [ 698.613631] Hardware name: LENOVO 20LJS2EV08/20LJS2EV08, BIOS R0SET33W (1.17 ) 07/18/2018 >> [ 698.613644] RIP: 0010:null_del_dev+0xc/0x110 [null_blk] >> [ 698.613649] Code: 00 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b eb 97 e8 47 bb 2a e8 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 54 53 <8b> 77 18 48 89 fb 4c 8b 27 48 c7 c7 40 57 1e c1 e8 bf c7 cb e8 48 >> [ 698.613654] RSP: 0018:ffffb887888bfde0 EFLAGS: 00010286 >> [ 698.613659] RAX: 0000000000000000 RBX: ffff9d436d92bc00 RCX: ffff9d43a9184681 >> [ 698.613663] RDX: ffffffffc11e5c30 RSI: 0000000068be6540 RDI: 0000000000000000 >> [ 698.613667] RBP: ffffb887888bfdf0 R08: 0000000000000001 R09: 0000000000000000 >> [ 698.613671] R10: ffffb887888bfdd8 R11: 0000000000000f16 R12: ffff9d436d92bc08 >> [ 698.613675] R13: ffff9d436d94e630 R14: ffffffffc11e5088 R15: ffffffffc11e5000 >> [ 698.613680] FS: 00007faa68be6540(0000) GS:ffff9d43d14c0000(0000) knlGS:0000000000000000 >> [ 698.613685] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 698.613689] CR2: 0000000000000018 CR3: 000000042f70c002 CR4: 00000000003606e0 >> [ 698.613693] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> [ 698.613697] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> [ 698.613700] Call Trace: >> [ 698.613712] nullb_group_drop_item+0x50/0x70 [null_blk] >> [ 698.613722] client_drop_item+0x29/0x40 >> [ 698.613728] configfs_rmdir+0x1ed/0x300 >> [ 698.613738] vfs_rmdir+0xb2/0x130 >> [ 698.613743] do_rmdir+0x1c7/0x1e0 >> [ 698.613750] __x64_sys_rmdir+0x17/0x20 >> [ 698.613759] do_syscall_64+0x5a/0x110 >> [ 698.613768] entry_SYSCALL_64_after_hwframe+0x44/0xa9 >> >> Signed-off-by: Bob Liu <bob.liu@xxxxxxxxxx> >> --- >> drivers/block/null_blk_main.c | 11 ++++++----- >> 1 file changed, 6 insertions(+), 5 deletions(-) >> >> diff --git a/drivers/block/null_blk_main.c b/drivers/block/null_blk_main.c >> index 62c9654..99dd0ab 100644 >> --- a/drivers/block/null_blk_main.c >> +++ b/drivers/block/null_blk_main.c >> @@ -326,11 +326,12 @@ static ssize_t nullb_device_power_store(struct config_item *item, >> set_bit(NULLB_DEV_FL_CONFIGURED, &dev->flags); >> dev->power = newp; >> } else if (dev->power && !newp) { >> - mutex_lock(&lock); >> - dev->power = newp; >> - null_del_dev(dev->nullb); >> - mutex_unlock(&lock); >> - clear_bit(NULLB_DEV_FL_UP, &dev->flags); >> + if (test_and_clear_bit(NULLB_DEV_FL_UP, &dev->flags)) { >> + mutex_lock(&lock); >> + dev->power = newp; >> + null_del_dev(dev->nullb); >> + mutex_unlock(&lock); >> + } >> clear_bit(NULLB_DEV_FL_CONFIGURED, &dev->flags); > > Is the ->power check safe? Should that be under the lock as well? > I think it's unnecessary. Even if dev->power is modified after checking, the test_and_clear_bit can still kepp null_dev_dev() won't be wrongly called. CPU#0 CPU#1 ---------------- ----------------- do_rmdir() >configfs_rmdir() >client_drop_item() >nullb_group_drop_item() nullb_device_power_store() > if dev->power >if test_and_clear_bit(NULLB_DEV_FL_UP > dev->power=false ^^^ Even if dev->power is modifiled after CPU#1 check > if test_and_clear_bit(NULLB_DEV_FL_UP ^^^^ This test_and_clear_bit can still keep null_del_dev() won't be called twice