On 2018/8/5 4:07 PM, shenghui wrote: > > > On 08/05/2018 12:14 PM, Coly Li wrote: >> On 2018/8/5 10:16 AM, shenghui wrote: >>> >>> >>> On 08/05/2018 01:35 AM, Coly Li wrote: >>>> On 2018/8/3 6:57 PM, Shenghui Wang wrote: >>>>> Recal cached_dev_sectors on cached_dev detached, as recal done on >>>>> cached_dev attached. >>>>> >>>>> Signed-off-by: Shenghui Wang <shhuiw@xxxxxxxxxxx> >>>>> --- >>>>> drivers/md/bcache/super.c | 1 + >>>>> 1 file changed, 1 insertion(+) >>>>> >>>>> diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c >>>>> index fa4058e43202..a5612c8a6c14 100644 >>>>> --- a/drivers/md/bcache/super.c >>>>> +++ b/drivers/md/bcache/super.c >>>>> @@ -991,6 +991,7 @@ static void cached_dev_detach_finish(struct work_struct *w) >>>>> >>>>> bcache_device_detach(&dc->disk); >>>>> list_move(&dc->list, &uncached_devices); >>>>> + calc_cached_dev_sectors(dc->disk.c); >>>>> >>>>> clear_bit(BCACHE_DEV_DETACHING, &dc->disk.flags); >>>>> clear_bit(BCACHE_DEV_UNLINK_DONE, &dc->disk.flags); >>>>> >>>> >>>> Hi Shenghui, >>>> >>>> During my testing, after writeback all dirty data, when I detach the >>>> backing device from cache set, a NULL pointer dereference error happens. >>>> Here is the oops message, >>>> >>>> [ 4114.687721] BUG: unable to handle kernel NULL pointer dereference at >>>> 0000000000000cf8 >>>> [ 4114.691136] PGD 0 P4D 0 >>>> [ 4114.692094] Oops: 0000 [#1] PREEMPT SMP PTI >>>> [ 4114.693962] CPU: 1 PID: 1845 Comm: kworker/1:43 Tainted: G >>>> E 4.18.0-rc7-1-default+ #1 >>>> [ 4114.697732] Hardware name: VMware, Inc. VMware Virtual Platform/440BX >>>> Desktop Reference Platform, BIOS 6.00 05/19/2017 >>>> [ 4114.701886] Workqueue: events cached_dev_detach_finish [bcache] >>>> [ 4114.704072] RIP: 0010:cached_dev_detach_finish+0x127/0x1e0 [bcache] >>>> [ 4114.706377] Code: 3f 58 01 00 31 d2 4c 89 60 08 48 89 83 a8 f3 ff ff >>>> 48 c7 83 b0 f3 ff ff 10 72 31 c0 4c 89 25 20 58 01 00 48 8b bb 48 f4 ff >>>> ff <48> 8b 87 f8 0c 00 00 48 8d b7 f8 0c 00 00 48 39 c6 74 1e 48 8b 88 >>>> [ 4114.714524] RSP: 0018:ffffba4881b33e30 EFLAGS: 00010246 >>>> [ 4114.716537] RAX: ffffffffc0317210 RBX: ffff9bea33c00c58 RCX: >>>> 0000000000000000 >>>> [ 4114.719193] RDX: 0000000000000000 RSI: ffff9bea2ffb15e0 RDI: >>>> 0000000000000000 >>>> [ 4114.721790] RBP: ffff9bea33c00010 R08: 0000000000000000 R09: >>>> 000000000000000f >>>> [ 4114.724477] R10: ffff9bea254ec928 R11: 0000000000000010 R12: >>>> ffff9bea33c00000 >>>> [ 4114.727170] R13: 0000000000000000 R14: ffff9bea35666500 R15: >>>> 0000000000000000 >>>> [ 4114.730012] FS: 0000000000000000(0000) GS:ffff9bea35640000(0000) >>>> knlGS:0000000000000000 >>>> [ 4114.732966] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>> [ 4114.735068] CR2: 0000000000000cf8 CR3: 000000012300a004 CR4: >>>> 00000000003606e0 >>>> [ 4114.737693] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >>>> 0000000000000000 >>>> [ 4114.740286] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: >>>> 0000000000000400 >>>> [ 4114.743187] Call Trace: >>>> [ 4114.744133] ? bch_keybuf_init+0x60/0x60 [bcache] >>>> [ 4114.745969] ? bch_sectors_dirty_init.cold.21+0x1b/0x1b [bcache] >>>> [ 4114.748181] process_one_work+0x1d1/0x310 >>>> [ 4114.749677] worker_thread+0x28/0x3c0 >>>> [ 4114.751053] ? rescuer_thread+0x330/0x330 >>>> [ 4114.752541] kthread+0x108/0x120 >>>> [ 4114.753752] ? kthread_create_worker_on_cpu+0x60/0x60 >>>> [ 4114.756001] ret_from_fork+0x35/0x40 >>>> [ 4114.757332] Modules linked in: bcache(E) af_packet(E) iscsi_ibft(E) >>>> iscsi_boot_sysfs(E) vmw_vsock_vmci_transport(E) vsock(E) vmw_balloon(E) >>>> e1000(E) vmw_vmci(E) sr_mod(E) cdrom(E) ata_piix(E) uhci_hcd(E) >>>> ehci_pci(E) ehci_hcd(E) mptspi(E) scsi_transport_spi(E) mptscsih(E) >>>> usbcore(E) mptbase(E) sg(E) >>>> [ 4114.766902] CR2: 0000000000000cf8 >>>> [ 4114.768135] ---[ end trace 467143bbdebef7b9 ]--- >>>> [ 4114.769992] RIP: 0010:cached_dev_detach_finish+0x127/0x1e0 [bcache] >>>> [ 4114.772287] Code: 3f 58 01 00 31 d2 4c 89 60 08 48 89 83 a8 f3 ff ff >>>> 48 c7 83 b0 f3 ff ff 10 72 31 c0 4c 89 25 20 58 01 00 48 8b bb 48 f4 ff >>>> ff <48> 8b 87 f8 0c 00 00 48 8d b7 f8 0c 00 00 48 39 c6 74 1e 48 8b 88 >>>> [ 4114.779325] RSP: 0018:ffffba4881b33e30 EFLAGS: 00010246 >>>> [ 4114.781300] RAX: ffffffffc0317210 RBX: ffff9bea33c00c58 RCX: >>>> 0000000000000000 >>>> [ 4114.783960] RDX: 0000000000000000 RSI: ffff9bea2ffb15e0 RDI: >>>> 0000000000000000 >>>> [ 4114.786582] RBP: ffff9bea33c00010 R08: 0000000000000000 R09: >>>> 000000000000000f >>>> [ 4114.789207] R10: ffff9bea254ec928 R11: 0000000000000010 R12: >>>> ffff9bea33c00000 >>>> [ 4114.791827] R13: 0000000000000000 R14: ffff9bea35666500 R15: >>>> 0000000000000000 >>>> [ 4114.794521] FS: 0000000000000000(0000) GS:ffff9bea35640000(0000) >>>> knlGS:0000000000000000 >>>> [ 4114.797509] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>> [ 4114.799613] CR2: 0000000000000cf8 CR3: 000000012300a004 CR4: >>>> 00000000003606e0 >>>> [ 4114.802559] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >>>> 0000000000000000 >>>> [ 4114.805195] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: >>>> 0000000000000400 >>>> >>>> Could you please to have a look ? >>>> cached_dev_detach_finish() is executed in a work queue, when it is >>>> called, it is possible that the cache set memory is released already. >>>> >>>> Thanks. >>>> >>>> Coly Li >>>> >>>> >>> >>> Hi Coly, >>> >>> I checked the code path, and found that bcache_device_detach will >>> set bcache_device->c to NULL before my previous change. So I made >>> a new change. >>> >>> Please check the followed new patch. >> >> Sure no problem, just double check, do you test/verify the change before >> posting it ? >> >> Thanks. >> >> Coly Li >> > > Hi Coly, > > I did basic attach/detach test. > > Will you please share your test case, so that I can do further test? Sure, here is my procedure, 1, make 100G cache set and 500G backing device 2, attach as writeback mode 3, set congested_read/write_threshold to 0 4, run fio to generate dirty data on cache set 5, when dirty data exceeds aboud 20% of dirty target, stop fio jobs 6, wait for all dirty data are written back to backing device 7, "echo 1 > /sys/block/bcache0/bcache/detach" to detach the backing device from cache set 8, "echo 1 > /sys/block/bcache0/bcache/stop" to stop backing device 9, "echo 1 > /sys/fs/bcache/<UUID>/stop" to stop cache set 10, rmmod bcache Here is my fio job file, this is function verification on my laptop. [global] thread=1 ioengine=libaio direct=1 [job0] filename=/dev/bcache0 readwrite=randrw rwmixread=0.5 blocksize=32k numjobs=2 iodepth=16 runtime=20m Thanks. Coly Li -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html