Re: [PATCH] bcache: recal cached_dev_sectors on detach

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2018/8/5 4:07 PM, shenghui wrote:
> 
> 
> On 08/05/2018 12:14 PM, Coly Li wrote:
>> On 2018/8/5 10:16 AM, shenghui wrote:
>>>
>>>
>>> On 08/05/2018 01:35 AM, Coly Li wrote:
>>>> On 2018/8/3 6:57 PM, Shenghui Wang wrote:
>>>>> Recal cached_dev_sectors on cached_dev detached, as recal done on
>>>>> cached_dev attached.
>>>>>
>>>>> Signed-off-by: Shenghui Wang <shhuiw@xxxxxxxxxxx>
>>>>> ---
>>>>>  drivers/md/bcache/super.c | 1 +
>>>>>  1 file changed, 1 insertion(+)
>>>>>
>>>>> diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
>>>>> index fa4058e43202..a5612c8a6c14 100644
>>>>> --- a/drivers/md/bcache/super.c
>>>>> +++ b/drivers/md/bcache/super.c
>>>>> @@ -991,6 +991,7 @@ static void cached_dev_detach_finish(struct work_struct *w)
>>>>>  
>>>>>  	bcache_device_detach(&dc->disk);
>>>>>  	list_move(&dc->list, &uncached_devices);
>>>>> +	calc_cached_dev_sectors(dc->disk.c);
>>>>>  
>>>>>  	clear_bit(BCACHE_DEV_DETACHING, &dc->disk.flags);
>>>>>  	clear_bit(BCACHE_DEV_UNLINK_DONE, &dc->disk.flags);
>>>>>
>>>>
>>>> Hi Shenghui,
>>>>
>>>> During my testing, after writeback all dirty data, when I detach the
>>>> backing device from cache set, a NULL pointer dereference error happens.
>>>> Here is the oops message,
>>>>
>>>> [ 4114.687721] BUG: unable to handle kernel NULL pointer dereference at
>>>> 0000000000000cf8
>>>> [ 4114.691136] PGD 0 P4D 0
>>>> [ 4114.692094] Oops: 0000 [#1] PREEMPT SMP PTI
>>>> [ 4114.693962] CPU: 1 PID: 1845 Comm: kworker/1:43 Tainted: G
>>>> E     4.18.0-rc7-1-default+ #1
>>>> [ 4114.697732] Hardware name: VMware, Inc. VMware Virtual Platform/440BX
>>>> Desktop Reference Platform, BIOS 6.00 05/19/2017
>>>> [ 4114.701886] Workqueue: events cached_dev_detach_finish [bcache]
>>>> [ 4114.704072] RIP: 0010:cached_dev_detach_finish+0x127/0x1e0 [bcache]
>>>> [ 4114.706377] Code: 3f 58 01 00 31 d2 4c 89 60 08 48 89 83 a8 f3 ff ff
>>>> 48 c7 83 b0 f3 ff ff 10 72 31 c0 4c 89 25 20 58 01 00 48 8b bb 48 f4 ff
>>>> ff <48> 8b 87 f8 0c 00 00 48 8d b7 f8 0c 00 00 48 39 c6 74 1e 48 8b 88
>>>> [ 4114.714524] RSP: 0018:ffffba4881b33e30 EFLAGS: 00010246
>>>> [ 4114.716537] RAX: ffffffffc0317210 RBX: ffff9bea33c00c58 RCX:
>>>> 0000000000000000
>>>> [ 4114.719193] RDX: 0000000000000000 RSI: ffff9bea2ffb15e0 RDI:
>>>> 0000000000000000
>>>> [ 4114.721790] RBP: ffff9bea33c00010 R08: 0000000000000000 R09:
>>>> 000000000000000f
>>>> [ 4114.724477] R10: ffff9bea254ec928 R11: 0000000000000010 R12:
>>>> ffff9bea33c00000
>>>> [ 4114.727170] R13: 0000000000000000 R14: ffff9bea35666500 R15:
>>>> 0000000000000000
>>>> [ 4114.730012] FS:  0000000000000000(0000) GS:ffff9bea35640000(0000)
>>>> knlGS:0000000000000000
>>>> [ 4114.732966] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [ 4114.735068] CR2: 0000000000000cf8 CR3: 000000012300a004 CR4:
>>>> 00000000003606e0
>>>> [ 4114.737693] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>>> 0000000000000000
>>>> [ 4114.740286] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>>>> 0000000000000400
>>>> [ 4114.743187] Call Trace:
>>>> [ 4114.744133]  ? bch_keybuf_init+0x60/0x60 [bcache]
>>>> [ 4114.745969]  ? bch_sectors_dirty_init.cold.21+0x1b/0x1b [bcache]
>>>> [ 4114.748181]  process_one_work+0x1d1/0x310
>>>> [ 4114.749677]  worker_thread+0x28/0x3c0
>>>> [ 4114.751053]  ? rescuer_thread+0x330/0x330
>>>> [ 4114.752541]  kthread+0x108/0x120
>>>> [ 4114.753752]  ? kthread_create_worker_on_cpu+0x60/0x60
>>>> [ 4114.756001]  ret_from_fork+0x35/0x40
>>>> [ 4114.757332] Modules linked in: bcache(E) af_packet(E) iscsi_ibft(E)
>>>> iscsi_boot_sysfs(E) vmw_vsock_vmci_transport(E) vsock(E) vmw_balloon(E)
>>>> e1000(E) vmw_vmci(E) sr_mod(E) cdrom(E) ata_piix(E) uhci_hcd(E)
>>>> ehci_pci(E) ehci_hcd(E) mptspi(E) scsi_transport_spi(E) mptscsih(E)
>>>> usbcore(E) mptbase(E) sg(E)
>>>> [ 4114.766902] CR2: 0000000000000cf8
>>>> [ 4114.768135] ---[ end trace 467143bbdebef7b9 ]---
>>>> [ 4114.769992] RIP: 0010:cached_dev_detach_finish+0x127/0x1e0 [bcache]
>>>> [ 4114.772287] Code: 3f 58 01 00 31 d2 4c 89 60 08 48 89 83 a8 f3 ff ff
>>>> 48 c7 83 b0 f3 ff ff 10 72 31 c0 4c 89 25 20 58 01 00 48 8b bb 48 f4 ff
>>>> ff <48> 8b 87 f8 0c 00 00 48 8d b7 f8 0c 00 00 48 39 c6 74 1e 48 8b 88
>>>> [ 4114.779325] RSP: 0018:ffffba4881b33e30 EFLAGS: 00010246
>>>> [ 4114.781300] RAX: ffffffffc0317210 RBX: ffff9bea33c00c58 RCX:
>>>> 0000000000000000
>>>> [ 4114.783960] RDX: 0000000000000000 RSI: ffff9bea2ffb15e0 RDI:
>>>> 0000000000000000
>>>> [ 4114.786582] RBP: ffff9bea33c00010 R08: 0000000000000000 R09:
>>>> 000000000000000f
>>>> [ 4114.789207] R10: ffff9bea254ec928 R11: 0000000000000010 R12:
>>>> ffff9bea33c00000
>>>> [ 4114.791827] R13: 0000000000000000 R14: ffff9bea35666500 R15:
>>>> 0000000000000000
>>>> [ 4114.794521] FS:  0000000000000000(0000) GS:ffff9bea35640000(0000)
>>>> knlGS:0000000000000000
>>>> [ 4114.797509] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [ 4114.799613] CR2: 0000000000000cf8 CR3: 000000012300a004 CR4:
>>>> 00000000003606e0
>>>> [ 4114.802559] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>>> 0000000000000000
>>>> [ 4114.805195] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>>>> 0000000000000400
>>>>
>>>> Could you please to have a look ?
>>>> cached_dev_detach_finish() is executed in a work queue, when it is
>>>> called, it is possible that the cache set memory is released already.
>>>>
>>>> Thanks.
>>>>
>>>> Coly Li
>>>>
>>>>
>>>
>>> Hi Coly,
>>>
>>> I checked the code path, and found that bcache_device_detach will 
>>> set bcache_device->c to NULL before my previous change. So I made
>>> a new change. 
>>>
>>> Please check the followed new patch.
>>
>> Sure no problem, just double check, do you test/verify the change before
>> posting it ?
>>
>> Thanks.
>>
>> Coly Li
>>
> 
> Hi Coly,
> 
> I did basic attach/detach test.
> 
> Will you please share your test case, so that I can do further test?

Sure, here is my procedure,
1, make 100G cache set and 500G backing device
2, attach as writeback mode
3, set congested_read/write_threshold to 0
4, run fio to generate dirty data on cache set
5, when dirty data exceeds aboud 20% of dirty target, stop fio jobs
6, wait for all dirty data are written back to backing device
7, "echo 1 > /sys/block/bcache0/bcache/detach" to detach the backing
device from cache set
8, "echo 1 > /sys/block/bcache0/bcache/stop" to stop backing device
9, "echo 1 > /sys/fs/bcache/<UUID>/stop" to stop cache set
10, rmmod bcache

Here is my fio job file, this is function verification on my laptop.
[global]
thread=1
ioengine=libaio
direct=1

[job0]
filename=/dev/bcache0
readwrite=randrw
rwmixread=0.5
blocksize=32k
numjobs=2
iodepth=16
runtime=20m

Thanks.

Coly Li

--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux