Re: bcache issues on PPC64LE

Coly Li <colyli@xxxxxxx> · Thu, 12 Jul 2018 20:51:27 +0800

On 2018/7/9 11:51 PM, Cameron Berkenpas wrote:
> Hello,
> 
> Thank you for the fast response! Sorry if this message is too verbose...
> 
> Yes, ppc64le is just PPC in little endian mode, correct.
> 
> How I'm creating the devices for bcache:
> make-bcache -B /dev/sdd1
> 
> (I've tried removing the writeback and discard options too)
> make-bcache -C --writeback --discard /dev/sdc1
> 
> When attaching the caching device for the *first* time (I suspect this
> is normal):
> [  193.275300] bcache: register_bdev() registered backing device sdd1
> [  193.292590] bcache: run_cache_set() invalidating existing data
> [  223.527043] bcache: register_cache() registered cache device sdc1
> [  223.534950] bcache: bch_cached_dev_attach() Caching sdd1 as bcache0
> on set 6aa362b3-606e-4c51-9bc7-807b8a6a8442
> 
> Detaching caching device:
> [  325.293675] bcache: cached_dev_detach_finish() Caching disabled for sdd1
> 
> And finally when I attempt to re-attach, things hang. No messages.
> 
> Here's the trace from 'echo l > /proc/sysrq-trigger':
> [  526.384192] sysrq: SysRq : Show backtrace of all active CPUs
> [  526.384673] sysrq: CPU20:
> [  526.384710] Call Trace:
> [  526.384742] [c000001e5085f930] [c000000000778ce0] showacpu+0x80/0xa0
> (unreliable)
> [  526.384841] [c000001e5085f9a0] [c0000000001d6dd8]
> flush_smp_call_function_queue+0x128/0x1d0
> [  526.384958] [c000001e5085fa20] [c00000000004d89c]
> smp_ipi_demux_relaxed+0x9c/0x110
> [  526.385075] [c000001e5085fa60] [c000000000048750]
> doorbell_exception+0xb0/0xf0
> [  526.385173] [c000001e5085faa0] [c000000000009fa8]
> h_doorbell_common+0x158/0x160
> [  526.385282] --- interrupt: e81 at replay_interrupt_return+0x0/0x4
>                    LR = arch_local_irq_restore+0x74/0x90
> [  526.385420] [c000001e5085fd90] [0000000000000014] 0x14 (unreliable)
> [  526.385511] [c000001e5085fdb0] [c0000000009f82d0]
> cpuidle_enter_state+0xf0/0x400
> [  526.385618] [c000001e5085fe10] [c000000000154df0] call_cpuidle+0x70/0xd0
> [  526.385713] [c000001e5085fe50] [c00000000015554c] do_idle+0x31c/0x3a0
> [  526.385798] [c000001e5085fec0] [c00000000015582c]
> cpu_startup_entry+0x3c/0x50
> [  526.385892] [c000001e5085fef0] [c00000000004ec6c]
> start_secondary+0x4fc/0x540
> [  526.385992] [c000001e5085ff90] [c00000000000b270]
> start_secondary_prolog+0x10/0x14
> 
> Now that the re-attach has hung for a while, I have the following:
> [  605.232666] INFO: task bash:2134 blocked for more than 120 seconds.
> [  605.232715]       Not tainted 4.17.5 #7
> [  605.232746] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  605.232807] bash            D    0  2134   2133 0x00040000
> [  605.232846] Call Trace:
> [  605.232883] [c000001c0ce8f850] [c00000000001e6ec]
> __switch_to+0x30c/0x4d0
> [  605.232957] [c000001c0ce8f8b0] [c000000000c08470] __schedule+0x330/0xa90
> [  605.233030] [c000001c0ce8f980] [c000000000c08c10] schedule+0x40/0xc0
> [  605.233113] [c000001c0ce8f9a0] [c000000000c0d708]
> rwsem_down_write_failed+0x198/0x390
> [  605.233220] [c000001c0ce8fa50] [c000000000c0c588] down_write+0x78/0xa0
> [  605.233304] [c000001c0ce8fa80] [c0000000009d79fc]
> bch_cached_dev_attach+0x35c/0x5c0
> [  605.233394] [c000001c0ce8fb50] [c0000000009db7b0]
> __cached_dev_store+0x820/0x8c0
> [  605.233466] [c000001c0ce8fc00] [c0000000009db8b4]
> bch_cached_dev_store+0x64/0x1a0
> [  605.233529] [c000001c0ce8fc50] [c000000000490bcc]
> sysfs_kf_write+0x7c/0xc0
> [  605.233585] [c000001c0ce8fc90] [c00000000048f79c]
> kernfs_fop_write+0x18c/0x250
> [  605.233693] [c000001c0ce8fce0] [c0000000003c144c] __vfs_write+0x6c/0x1d0
> [  605.233776] [c000001c0ce8fd80] [c0000000003c1808] vfs_write+0xd8/0x240
> [  605.233860] [c000001c0ce8fdd0] [c0000000003c1bd0] ksys_write+0x70/0x120
> [  605.233935] [c000001c0ce8fe30] [c00000000000b9e0] system_call+0x58/0x6c
> 
> And finally, here's the stack for that bash process:
> [<0>]           (null)
> [<0>] __switch_to+0x30c/0x4d0
> [<0>] bch_cached_dev_attach+0x35c/0x5c0
> [<0>] __cached_dev_store+0x820/0x8c0
> [<0>] bch_cached_dev_store+0x64/0x1a0
> [<0>] sysfs_kf_write+0x7c/0xc0
> [<0>] kernfs_fop_write+0x18c/0x250
> [<0>] __vfs_write+0x6c/0x1d0
> [<0>] vfs_write+0xd8/0x240
> [<0>] ksys_write+0x70/0x120
> [<0>] system_call+0x58/0x6c
> 
> In case it's useful, here's the /proc/<pid>/stack of all the bcache
> kernel processes:
> 
> [bcache]:
> [<0>]           (null)
> [<0>] __switch_to+0x30c/0x4d0
> [<0>] rescuer_thread+0x3a8/0x470
> [<0>] kthread+0x1a8/0x1b0
> [<0>] ret_from_kernel_thread+0x5c/0x8c
> 
> [bcache_gc]:
> [<0>]           (null)
> [<0>] __switch_to+0x30c/0x4d0
> [<0>] rescuer_thread+0x3a8/0x470
> [<0>] kthread+0x1a8/0x1b0
> [<0>] ret_from_kernel_thread+0x5c/0x8c
> 
> [bcache_allocato]:
> [<0>] 0xa00000000
> [<0>] __switch_to+0x30c/0x4d0
> [<0>] bch_allocator_thread+0x2e8/0xde0
> [<0>] kthread+0x1a8/0x1b0
> [<0>] ret_from_kernel_thread+0x5c/0x8c
> 
> [bcache_gc]:
> [<0>]           (null)
> [<0>] __switch_to+0x30c/0x4d0
> [<0>] bch_gc_thread+0x220/0x260
> [<0>] kthread+0x1a8/0x1b0
> [<0>] ret_from_kernel_thread+0x5c/0x8c
> 
> [bcache_writebac]:
> [<0>]           (null)
> [<0>] __switch_to+0x30c/0x4d0
> [<0>] rescuer_thread+0x3a8/0x470
> [<0>] kthread+0x1a8/0x1b0
> [<0>] ret_from_kernel_thread+0x5c/0x8c
> 

Hi Cameron,

It seems some kind of dead lock happening on writeback_lock semaphore.
The 4.17 kernel is quite fresh, so I suspect maybe upstream kernel may
have similar issue.

Is it possible for you to compile and run Linux v4.18-rc3, if yes, I
will send you a debug patch to print kernel message and see what
happens. I know there is a very rare dead lock (which might not happen
in real), not sure whether it is your condition.

Thanks.

Coly Li
--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html