Some other details I just remembered.
And I can trigger this hang whether the bcache device (ie, /dev/bcache0)
is formatted or not.
While re-attaching the cache device hangs, I can still mount the
filesystem... but any attempt to access the filesystem will hang
indefinitely (ie, ls).
While PPC64LE is PPC64 in little endian mode, is it possible there's
some codepaths that are assuming PPC is always big endian? I've heard
this is the case with some userspace software.
On 07/09/2018 08:51 AM, Cameron Berkenpas wrote:
Hello,
Thank you for the fast response! Sorry if this message is too verbose...
Yes, ppc64le is just PPC in little endian mode, correct.
How I'm creating the devices for bcache:
make-bcache -B /dev/sdd1
(I've tried removing the writeback and discard options too)
make-bcache -C --writeback --discard /dev/sdc1
When attaching the caching device for the *first* time (I suspect this
is normal):
[ 193.275300] bcache: register_bdev() registered backing device sdd1
[ 193.292590] bcache: run_cache_set() invalidating existing data
[ 223.527043] bcache: register_cache() registered cache device sdc1
[ 223.534950] bcache: bch_cached_dev_attach() Caching sdd1 as bcache0
on set 6aa362b3-606e-4c51-9bc7-807b8a6a8442
Detaching caching device:
[ 325.293675] bcache: cached_dev_detach_finish() Caching disabled for
sdd1
And finally when I attempt to re-attach, things hang. No messages.
Here's the trace from 'echo l > /proc/sysrq-trigger':
[ 526.384192] sysrq: SysRq : Show backtrace of all active CPUs
[ 526.384673] sysrq: CPU20:
[ 526.384710] Call Trace:
[ 526.384742] [c000001e5085f930] [c000000000778ce0]
showacpu+0x80/0xa0 (unreliable)
[ 526.384841] [c000001e5085f9a0] [c0000000001d6dd8]
flush_smp_call_function_queue+0x128/0x1d0
[ 526.384958] [c000001e5085fa20] [c00000000004d89c]
smp_ipi_demux_relaxed+0x9c/0x110
[ 526.385075] [c000001e5085fa60] [c000000000048750]
doorbell_exception+0xb0/0xf0
[ 526.385173] [c000001e5085faa0] [c000000000009fa8]
h_doorbell_common+0x158/0x160
[ 526.385282] --- interrupt: e81 at replay_interrupt_return+0x0/0x4
LR = arch_local_irq_restore+0x74/0x90
[ 526.385420] [c000001e5085fd90] [0000000000000014] 0x14 (unreliable)
[ 526.385511] [c000001e5085fdb0] [c0000000009f82d0]
cpuidle_enter_state+0xf0/0x400
[ 526.385618] [c000001e5085fe10] [c000000000154df0]
call_cpuidle+0x70/0xd0
[ 526.385713] [c000001e5085fe50] [c00000000015554c] do_idle+0x31c/0x3a0
[ 526.385798] [c000001e5085fec0] [c00000000015582c]
cpu_startup_entry+0x3c/0x50
[ 526.385892] [c000001e5085fef0] [c00000000004ec6c]
start_secondary+0x4fc/0x540
[ 526.385992] [c000001e5085ff90] [c00000000000b270]
start_secondary_prolog+0x10/0x14
Now that the re-attach has hung for a while, I have the following:
[ 605.232666] INFO: task bash:2134 blocked for more than 120 seconds.
[ 605.232715] Not tainted 4.17.5 #7
[ 605.232746] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 605.232807] bash D 0 2134 2133 0x00040000
[ 605.232846] Call Trace:
[ 605.232883] [c000001c0ce8f850] [c00000000001e6ec]
__switch_to+0x30c/0x4d0
[ 605.232957] [c000001c0ce8f8b0] [c000000000c08470]
__schedule+0x330/0xa90
[ 605.233030] [c000001c0ce8f980] [c000000000c08c10] schedule+0x40/0xc0
[ 605.233113] [c000001c0ce8f9a0] [c000000000c0d708]
rwsem_down_write_failed+0x198/0x390
[ 605.233220] [c000001c0ce8fa50] [c000000000c0c588] down_write+0x78/0xa0
[ 605.233304] [c000001c0ce8fa80] [c0000000009d79fc]
bch_cached_dev_attach+0x35c/0x5c0
[ 605.233394] [c000001c0ce8fb50] [c0000000009db7b0]
__cached_dev_store+0x820/0x8c0
[ 605.233466] [c000001c0ce8fc00] [c0000000009db8b4]
bch_cached_dev_store+0x64/0x1a0
[ 605.233529] [c000001c0ce8fc50] [c000000000490bcc]
sysfs_kf_write+0x7c/0xc0
[ 605.233585] [c000001c0ce8fc90] [c00000000048f79c]
kernfs_fop_write+0x18c/0x250
[ 605.233693] [c000001c0ce8fce0] [c0000000003c144c]
__vfs_write+0x6c/0x1d0
[ 605.233776] [c000001c0ce8fd80] [c0000000003c1808] vfs_write+0xd8/0x240
[ 605.233860] [c000001c0ce8fdd0] [c0000000003c1bd0]
ksys_write+0x70/0x120
[ 605.233935] [c000001c0ce8fe30] [c00000000000b9e0]
system_call+0x58/0x6c
And finally, here's the stack for that bash process:
[<0>] (null)
[<0>] __switch_to+0x30c/0x4d0
[<0>] bch_cached_dev_attach+0x35c/0x5c0
[<0>] __cached_dev_store+0x820/0x8c0
[<0>] bch_cached_dev_store+0x64/0x1a0
[<0>] sysfs_kf_write+0x7c/0xc0
[<0>] kernfs_fop_write+0x18c/0x250
[<0>] __vfs_write+0x6c/0x1d0
[<0>] vfs_write+0xd8/0x240
[<0>] ksys_write+0x70/0x120
[<0>] system_call+0x58/0x6c
In case it's useful, here's the /proc/<pid>/stack of all the bcache
kernel processes:
[bcache]:
[<0>] (null)
[<0>] __switch_to+0x30c/0x4d0
[<0>] rescuer_thread+0x3a8/0x470
[<0>] kthread+0x1a8/0x1b0
[<0>] ret_from_kernel_thread+0x5c/0x8c
[bcache_gc]:
[<0>] (null)
[<0>] __switch_to+0x30c/0x4d0
[<0>] rescuer_thread+0x3a8/0x470
[<0>] kthread+0x1a8/0x1b0
[<0>] ret_from_kernel_thread+0x5c/0x8c
[bcache_allocato]:
[<0>] 0xa00000000
[<0>] __switch_to+0x30c/0x4d0
[<0>] bch_allocator_thread+0x2e8/0xde0
[<0>] kthread+0x1a8/0x1b0
[<0>] ret_from_kernel_thread+0x5c/0x8c
[bcache_gc]:
[<0>] (null)
[<0>] __switch_to+0x30c/0x4d0
[<0>] bch_gc_thread+0x220/0x260
[<0>] kthread+0x1a8/0x1b0
[<0>] ret_from_kernel_thread+0x5c/0x8c
[bcache_writebac]:
[<0>] (null)
[<0>] __switch_to+0x30c/0x4d0
[<0>] rescuer_thread+0x3a8/0x470
[<0>] kthread+0x1a8/0x1b0
[<0>] ret_from_kernel_thread+0x5c/0x8c
On 07/09/2018 06:09 AM, Coly Li wrote:
On 2018/7/7 11:55 AM, Cameron Berkenpas wrote:
Hello,
I've recently tried to play around with bcache on some PPC64LE (POWER9,
little endian mode) hardware, and I've run into some issues.
I can format a device as a backing store no problem. If I never
attach a
caching device, it never has problems.
I can format a caching device no problem, I can attach it without issue
and use the cached bcache volume without issue.
If I detach the caching device, no perceivable issues. However, once I
attempted to re-attach the caching device, bcache hangs. For
example, if
I try to cat an arbitrary item under /sys, (such as 'cat
/sys/block/bcache0/bcache/sequential_cutoff'), it will hang
indefinitely.
If I reboot with the caching device attached, then after the system
comes back up, neither the caching and the backing volumes will be
recognized as bcache devices and are unmountable, and both must have
make-bcache run on them.
If I never attach a caching device to the backing device, the backing
never has problems. It appears to stay healthy between reboots. I've
had
uptimes of over 8 hours without a caching device without issue. No sort
of load was put on the volume, but otherwise, no issues.
The RAID controller I'm using (details below) is a very thoroughly
tested device that reports no issues. Specifically, the RAID controller
was tested on an x86 machine that's using bcache in the same manner as
here without issue.
What is the appropriate place to file a bug against this issue?
I've tried against multiple filesystems:
xfs
ext4
btrfs
I've tried against multiple kernels:
linux-image-4.16.0-2-powerpc64le (stock Debian kernel)
linux-image-4.16.18
linux-image-4.17.4
My setup:
Debian buster (testing)
1x POWER9 processor
LSI Megaraid 9361-16i (ALL disks are attached to this device).
128GB ECC memory
1x 10TB RAID1 volume (2 spinning disks)
2x 2TB Samsung 860 pro SSD's
2x 1TB Samsung 860 pro SSD's
(Note: all SSD's are in JBOD mode so I can access the individual disks
for TRIM, etc).
I've formatted the entire 10TB RAID1 volume as a backing store. I've
tried using 2TB and 1TB SSD's as caching devices (all brand new). I've
even tried using a 1TB SSD has a caching device for a 2TB SSD. The
results have not differed.
Thanks!
Hi Cameron,
Do you see anything suspicious from kernel message ? Or when the process
hangs, can you have its stack trace from proc or by sysrq ?
I don't touch any PPC64LE so far, and I assume it is same as x86-64
little endian, am I right ? Currently I have bug report that bcache
does not work on big endian machines and I am working on the fix. But
your machine is PPC64LE, maybe this is another separated issue.
Thanks.
Coly Li
--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html