Re: bcache gets stuck flushing writeback cache when used in combination with LUKS/dm-crypt and non-default bucket size

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 8 May 2016, James Johnston wrote:
> [1.] One line summary of the problem:
> 
> bcache gets stuck flushing writeback cache when used in combination with
> LUKS/dm-crypt and non-default bucket size
> 
> [2.] Full description of the problem/report:
> 
> I've run into a problem where the bcache writeback cache can't be flushed to
> disk when the backing device is a LUKS / dm-crypt device and the cache set has
> a non-default bucket size.  Basically, only a few megabytes will be flushed to
> disk, and then it gets stuck.  Stuck means that the bcache writeback task
> thrashes the disk by constantly reading hundreds of MB/second from the cache set
> in an infinite loop, while not actually progressing (dirty_data never decreases
> beyond a certain point).

While its thrashing, can you try getting a stack trace from the 
[bcache_writebac] thread with `cat /proc/pid/stack` ?

Run it several times as it is bound to change; maybe we can track down 
where it is spinning disk IO in the writeback process and add some debug 
code.  Perhaps there is some error-and-retry logic that needs some debug 
output.

--
Eric Wheeler



> 
> I am wondering if anybody else can reproduce this apparent bug?  Apologies for
> mailing both device mapper and bcache mailing lists, but I'm not sure where the
> bug lies as I've only reproduced it when both are used in combination.
> 
> The situation is basically unrecoverable as far as I can tell: if you attempt
> to detach the cache set then the cache set disk gets thrashed extra-hard
> forever, and it's impossible to actually get the cache set detached.  The only
> solution seems to be to back up the data and destroy the volume...
> 
> [3.] Keywords (i.e., modules, networking, kernel):
> 
> bcache, dm-crypt, LUKS, device mapper, LVM
> 
> [4.] Kernel information
> [4.1.] Kernel version (from /proc/version):
> Linux version 4.6.0-040600rc6-generic (kernel@gloin) (gcc version 5.2.1 20151010 (Ubuntu 5.2.1-22ubuntu2) ) #201605012031 SMP Mon May 2 00:33:26 UTC 2016
> 
> [7.] A small shell script or example program which triggers the
>      problem (if possible)
> 
> Here are the steps I used to reproduce:
> 
> 1.  Set up an Ubuntu 16.04 virtual machine in VMware with three SATA hard
>     drives.  Ubuntu was installed with default settings, except that: (1) guided
>     partitioning used with NO LVM or dm-crypt, (2) OpenSSH server installed.
>     First SATA drive has operating system installation.  Second SATA drive is
>     used for bcache cache set.  Third SATA drive has dm-crypt/LUKS + bcache
>     backing device.  Note that all drives have 512 byte physical sectors.  Also,
>     all virtual drives are backed by a single physical SSD with 512 byte
>     sectors. (i.e. not advanced format)
> 
> 2.  Ubuntu was updated to latest packages as of 5/8/2016.  The problem
>     reproduces with both distribution kernel 4.4.0-22-generic and also mainline
>     kernel 4.6.0-040600rc6-generic distributed by Ubuntu kernel team.  Installed
>     bcache-tools package was 1.0.8-2.  Installed cryptsetup-bin package was
>     2:1.6.6-5ubuntu2.
> 
> 3.  Set up the cache set, dm-crypt, and backing device:
> 
> sudo -s
> # Make cache set on second drive
> # IMPORTANT:  Problem does not occur if I omit --bucket parameter.
> make-bcache --bucket 2M -C /dev/sdb
> # Set up LUKS/dm-crypt on second drive.
> # IMPORTANT:  Problem does not occur if I omit the dm-crypt layer.
> cryptsetup luksFormat /dev/sdc
> cryptsetup open --type luks /dev/sdc backCrypt
> # Make bcache backing device & enable writeback
> make-bcache -B /dev/mapper/backCrypt
> bcache-super-show /dev/sdb | grep cset.uuid | \
> cut -f 3 > /sys/block/bcache0/bcache/attach
> echo writeback > /sys/block/bcache0/bcache/cache_mode
> 
> 4.  Finally, this is the kill sequence to bring the system to its knees:
> 
> sudo -s
> cd /sys/block/bcache0/bcache
> echo 0 > sequential_cutoff
> # Verify that the cache is attached (i.e. does not say "no cache").  It should
> # say that it's clean since we haven't written anything yet.
> cat state
> # Copy some random data.
> dd if=/dev/urandom of=/dev/bcache0 bs=1M count=250
> # Show current state.  On my system approximately 20 to 25 MB remain in
> # writeback cache.
> cat dirty_data
> cat state
> # Detach the cache set.  This will start the cache set disk thrashing.
> echo 1 > detach
> # After a few moments, confirm that the cache set is not going anywhere.  On
> # my system, only a few MB have been flushed as evidenced by a small decrease
> # in dirty_data.  State remains dirty.
> cat dirty_data
> cat state
> # At this point, the hypervisor system reports hundreds of MB/second of reads
> # to the underlying physical SSD coming from the virtual machine; the hard drive
> # light is stuck on...  hypervisor status bar shows the activity is on cache
> # set.  No writes seem to be occurring on any disk.
> 
> [8.] Environment
> [8.1.] Software (add the output of the ver_linux script here)
> Linux bcachetest2 4.6.0-040600rc6-generic #201605012031 SMP Mon May 2 00:33:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
> 
> Util-linux              2.27.1
> Mount                   2.27.1
> Module-init-tools       22
> E2fsprogs               1.42.13
> Xfsprogs                4.3.0
> Linux C Library         2.23
> Dynamic linker (ldd)    2.23
> Linux C++ Library       6.0.21
> Procps                  3.3.10
> Net-tools               1.60
> Kbd                     1.15.5
> Console-tools           1.15.5
> Sh-utils                8.25
> Udev                    229
> Modules Loaded          8250_fintek ablk_helper aesni_intel aes_x86_64 ahci async_memcpy async_pq async_raid6_recov async_tx async_xor autofs4 btrfs configfs coretemp crc32_pclmul crct10dif_pclmul cryptd drm drm_kms_helper e1000 fb_sys_fops fjes gf128mul ghash_clmulni_intel glue_helper hid hid_generic i2c_piix4 ib_addr ib_cm ib_core ib_iser ib_mad ib_sa input_leds iscsi_tcp iw_cm joydev libahci libcrc32c libiscsi libiscsi_tcp linear lrw mac_hid mptbase mptscsih mptspi multipath nfit parport parport_pc pata_acpi ppdev psmouse raid0 raid10 raid1 raid456 raid6_pq rdma_cm scsi_transport_iscsi scsi_transport_spi serio_raw shpchp syscopyarea sysfillrect sysimgblt ttm usbhid vmw_balloon vmwgfx vmw_vmci vmw_vsock_vmci_transport vsock xor
> 
> [8.2.] Processor information (from /proc/cpuinfo):
> processor       : 0
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 42
> model name      : Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
> stepping        : 7
> microcode       : 0x29
> cpu MHz         : 2491.980
> cache size      : 3072 KB
> physical id     : 0
> siblings        : 1
> core id         : 0
> cpu cores       : 1
> apicid          : 0
> initial apicid  : 0
> fpu             : yes
> fpu_exception   : yes
> cpuid level     : 13
> wp              : yes
> flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc aperfmperf eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm epb tsc_adjust dtherm ida arat pln pts
> bugs            :
> bogomips        : 4983.96
> clflush size    : 64
> cache_alignment : 64
> address sizes   : 42 bits physical, 48 bits virtual
> power management:
> 
> [8.3.] Module information (from /proc/modules):
> ppdev 20480 0 - Live 0x0000000000000000
> vmw_balloon 20480 0 - Live 0x0000000000000000
> vmw_vsock_vmci_transport 28672 1 - Live 0x0000000000000000
> vsock 36864 2 vmw_vsock_vmci_transport, Live 0x0000000000000000
> coretemp 16384 0 - Live 0x0000000000000000
> joydev 20480 0 - Live 0x0000000000000000
> input_leds 16384 0 - Live 0x0000000000000000
> serio_raw 16384 0 - Live 0x0000000000000000
> shpchp 36864 0 - Live 0x0000000000000000
> vmw_vmci 65536 2 vmw_balloon,vmw_vsock_vmci_transport, Live 0x0000000000000000
> i2c_piix4 24576 0 - Live 0x0000000000000000
> nfit 40960 0 - Live 0x0000000000000000
> 8250_fintek 16384 0 - Live 0x0000000000000000
> parport_pc 32768 0 - Live 0x0000000000000000
> parport 49152 2 ppdev,parport_pc, Live 0x0000000000000000
> mac_hid 16384 0 - Live 0x0000000000000000
> ib_iser 49152 0 - Live 0x0000000000000000
> rdma_cm 53248 1 ib_iser, Live 0x0000000000000000
> iw_cm 49152 1 rdma_cm, Live 0x0000000000000000
> ib_cm 45056 1 rdma_cm, Live 0x0000000000000000
> ib_sa 36864 2 rdma_cm,ib_cm, Live 0x0000000000000000
> ib_mad 49152 2 ib_cm,ib_sa, Live 0x0000000000000000
> ib_core 122880 6 ib_iser,rdma_cm,iw_cm,ib_cm,ib_sa,ib_mad, Live 0x0000000000000000
> ib_addr 20480 3 rdma_cm,ib_sa,ib_core, Live 0x0000000000000000
> configfs 40960 2 rdma_cm, Live 0x0000000000000000
> iscsi_tcp 20480 0 - Live 0x0000000000000000
> libiscsi_tcp 24576 1 iscsi_tcp, Live 0x0000000000000000
> libiscsi 53248 3 ib_iser,iscsi_tcp,libiscsi_tcp, Live 0x0000000000000000
> scsi_transport_iscsi 98304 4 ib_iser,iscsi_tcp,libiscsi, Live 0x0000000000000000
> autofs4 40960 2 - Live 0x0000000000000000
> btrfs 1024000 0 - Live 0x0000000000000000
> raid10 49152 0 - Live 0x0000000000000000
> raid456 110592 0 - Live 0x0000000000000000
> async_raid6_recov 20480 1 raid456, Live 0x0000000000000000
> async_memcpy 16384 2 raid456,async_raid6_recov, Live 0x0000000000000000
> async_pq 16384 2 raid456,async_raid6_recov, Live 0x0000000000000000
> async_xor 16384 3 raid456,async_raid6_recov,async_pq, Live 0x0000000000000000
> async_tx 16384 5 raid456,async_raid6_recov,async_memcpy,async_pq,async_xor, Live 0x0000000000000000
> xor 24576 2 btrfs,async_xor, Live 0x0000000000000000
> raid6_pq 102400 4 btrfs,raid456,async_raid6_recov,async_pq, Live 0x0000000000000000
> libcrc32c 16384 1 raid456, Live 0x0000000000000000
> raid1 36864 0 - Live 0x0000000000000000
> raid0 20480 0 - Live 0x0000000000000000
> multipath 16384 0 - Live 0x0000000000000000
> linear 16384 0 - Live 0x0000000000000000
> hid_generic 16384 0 - Live 0x0000000000000000
> usbhid 49152 0 - Live 0x0000000000000000
> hid 122880 2 hid_generic,usbhid, Live 0x0000000000000000
> crct10dif_pclmul 16384 0 - Live 0x0000000000000000
> crc32_pclmul 16384 0 - Live 0x0000000000000000
> ghash_clmulni_intel 16384 0 - Live 0x0000000000000000
> aesni_intel 167936 0 - Live 0x0000000000000000
> aes_x86_64 20480 1 aesni_intel, Live 0x0000000000000000
> lrw 16384 1 aesni_intel, Live 0x0000000000000000
> gf128mul 16384 1 lrw, Live 0x0000000000000000
> glue_helper 16384 1 aesni_intel, Live 0x0000000000000000
> ablk_helper 16384 1 aesni_intel, Live 0x0000000000000000
> cryptd 20480 3 ghash_clmulni_intel,aesni_intel,ablk_helper, Live 0x0000000000000000
> vmwgfx 237568 1 - Live 0x0000000000000000
> ttm 98304 1 vmwgfx, Live 0x0000000000000000
> drm_kms_helper 147456 1 vmwgfx, Live 0x0000000000000000
> syscopyarea 16384 1 drm_kms_helper, Live 0x0000000000000000
> psmouse 131072 0 - Live 0x0000000000000000
> sysfillrect 16384 1 drm_kms_helper, Live 0x0000000000000000
> sysimgblt 16384 1 drm_kms_helper, Live 0x0000000000000000
> fb_sys_fops 16384 1 drm_kms_helper, Live 0x0000000000000000
> drm 364544 4 vmwgfx,ttm,drm_kms_helper, Live 0x0000000000000000
> ahci 36864 2 - Live 0x0000000000000000
> libahci 32768 1 ahci, Live 0x0000000000000000
> e1000 135168 0 - Live 0x0000000000000000
> mptspi 24576 0 - Live 0x0000000000000000
> mptscsih 40960 1 mptspi, Live 0x0000000000000000
> mptbase 102400 2 mptspi,mptscsih, Live 0x0000000000000000
> scsi_transport_spi 32768 1 mptspi, Live 0x0000000000000000
> pata_acpi 16384 0 - Live 0x0000000000000000
> fjes 28672 0 - Live 0x0000000000000000
> 
> [8.6.] SCSI information (from /proc/scsi/scsi)
> Attached devices:
> Host: scsi3 Channel: 00 Id: 00 Lun: 00
>   Vendor: ATA      Model: VMware Virtual S Rev: 0001
>   Type:   Direct-Access                    ANSI  SCSI revision: 05
> Host: scsi4 Channel: 00 Id: 00 Lun: 00
>   Vendor: NECVMWar Model: VMware SATA CD01 Rev: 1.00
>   Type:   CD-ROM                           ANSI  SCSI revision: 05
> Host: scsi5 Channel: 00 Id: 00 Lun: 00
>   Vendor: ATA      Model: VMware Virtual S Rev: 0001
>   Type:   Direct-Access                    ANSI  SCSI revision: 05
> Host: scsi6 Channel: 00 Id: 00 Lun: 00
>   Vendor: ATA      Model: VMware Virtual S Rev: 0001
>   Type:   Direct-Access                    ANSI  SCSI revision: 05
> 
> Best regards,
> 
> James Johnston
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux