Filesystem corruption caused by disk from backing device kicked out of RAID1 array

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I've unfortunately experienced a filesystem corruption with bcache in writeback mode. The good news is I can reproduce it.

Short version: this is on stock kernel 3.10.0, bcache-tools from 20130626. Both my backing and cache devices are in a RAID1 mdadm array. Writeback caching is enabled. One disk from the backing RAID1 suddenly has an issue and gets kicked out of the RAID. Everything continues smoothly, as expected.

The server is rebooted (several hours later) in order to 'reset' the failed disk and check if it's still functioning or needs to be replaced (this is a non-hot swap chassis). The disk is fine, but bcache is now very confused as it sees 2 devices with the same UUID. Boom, the filesystem is now corrupted.

Longer version: here's how to reproduce.

* create mdadm RAID1 arrays with a superblock at the end of the device (0.9 or 1.0):
mdadm -C /dev/md10 --metadata=1.0 -l 1 -n 2 /dev/sdc1 /dev/sdd1
mdadm -C /dev/md11 --metadata=1.0 -l 1 -n 2 /dev/sdc2 /dev/sdd2

* create your bcache device:
make-bcache -B /dev/md10 -C /dev/md11

* create a filesystem and mount it:
mkfs -t xfs /dev/bcache0
mount /dev/bcache0 /mnt

* enable writeback mode:
echo writeback > /sys/block/bcache0/bcache/cache_mode

* simulate writes on the filesystem:
iozone -a

* wait a bit, make sure data is being written on both the backing and cache device. At some point, simulate a failure on the disk:
mdadm -f /dev/md10 /dev/sdc1

* wait again until more data is written on the /dev/md10 backing device: /dev/sdc1 and /dev/sdd1 are now diverging.

* finally, reboot.

Here's what I get (relevant lines only):

[    9.759055] md: bind<sdc1>
[    9.759418] md: bind<sdd1>
[    9.759532] md: kicking non-fresh sdc1 from array!
[    9.759636] md: unbind<sdc1>
[    9.842171] md: export_rdev(sdc1)
[    9.844579] md/raid1:md10: active with 1 out of 2 mirrors
[    9.844708] md10: detected capacity change from 0 to 5379194880
[    9.844931] RAID1 conf printout:
[    9.844933]  --- wd:1 rd:2
[    9.844935]  disk 1, wo:0, o:1, dev:sdd1
[    9.845295] md: md11 stopped.
[    9.845729]  md10: unknown partition table
[    9.864962] md: bind<sdd2>
[    9.865332] md: bind<sdc2>
[    9.867314] md/raid1:md11: active with 2 out of 2 mirrors
[    9.867432] md11: detected capacity change from 0 to 1085669376
[    9.867611] RAID1 conf printout:
[    9.867613]  --- wd:2 rd:2
[    9.867615]  disk 0, wo:0, o:1, dev:sdc2
[    9.867617]  disk 1, wo:0, o:1, dev:sdd2
[    9.868148]  md11: unknown partition table

[   10.746609] bcache: register_bdev() registered backing device sdc1
[   10.855288] bcache: register_bdev() registered backing device md10
[ 10.926637] bcache: bch_journal_replay() journal replay done, 547 keys in 4 entries, seq 293 [ 10.936029] bcache: bch_cached_dev_attach() Caching md10 as bcache1 on set ba9ed0f7-c75c-401c-a841-7cc99c393b05

[   10.940232] ------------[ cut here ]------------
[   10.940342] WARNING: at fs/sysfs/dir.c:530 sysfs_add_one+0xbc/0xe0()
[ 10.940450] sysfs: cannot create duplicate filename '/fs/bcache/ba9ed0f7-c75c-401c-a841-7cc99c393b05/bdev0'
[   10.941491] ehci-pci 0000:00:1d.7: USB 2.0 started, EHCI 1.00
[ 10.940595] Modules linked in: bcache i2c_i801(+) ioatdma(+) ehci_pci(+) ehci_hcd coretemp i7core_edac edac_core usbhid uhci_hcd usbcore usb_common megaraid_sas netconsole ixgbe mdio igb hwmon dca i2c_algo_bit ptp pps_core
[   10.942312] CPU: 7 PID: 5126 Comm: bcache-register Not tainted 3.10.0 #1
[ 10.942450] Hardware name: Intel Corporation S5520HC/S5520HC, BIOS S5500.86B.01.00.0059.082320111421 08/23/2011 [ 10.942597] 0000000000000212 ffff880c2dd23848 ffffffff8169349e ffff880c2dd23888 [ 10.942974] ffffffff81044a1b ffff880c6ac38bc0 ffff880c2f37e978 ffff880c2dd239c8 [ 10.943351] ffff880c31795000 00000000ffffffef ffff880c2f3233c8 ffff880c2dd23968
[   10.943728] Call Trace:
[   10.943827]  [<ffffffff8169349e>] dump_stack+0x19/0x1b
[   10.943934]  [<ffffffff81044a1b>] warn_slowpath_common+0x6b/0xa0
[   10.944041]  [<ffffffff81044b49>] warn_slowpath_fmt+0x69/0x70
[   10.944150]  [<ffffffff81408d84>] ? strlcat+0x74/0x90
[   10.944255]  [<ffffffff81408d84>] ? strlcat+0x74/0x90
[   10.944359]  [<ffffffff81408d84>] ? strlcat+0x74/0x90
[   10.944463]  [<ffffffff81408d84>] ? strlcat+0x74/0x90
[   10.944567]  [<ffffffff811a1bbc>] sysfs_add_one+0xbc/0xe0
[   10.944674]  [<ffffffff811a2ccf>] sysfs_do_create_link_sd+0xbf/0x1f0
[   10.944783]  [<ffffffff811a2e5b>] sysfs_create_link+0x2b/0x30
[   10.944895]  [<ffffffffa0227586>] bcache_device_link+0xe6/0x100 [bcache]
[ 10.945009] [<ffffffffa0229390>] bch_cached_dev_attach+0x380/0x4b0 [bcache]
[   10.945123]  [<ffffffffa0228b6d>] ? __write_super+0x15d/0x170 [bcache]
[ 10.945236] [<ffffffffa0228cd3>] ? bcache_write_super+0x153/0x170 [bcache]
[   10.945350]  [<ffffffffa0229ced>] run_cache_set+0x82d/0x9d0 [bcache]
[   10.945463]  [<ffffffffa022a0aa>] register_cache_set+0x21a/0x2b0 [bcache]
[   10.945576]  [<ffffffffa022a803>] register_cache+0x6c3/0x710 [bcache]
[   10.945686]  [<ffffffff81121b68>] ? kmem_cache_alloc_trace+0xb8/0x1e0
[   10.945798]  [<ffffffffa022ae48>] register_bcache+0x5f8/0xb20 [bcache]
[   10.945908]  [<ffffffff81402987>] kobj_attr_store+0x17/0x20
[   10.946014]  [<ffffffff811a092c>] sysfs_write_file+0xcc/0x150
[   10.946123]  [<ffffffff81130a02>] vfs_write+0xe2/0x1e0
[   10.946228]  [<ffffffff81130c0c>] SyS_write+0x5c/0xa0
[   10.946335]  [<ffffffff8169ea12>] system_call_fastpath+0x16/0x1b
[   10.946442] ---[ end trace 1f31f40a281dfb68 ]---
[   10.946545] ------------[ cut here ]------------
[ 10.946652] WARNING: at drivers/md/bcache/super.c:689 bcache_device_link+0xc5/0x100 [bcache]()
[   10.947846] Couldn't create device <-> cache set symlinks
[ 10.947930] Modules linked in:<7>[ 10.948184] ioatdma 0000:00:16.2: irq 122 for MSI/MSI-X

[ 10.948095] bcache i2c_i801(+) ioatdma(+) ehci_pci ehci_hcd coretemp i7core_edac edac_core usbhid uhci_hcd usbcore usb_common megaraid_sas netconsole ixgbe mdio igb hwmon dca i2c_algo_bit ptp pps_core [ 10.949635] CPU: 7 PID: 5126 Comm: bcache-register Tainted: G W 3.10.0 #1 [ 10.949777] Hardware name: Intel Corporation S5520HC/S5520HC, BIOS S5500.86B.01.00.0059.082320111421 08/23/2011
[   10.949818] ioatdma 0000:00:16.6: irq 126 for MSI/MSI-X
[   10.950212] ioatdma 0000:00:16.7: irq 127 for MSI/MSI-X
[ 10.949924] 00000000000002b1 ffff880c2dd238f8 ffffffff8169349e ffff880c2dd23938 [ 10.950304] ffffffff81044a1b ffff880c2dd23928 ffff880c32260040 ffff880c32260010
[   10.950655] i801_smbus 0000:00:1f.3: SMBus using PCI Interrupt
[ 10.950803] ffff880c31f00040 ffff880c3226008c ffff880c31f00000 ffff880c2dd23a18
[   10.951208] Call Trace:
[   10.951306]  [<ffffffff8169349e>] dump_stack+0x19/0x1b
[   10.951411]  [<ffffffff81044a1b>] warn_slowpath_common+0x6b/0xa0
[   10.951528]  [<ffffffff81044b49>] warn_slowpath_fmt+0x69/0x70
[   10.951637]  [<ffffffff811a1d85>] ? release_sysfs_dirent+0x75/0x130
[   10.951746]  [<ffffffff811a2d86>] ? sysfs_do_create_link_sd+0x176/0x1f0
[   10.951857]  [<ffffffff811a2e5b>] ? sysfs_create_link+0x2b/0x30
[   10.951969]  [<ffffffffa0227565>] bcache_device_link+0xc5/0x100 [bcache]
[ 10.952083] [<ffffffffa0229390>] bch_cached_dev_attach+0x380/0x4b0 [bcache]
[   10.952198]  [<ffffffffa0228b6d>] ? __write_super+0x15d/0x170 [bcache]
[ 10.952312] [<ffffffffa0228cd3>] ? bcache_write_super+0x153/0x170 [bcache]
[   10.952426]  [<ffffffffa0229ced>] run_cache_set+0x82d/0x9d0 [bcache]
[   10.952540]  [<ffffffffa022a0aa>] register_cache_set+0x21a/0x2b0 [bcache]
[   10.952653]  [<ffffffffa022a803>] register_cache+0x6c3/0x710 [bcache]
[   10.952763]  [<ffffffff81121b68>] ? kmem_cache_alloc_trace+0xb8/0x1e0
[   10.952875]  [<ffffffffa022ae48>] register_bcache+0x5f8/0xb20 [bcache]
[   10.952986]  [<ffffffff81402987>] kobj_attr_store+0x17/0x20
[   10.953093]  [<ffffffff811a092c>] sysfs_write_file+0xcc/0x150
[   10.953200]  [<ffffffff81130a02>] vfs_write+0xe2/0x1e0
[   10.953305]  [<ffffffff81130c0c>] SyS_write+0x5c/0xa0
[   10.953410]  [<ffffffff8169ea12>] system_call_fastpath+0x16/0x1b
[   10.953517] ---[ end trace 1f31f40a281dfb69 ]---
[ 10.953621] bcache: bch_cached_dev_attach() Caching sdc1 as bcache0 on set ba9ed0f7-c75c-401c-a841-7cc99c393b05
[   10.953789] bcache: register_cache() registered cache device md11

And the filesystem is now obviously corrupted:

[   14.312553] XFS (bcache0): Mounting Filesystem
[   14.413270] XFS (bcache0): Starting recovery (logdev: internal)
[ 14.414033] XFS (bcache0): Internal error xlog_valid_rec_header(1) at line 3610 of file fs/xfs/xfs_log_recover.c. Caller 0xffffffff812f6da2
(XFS trace)
[   14.418639] XFS (bcache0): log mount/recovery failed: error 117
[   14.418800] XFS (bcache0): log mount failed

If the RAID arrays are created with a 1.1 or 1.2 superblock, the issue doesn't occur, as the kicked out disk is not recognized by bcache.

I'm not sure what bcache could do to prevent this. Preventing the second cache device from being created wouldn't solve the problem.

Thanks,
Cyril

--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux