On 9/19/18 10:40 AM, nitroxis wrote:
Hi,
I think I found a bug in bcache. When I try to create a cache set on a
loop device, I get the following error in dmesg:
------------[ cut here ]------------
kernel BUG at drivers/md/bcache/super.c:2040!
invalid opcode: 0000 [#2] PREEMPT SMP PTI
CPU: 0 PID: 16925 Comm: bcache-register Tainted: G D
4.18.7-arch1-1-ARCH #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:register_bcache+0x122d/0x1560 [bcache]
Code: e9 48 f1 48 8b 55 90 48 c7 c6 18 f8 ae c0 48 c7 c7 40 c6 af c0
e8 71 ac df f0 48 8b 7d a0 e8 8a 14 47 f1 e9 7a 7c 00 00 0f 0b <0f> 0b
3e 41 80 89 40 03 00 00 01 4c 89 cf e8 70 df ff ff e9 8c fc
RSP: 0018:ffffa89d8081fd70 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff9e9889ed7000 RCX: 0000000022619a00
RDX: 0000000022619800 RSI: 0000000022619800 RDI: ffff9e98fd001a80
RBP: ffffa89d8081fe00 R08: ffff9e989eb933e8 R09: 00000000ffffffff
R10: ffff9e98ea7b40f0 R11: 0000000000000240 R12: 000000000000000b
R13: ffff9e98e6968340 R14: ffff9e98ea7b4710 R15: ffff9e989eb92000
FS: 00007ff70ab56540(0000) GS:ffff9e98ffc00000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000056292c0afbf0 CR3: 000000006f15e004 CR4: 00000000001606f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
? retint_kernel+0x1b/0x1d
? kernfs_fop_write+0x116/0x190
? bch_cache_set_alloc+0x4f0/0x4f0 [bcache]
kernfs_fop_write+0x116/0x190
__vfs_write+0x36/0x190
? __audit_syscall_entry+0xd7/0x160
? handle_mm_fault+0x10a/0x250
vfs_write+0xa9/0x190
ksys_write+0x4f/0xb0
do_syscall_64+0x5b/0x170
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7ff70aa7d7a8
Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e
fa 48 8d 05 95 6d 0d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d
00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55
RSP: 002b:00007ffca7aa07e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000000000b RCX: 00007ff70aa7d7a8
RDX: 000000000000000b RSI: 0000557ba02e9260 RDI: 0000000000000003
RBP: 0000557ba02e9260 R08: 00007ffca7aa2e37 R09: 00007ffca7aa0460
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffca7aa0890
R13: 000000000000000b R14: 00007ff70ab4b5c0 R15: 000000000000000b
Modules linked in: bcache loop nf_conntrack_ipv6 nf_defrag_ipv6
ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_tcpudp
xt_state xt_conntrack nf_conntrack iptable_filter cirrus ttm
drm_kms_helper drm ext4 crc16 mbcache jbd2 fscrypto psmouse joydev
input_leds intel_agp led_class pcspkr mousedev intel_gtt syscopyarea
sysfillrect sysimgblt i2c_piix4 fb_sys_fops agpgart evdev qemu_fw_cfg
mac_hid zram sg ip_tables x_tables btrfs libcrc32c crc32c_generic xor
zstd_decompress zstd_compress xxhash raid6_pq algif_skcipher af_alg
dm_crypt dm_mod hid_generic usbhid hid virtio_net net_failover
failover virtio_rng virtio_blk rng_core virtio_balloon sr_mod cdrom
ata_generic pata_acpi crct10dif_pclmul crc32_pclmul crc32c_intel
ghash_clmulni_intel serio_raw atkbd libps2 pcbc ata_piix uhci_hcd
ehci_pci ehci_hcd aesni_intel libata aes_x86_64 crypto_simd cryptd
glue_helper virtio_pci virtio_ring virtio usbcore usb_common scsi_mod
floppy i8042 serio
---[ end trace 9c400b357ca9bc1b ]---
RIP: 0010:register_bcache+0x122d/0x1560 [bcache]
Code: e9 48 f1 48 8b 55 90 48 c7 c6 18 f8 ae c0 48 c7 c7 40 c6 af c0
e8 71 ac df f0 48 8b 7d a0 e8 8a 14 47 f1 e9 7a 7c 00 00 0f 0b <0f> 0b
3e 41 80 89 40 03 00 00 01 4c 89 cf e8 70 df ff ff e9 8c fc
RSP: 0018:ffffa89d808bfd70 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff9e98b2618000 RCX: 0000000021ffc000
RDX: 0000000021ffbe00 RSI: 0000000021ffbe00 RDI: ffff9e98fd001a80
RBP: ffffa89d808bfe00 R08: ffff9e989eb973e8 R09: 00000000ffffffff
R10: ffff9e989c35aad0 R11: 0000000000000f20 R12: 000000000000000d
R13: ffff9e98e696ad80 R14: ffff9e989c35af50 R15: ffff9e989eb96000
FS: 00007ff70ab56540(0000) GS:ffff9e98ffc00000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f3ce70f6000 CR3: 000000006f15e001 CR4: 00000000001606f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
audit: type=1006 audit(1537323691.553:147): pid=17000 uid=0
old-auid=4294967295 auid=1000 tty=(none) old-ses=4294967295 ses=14 res=1
It appears to be a segfault during register_bcache. The steps to
reproduce are pretty simple:
# dd if=/dev/zero of=data bs=1M count=100
# losetup /dev/loop0 data
# sudo make-bcache -C /dev/loop0
make-bcache probably automatically registers the set, which then leads
to the bug. This happened on 4.18.7 and 4.18.8 on two different machines
(one a VPS, one a physical machine).
Thanks for the information. I also may reproduce it on my machine. It
seems the error is from one of the following line,
2038 if (!init_fifo(&ca->free[RESERVE_BTREE], btree_buckets,
GFP_KERNEL) ||
2039 !init_fifo_exact(&ca->free[RESERVE_PRIO],
prio_buckets(ca), GFP_KERNEL) ||
2040 !init_fifo(&ca->free[RESERVE_MOVINGGC], free,
GFP_KERNEL) ||
2041 !init_fifo(&ca->free[RESERVE_NONE], free, GFP_KERNEL) ||
2042 !init_fifo(&ca->free_inc, free << 2, GFP_KERNEL) ||
2043 !init_heap(&ca->heap, free << 3, GFP_KERNEL) ||
2044 !(ca->buckets = vzalloc(array_size(sizeof(struct
bucket),
2045 ca->sb.nbuckets))) ||
2046 !(ca->prio_buckets = kzalloc(array3_size(sizeof(uint64_t),
2047 prio_buckets(ca), 2),
2048 GFP_KERNEL)) ||
2049 !(ca->disk_buckets = alloc_bucket_pages(GFP_KERNEL, ca)))
2050 return -ENOMEM;
Let me check.
Coly Li