Hello, I encountered a bug while working with DM snapshot targets: having a snapshot-origin target with all snapshots removed may lead to BUG_ON(sectors <= 0) in function bio_split() (file block/bio.c). /proc/version: Linux version 4.19.59 (root@test-dm) (gcc version 8.3.0 (Debian 8.3.0-6)) #1 SMP Fri Jul 19 15:26:13 CEST 2019 dmesg: [ 856.268151] ------------[ cut here ]------------ [ 856.268154] kernel BUG at block/bio.c:1803! [ 856.268230] invalid opcode: 0000 [#1] SMP PTI [ 856.268294] CPU: 3 PID: 677 Comm: e2fsck Tainted: G E 4.19.59 #1 [ 856.268373] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014 [ 856.268462] RIP: 0010:bio_split+0x7a/0x80 [ 856.268526] Code: 41 8b 74 24 30 48 89 ef e8 63 ec ff ff c7 45 38 00 00 00 00 f6 45 15 04 74 08 66 41 81 4c 24 14 00 04 4c 89 e0 5b 5d 41 5c c3 <0f> 0b 0f 0b 66 90 66 66 66 66 90 41 56 45 31 f6 41 55 41 54 55 53 [ 856.268866] RSP: 0018:ffffb274012aba58 EFLAGS: 00010246 [ 856.268934] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff979a62d4ae90 [ 856.269030] RDX: 0000000000600000 RSI: 0000000000000000 RDI: ffff979a3488c100 [ 856.269108] RBP: ffff979a76068000 R08: ffff979a770489c0 R09: ffff979a35625858 [ 856.269184] R10: 0000000000000000 R11: ffff979a35625858 R12: ffff979a3488c100 [ 856.269261] R13: 0000000000000000 R14: ffffffffc076b580 R15: ffffb274012abe48 [ 856.269339] FS: 00007fc6c6d76780(0000) GS:ffff979a77b80000(0000) knlGS:0000000000000000 [ 856.269423] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 856.269492] CR2: 000055ff1b8b1278 CR3: 0000000235026002 CR4: 0000000000060ee0 [ 856.269573] Call Trace: [ 856.269648] __split_and_process_bio+0xe4/0x1a0 [dm_mod] [ 856.269724] __dm_make_request.isra.37+0x3f/0xa0 [dm_mod] [ 856.269796] generic_make_request+0x1a4/0x400 [ 856.269861] submit_bio+0x45/0x140 [ 856.269919] ? guard_bio_eod+0x32/0x100 [ 856.269979] submit_bh_wbc+0x163/0x190 [ 856.270039] __block_write_full_page+0x22b/0x440 [ 856.270103] ? touch_buffer+0x60/0x60 [ 856.270162] ? check_disk_change+0x60/0x60 [ 856.270224] __writepage+0x19/0x50 [ 856.270282] write_cache_pages+0x1e1/0x470 [ 856.270343] ? __wb_calc_thresh+0x130/0x130 [ 856.270405] ? __block_commit_write.isra.39+0x4c/0xa0 [ 856.272244] ? block_write_end+0x2f/0x80 [ 856.272764] ? iov_iter_copy_from_user_atomic+0xbe/0x310 [ 856.273298] ? blkdev_write_end+0x1d/0x80 [ 856.273805] ? balance_dirty_pages_ratelimited+0x1f5/0x3c0 [ 856.274313] generic_writepages+0x56/0x90 [ 856.274823] do_writepages+0x41/0xd0 [ 856.275326] ? blkdev_write_iter+0xb0/0x120 [ 856.275826] ? io_schedule_timeout+0x19/0x40 [ 856.276319] __filemap_fdatawrite_range+0xbe/0xf0 [ 856.276806] file_write_and_wait_range+0x4c/0xa0 [ 856.277295] blkdev_fsync+0x16/0x40 [ 856.277768] do_fsync+0x38/0x70 [ 856.278237] __x64_sys_fsync+0x10/0x20 [ 856.278700] do_syscall_64+0x55/0xf0 [ 856.279156] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 856.279617] RIP: 0033:0x7fc6c6e8b214 [ 856.280075] Code: 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8d 05 e9 f4 0c 00 8b 00 85 c0 75 13 b8 4a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 3c c3 0f 1f 00 53 89 fb 48 83 ec 10 e8 24 55 [ 856.281131] RSP: 002b:00007fffd9870c18 EFLAGS: 00000246 ORIG_RAX: 000000000000004a [ 856.282181] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc6c6e8b214 [ 856.282690] RDX: 0000000000000000 RSI: 000055ff1b8ab130 RDI: 0000000000000003 [ 856.283179] RBP: 000055ff1b8ab130 R08: 0000000000000000 R09: 000055ff1b8ad2c4 [ 856.283663] R10: 00007fc6c7011200 R11: 0000000000000246 R12: 0000000000000200 [ 856.284129] R13: 000055ff1b8ad2e0 R14: 0000000000000000 R15: 000055ff1b8aaef0 [ 856.284583] Modules linked in: dm_snapshot(E) dm_bufio(E) dm_mod(E) loop(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) pcbc(E) snd_pcm(E) aesni_intel(E) bochs_drm(E) snd_timer(E) aes_x86_64(E) crypto_simd(E) ttm(E) snd(E) cryptd(E) glue_helper(E) 9pnet_virtio(E) joydev(E) evdev(E) soundcore(E) drm_kms_helper(E) sg(E) hid_generic(E) serio_raw(E) 9pnet(E) pcspkr(E) virtio_balloon(E) iTCO_wdt(E) iTCO_vendor_support(E) drm(E) qemu_fw_cfg(E) button(E) virtio_rng(E) rng_core(E) ip_tables(E) x_tables(E) autofs4(E) usbhid(E) hid(E) ext4(E) crc32c_generic(E) crc16(E) mbcache(E) jbd2(E) fscrypto(E) sr_mod(E) cdrom(E) virtio_blk(E) virtio_console(E) virtio_net(E) net_failover(E) failover(E) ahci(E) libahci(E) libata(E) xhci_pci(E) crc32c_intel(E) xhci_hcd(E) [ 856.288356] scsi_mod(E) psmouse(E) lpc_ich(E) i2c_i801(E) usbcore(E) virtio_pci(E) virtio_ring(E) virtio(E) [ 856.289169] ---[ end trace 54a02b5e2590a1fe ]--- Environment: - Kernel version: 4.19.59 (same problem with versions 5.2.1, latest 4.19 from Debian Buster and latest 4.9 from Debian Stretch) - Distribution: Debian 10 - Software versions: GNU Make 4.2.1 Binutils 2.31.1 Util-linux 2.33.1 Mount 2.33.1 Module-init-tools 26 E2fsprogs 1.44.5 Linux C Library 2.28 Dynamic linker (ldd) 2.28 Linux C++ Library 6.0.25 Procps 3.3.15 Kbd 2.0.4 Console-tools 2.0.4 Sh-utils 8.30 Udev 241 Modules Loaded 9pnet 9pnet_virtio aesni_intel aes_x86_64 ahci autofs4 bochs_drm button cdrom crc16 crc32c_generic crc32c_intel crc32_pclmul crct10dif_pclmul cryptd crypto_simd drm drm_kms_helper evdev ext4 failover fscrypto ghash_clmulni_intel glue_helper hid hid_generic i2c_i801 ip_tables irqbypass iTCO_vendor_support iTCO_wdt jbd2 joydev kvm kvm_intel libahci libata lpc_ich mbcache net_failover pcbc pcspkr psmouse qemu_fw_cfg rng_core scsi_mod serio_raw sg snd snd_pcm snd_timer soundcore sr_mod ttm usbcore usbhid virtio virtio_balloon virtio_blk virtio_console virtio_net virtio_pci virtio_ring virtio_rng xhci_hcd xhci_pci x_tables Steps to reproduce: truncate -s 500M origin.bin truncate -s 50M snapshot.bin losetup /dev/loop0 origin.bin losetup /dev/loop1 snapshot.bin mkfs.ext4 /dev/loop0 dmsetup create snap --table "0 $(blockdev --getsz /dev/loop0) snapshot /dev/loop0 /dev/loop1 N 256" dmsetup create orig --table "0 $(blockdev --getsz /dev/loop0) snapshot-origin /dev/loop0" # use /dev/mapper/snap and /dev/mapper/orig then unmount them dmsetup suspend orig dmsetup remove snap dmsetup resume orig e2fsck /dev/mapper/orig # BUG in bio_split() Steps to reproduce (the express way): truncate -s 500M origin.bin losetup /dev/loop0 origin.bin mkfs.ext4 /dev/loop0 dmsetup create orig --table "0 $(blockdev --getsz /dev/loop0) snapshot-origin /dev/loop0" e2fsck /dev/mapper/orig # BUG in bio_split() I looked at the code and to my opinion the problem comes from function origin_map (file drivers/md/dm-snap.c). In the following code: static int origin_map(struct dm_target *ti, struct bio *bio) { struct dm_origin *o = ti->private; unsigned available_sectors; ... available_sectors = o->split_boundary - ((unsigned)bio->bi_iter.bi_sector & (o->split_boundary - 1)); if (bio_sectors(bio) > available_sectors) dm_accept_partial_bio(bio, available_sectors); ... when there is no snapshot, split_boundary is 0 so available_sectors gets an invalid value. The problem no more appears if the function origin_map early exits using the following patch: --- a/drivers/md/dm-snap.c 2019-07-14 08:11:23.000000000 +0200 +++ b/drivers/md/dm-snap.c 2019-07-19 17:50:15.876000000 +0200 @@ -2328,6 +2328,9 @@ static int origin_map(struct dm_target * if (bio_data_dir(bio) != WRITE) return DM_MAPIO_REMAPPED; + if (unlikely(!o->split_boundary)) + return do_origin(o->dev, bio); + available_sectors = o->split_boundary - ((unsigned)bio->bi_iter.bi_sector & (o->split_boundary - 1)); Best regards, Cédric Delmas -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel