On Wed, Mar 6, 2024 at 11:00 AM Ming Lei <ming.lei@xxxxxxxxxx> wrote: > > On Tue, Mar 05, 2024 at 12:45:13PM -0500, Mike Snitzer wrote: > > On Thu, Feb 29 2024 at 5:05P -0500, > > Goffredo Baroncelli <kreijack@xxxxxxxxx> wrote: > > > > > On 29/02/2024 21.22, Patrick Plenefisch wrote: > > > > On Thu, Feb 29, 2024 at 2:56 PM Goffredo Baroncelli <kreijack@xxxxxxxxx> wrote: > > > > > > > > > > > Your understanding is correct. The only thing that comes to my mind to > > > > > > cause the problem is asymmetry of the SATA devices. I have one 8TB > > > > > > device, plus a 1.5TB, 3TB, and 3TB drives. Doing math on the actual > > > > > > extents, lowerVG/single spans (3TB+3TB), and > > > > > > lowerVG/lvmPool/lvm/brokenDisk spans (3TB+1.5TB). Both obviously have > > > > > > the other leg of raid1 on the 8TB drive, but my thought was that the > > > > > > jump across the 1.5+3TB drive gap was at least "interesting" > > > > > > > > > > > > > > > what about lowerVG/works ? > > > > > > > > > > > > > That one is only on two disks, it doesn't span any gaps > > > > > > Sorry, but re-reading the original email I found something that I missed before: > > > > > > > BTRFS error (device dm-75): bdev /dev/mapper/lvm-brokenDisk errs: wr > > > > 0, rd 0, flush 1, corrupt 0, gen 0 > > > > BTRFS warning (device dm-75): chunk 13631488 missing 1 devices, max > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > tolerance is 0 for writable mount > > > > BTRFS: error (device dm-75) in write_all_supers:4379: errno=-5 IO > > > > failure (errors while submitting device barriers.) > > > > > > Looking at the code, it seems that if a FLUSH commands fails, btrfs > > > considers that the disk is missing. The it cannot mount RW the device. > > > > > > I would investigate with the LVM developers, if it properly passes > > > the flush/barrier command through all the layers, when we have an > > > lvm over lvm (raid1). The fact that the lvm is a raid1, is important because > > > a flush command to be honored has to be honored by all the > > > devices involved. > > Hello Patrick & Goffredo, > > I can trigger this kind of btrfs complaint by simulating one FLUSH failure. > > If you can reproduce this issue easily, please collect log by the > following bpftrace script, which may show where the flush failure is, > and maybe it can help to narrow down the issue in the whole stack. > > > #!/usr/bin/bpftrace > > #ifndef BPFTRACE_HAVE_BTF > #include <linux/blkdev.h> > #endif > > kprobe:submit_bio_noacct, > kprobe:submit_bio > / (((struct bio *)arg0)->bi_opf & (1 << __REQ_PREFLUSH)) != 0 / > { > $bio = (struct bio *)arg0; > @submit_stack[arg0] = kstack; > @tracked[arg0] = 1; > } > > kprobe:bio_endio > /@tracked[arg0] != 0/ > { > $bio = (struct bio *)arg0; > > if (($bio->bi_flags & (1 << BIO_CHAIN)) && $bio->__bi_remaining.counter > 1) { > return; > } > > if ($bio->bi_status != 0) { > printf("dev %s bio failed %d, submitter %s completion %s\n", > $bio->bi_bdev->bd_disk->disk_name, > $bio->bi_status, @submit_stack[arg0], kstack); > } > delete(@submit_stack[arg0]); > delete(@tracked[arg0]); > } > > END { > clear(@submit_stack); > clear(@tracked); > } > Attaching 4 probes... dev dm-77 bio failed 10, submitter submit_bio_noacct+5 __send_duplicate_bios+358 __send_empty_flush+179 dm_submit_bio+857 __submit_bio+132 submit_bio_noacct_nocheck+345 write_all_supers+1718 btrfs_commit_transaction+2342 transaction_kthread+345 kthread+229 ret_from_fork+49 ret_from_fork_asm+27 completion bio_endio+5 dm_submit_bio+955 __submit_bio+132 submit_bio_noacct_nocheck+345 write_all_supers+1718 btrfs_commit_transaction+2342 transaction_kthread+345 kthread+229 ret_from_fork+49 ret_from_fork_asm+27 dev dm-86 bio failed 10, submitter submit_bio_noacct+5 write_all_supers+1718 btrfs_commit_transaction+2342 transaction_kthread+345 kthread+229 ret_from_fork+49 ret_from_fork_asm+27 completion bio_endio+5 clone_endio+295 clone_endio+295 process_one_work+369 worker_thread+635 kthread+229 ret_from_fork+49 ret_from_fork_asm+27 For context, dm-86 is /dev/lvm/brokenDisk and dm-77 is /dev/lowerVG/lvmPool > > > Thanks, > Ming > And to answer Mike's question: > > Also, I didn't see any kernel logs that show DM-specific errors. I > doubt you'd have left any DM-specific errors out in your report. So > is btrfs the canary here? To be clear: You're only seeing btrfs > errors in the kernel log? Correct, that's why I initially thought it was a btrfs issue. No DM errors in dmesg, btrfs is just the canary