On Wed, Feb 28, 2024 at 2:19 PM Goffredo Baroncelli <kreijack@xxxxxxxxx> wrote: > > On 28/02/2024 18.25, Patrick Plenefisch wrote: > > I'm unsure if this is just an LVM bug, or a BTRFS+LVM interaction bug, > > but LVM is definitely involved somehow. > > Upgrading from 5.10 to 6.1, I noticed one of my filesystems was > > read-only. In dmesg, I found: > > > > BTRFS error (device dm-75): bdev /dev/mapper/lvm-brokenDisk errs: wr > > 0, rd 0, flush 1, corrupt 0, gen 0 > > BTRFS warning (device dm-75): chunk 13631488 missing 1 devices, max > > tolerance is 0 for writable mount > > BTRFS: error (device dm-75) in write_all_supers:4379: errno=-5 IO > > failure (errors while submitting device barriers.) > > BTRFS info (device dm-75: state E): forced readonly > > BTRFS warning (device dm-75: state E): Skipping commit of aborted transaction. > > BTRFS: error (device dm-75: state EA) in cleanup_transaction:1992: > > errno=-5 IO failure > > > > At first I suspected a btrfs error, but a scrub found no errors, and > > it continued to be read-write on 5.10 kernels. > > > > Here is my setup: > > > > /dev/lvm/brokenDisk is a lvm-on-lvm volume. I have /dev/sd{a,b,c,d} > > (of varying sizes) in a lower VG, which has three LVs, all raid1 > > volumes. Two of the volumes are further used as PV's for an upper VGs. > > One of the upper VGs has no issues. The non-PV LV has no issue. The > > remaining one, /dev/lowerVG/lvmPool, hosting nested LVM, is used as a > > PV for VG "lvm", and has 3 volumes inside. Two of those volumes have > > no issues (and are btrfs), but the last one is /dev/lvm/brokenDisk. > > This volume is the only one that exhibits this behavior, so something > > is special. > > > > Or described as layers: > > /dev/sd{a,b,c,d} => PV => VG "lowerVG" > > /dev/lowerVG/single (RAID1 LV) => BTRFS, works fine > > /dev/lowerVG/works (RAID1 LV) => PV => VG "workingUpper" > > /dev/workingUpper/{a,b,c} => BTRFS, works fine > > /dev/lowerVG/lvmPool (RAID1 LV) => PV => VG "lvm" > > /dev/lvm/{a,b} => BTRFS, works fine > > /dev/lvm/brokenDisk => BTRFS, Exhibits errors > > I am a bit curious about the reasons of this setup. The lowerVG is supposed to be a pool of storage for several VM's & containers. [workingUpper] is for one VM, and [lvm] is for another VM. However right now I'm still trying to organize the files directly because I don't have all the VM's fully setup yet > However I understood that: > > /dev/sda -+ +-- single (RAID1) -> ok +-> a ok > /dev/sdb | | |-> b ok > /dev/sdc +--> [lowerVG]>--+-- works (RAID1) -> [workingUpper] -+-> c ok > /dev/sdd -+ | > | +-> a -> ok > +-- lvmPool -> [lvm] ->-| > +-> b -> ok > | > +->brokenDisk -> fail > > [xxx] means VG, the others are LVs that may act also as PV in > an upper VG Note that lvmPool is also RAID1, but yes > > So, it seems that > > 1) lowerVG/lvmPool/lvm/a > 2) lowerVG/lvmPool/lvm/a > 3) lowerVG/lvmPool/lvm/brokenDisk > > are equivalent ... so I don't understand how 1) and 2) are fine but 3) is > problematic. I assume you meant lvm/b for 2? > > Is my understanding of the LVM layouts correct ? Your understanding is correct. The only thing that comes to my mind to cause the problem is asymmetry of the SATA devices. I have one 8TB device, plus a 1.5TB, 3TB, and 3TB drives. Doing math on the actual extents, lowerVG/single spans (3TB+3TB), and lowerVG/lvmPool/lvm/brokenDisk spans (3TB+1.5TB). Both obviously have the other leg of raid1 on the 8TB drive, but my thought was that the jump across the 1.5+3TB drive gap was at least "interesting" > > > > > > After some investigation, here is what I've found: > > > > 1. This regression was introduced in 5.19. 5.18 and earlier kernels I > > can keep this filesystem rw and everything works as expected, while > > 5.19.0 and later the filesystem is immediately ro on any write > > attempt. I couldn't build rc1, but I did confirm rc2 already has this > > regression. > > 2. Passing /dev/lvm/brokenDisk to a KVM VM as /dev/vdb with an > > unaffected kernel inside the vm exhibits the ro barrier problem on > > unaffected kernels. > > Is /dev/lvm/brokenDisk *always* problematic with affected ( >= 5.19 ) and > UNaffected ( < 5.19 ) kernel ? Yes, I didn't test it in as much depth, but 5.15 and 6.1 in the VM (and 6.1 on the host) are identically problematic > > > 3. Passing /dev/lowerVG/lvmPool to a KVM VM as /dev/vdb with an > > affected kernel inside the VM and using LVM inside the VM exhibits > > correct behavior (I can keep the filesystem rw, no barrier errors on > > host or guest) > > Is /dev/lowerVG/lvmPool problematic with only "affected" kernel ? Uh, passing lvmPool directly to the VM is never problematic. I tested 5.10 and 6.1 in the VM (and 6.1 on the host), and neither setup throws barrier errors. > [...] > > -- > gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it> > Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 >