>>>>> "Andrei" == Andrei Rodionov <andrei.rodionov@xxxxxxxxx> writes: Andrei> I've provisioned an LVM RAID 6 across 4 physical disks. I'm Andrei> trying to understand the RAID behavior after injecting the Andrei> failure - removing physical disk /dev/sdc. The docs state you need to use 5 devices for RAID6 under LVM, not four. And you do show 5 disks in your vgcreate, but not your lvcreate command. Maybe you could post a test script you use to do your testing to make sure you're calling it correctly? Andrei> pvcreate /dev/sdc /dev/sdd /dev/sde /dev/sdf Andrei> vgcreate pool_vg /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg Andrei> lvcreate -l +100%FREE -n pool_lv --type raid6 pool_vg Andrei> mkfs.xfs /dev/pool_vg/pool_lv Andrei> echo "/dev/mapper/pool_vg-pool_lv /mnt xfs defaults,x-systemd.mount-timeout=30 0 0" >> /etc/fstab This looks ok, but maybe you need to specify the explicit stripe count and size? lvcreate --type raid6 -l 100%FREE --stripes 3 --stripesize 1 -n pool_lv pool_vg Andrei> Everything appears to be working fine: Andrei> # pvs --segments -o pv_name,pv_size,seg_size,vg_name,lv_name,lv_attr,lv_size,seg_pe_ranges Andrei> PV PSize SSize VG LV Attr LSize PE Ranges Andrei> /dev/sda3 <49.00g <24.50g ubuntu-vg ubuntu-lv -wi-ao---- <24.50g /dev/sda3:0-6270 Andrei> /dev/sda3 <49.00g 24.50g ubuntu-vg 0 Andrei> /dev/sdc <100.00g 4.00m pool_vg [pool_lv_rmeta_0] ewi-aor--- 4.00m /dev/sdc:0-0 Andrei> /dev/sdc <100.00g 99.99g pool_vg [pool_lv_rimage_0] iwi-aor--- 99.99g /dev/sdc:1-25598 Andrei> /dev/sdd <100.00g 4.00m pool_vg [pool_lv_rmeta_1] ewi-aor--- 4.00m /dev/sdd:0-0 Andrei> /dev/sdd <100.00g 99.99g pool_vg [pool_lv_rimage_1] iwi-aor--- 99.99g /dev/sdd:1-25598 Andrei> /dev/sde <100.00g 4.00m pool_vg [pool_lv_rmeta_2] ewi-aor--- 4.00m /dev/sde:0-0 Andrei> /dev/sde <100.00g 99.99g pool_vg [pool_lv_rimage_2] iwi-aor--- 99.99g /dev/sde:1-25598 Andrei> /dev/sdf <100.00g 4.00m pool_vg [pool_lv_rmeta_3] ewi-aor--- 4.00m /dev/sdf:0-0 Andrei> /dev/sdf <100.00g 99.99g pool_vg [pool_lv_rimage_3] iwi-aor--- 99.99g /dev/sdf:1-25598 Andrei> /dev/sdg <100.00g 4.00m pool_vg [pool_lv_rmeta_4] ewi-aor--- 4.00m /dev/sdg:0-0 Andrei> /dev/sdg <100.00g 99.99g pool_vg [pool_lv_rimage_4] iwi-aor--- 99.99g /dev/sdg:1-25598 Andrei> # lvs -a -o name,lv_attr,copy_percent,health_status,devices pool_vg Andrei> LV Attr Cpy%Sync Health Devices Andrei> pool_lv rwi-aor--- 100.00 pool_lv_rimage_0(0),pool_lv_rimage_1 Andrei> (0),pool_lv_rimage_2(0),pool_lv_rimage_3(0),pool_lv_rimage_4(0) Andrei> [pool_lv_rimage_0] iwi-aor--- /dev/sdc(1) Andrei> [pool_lv_rimage_1] iwi-aor--- /dev/sdd(1) Andrei> [pool_lv_rimage_2] iwi-aor--- /dev/sde(1) Andrei> [pool_lv_rimage_3] iwi-aor--- /dev/sdf(1) Andrei> [pool_lv_rimage_4] iwi-aor--- /dev/sdg(1) Andrei> [pool_lv_rmeta_0] ewi-aor--- /dev/sdc(0) Andrei> [pool_lv_rmeta_1] ewi-aor--- /dev/sdd(0) Andrei> [pool_lv_rmeta_2] ewi-aor--- /dev/sde(0) Andrei> [pool_lv_rmeta_3] ewi-aor--- /dev/sdf(0) Andrei> [pool_lv_rmeta_4] ewi-aor--- /dev/sdg(0) Andrei> After the /dev/sdc is removed and the system is rebooted, the Andrei> RAID goes into "partial" health state and is no longer Andrei> accessible. Just for grins, what happens if you re-add the sdc and then reboot? Does it re-find the array? Andrei> # lvs -a -o name,lv_attr,copy_percent,health_status,devices pool_vg Andrei> WARNING: Couldn't find device with uuid 03KtEG-cJ5S-cMAD-RlL8-yBXM-jCav-EyD9I3. Andrei> WARNING: VG pool_vg is missing PV 03KtEG-cJ5S-cMAD-RlL8-yBXM-jCav-EyD9I3 (last written to /dev/ Andrei> sdc). Andrei> LV Attr Cpy%Sync Health Devices Andrei> pool_lv rwi---r-p- partial pool_lv_rimage_0(0),pool_lv_rimage_1 Andrei> (0),pool_lv_rimage_2(0),pool_lv_rimage_3(0),pool_lv_rimage_4(0) Andrei> [pool_lv_rimage_0] Iwi---r-p- partial [unknown](1) Andrei> [pool_lv_rimage_1] Iwi---r--- /dev/sdd(1) Andrei> [pool_lv_rimage_2] Iwi---r--- /dev/sde(1) Andrei> [pool_lv_rimage_3] Iwi---r--- /dev/sdf(1) Andrei> [pool_lv_rimage_4] Iwi---r--- /dev/sdg(1) Andrei> [pool_lv_rmeta_0] ewi---r-p- partial [unknown](0) Andrei> [pool_lv_rmeta_1] ewi---r--- /dev/sdd(0) Andrei> [pool_lv_rmeta_2] ewi---r--- /dev/sde(0) Andrei> [pool_lv_rmeta_3] ewi---r--- /dev/sdf(0) Andrei> [pool_lv_rmeta_4] ewi---r--- /dev/sdg(0) Andrei> From what I understand, the RAID should be able to continue Andrei> with a physical disk loss and be in a "degraded" state, not Andrei> "partial", because the data is fully present on the surviving Andrei> disks. Andrei> From /etc/lvm/lvm.conf: Andrei> # degraded Andrei> # Like complete, but additionally RAID LVs of segment type raid1, Andrei> # raid4, raid5, radid6 and raid10 will be activated if there is no Andrei> # data loss, i.e. they have sufficient redundancy to present the Andrei> # entire addressable range of the Logical Volume. Andrei> # partial Andrei> # Allows the activation of any LV even if a missing or failed PV Andrei> # could cause data loss with a portion of the LV inaccessible. Andrei> # This setting should not normally be used, but may sometimes Andrei> # assist with data recovery. What is your actual setting in /etc/lvm/lvm.conf for the block: activation { ... activation_mode = "degraded" ... } I'm on debian, and not on RHEL8 and I haven't tested this myself, but I wonder if you needed to really apply the '--stripes 3' value when you built it? John _______________________________________________ linux-lvm mailing list linux-lvm@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/