Re: LVM RAID behavior after losing physical disk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I apologize for replying to my own message, I was subscribed in the
digest mode...

I've been hitting my head against this one for a while now. Originally
discovered this on Ubuntu 20.04, but I'm seeing the same issue with
RHEL 8.5. The loss of a single disk leaves the RAID in "partial" mode,
when it should be "degraded".

I've tried to explicitly specify the number of stripes, but it did not
make a difference. After adding the missing disk back, the array is
healthy again. Please see below.

# cat /etc/redhat-release
Red Hat Enterprise Linux release 8.5 (Ootpa)
# lvm version
  LVM version:     2.03.12(2)-RHEL8 (2021-05-19)
  Library version: 1.02.177-RHEL8 (2021-05-19)
  Driver version:  4.43.0

# lsblk
NAME          MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda             8:0    0   50G  0 disk
├─sda1          8:1    0    1G  0 part /boot
└─sda2          8:2    0   49G  0 part
  ├─rhel-root 253:0    0   44G  0 lvm  /
  └─rhel-swap 253:1    0    5G  0 lvm  [SWAP]
sdb             8:16   0   70G  0 disk
sdc             8:32   0  100G  0 disk
sdd             8:48   0  100G  0 disk
sde             8:64   0  100G  0 disk
sdf             8:80   0  100G  0 disk
sdg             8:96   0  100G  0 disk

# pvcreate /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg
# vgcreate pool_vg /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg

# lvcreate -l +100%FREE -n pool_lv --type raid6 --stripes 3
--stripesize 1 pool_vg
  Invalid stripe size 1.00 KiB.
  Run `lvcreate --help' for more information.

# lvcreate -l +100%FREE -n pool_lv --type raid6 --stripes 3
--stripesize 4 pool_vg
  Logical volume "pool_lv" created.

# mkfs.xfs /dev/pool_vg/pool_lv
# echo "/dev/mapper/pool_vg-pool_lv /mnt xfs
defaults,x-systemd.mount-timeout=30 0 0" >> /etc/fstab
# mount -a
# touch /mnt/test

Note the RAID is correctly striped across all 5 disks:

# lvs -a -o name,lv_attr,copy_percent,health_status,devices pool_vg
  LV                 Attr       Cpy%Sync Health          Devices
  pool_lv            rwi-aor--- 100.00
pool_lv_rimage_0(0),pool_lv_rimage_1(0),pool_lv_rimage_2(0),pool_lv_rimage_3(0),pool_lv_rimage_4(0)
  [pool_lv_rimage_0] iwi-aor---                          /dev/sdc(1)
  [pool_lv_rimage_1] iwi-aor---                          /dev/sdd(1)
  [pool_lv_rimage_2] iwi-aor---                          /dev/sde(1)
  [pool_lv_rimage_3] iwi-aor---                          /dev/sdf(1)
  [pool_lv_rimage_4] iwi-aor---                          /dev/sdg(1)
  [pool_lv_rmeta_0]  ewi-aor---                          /dev/sdc(0)
  [pool_lv_rmeta_1]  ewi-aor---                          /dev/sdd(0)
  [pool_lv_rmeta_2]  ewi-aor---                          /dev/sde(0)
  [pool_lv_rmeta_3]  ewi-aor---                          /dev/sdf(0)
  [pool_lv_rmeta_4]  ewi-aor---                          /dev/sdg(0)

After shutting down the OS and removing a disk, reboot drops the
system into a single user mode because it cannot mount /mnt! The RAID
is now in "partial" mode, when it must be just "degraded".

# lvs -a -o name,lv_attr,copy_percent,health_status,devices pool_vg
  WARNING: Couldn't find device with uuid
d5y3gp-taRv-2YMa-3mR0-94ZZ-72Od-IKF8Co.
  WARNING: VG pool_vg is missing PV
d5y3gp-taRv-2YMa-3mR0-94ZZ-72Od-IKF8Co (last written to /dev/sdc).
  LV                 Attr       Cpy%Sync Health          Devices
  pool_lv            rwi---r-p-          partial
pool_lv_rimage_0(0),pool_lv_rimage_1(0),pool_lv_rimage_2(0),pool_lv_rimage_3(0),pool_lv_rimage_4(0)
  [pool_lv_rimage_0] Iwi---r-p-          partial         [unknown](1)
  [pool_lv_rimage_1] Iwi---r---                          /dev/sdc(1)
  [pool_lv_rimage_2] Iwi---r---                          /dev/sdd(1)
  [pool_lv_rimage_3] Iwi---r---                          /dev/sde(1)
  [pool_lv_rimage_4] Iwi---r---                          /dev/sdf(1)
  [pool_lv_rmeta_0]  ewi---r-p-          partial         [unknown](0)
  [pool_lv_rmeta_1]  ewi---r---                          /dev/sdc(0)
  [pool_lv_rmeta_2]  ewi---r---                          /dev/sdd(0)
  [pool_lv_rmeta_3]  ewi---r---                          /dev/sde(0)
  [pool_lv_rmeta_4]  ewi---r---                          /dev/sdf(0)

After adding the missing disk back, the system boots correctly and
there are no issues with the RAID:

# lvs -a -o name,lv_attr,copy_percent,health_status,devices pool_vg
  LV                 Attr       Cpy%Sync Health          Devices
  pool_lv            rwi-a-r--- 100.00
pool_lv_rimage_0(0),pool_lv_rimage_1(0),pool_lv_rimage_2(0),pool_lv_rimage_3(0),pool_lv_rimage_4(0)
  [pool_lv_rimage_0] iwi-aor---                          /dev/sdc(1)
  [pool_lv_rimage_1] iwi-aor---                          /dev/sdd(1)
  [pool_lv_rimage_2] iwi-aor---                          /dev/sde(1)
  [pool_lv_rimage_3] iwi-aor---                          /dev/sdf(1)
  [pool_lv_rimage_4] iwi-aor---                          /dev/sdg(1)
  [pool_lv_rmeta_0]  ewi-aor---                          /dev/sdc(0)
  [pool_lv_rmeta_1]  ewi-aor---                          /dev/sdd(0)
  [pool_lv_rmeta_2]  ewi-aor---                          /dev/sde(0)
  [pool_lv_rmeta_3]  ewi-aor---                          /dev/sdf(0)
  [pool_lv_rmeta_4]  ewi-aor---                          /dev/sdg(0)


On Tue, Jan 25, 2022 at 11:44 AM Andrei Rodionov
<andrei.rodionov@xxxxxxxxx> wrote:
>
> Hello,
>
> I've provisioned an LVM RAID 6 across 5 physical disks. I'm trying to understand the RAID behavior after injecting the failure - removing physical disk /dev/sdc.


_______________________________________________
linux-lvm mailing list
linux-lvm@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/




[Index of Archives]     [Gluster Users]     [Kernel Development]     [Linux Clusters]     [Device Mapper]     [Security]     [Bugtraq]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]

  Powered by Linux