Possible bug with concurrent RAID syncs on the same underlying devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Up until now I was having 8 mdadm RAID6 arrays which are sharing the
same 6 different sized devices with 1TB partitions, like:
md0: sda1 sdb1 sdc1...
md1: sda2 sdb2 sdc2...
.
.
.
md7: sda8 sdb8 sde5 sdd7...

It was set up like this so I can efficiently use the space from
different sized disks.

Since lvmraid has support for integrity on lvmraid devices, I backed
up everything and trying to recreate a similar structure with lvmraid
and integrity enabled.

In the past when multiple mdadm arrays needed to resync, they would
wait for each other to finish before, because mdadm detected those
arrays shared the same disks.

Now when I was trying to recreate the arrays I realized that the
initial lvmraid syncs doesn't wait for each other.
This means I can't recreate the whole structure in one go as it would
trash the IO on these HDDs.

I don't know if this is on purpose, because I haven't tried lvmraid
before, but I know lvmraid uses md under the hood, and I'm thinking
that this might be a bug, because the md code in kernel can't detect
the underlying devices through the integrity layer.

But I think it might worth to get fixed, as even with just 3 raid6
lvmraids and sync speed reduced to 10M by dev.raid.speed_limit_max
sysctl I get a pretty high load:

[root@hp ~] 2021-04-10 02:07:38
# lvs
  LV   VG      Attr       LSize  Pool Origin Data%  Meta%  Move Log
Cpy%Sync Convert
  root pve     rwi-aor--- 29,25g
100,00
  md0  raid6-0 rwi-a-r--- <3,61t
40,54
  md1  raid6-1 rwi-a-r--- <2,71t
8,54
  md2  raid6-2 rwi-a-r--- <3,61t                                    1,01
[root@hp ~] 2021-04-10 02:30:46
# pvs -S vg_name=raid6-0
  PV         VG      Fmt  Attr PSize   PFree
  /dev/sda3  raid6-0 lvm2 a--  931,50g 4,00m
  /dev/sdb1  raid6-0 lvm2 a--  931,50g 4,00m
  /dev/sdd6  raid6-0 lvm2 a--  931,50g 4,00m
  /dev/sde6  raid6-0 lvm2 a--  931,50g 4,00m
  /dev/sdf1  raid6-0 lvm2 a--  931,50g 4,00m
  /dev/sdg4  raid6-0 lvm2 a--  931,50g 4,00m
[root@hp ~] 2021-04-10 02:35:39
# uptime
 02:35:40 up 1 day, 29 min,  4 users,  load average: 138,20, 126,23, 135,60

Although this is just due to the insane amount of integrity kworker
processes, and the system is pretty usable, I think it would be much
nicer to only have 1 sync running on the same physical device at a
time.

_______________________________________________
linux-lvm mailing list
linux-lvm@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/




[Index of Archives]     [Gluster Users]     [Kernel Development]     [Linux Clusters]     [Device Mapper]     [Security]     [Bugtraq]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]

  Powered by Linux