Re: LVM RAID: task mdX_raid1:221 blocked for more than 120 seconds

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Resending, I erroneusly replied only to Zdenek, sorry.


On 26/11/18 09:49, Zdenek Kabelac wrote:
It does look like 'freeze' happens during LV  resize of device
(just wild guess from bug=913138)

To track down the issue - there would need to be probably some communication with bug reporters - they would need to expose what they were doing plus state
of dm tables and number of other things.

I can provide details about this, that was filed by me:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=913119

It's about a desktop PC, with two SSD (Samsung 850 EVO) on which i build RAID1 using LVM.
# pvs
  PV         VG  Fmt  Attr PSize    PFree
  /dev/sdb3  vg0 lvm2 a--  <250,00g 15,98g
  /dev/sdc3  vg0 lvm2 a--  <250,00g 15,98g

# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
  home  vg0 rwi-aor--- 200,00g 100,00
  root  vg0 rwi-aor---  30,00g 100,00
  swap0 vg0 rwi-aor---   4,00g 100,00

It's a desktop PC using Debian unstable, so it's rebooted quite often due to frequent updates. The freezes happens during normal work, without any resizing or any maintenance on LVM going on. Most of the time I noted the freeze while I was using Thunderbird. But eventually they resolve by themself: I wait minutes and the system suddenly became responsive again. Sometimes I've noted freezes but without any notice in dmesg: maybe they resolved before some kernel threshold. But most of the time another freeze will happen soon (it could be 1-2 hours but also minutes), so a reboot is really necessary.

I've not noticed any corruption due to these freeze but often they are very long and very impacting. The only reliable workaround found was to reboot with:
scsi_mod.use_blk_mq=0 dm_mod.use_blk_mq=0

Or to reboot with Debian kernel 4.16.16 (linux-image-4.16.0-2-amd) the last that work without problem but also the last before Debian maintaner's activated SCSI_MQ_DEFAULT and DM_MQ_DEFAULT.

To me the only evidence is that disabling blk-mq the problem doesn't happen and so it looks an interaction with blk-mq. I've read in RHEL8 release notes that it will enable it by default, so I wonder if that happened to others. I have a fedora-server 29 VM, upgraded from 28, but there, if I recall correctly, SCSI_MQ_DEFAULT and DM_MQ_DEFAULT are not set.

Anyway without way more info such bug report is meaningless.

Please ask, I'll do my best to provide any info you need.

Cesare.

_______________________________________________
linux-lvm mailing list
linux-lvm@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/




[Index of Archives]     [Gluster Users]     [Kernel Development]     [Linux Clusters]     [Device Mapper]     [Security]     [Bugtraq]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]

  Powered by Linux