Re: [REGRESSION] 6.7.1: md: raid5 hang and unresponsive system; successfully bisected

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Dan,

On 1/25/24 12:31 PM, Dan Moulding wrote:
On this Fedora 39 VM, I created a 1GiB LVM volume to use as the RAID-5
journal from space on the "boot" disk. Then I attached 3 additional
100 GiB virtual disks and created the RAID-5 from those 3 disks and
the write-journal device. I then created a new LVM volume group from
the md0 array and created one LVM logical volume named "data", using
all but 64GiB of the available VG space. I then created an ext4 file
system on the "data" volume, mounted it, and used "dd" to copy 1MiB
blocks from /dev/urandom to a file on the "data" file system, and just
let it run. Eventually "dd" hangs and top shows that md0_raid5 is
using 100% CPU.

I can't reproduce this issue with this test case running over night, dd is making progress well. I can see dd is very busy, closing to 100%, sometimes it stay in D status, but just for a moment. md5_raid5 is staying around 60%, never 100%.

I am wondering your case is a performance issue or a dead hung, if it's a hung, i suppose we should see some hung task call trace of dd in dmesg if you didn't disable kernel.hung_task_timeout_secs.

Also are you able to configure kdump and trigger a core dump when issue reproduced.

Thanks,

Junxiao.




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux