Re: Linux RAID with btrfs stuck and consume 100 % CPU

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks a lot Manuel for your findings and information.

It's good to know btrfs is not causing this issue and the common symptom is an MD journal on another RAID device.

I have moved journal from logical volume on RAID1 to a plain partition on a SSD and I will monitor the state.

Vojtech



On 17. 03. 21 5:35, Manuel Riel wrote:
Final update on this issue for anyone who encounters a similar problem in the future:

I didn't observe any "hanging" RAID devices after using an ordinary NVMe partition as journal. So using e.g. another md-RAID1 array as journal doesn't seem to be supported.

The docs[1] say "This means the cache disk must be ... sustainable." The sustainable part motivated me to use a md-RAID1 array. I think the docs should mention that the journal can't be on another RAID array.

I'm sending in a patch to emphasize this in the docs.


1: https://www.kernel.org/doc/html/latest/driver-api/md/raid5-cache.html

On Feb 28, 2021, at 4:34 PM, Manuel Riel <manu@xxxxxxxxxxxxx> wrote:

Hit another mdadm "hanger" today. No more reading possible and md4_raid6 stuck at 100% CPU.

I've now moved the write journal off the RAID1 device. So it's not a "nested" RAID any more. Hope this will help.

With only one hardware device used as write cache, I suppose only write-through mode[1] is suggested now.


1: https://www.kernel.org/doc/Documentation/md/raid5-cache.txt




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux