Re: Linux RAID with btrfs stuck and consume 100 % CPU

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 29, 2020 at 3:06 PM Guoqing Jiang
<guoqing.jiang@xxxxxxxxxxxxxxx> wrote:
>
> Hi,
>
> On 7/22/20 10:47 PM, Vojtech Myslivec wrote:
> > 1. What should be the cause of this problem?
>
> Just a quick glance based on the stacks which you attached, I guess it
> could be
> a deadlock issue of raid5 cache super write.
>
> Maybe the commit 8e018c21da3f ("raid5-cache: fix a deadlock in superblock
> write") didn't fix the problem completely.  Cc Song.

That references discards, and it make me relook at mdadm -D which
shows a journal device:

       0     253        2        -      journal   /dev/dm-2

Vojtech, can you confirm this device is an SSD? There are a couple
SSDs that show up in the dmesg if I recall correctly.

What is the default discard hinting for this SSD when it's used as a
journal device for mdadm? And what is the write behavior of the
journal? I'm not familiar with this feature at all, whether it's
treated as a raw block device for the journal or if the journal
resides on a file system. So I get kinda curious what might happen
long term if this is a very busy file system, very busy raid5/6
journal on this SSD, without any discard hints? Is it possible the SSD
runs out of ready-to-write erase blocks, and the firmware has become
super slow doing erasure/garbage collection on demand? And the journal
is now having a hard time flushing?


-- 
Chris Murphy



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux