Re: parity raid and ext4 get stuck in writes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Dec 24, 2023 at 11:39:05PM -0800, Daniel Dawson wrote:
> On 12/22/23 12:48 PM, Carlos Carvalho wrote:
> > This is finally a summary of a long standing problem. When lots of writes to
> > many files are sent in a short time the kernel gets stuck and stops sending
> > write requests to the disks. Sometimes it recovers and finally sends the
> > modified pages to permanent storage, sometimes not and eventually other
> > functions degrade and the machine crashes.
> > 
> > A simple way to reproduce: expand a kernel source tree, like
> > xzcat linux-6.5.tar.xz | tar x -f -
> This sounds almost exactly like a problem I was having, right down to
> triggering it by writing the files of a kernel tree, though the details in
> my case are slightly different. I wanted to report it, but wanted to get a
> better handle on it and never managed it, and now I've changed my setup such
> that it doesn't happen anymore.
> > - it happens only with ext4 on a parity raid array
> 
> This is where it differs for me. I experienced it only with btrfs. But I had

Hi Daniel,

So I think there are some other people noticing something similar on
btrfs as well [1]. Maybe this is related to the issue you are noticing
although they have not mentioned anything about raid in btrfs.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=2242391

Regards,
ojaswin
> two arrays with it, one on SSDs and one on HDDs. The HDD array exhibited the
> problem almost exclusively (the SSDs, I think, exhibited it once in several
> months, while the HDDs did pretty much every time I tried to compile a new
> kernel (until I started working around it), and even from some other things,
> which was a couple of times a week). I imagine because HDDs much slower and
> therefore allow more data to get cached.
> 
> Now that I've switched the HDD array to ext4, I haven't experienced the
> issue even once. But the setup has better performance, so maybe it's just
> because it flushes its writes faster.
> 
> -- 
> PGP fingerprint: 5BBD5080FEB0EF7F142F8173D572B791F7B4422A
> 




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux