Re: problem with recovered array

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



If you directory entries are large (lots of small files in a
directory) then the recovery of the missing data could be just enough
to push your array too hard.

find /<mount> -type d -size +1M -ls     will find large directories.

do a ls -l <largedirname> | wc -l and see how many files are in there.

ext3/4 has issues with really big directories.

The perf top showed just about all of the time was being spend in
ext3/4 threads allocating new blocks/directory entries and such.

How much free space does the disk show in df?


On Tue, Oct 31, 2023 at 4:29 AM <eyal@xxxxxxxxxxxxxx> wrote:
>
> On 31/10/2023 14.21, Carlos Carvalho wrote:
> > Roger Heflin (rogerheflin@xxxxxxxxx) wrote on Mon, Oct 30, 2023 at 01:14:49PM -03:
> >> look at  SAR -d output for all the disks in the raid6.   It may be a
> >> disk issue (though I suspect not given the 100% cpu show in raid).
> >>
> >> Clearly something very expensive/deadlockish is happening because of
> >> the raid having to rebuild the data from the missing disk, not sure
> >> what could be wrong with it.
> >
> > This is very similar to what I complained some 3 months ago. For me it happens
> > with an array in normal state. sar shows no disk activity yet there are no
> > writes to the array (reads happen normally) and the flushd thread uses 100%
> > cpu.
> >
> > For the latest 6.5.* I can reliably reproduce it with
> > % xzcat linux-6.5.tar.xz | tar x -f -
> >
> > This leaves the machine with ~1.5GB of dirty pages (as reported by
> > /proc/meminfo) that it never manages to write to the array. I've waited for
> > several hours to no avail. After a reboot the kernel tree had only about 220MB
> > instead of ~1.5GB...
>
> More evidence that the problem relates to the cache not flushed to disk.
>
> If I run 'rsync --fsync ...' it slows it down as the writing is flushed to disk for each file.
> But it also evicts it from the cache, so nothing accumulates.
> The result is a slower than otherwise copying but it streams with no pauses.
>
> It seems that the array is slow to sync files somehow. Mythtv has no problems because it write
> only a few large files. rsync copies a very large number of small files which somehow triggers
> the problem.
>
> This is why my 'dd if=/dev/zero of=file-on-array' goes fast without problems.
>
> Just my guess.
>
> BTW I ran fsck on the fs (on the array) and it found no fault.
>
> --
> Eyal at Home (eyal@xxxxxxxxxxxxxx)
>




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux