If you directory entries are large (lots of small files in a directory) then the recovery of the missing data could be just enough to push your array too hard. find /<mount> -type d -size +1M -ls will find large directories. do a ls -l <largedirname> | wc -l and see how many files are in there. ext3/4 has issues with really big directories. The perf top showed just about all of the time was being spend in ext3/4 threads allocating new blocks/directory entries and such. How much free space does the disk show in df? On Tue, Oct 31, 2023 at 4:29 AM <eyal@xxxxxxxxxxxxxx> wrote: > > On 31/10/2023 14.21, Carlos Carvalho wrote: > > Roger Heflin (rogerheflin@xxxxxxxxx) wrote on Mon, Oct 30, 2023 at 01:14:49PM -03: > >> look at SAR -d output for all the disks in the raid6. It may be a > >> disk issue (though I suspect not given the 100% cpu show in raid). > >> > >> Clearly something very expensive/deadlockish is happening because of > >> the raid having to rebuild the data from the missing disk, not sure > >> what could be wrong with it. > > > > This is very similar to what I complained some 3 months ago. For me it happens > > with an array in normal state. sar shows no disk activity yet there are no > > writes to the array (reads happen normally) and the flushd thread uses 100% > > cpu. > > > > For the latest 6.5.* I can reliably reproduce it with > > % xzcat linux-6.5.tar.xz | tar x -f - > > > > This leaves the machine with ~1.5GB of dirty pages (as reported by > > /proc/meminfo) that it never manages to write to the array. I've waited for > > several hours to no avail. After a reboot the kernel tree had only about 220MB > > instead of ~1.5GB... > > More evidence that the problem relates to the cache not flushed to disk. > > If I run 'rsync --fsync ...' it slows it down as the writing is flushed to disk for each file. > But it also evicts it from the cache, so nothing accumulates. > The result is a slower than otherwise copying but it streams with no pauses. > > It seems that the array is slow to sync files somehow. Mythtv has no problems because it write > only a few large files. rsync copies a very large number of small files which somehow triggers > the problem. > > This is why my 'dd if=/dev/zero of=file-on-array' goes fast without problems. > > Just my guess. > > BTW I ran fsck on the fs (on the array) and it found no fault. > > -- > Eyal at Home (eyal@xxxxxxxxxxxxxx) >