Did you check with iostat/sar? (iostat -x 2 5). The md housekeeping/background stuff does not show on the md device itself, but it shows on the underlying disks. It might also be related the the bitmap keeping track of how far the missing disk is behind. Small files are troublesome. A tiny file takes several reads and writes. I think the bitmap is tracking how many writes need to be done on the missing disk, and if so then until the new disk gets put back will not start cleaning up. On Tue, Oct 31, 2023 at 4:40 PM <eyal@xxxxxxxxxxxxxx> wrote: > > On 31/10/2023 21.24, Roger Heflin wrote: > > If you directory entries are large (lots of small files in a > > directory) then the recovery of the missing data could be just enough > > to push your array too hard. > > Nah, the directory I am copying has nothing really large, and the target directory is created new. > > > find /<mount> -type d -size +1M -ls will find large directories. > > > > do a ls -l <largedirname> | wc -l and see how many files are in there. > > > > ext3/4 has issues with really big directories. > > > > The perf top showed just about all of the time was being spend in > > ext3/4 threads allocating new blocks/directory entries and such. > > Just in case there is an issue, I will copy another directory as a test. > [later] Same issue. This time the files were Pictures, 1-3MB each, so it went faster (but not as fast as the array can sustain). > After a few minutes (9GB copied) it took a long pause and a second kworker started. This one gone after I killed the copy. > > However, this same content was copied from an external USB disk (NOT to the array) without a problem. > > > How much free space does the disk show in df? > > Enough room: > /dev/md127 55T 45T 9.8T 83% /data1 > > I still suspect an issue with the array after it was recovered. > > A replated issue is that there is a constant rate of writes to the array (iostat says) at about 5KB/s > when there is no activity on this fs. In the past I saw zero read/write in iostat in this situation. > > Is there some background md process? Can it be hurried to completion? > > > On Tue, Oct 31, 2023 at 4:29 AM <eyal@xxxxxxxxxxxxxx> wrote: > >> > >> On 31/10/2023 14.21, Carlos Carvalho wrote: > >>> Roger Heflin (rogerheflin@xxxxxxxxx) wrote on Mon, Oct 30, 2023 at 01:14:49PM -03: > >>>> look at SAR -d output for all the disks in the raid6. It may be a > >>>> disk issue (though I suspect not given the 100% cpu show in raid). > >>>> > >>>> Clearly something very expensive/deadlockish is happening because of > >>>> the raid having to rebuild the data from the missing disk, not sure > >>>> what could be wrong with it. > >>> > >>> This is very similar to what I complained some 3 months ago. For me it happens > >>> with an array in normal state. sar shows no disk activity yet there are no > >>> writes to the array (reads happen normally) and the flushd thread uses 100% > >>> cpu. > >>> > >>> For the latest 6.5.* I can reliably reproduce it with > >>> % xzcat linux-6.5.tar.xz | tar x -f - > >>> > >>> This leaves the machine with ~1.5GB of dirty pages (as reported by > >>> /proc/meminfo) that it never manages to write to the array. I've waited for > >>> several hours to no avail. After a reboot the kernel tree had only about 220MB > >>> instead of ~1.5GB... > >> > >> More evidence that the problem relates to the cache not flushed to disk. > >> > >> If I run 'rsync --fsync ...' it slows it down as the writing is flushed to disk for each file. > >> But it also evicts it from the cache, so nothing accumulates. > >> The result is a slower than otherwise copying but it streams with no pauses. > >> > >> It seems that the array is slow to sync files somehow. Mythtv has no problems because it write > >> only a few large files. rsync copies a very large number of small files which somehow triggers > >> the problem. > >> > >> This is why my 'dd if=/dev/zero of=file-on-array' goes fast without problems. > >> > >> Just my guess. > >> > >> BTW I ran fsck on the fs (on the array) and it found no fault. > >> > >> -- > >> Eyal at Home (eyal@xxxxxxxxxxxxxx) > >> > > -- > Eyal at Home (eyal@xxxxxxxxxxxxxx) >