Re: problem with recovered array

eyal@xxxxxxxxxxxxxx · Thu, 2 Nov 2023 00:08:02 +1100

On 01/11/2023 21.30, Roger Heflin wrote:
Did you check with iostat/sar?    (iostat -x 2 5).  The md
housekeeping/background stuff does not show on the md device itself,
but it shows on the underlying disks.

Yes, I have iostat on both the md device and the components and I see sparse activity on both.

It might also be related the the bitmap keeping track of how far the
missing disk is behind.

This may be the case. In case the disk is re-added it may help to know this,
but if a new disk is added then the full disk will be recreated anyway.

Small files are troublesome.    A tiny file takes several reads and writes.

Yes, but the array is so fast that this should not be a problem.
The rsync source is 175MB/s and the target array is 800MB/s so I do not
see how the writing can slow the copy.

In one case I (virsh) saved a VM, which created one 8GB file, which took
many hours to be written.

I think the bitmap is tracking how many writes need to be done on the
missing disk, and if so then until the new disk gets put back will not
start cleaning up.

On Tue, Oct 31, 2023 at 4:40 PM <eyal@xxxxxxxxxxxxxx> wrote:

On 31/10/2023 21.24, Roger Heflin wrote:
If you directory entries are large (lots of small files in a
directory) then the recovery of the missing data could be just enough
to push your array too hard.

Nah, the directory I am copying has nothing really large, and the target directory is created new.

find /<mount> -type d -size +1M -ls     will find large directories.

do a ls -l <largedirname> | wc -l and see how many files are in there.

ext3/4 has issues with really big directories.

The perf top showed just about all of the time was being spend in
ext3/4 threads allocating new blocks/directory entries and such.

Just in case there is an issue, I will copy another directory as a test.
[later] Same issue. This time the files were Pictures, 1-3MB each, so it went faster (but not as fast as the array can sustain).
After a few minutes (9GB copied) it took a long pause and a second kworker started. This one gone after I killed the copy.

However, this same content was copied from an external USB disk (NOT to the array) without a problem.

How much free space does the disk show in df?

Enough  room:
         /dev/md127       55T   45T  9.8T  83% /data1

I still suspect an issue with the array after it was recovered.

A replated issue is that there is a constant rate of writes to the array (iostat says) at about 5KB/s
when there is no activity on this fs. In the past I saw zero read/write in iostat in this situation.

Is there some background md process? Can it be hurried to completion?

On Tue, Oct 31, 2023 at 4:29 AM <eyal@xxxxxxxxxxxxxx> wrote:

On 31/10/2023 14.21, Carlos Carvalho wrote:
Roger Heflin (rogerheflin@xxxxxxxxx) wrote on Mon, Oct 30, 2023 at 01:14:49PM -03:
look at  SAR -d output for all the disks in the raid6.   It may be a
disk issue (though I suspect not given the 100% cpu show in raid).

Clearly something very expensive/deadlockish is happening because of
the raid having to rebuild the data from the missing disk, not sure
what could be wrong with it.

This is very similar to what I complained some 3 months ago. For me it happens
with an array in normal state. sar shows no disk activity yet there are no
writes to the array (reads happen normally) and the flushd thread uses 100%
cpu.

For the latest 6.5.* I can reliably reproduce it with
% xzcat linux-6.5.tar.xz | tar x -f -

This leaves the machine with ~1.5GB of dirty pages (as reported by
/proc/meminfo) that it never manages to write to the array. I've waited for
several hours to no avail. After a reboot the kernel tree had only about 220MB
instead of ~1.5GB...

More evidence that the problem relates to the cache not flushed to disk.

If I run 'rsync --fsync ...' it slows it down as the writing is flushed to disk for each file.
But it also evicts it from the cache, so nothing accumulates.
The result is a slower than otherwise copying but it streams with no pauses.

It seems that the array is slow to sync files somehow. Mythtv has no problems because it write
only a few large files. rsync copies a very large number of small files which somehow triggers
the problem.

This is why my 'dd if=/dev/zero of=file-on-array' goes fast without problems.

Just my guess.

BTW I ran fsck on the fs (on the array) and it found no fault.

--
Eyal at Home (eyal@xxxxxxxxxxxxxx)

--
Eyal at Home (eyal@xxxxxxxxxxxxxx)

--
Eyal at Home (eyal@xxxxxxxxxxxxxx)