On Tue, Nov 27, 2018 at 3:51 AM NeilBrown <neilb@xxxxxxxx> wrote: > >> Now I tried do change chunks size on the first server, but no success: > >> # mdadm --grow /dev/md3 --chunk=4096 --backup-file=/home/md3-backup > >> chunk size for /dev/md3 set to 16777216 > > Hmmm - that's a bug. In Grow.c (in mdadm) > > printf("chunk size for %s set to %d\n", > devname, array.chunk_size); > > should be > > printf("chunk size for %s set to %d\n", > devname, info->new_chunk); I see. But it shouldn't cause deny of array rebuilding (nothing happens except this message). > You shouldn't need --backup-file if kernel and mdadm are reasonably > recent. > > What kernel and what mdadm are you using? What does "mdadm --examine" > of the devices show? I started with 4.18.12 from debian backports, later I switched to vanilla kernel. 4.20-rc4 is used now. mdadm debian package version 3.4-4+b1 # mdadm --version mdadm - v3.4 - 28th January 2016 I uploaded mdadm --examine output to https://bugzilla.kernel.org/show_bug.cgi?id=201331 Also I uploaded dmesg output with CONFIG_LOCKDEP=y > >> I have some questions: > >> 1. Is deadlock under load an expected behavior with 16Mb chunk size? > >> Or it is a bug and should be fixed? > > It's a bug. Maybe it can be fixed by calling > md_wakeup_thread(bitmap->mddev->thread); > in md_bitmap_startwrite() just before the call to schedule(). Ok, I'll make a try. > >> 2. Is it possible to reshape existing RAID with smaller chunk size? > >> (without data loss) > > Yes. I have not managed yet. > >> 3. Why chunk size over 4Mb causes bad write performance? > > The larger the chunk size, the more read-modify-write cycles are > needed. With smaller chunk sizes, a write can cover whole stripes, and > doesn't need to read anything. I found threshold value 4Mib. With chunk size above it random write test produces lots of reads even if stripe_cache_size is set to maximum (32768) I do not understand why. IMHO if write block is less than the chunk it shouldn't matter how large the chunk size is. BTW, why is stripe_cache_size limited to 32768? It seems that this limit could be safely increased for machines with a lot of RAM.