On Sat, Nov 24 2018, Chris Murphy wrote: > On Sun, Nov 18, 2018 at 8:01 PM Ed Spiridonov <edo.rus@xxxxxxxxx> wrote: >> >> I've set up server with big amount of disk space available (10x10Tb HDDs). >> This server should deliver (over HTTP) files to many clients, usual >> file size is several Mb. >> I use RAID 6 and XFS. >> I make a decision to make chunk size as large as possible. >> >> My reasoning is: >> HDD performance is mostly limited by seeks. >> With default chunk size (512Kb) reading of 4Mb file touches 8 HDDs (8 seeks) >> With large chunk size only one HDD is touched (1 seek). Reads prefer large chunks, writes prefer small chunks. >> >> So I created array with maximal possible chunk size (16Mb). >> And I have an issues with this array. >> https://bugzilla.kernel.org/show_bug.cgi?id=201331 >> >> I have another server with similar setup. I did some tests on it. >> As expected large chunk size provides significantly better >> multithreaded large block read performance. >> But write performance drops with chunk size over 4Mb. >> So I set up second server with chunk size 4Mb. And I have no such >> deadlocks with this server. >> >> Now I tried do change chunks size on the first server, but no success: >> # mdadm --grow /dev/md3 --chunk=4096 --backup-file=/home/md3-backup >> chunk size for /dev/md3 set to 16777216 Hmmm - that's a bug. In Grow.c (in mdadm) printf("chunk size for %s set to %d\n", devname, array.chunk_size); should be printf("chunk size for %s set to %d\n", devname, info->new_chunk); You shouldn't need --backup-file if kernel and mdadm are reasonably recent. What kernel and what mdadm are you using? What does "mdadm --examine" of the devices show? >> >> (and no changes in /proc/mdstat) >> >> I have some questions: >> 1. Is deadlock under load an expected behavior with 16Mb chunk size? >> Or it is a bug and should be fixed? It's a bug. Maybe it can be fixed by calling md_wakeup_thread(bitmap->mddev->thread); in md_bitmap_startwrite() just before the call to schedule(). >> 2. Is it possible to reshape existing RAID with smaller chunk size? >> (without data loss) Yes. >> 3. Why chunk size over 4Mb causes bad write performance? The larger the chunk size, the more read-modify-write cycles are needed. With smaller chunk sizes, a write can cover whole stripes, and doesn't need to read anything. NeilBrown (I didn't get the original email due to email problems, so I'm replying to a reply).
Attachment:
signature.asc
Description: PGP signature