----- Original Message ----- > From: "Shaohua Li" <shli@xxxxxxxxxx> > To: "Xiao Ni" <xni@xxxxxxxxxx> > Cc: linux-raid@xxxxxxxxxxxxxxx, "ming lei" <ming.lei@xxxxxxxxxx>, ncroxon@xxxxxxxxxx, neilb@xxxxxxxx > Sent: Monday, April 2, 2018 7:02:37 AM > Subject: Re: [PATCH V3 1/1] MD: fix lock contention for flush bios > > On Wed, Mar 21, 2018 at 02:47:22PM +0800, Xiao Ni wrote: > > There is a lock contention when there are many processes which send flush > > bios > > to md device. eg. Create many lvs on one raid device and mkfs.xfs on each > > lv. > > > > Now it just can handle flush request sequentially. It needs to wait > > mddev->flush_bio > > to be NULL, otherwise get mddev->lock. > > > > This patch remove mddev->flush_bio and handle flush bio asynchronously. > > I did a test with command dbench -s 128 -t 4800. This is the test result: > > > > =================Without the patch============================ > > Operation Count AvgLat MaxLat > > -------------------------------------------------- > > Flush 5239 142.590 3972.034 > > Close 53114 0.176 498.236 > > LockX 208 0.066 0.907 > > Rename 2793 0.335 7.203 > > ReadX 98100 0.020 2.280 > > WriteX 67800 555.649 8238.498 > > Unlink 7985 1.742 446.503 > > UnlockX 208 0.058 1.013 > > FIND_FIRST 21035 0.141 3.147 > > SET_FILE_INFORMATION 6419 0.090 1.539 > > QUERY_FILE_INFORMATION 18244 0.007 0.130 > > QUERY_PATH_INFORMATION 55622 0.060 3.884 > > QUERY_FS_INFORMATION 9451 0.040 1.148 > > NTCreateX 63960 0.717 536.542 > > > > Throughput 12.1782 MB/sec (sync open) 128 clients 128 procs > > max_latency=8238.513 ms > > > > =====================With the patch=========================== > > Operation Count AvgLat MaxLat > > -------------------------------------------------- > > Flush 34858 36.484 668.243 > > Close 379883 0.107 252.232 > > LockX 1792 0.048 1.070 > > Rename 21761 0.804 266.659 > > ReadX 817947 0.021 42.891 > > WriteX 254804 142.485 948.090 > > Unlink 99665 3.590 899.816 > > UnlockX 1792 0.056 1.240 > > FIND_FIRST 178857 0.187 23.287 > > SET_FILE_INFORMATION 41612 0.135 26.575 > > QUERY_FILE_INFORMATION 83691 0.007 2.589 > > QUERY_PATH_INFORMATION 470889 0.077 83.846 > > QUERY_FS_INFORMATION 82764 0.056 10.368 > > NTCreateX 512262 0.616 809.980 > > > > Throughput 53.6545 MB/sec (sync open) 128 clients 128 procs > > max_latency=948.105 ms > > > > V3: > > Shaohua suggests mempool is overkill. In v3 it allocs memory during > > creating raid device > > and uses a simple bitmap to record which resource is free. > > Sorry for the delay. The bitmap method is still too complicated. Can we do > something like this: > > in mddev: > struct flush_info flush_infos[8 or 16]; Maybe don't need a special struct > too. > > the info can include a lock, every time we handle a flush, we select a > flush_info by flush_infos[jhash(flush_request_bio_address)]; > > we then take the lock and do whatever current code does against the specific > flush_info. Isn't this a simpler implementation? I think 8 or 16 locks should > reduce most of the lock contention. Hi Shaohua There is a problem. If the method hash give the same result. The flush bios will wait for one flush_info, even there are some flush_info are free. If we use the bitmap method, the flush request can get one flush info in the flush_infos[]. Best Regards Xiao -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html