On Mon, Apr 02, 2018 at 10:48:36PM -0400, Xiao Ni wrote: > > > ----- Original Message ----- > > From: "Shaohua Li" <shli@xxxxxxxxxx> > > To: "Xiao Ni" <xni@xxxxxxxxxx> > > Cc: linux-raid@xxxxxxxxxxxxxxx, "ming lei" <ming.lei@xxxxxxxxxx>, ncroxon@xxxxxxxxxx, neilb@xxxxxxxx > > Sent: Monday, April 2, 2018 7:02:37 AM > > Subject: Re: [PATCH V3 1/1] MD: fix lock contention for flush bios > > > > On Wed, Mar 21, 2018 at 02:47:22PM +0800, Xiao Ni wrote: > > > There is a lock contention when there are many processes which send flush > > > bios > > > to md device. eg. Create many lvs on one raid device and mkfs.xfs on each > > > lv. > > > > > > Now it just can handle flush request sequentially. It needs to wait > > > mddev->flush_bio > > > to be NULL, otherwise get mddev->lock. > > > > > > This patch remove mddev->flush_bio and handle flush bio asynchronously. > > > I did a test with command dbench -s 128 -t 4800. This is the test result: > > > > > > =================Without the patch============================ > > > Operation Count AvgLat MaxLat > > > -------------------------------------------------- > > > Flush 5239 142.590 3972.034 > > > Close 53114 0.176 498.236 > > > LockX 208 0.066 0.907 > > > Rename 2793 0.335 7.203 > > > ReadX 98100 0.020 2.280 > > > WriteX 67800 555.649 8238.498 > > > Unlink 7985 1.742 446.503 > > > UnlockX 208 0.058 1.013 > > > FIND_FIRST 21035 0.141 3.147 > > > SET_FILE_INFORMATION 6419 0.090 1.539 > > > QUERY_FILE_INFORMATION 18244 0.007 0.130 > > > QUERY_PATH_INFORMATION 55622 0.060 3.884 > > > QUERY_FS_INFORMATION 9451 0.040 1.148 > > > NTCreateX 63960 0.717 536.542 > > > > > > Throughput 12.1782 MB/sec (sync open) 128 clients 128 procs > > > max_latency=8238.513 ms > > > > > > =====================With the patch=========================== > > > Operation Count AvgLat MaxLat > > > -------------------------------------------------- > > > Flush 34858 36.484 668.243 > > > Close 379883 0.107 252.232 > > > LockX 1792 0.048 1.070 > > > Rename 21761 0.804 266.659 > > > ReadX 817947 0.021 42.891 > > > WriteX 254804 142.485 948.090 > > > Unlink 99665 3.590 899.816 > > > UnlockX 1792 0.056 1.240 > > > FIND_FIRST 178857 0.187 23.287 > > > SET_FILE_INFORMATION 41612 0.135 26.575 > > > QUERY_FILE_INFORMATION 83691 0.007 2.589 > > > QUERY_PATH_INFORMATION 470889 0.077 83.846 > > > QUERY_FS_INFORMATION 82764 0.056 10.368 > > > NTCreateX 512262 0.616 809.980 > > > > > > Throughput 53.6545 MB/sec (sync open) 128 clients 128 procs > > > max_latency=948.105 ms > > > > > > V3: > > > Shaohua suggests mempool is overkill. In v3 it allocs memory during > > > creating raid device > > > and uses a simple bitmap to record which resource is free. > > > > Sorry for the delay. The bitmap method is still too complicated. Can we do > > something like this: > > > > in mddev: > > struct flush_info flush_infos[8 or 16]; Maybe don't need a special struct > > too. > > > > the info can include a lock, every time we handle a flush, we select a > > flush_info by flush_infos[jhash(flush_request_bio_address)]; > > > > we then take the lock and do whatever current code does against the specific > > flush_info. Isn't this a simpler implementation? I think 8 or 16 locks should > > reduce most of the lock contention. > > Hi Shaohua > > There is a problem. If the method hash give the same result. The flush bios will > wait for one flush_info, even there are some flush_info are free. If we use the > bitmap method, the flush request can get one flush info in the flush_infos[]. If the hash algorithm, this shouldn't be a problem. The bitmap method need a global lock, which we want to avoid I think. Thanks, Shaohua -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html