Re: [RFC PATCH V2 3/3] dm: support bio polling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 21, 2021 at 07:33:34PM +0800, JeffleXu wrote:
> 
> 
> On 6/18/21 10:39 PM, Ming Lei wrote:
> > From 47e523b9ee988317369eaadb96826323cd86819e Mon Sep 17 00:00:00 2001
> > From: Ming Lei <ming.lei@xxxxxxxxxx>
> > Date: Wed, 16 Jun 2021 16:13:46 +0800
> > Subject: [RFC PATCH V3 3/3] dm: support bio polling
> > 
> > Support bio(REQ_POLLED) polling in the following approach:
> > 
> > 1) only support io polling on normal READ/WRITE, and other abnormal IOs
> > still fallback on IRQ mode, so the target io is exactly inside the dm
> > io.
> > 
> > 2) hold one refcnt on io->io_count after submitting this dm bio with
> > REQ_POLLED
> > 
> > 3) support dm native bio splitting, any dm io instance associated with
> > current bio will be added into one list which head is bio->bi_end_io
> > which will be recovered before ending this bio
> > 
> > 4) implement .poll_bio() callback, call bio_poll() on the single target
> > bio inside the dm io which is retrieved via bio->bi_bio_drv_data; call
> > dec_pending() after the target io is done in .poll_bio()
> > 
> > 4) enable QUEUE_FLAG_POLL if all underlying queues enable QUEUE_FLAG_POLL,
> > which is based on Jeffle's previous patch.
> > 
> > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx>
> > ---
> > V3:
> > 	- covers all comments from Jeffle
> > 	- fix corner cases when polling on abnormal ios
> > 
> ...
> 
> One bug and one performance issue, though I haven't investigated deep
> for both.
> 
> 
> kernel base: based on Jens' for-next, applying Christoph and Leiming's
> patchset.
> 
> 
> 1. One bug when there's DM device stack, e.g., dm-linear upon another
> dm-linear. Can be reproduced by following steps:
> 
> ```
> $ sudo dmsetup create tmpdev --table '0 2097152 linear /dev/nvme0n1 0'
> 
> $ cat tmp.table
> 0 2097152 linear /dev/mapper/tmpdev 0
> 2097152 2097152 linear /dev/nvme0n1 0
> 
> $ cat tmp.table | dmsetup create testdev
> 
> $ fio -name=test -ioengine=io_uring -iodepth=128 -numjobs=1 -thread
> -rw=randread -direct=1 -bs=4k -time_based -runtime=10 -cpus_allowed=6
> -filename=/dev/mapper/testdev -hipri=1
> ```
> 
> 
> BUG: unable to handle page fault for address: ffffffffc01a6208
> #PF: supervisor write access in kernel mode
> #PF: error_code(0x0003) - permissions violation
> PGD 39740c067 P4D 39740c067 PUD 39740e067 PMD 1035db067 PTE 1ddf6f061
> Oops: 0003 [#1] SMP PTI
> CPU: 6 PID: 5899 Comm: fio Tainted: G S
> 5.13.0-0.1.git.81bcdc3.al7.x86_64 #1
> Hardware name: Inventec     K900G3-10G/B900G3, BIOS A2.20 06/23/2017
> RIP: 0010:dm_submit_bio+0x171/0x3e0 [dm_mod]

It has been fixed in my local repo:

@@ -1608,6 +1649,7 @@ static void init_clone_info(struct clone_info *ci, struct mapped_device *md,
        ci->map = map;
        ci->io = alloc_io(md, bio);
        ci->sector = bio->bi_iter.bi_sector;
+       ci->submit_as_polled = false;

> 
> 
> 2. Performance Issue
> 
> I test both on x86 (with only one NVMe) and aarch64 (with multiple NVMes).
> 
> The result (IOPS) on x86 is as expected:
> 
> Type 	  |IRQ   | Polling
> --------- | ---- | ----
> dm-linear | 239k | 357k
> 
> - dm-linear built upon one NVMe,bs=4k, iopoll=1, iodepth=128,
> numjobs=1, direct, randread, ioengine=io_uring

This data looks good.

> 
> 
> 
> While the result on aarch64 is a little confusing.
> 
> Type 	      |IRQ   | Polling
> ------------- | ---- | ----
> dm-linear [1] | 208k | 230k
> dm-linear [2] | 637k | 691k
> dm-stripe     | 310k | 354k
> 
> - dm-linear [1] built upon *one* NVMe,bs=4k, iopoll=1, iodepth=128,
> *numjobs=1*, direct, randread, ioengine=io_uring
> - dm-linear [2] built upon *three* NVMes,bs=4k, iopoll=1, iodepth=128,
> *numjobs=3*, direct, randread, ioengine=io_uring
> - dm-stripe built upon *three* NVMes,chunk_size=4k, bs=12k, iopoll=1,
> iodepth=128, numjobs=3, direct, randread, ioengine=io_uring
> 
> 
> Following is the corresponding test result of Leiming's last
> implementation for bio-based polling on aarch64.
> IRQ	IOPOLL	ratio
> dm-linear [2]	639K	835K	~30%
> dm-stripe 	314K	408K	~30%

The previous version polls one hw queue once if bios are submitted to
same hw queue. We might improve it in future.


Thanks,
Ming

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/dm-devel




[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux