On 3/8/21 4:54 AM, JeffleXu wrote:
On 3/6/21 1:56 AM, Heinz Mauelshagen wrote:
On 3/5/21 6:46 PM, Heinz Mauelshagen wrote:
On 3/5/21 10:52 AM, JeffleXu wrote:
On 3/3/21 6:09 PM, Mikulas Patocka wrote:
On Wed, 3 Mar 2021, JeffleXu wrote:
On 3/3/21 3:05 AM, Mikulas Patocka wrote:
Support I/O polling if submit_bio_noacct_mq_direct returned non-empty
cookie.
Signed-off-by: Mikulas Patocka <mpatocka@xxxxxxxxxx>
---
drivers/md/dm.c | 5 +++++
1 file changed, 5 insertions(+)
Index: linux-2.6/drivers/md/dm.c
===================================================================
--- linux-2.6.orig/drivers/md/dm.c 2021-03-02
19:26:34.000000000 +0100
+++ linux-2.6/drivers/md/dm.c 2021-03-02 19:26:34.000000000 +0100
@@ -1682,6 +1682,11 @@ static void __split_and_process_bio(stru
}
}
+ if (ci.poll_cookie != BLK_QC_T_NONE) {
+ while (atomic_read(&ci.io->io_count) > 1 &&
+ blk_poll(ci.poll_queue, ci.poll_cookie, true)) ;
+ }
+
/* drop the extra reference count */
dec_pending(ci.io, errno_to_blk_status(error));
}
It seems that the general idea of your design is to
1) submit *one* split bio
2) blk_poll(), waiting the previously submitted split bio complets
No, I submit all the bios and poll for the last one.
and then submit next split bio, repeating the above process. I'm
afraid
the performance may be an issue here, since the batch every time
blk_poll() reaps may decrease.
Could you benchmark it?
I only tested dm-linear.
The configuration (dm table) of dm-linear is:
0 1048576 linear /dev/nvme0n1 0
1048576 1048576 linear /dev/nvme2n1 0
2097152 1048576 linear /dev/nvme5n1 0
fio script used is:
```
$cat fio.conf
[global]
name=iouring-sqpoll-iopoll-1
ioengine=io_uring
iodepth=128
numjobs=1
thread
rw=randread
direct=1
registerfiles=1
hipri=1
runtime=10
time_based
group_reporting
randrepeat=0
filename=/dev/mapper/testdev
bs=4k
[job-1]
cpus_allowed=14
```
IOPS (IRQ mode) | IOPS (iopoll mode (hipri=1))
--------------- | --------------------
213k | 19k
At least, it doesn't work well with io_uring interface.
Jeffle,
I ran your above fio test on a linear LV split across 3 NVMes to
second your split mapping
(system: 32 core Intel, 256GiB RAM) comparing io engines sync, libaio
and io_uring,
the latter w/ and w/o hipri (sync+libaio obviously w/o registerfiles
and hipri) which resulted ok:
sync | libaio | IRQ mode (hipri=0) | iopoll (hipri=1)
------|----------|---------------------|----------------- 56.3K |
290K | 329K | 351K I can't second your
drastic hipri=1 drop here...
Sorry, email mess.
sync | libaio | IRQ mode (hipri=0) | iopoll (hipri=1)
-------|----------|---------------------|-----------------
56.3K | 290K | 329K | 351K
I can't second your drastic hipri=1 drop here...
Hummm, that's indeed somewhat strange...
My test environment:
- CPU: 128 cores, though only one CPU core is used since
'cpus_allowed=14' in fio configuration
- memory: 983G memory free
- NVMe: Huawai ES3510P (HWE52P434T0L005N), with 'nvme.poll_queues=3'
Maybe you didn't specify 'nvme.poll_queues=XXX'? In this case, IO still
goes into IRQ mode, even you have specified 'hipri=1'?
Jeffle,
nvme.poll_queues was zero indeed.
At 3 results are hipri=0 / hipri=1 : 1699K / 1702K IOPS (all cores)
Single core results : 315K / 329K
Still no extreme drop...
FWIW:
Thanks,
Heinz
--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/dm-devel