Re: [PATCH for-next v3 0/4] fixed-buffer for uring-cmd/passthrough

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 02, 2022 at 03:25:33PM -0600, Jens Axboe wrote:
On 9/2/22 1:32 PM, Jens Axboe wrote:
On 9/2/22 12:46 PM, Kanchan Joshi wrote:
On Fri, Sep 02, 2022 at 10:32:16AM -0600, Jens Axboe wrote:
On 9/2/22 10:06 AM, Jens Axboe wrote:
On 9/2/22 9:16 AM, Kanchan Joshi wrote:
Hi,

Currently uring-cmd lacks the ability to leverage the pre-registered
buffers. This series adds the support in uring-cmd, and plumbs
nvme passthrough to work with it.

Using registered-buffers showed peak-perf hike from 1.85M to 2.17M IOPS
in my setup.

Without fixedbufs
*****************
# taskset -c 0 t/io_uring -b512 -d128 -c32 -s32 -p0 -F1 -B0 -O0 -n1 -u1 /dev/ng0n1
submitter=0, tid=5256, file=/dev/ng0n1, node=-1
polled=0, fixedbufs=0/0, register_files=1, buffered=1, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
IOPS=1.85M, BW=904MiB/s, IOS/call=32/31
IOPS=1.85M, BW=903MiB/s, IOS/call=32/32
IOPS=1.85M, BW=902MiB/s, IOS/call=32/32
^CExiting on signal
Maximum IOPS=1.85M

With the poll support queued up, I ran this one as well. tldr is:

bdev (non pt)??? 122M IOPS
irq driven??? 51-52M IOPS
polled??????? 71M IOPS
polled+fixed??? 78M IOPS

except first one, rest three entries are for passthru? somehow I didn't
see that big of a gap. I will try to align my setup in coming days.

Right, sorry it was badly labeled. First one is bdev with polling,
registered buffers, etc. The others are all the passthrough mode. polled
goes to 74M with the caching fix, so it's about a 74M -> 82M bump using
registered buffers with passthrough and polling.

polled+fixed??? 82M

I suspect the remainder is due to the lack of batching on the request
freeing side, at least some of it. Haven't really looked deeper yet.

One issue I saw - try and use passthrough polling without having any
poll queues defined and it'll stall just spinning on completions. You
need to ensure that these are processed as well - look at how the
non-passthrough io_uring poll path handles it.

Had tested this earlier, and it used to run fine. And it does not now.
I see that io are getting completed, irq-completion is arriving in nvme
and it is triggering task-work based completion (by calling
io_uring_cmd_complete_in_task). But task-work never got called and
therefore no completion happened.

io_uring_cmd_complete_in_task -> io_req_task_work_add -> __io_req_task_work_add

Seems task work did not get added. Something about newly added
IORING_SETUP_DEFER_TASKRUN changes the scenario.

static inline void __io_req_task_work_add(struct io_kiocb *req, bool allow_local)
{
?????? struct io_uring_task *tctx = req->task->io_uring;
?????? struct io_ring_ctx *ctx = req->ctx;
?????? struct llist_node *node;

?????? if (allow_local && ctx->flags & IORING_SETUP_DEFER_TASKRUN) {
?????????????? io_req_local_work_add(req);
?????????????? return;
?????? }
????....

To confirm, I commented that in t/io_uring and it runs fine.
Please see if that changes anything for you? I will try to find the
actual fix tomorow.

Ah gotcha, yes that actually makes a lot of sense. I wonder if regular
polling is then also broken without poll queues if
IORING_SETUP_DEFER_TASKRUN is set. It should be, I'll check into
io_iopoll_check().

A mix of fixes and just cleanups, here's what I got.

Thanks, this looks much better. Just something to discuss on the fix
though. Will use other thread for that.





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux