Re: [PATCH 0/8] Add support for vectored registered buffers

Pavel Begunkov <asml.silence@xxxxxxxxx> · Tue, 4 Mar 2025 10:21:26 +0000

On 3/3/25 21:03, Andres Freund wrote:
Hi,

On 2025-03-03 15:50:55 +0000, Pavel Begunkov wrote:
Add registered buffer support for vectored io_uring operations. That
allows to pass an iovec, all entries of which must belong to and
point into the same registered buffer specified by sqe->buf_index.

This is very much appreciated!'

Glad to hear. I do remember you mentioning the contention issue
in the list. A bunch of other people who were interested as well.

The series covers zerocopy sendmsg and reads / writes. Reads and
writes are implemented as new opcodes, while zerocopy sendmsg
reuses IORING_RECVSEND_FIXED_BUF for the api.

Results are aligned to what one would expect from registered buffers:

t/io_uring + nullblk, single segment 16K:
   34 -> 46 GiB/s

FWIW, I'd expect bigger wins with real IO when using 1GB huge pages. I

I didn't even benchmark it meaningfully as we should be able to
extrapolate results from registered buffer test, but I agree, such
contention might make it even more desirable.

encountered when there were a lot of reads from a large nvme raid into a small
set of shared huge pages (database buffer pool), by many proceses
concurrently. The constant pinning/unpinning of the relevant folio caused a
lot of contention.

Unfortunately switching to registered buffers would, until now, have required
using non-vectored IO, which causes significant performance regressions in
other cases...

--
Pavel Begunkov