Re: [PATCHSET 0/5] User mapped provided buffer rings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2023-03-16 10:17 p.m., Jens Axboe wrote:
On 3/16/23 8:09?PM, Jens Axboe wrote:
On 3/16/23 1:46?PM, Jens Axboe wrote:
On 3/16/23 1:08?PM, John David Anglin wrote:
On 2023-03-15 5:18 p.m., Jens Axboe wrote:
On 3/15/23 2:38?PM, Jens Axboe wrote:
On 3/15/23 2:07?PM, Helge Deller wrote:
On 3/15/23 21:03, Helge Deller wrote:
Hi Jens,

Thanks for doing those fixes!

On 3/14/23 18:16, Jens Axboe wrote:
One issue that became apparent when running io_uring code on parisc is
that for data shared between the application and the kernel, we must
ensure that it's placed correctly to avoid aliasing issues that render
it useless.

The first patch in this series is from Helge, and ensures that the
SQ/CQ rings are mapped appropriately. This makes io_uring actually work
there.

Patches 2..4 are prep patches for patch 5, which adds a variant of
ring mapped provided buffers that have the kernel allocate the memory
for them and the application mmap() it. This brings these mapped
buffers in line with how the SQ/CQ rings are managed too.

I'm not fully sure if this ONLY impacts archs that set SHM_COLOUR,
of which there is only parisc, or if SHMLBA setting archs (of which
there are others) are impact to any degree as well...
It would be interesting to find out. I'd assume that other arches,
e.g. sparc, might have similiar issues.
Have you tested your patches on other arches as well?
By the way, I've now tested this series on current git head on an
older parisc box (with PA8700 / PCX-W2 CPU).

Results of liburing testsuite:
Tests timed out (1): <send-zerocopy.t> - (may not be a failure)
Tests failed (5): <buf-ring.t> <file-verify.t> <poll-race-mshot.t> <ringbuf-read.t> <send_recvmsg.t>
If you update your liburing git copy, switch to the ring-buf-alloc branch,
then all of the above should work:
With master liburing branch, test/poll-race-mshot.t still crashes my rp3440:
Running test poll-race-mshot.t Bad cqe res -233
Bad cqe res -233
Bad cqe res -233

There is a total lockup with no messages of any kind.

I think the io_uring code needs to reject user supplied ring buffers that are not equivalently mapped
to the corresponding kernel pages.  Don't know if it would be possible to reallocate kernel pages so they
are equivalently mapped.
We can do that, you'd just want to add that check in io_pin_pbuf_ring()
when the pages have been mapped AND we're on an arch that has those
kinds of requirements. Maybe something like the below, totally
untested...

I am puzzled where the crash is coming from, though. It should just hit
the -ENOBUFS case as it can't find a buffer, and that'd terminate that
request. Which does seem to be what is happening above, that is really
no different than an attempt to read/receive from a buffer group that
has no buffers available. So a bit puzzling on what makes your kernel
crash after that has happened, as we do have generic test cases that
exercise that explicitly.


diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
index cd1d9dddf58e..73f290aca7f1 100644
--- a/io_uring/kbuf.c
+++ b/io_uring/kbuf.c
@@ -491,6 +491,15 @@ static int io_pin_pbuf_ring(struct io_uring_buf_reg *reg,
  		return PTR_ERR(pages);
br = page_address(pages[0]);
+#ifdef SHM_COLOUR
+	if ((reg->ring_addr & (unsigned long) br) & SHM_COLOUR) {
& (SHM_COLOUR - 1)) {

of course...
Full version, I think this should do the right thing. If the kernel and
app side isn't aligned on the same SHM_COLOUR boundary, we'll return
-EINVAL rather than setup the ring.

For the ring-buf-alloc branch, this is handled automatically. But we
should, as you mentioned, ensure that the kernel doesn't allow setting
something up that will not work.

Note that this is still NOT related to your hang, I honestly have no
idea what that could be. Unfortunately parisc doesn't have a lot of
debugging aids for this... Could even be a generic kernel issue. I
looked up your rp3440, and it sounds like we have basically the same
setup. I'm running a dual socket PA8900 at 1GHz.
With this change, test/poll-race-mshot.t no longer crashes my rp34404.

Results on master are:
Tests timed out (2): <a4c0b3decb33.t> <send-zerocopy.t>
Tests failed (1): <fd-pass.t>

Running test buf-ring.t 0 sec [0]
Running test poll-race-mshot.t Skipped

Results on ring-buf-alloc are:
Tests timed out (2): <a4c0b3decb33.t> <send-zerocopy.t>
Tests failed (2): <buf-ring.t> <fd-pass.t>

Running test buf-ring.t register buf ring failed -22
test_full_page_reg failed
Test buf-ring.t failed with ret 1
Running test poll-race-mshot.t 4 sec

Without the change, the test/poll-race-mshot.t test causes HPMCs on my rp3440 (two processors).
The front status LED turns red and the event is logged in the hardware system log.  I looked at where
the HPMC occurred but the locations were unrelated to io_uring.

I tried running the test under strace.  With output to console, the test doesn't cause a crash and it more
or less exits normally (need ^C to kill one process).  With output to file, system crashes and file is empty
on reboot.

fd-pass.t fail is new.

I don't think buf-ring.t and send_recvmsg.t actually pass on master with change.  Tests probably need
updating.

The "Bad cqe res -233" messages are gone😁

Aside from additional server related stuff, the rp3440 is architecturally similar to c8000.  Both used PA8800
and PA8900 CPUs.



diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
index cd1d9dddf58e..7c6544456f90 100644
--- a/io_uring/kbuf.c
+++ b/io_uring/kbuf.c
@@ -491,6 +491,15 @@ static int io_pin_pbuf_ring(struct io_uring_buf_reg *reg,
  		return PTR_ERR(pages);
br = page_address(pages[0]);
+#ifdef SHM_COLOUR
+	if ((reg->ring_addr | (unsigned long) br) & (SHM_COLOUR - 1)) {
+		int i;
+
+		for (i = 0; i < nr_pages; i++)
+			unpin_user_page(pages[i]);
+		return -EINVAL;
+	}
+#endif
  	bl->buf_pages = pages;
  	bl->buf_nr_pages = nr_pages;
  	bl->buf_ring = br;


Dave

--
John David Anglin  dave.anglin@xxxxxxxx




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux