Re: io_uring failure on parisc with VIPT caches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/15/23 16:56, Jens Axboe wrote:
On 2/15/23 8:52?AM, Helge Deller wrote:
On 2/15/23 16:16, Jens Axboe wrote:
On 2/14/23 7:12?PM, John David Anglin wrote:
On 2023-02-14 6:29 p.m., Jens Axboe wrote:
On 2/14/23 4:09?PM, Helge Deller wrote:
* John David Anglin<dave.anglin@xxxxxxxx>:
On 2023-02-13 5:05 p.m., Helge Deller wrote:
On 2/13/23 22:05, Jens Axboe wrote:
On 2/13/23 1:59?PM, Helge Deller wrote:
Yep sounds like it. What's the caching architecture of parisc?
parisc is Virtually Indexed, Physically Tagged (VIPT).
That's what I assumed, so virtual aliasing is what we're dealing with
here.

Thanks for the patch!
Sadly it doesn't fix the problem, as the kernel still sees
ctx->rings->sq.tail as being 0.
Interestingly it worked once (not reproduceable) directly after bootup,
which indicates that we at least look at the right address from kernel side.

So, still needs more debugging/testing.
It's not like this is untested stuff, so yeah it'll generally be
correct, it just seems that parisc is a bit odd in that the virtual
aliasing occurs between the kernel and userspace addresses too. At least
that's what it seems like.
True.

But I wonder if what needs flushing is the user side, not the kernel
side? Either that, or my patch is not flushing the right thing on the
kernel side.
The patch below seems to fix the issue.

I've successfuly tested it with the io_uring-test testcase on
physical parisc machines with 32- and 64-bit 6.1.11 kernels.

The idea is similiar on how a file is mmapped shared by two
userspace processes by keeping the lower bits of the virtual address
the same.

Cache flushes from userspace don't seem to be needed.
Are they from the kernel side, if the lower bits mean we end up
with the same coloring? Because I think this is a bit of a big
hammer, in terms of overhead for flushing. As an example, on arm64
that is perfectly fine with the existing code, it's about a 20-25%
performance hit.

The io_uring-test testcase still works on rp3440 with the kernel
flushes removed.

That's what I suspected, the important bit here is just aligning it for
identical coloring. Can you confirm if the below works for you? Had to
fiddle it a bit to get it to work without coloring.

Yes, the patch works for me on 32- and 64-bit, even with PA8900 CPUs...

Is there maybe somewhere a more detailled testcase which I could try too?

Just git clone liburing:

git clone git://git.kernel.dk/liburing

and run make && make runtests in there, that'll go through the whole
regression suite.

Thanks!
I'll test.

Some nits below...

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index db623b3185c8..1d4562067949 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -72,6 +72,7 @@
   #include <linux/io_uring.h>
   #include <linux/audit.h>
   #include <linux/security.h>
+#include <asm/shmparam.h>

   #define CREATE_TRACE_POINTS
   #include <trace/events/io_uring.h>
@@ -3200,6 +3201,51 @@ static __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
       return remap_pfn_range(vma, vma->vm_start, pfn, sz, vma->vm_page_prot);
   }

+static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp,
+            unsigned long addr, unsigned long len,
+            unsigned long pgoff, unsigned long flags)
+{
+    const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags);
+    struct vm_unmapped_area_info info;
+    void *ptr;
+
+    ptr = io_uring_validate_mmap_request(filp, pgoff, len);
+    if (IS_ERR(ptr))
+        return -ENOMEM;
+
+    /* we do not support requesting a specific address */
+    if (addr)
+        return -EINVAL;

With this ^ we disallow users to provide a proposed address.
I think this is ok and I suggest to keep it that way.

Alternatively one could check the given address against the
alignment which is calculated below, but this will make the
code IMHO unnecessary bigger.

liburing won't provide an address, so I'd say let's just keep it as-is.

Good.

+
+    info.flags = VM_UNMAPPED_AREA_TOPDOWN;
+    info.length = len;
+    info.low_limit = max(PAGE_SIZE, mmap_min_addr);
+    info.high_limit = arch_get_mmap_base(addr, current->mm->mmap_base);
+    info.align_mask = PAGE_MASK;
+    info.align_offset = (unsigned long) ptr;

For parisc I introduced SHM_COLOUR because it allows userspace
to map a shared file initially at any PAGE_SIZE-aligned address.
Only if then a second user maps the same file, the aliasing will be enforced.

Other platforms just have SHMLBA, and for some SHMLBA is > PAGE_SIZE.
So, instead of above code, this untested code might be better for those other
platforms ?
info.align_mask = PAGE_MASK & (SHMLBA - 1);
info.align_offset = (unsigned long)ptr & (SHMLBA - 1);

Yeah, I did peek at SHMLBA as well and it seems more common. Could you
test that and send out a "real" patch so we can get it queued up?

Sure, I'll do.

Helge




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux