On 2/15/23 8:52?AM, Helge Deller wrote: > On 2/15/23 16:16, Jens Axboe wrote: >> On 2/14/23 7:12?PM, John David Anglin wrote: >>> On 2023-02-14 6:29 p.m., Jens Axboe wrote: >>>> On 2/14/23 4:09?PM, Helge Deller wrote: >>>>> * John David Anglin<dave.anglin@xxxxxxxx>: >>>>>> On 2023-02-13 5:05 p.m., Helge Deller wrote: >>>>>>> On 2/13/23 22:05, Jens Axboe wrote: >>>>>>>> On 2/13/23 1:59?PM, Helge Deller wrote: >>>>>>>>>> Yep sounds like it. What's the caching architecture of parisc? >>>>>>>>> parisc is Virtually Indexed, Physically Tagged (VIPT). >>>>>>>> That's what I assumed, so virtual aliasing is what we're dealing with >>>>>>>> here. >>>>>>>> >>>>>>>>> Thanks for the patch! >>>>>>>>> Sadly it doesn't fix the problem, as the kernel still sees >>>>>>>>> ctx->rings->sq.tail as being 0. >>>>>>>>> Interestingly it worked once (not reproduceable) directly after bootup, >>>>>>>>> which indicates that we at least look at the right address from kernel side. >>>>>>>>> >>>>>>>>> So, still needs more debugging/testing. >>>>>>>> It's not like this is untested stuff, so yeah it'll generally be >>>>>>>> correct, it just seems that parisc is a bit odd in that the virtual >>>>>>>> aliasing occurs between the kernel and userspace addresses too. At least >>>>>>>> that's what it seems like. >>>>>>> True. >>>>>>> >>>>>>>> But I wonder if what needs flushing is the user side, not the kernel >>>>>>>> side? Either that, or my patch is not flushing the right thing on the >>>>>>>> kernel side. >>>>> The patch below seems to fix the issue. >>>>> >>>>> I've successfuly tested it with the io_uring-test testcase on >>>>> physical parisc machines with 32- and 64-bit 6.1.11 kernels. >>>>> >>>>> The idea is similiar on how a file is mmapped shared by two >>>>> userspace processes by keeping the lower bits of the virtual address >>>>> the same. >>>>> >>>>> Cache flushes from userspace don't seem to be needed. >>>> Are they from the kernel side, if the lower bits mean we end up >>>> with the same coloring? Because I think this is a bit of a big >>>> hammer, in terms of overhead for flushing. As an example, on arm64 >>>> that is perfectly fine with the existing code, it's about a 20-25% >>>> performance hit. >>> >>> The io_uring-test testcase still works on rp3440 with the kernel >>> flushes removed. >> >> That's what I suspected, the important bit here is just aligning it for >> identical coloring. Can you confirm if the below works for you? Had to >> fiddle it a bit to get it to work without coloring. > > Yes, the patch works for me on 32- and 64-bit, even with PA8900 CPUs... > > Is there maybe somewhere a more detailled testcase which I could try too? Just git clone liburing: git clone git://git.kernel.dk/liburing and run make && make runtests in there, that'll go through the whole regression suite. > Some nits below... > >> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c >> index db623b3185c8..1d4562067949 100644 >> --- a/io_uring/io_uring.c >> +++ b/io_uring/io_uring.c >> @@ -72,6 +72,7 @@ >> #include <linux/io_uring.h> >> #include <linux/audit.h> >> #include <linux/security.h> >> +#include <asm/shmparam.h> >> >> #define CREATE_TRACE_POINTS >> #include <trace/events/io_uring.h> >> @@ -3200,6 +3201,51 @@ static __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma) >> return remap_pfn_range(vma, vma->vm_start, pfn, sz, vma->vm_page_prot); >> } >> >> +static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp, >> + unsigned long addr, unsigned long len, >> + unsigned long pgoff, unsigned long flags) >> +{ >> + const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags); >> + struct vm_unmapped_area_info info; >> + void *ptr; >> + >> + ptr = io_uring_validate_mmap_request(filp, pgoff, len); >> + if (IS_ERR(ptr)) >> + return -ENOMEM; >> + >> + /* we do not support requesting a specific address */ >> + if (addr) >> + return -EINVAL; > > With this ^ we disallow users to provide a proposed address. > I think this is ok and I suggest to keep it that way. > > Alternatively one could check the given address against the > alignment which is calculated below, but this will make the > code IMHO unnecessary bigger. liburing won't provide an address, so I'd say let's just keep it as-is. >> + >> + info.flags = VM_UNMAPPED_AREA_TOPDOWN; >> + info.length = len; >> + info.low_limit = max(PAGE_SIZE, mmap_min_addr); >> + info.high_limit = arch_get_mmap_base(addr, current->mm->mmap_base); >> + info.align_mask = PAGE_MASK; >> + info.align_offset = (unsigned long) ptr; > > For parisc I introduced SHM_COLOUR because it allows userspace > to map a shared file initially at any PAGE_SIZE-aligned address. > Only if then a second user maps the same file, the aliasing will be enforced. > > Other platforms just have SHMLBA, and for some SHMLBA is > PAGE_SIZE. > So, instead of above code, this untested code might be better for those other > platforms ? > info.align_mask = PAGE_MASK & (SHMLBA - 1); > info.align_offset = (unsigned long)ptr & (SHMLBA - 1); Yeah, I did peek at SHMLBA as well and it seems more common. Could you test that and send out a "real" patch so we can get it queued up? > this is ok -> >> +#ifdef SHM_COLOUR >> + info.align_mask &= (SHM_COLOUR - 1); >> + info.align_offset &= (SHM_COLOUR - 1) > > ^^ misses a ";" at the end. Oops indeed. -- Jens Axboe