Re: io_uring failure on parisc with VIPT caches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/15/23 8:52?AM, Helge Deller wrote:
> On 2/15/23 16:16, Jens Axboe wrote:
>> On 2/14/23 7:12?PM, John David Anglin wrote:
>>> On 2023-02-14 6:29 p.m., Jens Axboe wrote:
>>>> On 2/14/23 4:09?PM, Helge Deller wrote:
>>>>> * John David Anglin<dave.anglin@xxxxxxxx>:
>>>>>> On 2023-02-13 5:05 p.m., Helge Deller wrote:
>>>>>>> On 2/13/23 22:05, Jens Axboe wrote:
>>>>>>>> On 2/13/23 1:59?PM, Helge Deller wrote:
>>>>>>>>>> Yep sounds like it. What's the caching architecture of parisc?
>>>>>>>>> parisc is Virtually Indexed, Physically Tagged (VIPT).
>>>>>>>> That's what I assumed, so virtual aliasing is what we're dealing with
>>>>>>>> here.
>>>>>>>>
>>>>>>>>> Thanks for the patch!
>>>>>>>>> Sadly it doesn't fix the problem, as the kernel still sees
>>>>>>>>> ctx->rings->sq.tail as being 0.
>>>>>>>>> Interestingly it worked once (not reproduceable) directly after bootup,
>>>>>>>>> which indicates that we at least look at the right address from kernel side.
>>>>>>>>>
>>>>>>>>> So, still needs more debugging/testing.
>>>>>>>> It's not like this is untested stuff, so yeah it'll generally be
>>>>>>>> correct, it just seems that parisc is a bit odd in that the virtual
>>>>>>>> aliasing occurs between the kernel and userspace addresses too. At least
>>>>>>>> that's what it seems like.
>>>>>>> True.
>>>>>>>
>>>>>>>> But I wonder if what needs flushing is the user side, not the kernel
>>>>>>>> side? Either that, or my patch is not flushing the right thing on the
>>>>>>>> kernel side.
>>>>> The patch below seems to fix the issue.
>>>>>
>>>>> I've successfuly tested it with the io_uring-test testcase on
>>>>> physical parisc machines with 32- and 64-bit 6.1.11 kernels.
>>>>>
>>>>> The idea is similiar on how a file is mmapped shared by two
>>>>> userspace processes by keeping the lower bits of the virtual address
>>>>> the same.
>>>>>
>>>>> Cache flushes from userspace don't seem to be needed.
>>>> Are they from the kernel side, if the lower bits mean we end up
>>>> with the same coloring? Because I think this is a bit of a big
>>>> hammer, in terms of overhead for flushing. As an example, on arm64
>>>> that is perfectly fine with the existing code, it's about a 20-25%
>>>> performance hit.
>>>
>>> The io_uring-test testcase still works on rp3440 with the kernel
>>> flushes removed.
>>
>> That's what I suspected, the important bit here is just aligning it for
>> identical coloring. Can you confirm if the below works for you? Had to
>> fiddle it a bit to get it to work without coloring.
> 
> Yes, the patch works for me on 32- and 64-bit, even with PA8900 CPUs...
> 
> Is there maybe somewhere a more detailled testcase which I could try too?

Just git clone liburing:

git clone git://git.kernel.dk/liburing

and run make && make runtests in there, that'll go through the whole
regression suite.

> Some nits below...
> 
>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
>> index db623b3185c8..1d4562067949 100644
>> --- a/io_uring/io_uring.c
>> +++ b/io_uring/io_uring.c
>> @@ -72,6 +72,7 @@
>>   #include <linux/io_uring.h>
>>   #include <linux/audit.h>
>>   #include <linux/security.h>
>> +#include <asm/shmparam.h>
>>
>>   #define CREATE_TRACE_POINTS
>>   #include <trace/events/io_uring.h>
>> @@ -3200,6 +3201,51 @@ static __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
>>       return remap_pfn_range(vma, vma->vm_start, pfn, sz, vma->vm_page_prot);
>>   }
>>
>> +static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp,
>> +            unsigned long addr, unsigned long len,
>> +            unsigned long pgoff, unsigned long flags)
>> +{
>> +    const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags);
>> +    struct vm_unmapped_area_info info;
>> +    void *ptr;
>> +
>> +    ptr = io_uring_validate_mmap_request(filp, pgoff, len);
>> +    if (IS_ERR(ptr))
>> +        return -ENOMEM;
>> +
>> +    /* we do not support requesting a specific address */
>> +    if (addr)
>> +        return -EINVAL;
> 
> With this ^ we disallow users to provide a proposed address.
> I think this is ok and I suggest to keep it that way.
> 
> Alternatively one could check the given address against the
> alignment which is calculated below, but this will make the
> code IMHO unnecessary bigger.

liburing won't provide an address, so I'd say let's just keep it as-is.

>> +
>> +    info.flags = VM_UNMAPPED_AREA_TOPDOWN;
>> +    info.length = len;
>> +    info.low_limit = max(PAGE_SIZE, mmap_min_addr);
>> +    info.high_limit = arch_get_mmap_base(addr, current->mm->mmap_base);
>> +    info.align_mask = PAGE_MASK;
>> +    info.align_offset = (unsigned long) ptr;
> 
> For parisc I introduced SHM_COLOUR because it allows userspace
> to map a shared file initially at any PAGE_SIZE-aligned address.
> Only if then a second user maps the same file, the aliasing will be enforced.
> 
> Other platforms just have SHMLBA, and for some SHMLBA is > PAGE_SIZE.
> So, instead of above code, this untested code might be better for those other
> platforms ?
> info.align_mask = PAGE_MASK & (SHMLBA - 1);
> info.align_offset = (unsigned long)ptr & (SHMLBA - 1);

Yeah, I did peek at SHMLBA as well and it seems more common. Could you
test that and send out a "real" patch so we can get it queued up?

> this is ok ->
>> +#ifdef SHM_COLOUR
>> +    info.align_mask &= (SHM_COLOUR - 1);
>> +    info.align_offset &= (SHM_COLOUR - 1)
> 
> ^^ misses a ";" at the end.

Oops indeed.

-- 
Jens Axboe




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux