Re: [PATCH, V3] parisc: Rewrite cache flush code for PA8800/PA8900

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/17/22 16:26, John David Anglin wrote:
> On 2022-05-17 9:24 a.m., Helge Deller wrote:
>> On 5/17/22 15:19, Rolf Eike Beer wrote:
>>> Am Montag, 16. Mai 2022, 23:49:10 CEST schrieb Helge Deller:
>>>> On 5/16/22 23:28, Rolf Eike Beer wrote:
>>>>> Am Montag, 16. Mai 2022, 17:14:47 CEST schrieb John David Anglin:
>>>>>> Originally, I was convinced that we needed to use tmpalias flushes
>>>>>> everwhere, for both user and kernel flushes. However, when I modified
>>>>>> flush_kernel_dcache_page_addr, to use a tmpalias flush, my c8000
>>>>>> would crash quite early when booting.
>>>>>>
>>>>>> The PDC returns alias values of 0 for the icache and dcache. This
>>>>>> indicates that either the alias boundary is greater than 16MB or
>>>>>> equivalent aliasing doesn't work. I modified the tmpalias code to
>>>>>> make it easy to try alternate boundaries. I tried boundaries up to
>>>>>> 128MB but still kernel tmpalias flushes didn't work on c8000.
>>>>>>
>>>>>> This led me to conclude that tmpalias flushes don't work on PA8800
>>>>>> and PA8900 machines, and that we needed to flush directly using the
>>>>>> virtual address of user and kernel pages. This is likely the major
>>>>>> cause of instability on the c8000 and rp34xx machines.
>>>>>>
>>>>>> Flushing user pages requires doing a temporary context switch as we
>>>>>> have to flush pages that don't belong to the current context. Further,
>>>>>> we have to deal with pages that aren't present. If a page isn't
>>>>>> present, the flush instructions fault on every line.
>>>>>>
>>>>>> Other code has been rearranged and simplified based on testing. For
>>>>>> example, I introduced a flush_cache_dup_mm routine. flush_cache_mm
>>>>>> and flush_cache_dup_mm differ in that flush_cache_mm calls
>>>>>> purge_cache_pages and flush_cache_dup_mm calls flush_cache_pages.
>>>>>> In some implementations, pdc is more efficient than fdc. Based on
>>>>>> my testing, I don't believe there's any performance benefit on the
>>>>>> c8000.
>>>>>>
>>>>>> V2:
>>>>>> 1) Add flush_cache_page_check_pte routine.
>>>>>> 2) Use it in copy_to_user_page and copy_from_user_page.
>>>>>> 3) flush_anon_page moved to cache.c and updated.
>>>>>> 4) Changed alias boundary to 64 MB for 64-bit kernels. Revised comment
>>>>>>
>>>>>>     regarding alias boundary for PA8800/PA8900 processors.
>>>>>>
>>>>>> 5) Removed struct mm_struct * argument from flush_cache_pages.
>>>>>> 6) Fixed thinko in flush_cache_range. It increased the number of pages
>>>>>>
>>>>>>     flushed and slowed performance.
>>>>>>
>>>>>> 7) Removed sync changes from pacache.S.
>>>>>>
>>>>>> V3:
>>>>>> 1) copy_to_user_page and copy_from_user_page moved to cache.c to
>>>>>>
>>>>>>     improve inlining.
>>>>>>
>>>>>> 2) Replaced copy_user_page with copy_user_highpage.
>>>>>> 3) Fixed cache threshold calculation on 32-bit kernels.
>>>>>> 4) Don't warn on inequivalent private mappings in flush_dcache_page.
>>>>>> 5) Return early from mm_total_size if size exceeds
>>>>>>
>>>>>>     parisc_cache_flush_threshold.
>>>>>>
>>>>>> 6) Disable non-access TLB warning in handle_nadtlb_fault. Warning
>>>>>>
>>>>>>     happens occassionally handling flushes for COW faults.
>>>>>>
>>>>>> 7) Remove flush_cache_dup_mm.
>>>>>> 8) Flush entire cache in flush_cache_mm and flush_cache_range on
>>>>>>
>>>>>>     processors with aliasing caches. Only flush small cache ranges
>>>>>>     on machines with PA8800/PA8900 processors.
>>>>>>
>>>>>> 9) Tested on rp3440, c8000 and c3750.
>>>>> Given how long these changelogs are, and how fragile the whole caching is
>>>>> I
>>>>> think it is a good idea to split this patch into smaller ones, to improve
>>>>> readability and being able to bisect it.
>>>> FWIW, I've done some cleanups to this patch and committed it to my for-next
>>>> tree. In case it's split up, please use the revised version.
>>> Why did you modify get_ptep()? Until now it was just moved around in the file,
>>> and IMHO it becomes less readable because all these needless variables are
>>> batched up at the start of the function now. The only point I would see in
>>> moving them all to the front is if there would be no nesting anymore, and
>>> every condition was inverted:
>> Dave's original patch did not moved the variables to the beginning.
>> That change was me - just because checkpatch complained otherwise.
>>
>> I agree that it's less readable.
> The get_ptep code was originally based on the vmalloc_to_page code in vmalloc.c.
> vmalloc_to_page code.  This code has now changed, so I think get_ptep needs updating.
> See get_old_pud in mremap.c.  It looks good to me.

So, what's the consense of this discussion now?

I can easily split out the pr_warn("WARNING").
Moving the get_ptep() back to the original place seems ok, and I'll keep
the strange indenting which checkpatch want.
Is that ok?

Helge




[Index of Archives]     [Linux SoC]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux