On 2019-06-02 10:38 a.m., John David Anglin wrote: > On 2019-05-31 8:23 a.m., John David Anglin wrote: >> On 2019-05-30 4:59 p.m., Sven Schnelle wrote: >>> Hi, >>> >>> On Thu, May 30, 2019 at 09:55:43PM +0200, Sven Schnelle wrote: >>>> Hi, >>>> >>>> On Wed, May 29, 2019 at 04:15:03PM +0200, Helge Deller wrote: >>>>>>> Exactly. And as: >>>>>>> >>>>>>> a) All C3600 PDC versions clear the NP bit >>>>>>> b) All C37XX/J5000 PDC version set the NP bit >>>>>>> >>>>>>> i don't think there's some bug in the PDC. I would guess that the patch Carlo >>>>>>> reported to fix issues is just hiding the real problem. Would be interesting >>>>>>> to run Carlo's Test on a C37XX. >>>>>> Probably, hardware cache coherent I/O is not implemented correctly for Elroy based systems. >>>>>> https://www.hpl.hp.com/hpjournal/96feb/feb96a6.pdf >>>>>> Does it work on C360? >>>>> I slowly start to get confused... >>>>> Just thinking about another possibility: Maybe we can rely on the value of the >>>>> NP iopdir_fdc bit only on machines with >= PA8700 CPUs? >>>>> For older machines (which would need opdir_fdc) HP-UX or other operating >>>>> systems decides on the found CPU. >>>>> This would explain why it's not set on Carlo's C3600, and if Sven's C240 >>>>> (with a PA8200 CPU) doesn't has the bit set too, then this could explain this theory. >>>> I just re-tested my kexec branch, and the HPMC i was seeing when kexec'ing a new >>>> kernel on my J5000 is now gone with Helge's patch. J5000 also has PCX-W. It was >>>> only triggered when i had SMP enabled, but this is somehow not suprising given >>>> the fact that a cache flush was missing. >>> Looks like i'm also confused now. My J5000 crashed with the kexec stuff again. >>> It's much less than before, only 1 out of 10 times. >>> >>> The patch does: >>> >>> if ((cond & ALT_COND_NO_IOC_FDC) && >>> ((boot_cpu_data.cpu_type < pcxw) || >>> (boot_cpu_data.cpu_type == pcxw_) || >>> (boot_cpu_data.pdc.capabilities & PDC_MODEL_IOPDIR_FDC))) >>> continue; >>> >>> So there should be no change for PCX-W and my statement that this fixes anything >>> on my J5000 is wrong. I think i'll disable the patching and see whether the problem >>> disappears. >> Is it possible that we are running in a mode where the cache/TLB does not issue coherent >> operations? There is a PDC_CACHE call to set the coherence state. > > I checked the machines that I have and they all have coherent caches and TLBs. I think > flush and sync are required on all machines with write-back caches. This makes write > visible to I/O adapter (memory). The c3600 has a write-back data cache. See "PDC Procedures" > page 4-21. > > This might be affected by the TLB U bit. Possibly, the U bit is not set for pages in the > I/O address region (IO-PDIR) and we need flush/sync as a result. > Possibly, this change will fix the alternative coding NP IO-PDIR optimization on machines that don't need to flush/sync. It's boot tested on c8000 but it needs testing on c3600 and j5000, etc, to make sure it resolves the issue on machines that don't need to flush/sync the IO-PDIR. Dave diff --git a/arch/parisc/mm/ioremap.c b/arch/parisc/mm/ioremap.c index 92a9b5f12f98..9baea70e38c4 100644 --- a/arch/parisc/mm/ioremap.c +++ b/arch/parisc/mm/ioremap.c @@ -38,7 +38,6 @@ void __iomem * __ioremap(unsigned long phys_addr, unsigned long size, unsigned l if ((phys_addr >= 0x00080000 && end < 0x000fffff) || (phys_addr >= 0x00500000 && end < 0x03bfffff)) { phys_addr |= F_EXTEND(0xfc000000); - flags |= _PAGE_NO_CACHE; } #endif @@ -65,7 +64,7 @@ void __iomem * __ioremap(unsigned long phys_addr, unsigned long size, unsigned l } pgprot = __pgprot(_PAGE_PRESENT | _PAGE_RW | _PAGE_DIRTY | - _PAGE_ACCESSED | flags); + _PAGE_ACCESSED | _PAGE_NO_CACHE | flags); /* * Mappings have to be page-aligned