Re: [parisc] A500 boot crash with 44786880df196a4200c178945c4d41675faf9fb7

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 24.10.2018 19:29, John David Anglin wrote:
> On 2018-10-24 11:45 AM, Helge Deller wrote:
>> On 24.10.2018 17:03, John David Anglin wrote:
>>> On 2018-10-24 7:59 AM, John David Anglin wrote:
>>>> The fault occured executing this instruction "stw r31,0(r25)". Register r31 contains the following
>>>> instruction "pdtlb,l r0(sr1,r3)".  This indicates the fault occurred during alternative patching.
>>>>
>>>> I suspect all kernel TLB entries need to be flushed prior to alternative patching to ensure that kernel
>>>> pages are writeable.
>>> Looks like this is a problem with set_kernel_text_rw().  Maybe this causes problems:
>>>
>>> int __flush_tlb_range(unsigned long sid, unsigned long start,
>>>                        unsigned long end)
>>> {
>>>          unsigned long flags;
>>>
>>>          if ((!IS_ENABLED(CONFIG_SMP) || !arch_irqs_disabled()) &&
>>>              end - start >= parisc_tlb_flush_threshold) {
>>>                  flush_tlb_all();
>>>                  return 1;
>>>          }
>>>
>>> I believe that we need to disable this optimization until the parisc_tlb_flush_threshold is
>>> calculated.  I think this crash is related to the occasional crash in parisc_setup_cache_timing().
>>>
>>> Maybe change in cache.c the initial define for parisc_tlb_flush_threshold:
>>> static unsigned long parisc_tlb_flush_threshold __read_mostly = ~0UL;
>> If it would run into flush_tlb_all(), then I'd expect that all TLBs have been flushed and
>> we wouldn't see an issue.
>> Maybe the info in the cache_info struct, which is used in the assembly of flush_tlb_all_local(),
>> hasn't been initialized yet and such the whole cache hasn't been flushed?

> Since the fault occurred before the write bit is removed, it seems to
> me that the only way this can happen is that the TLB entry is left
> over from a previous instantiation of the OS.
Agreed.

> parisc_kernel_start() doesn't seem to whack TLB.  This suggests that
> __flush_tlb_range() call in set_kernel_text_rw() didn't work as
> expected.

Yes, seems so.
This system has only one CPU, so one flush_tlb_all_local() should have been sufficient.
 > Maybe start or end are wrong (same function pointer issue as os_hpmc)?
I don't think _start and _end are wrong. Then the issue would probably be reproducible.
Meelis, do you still have the original System.map file (or the vmlinux) so that we could check?

Helge



[Index of Archives]     [Linux SoC]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux