Re: possible Malta 4Kc cache problem ...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> I attached the test case.  Untar it.  Type 'make' and run 'a.out'.
> 
> If the test fails you will see a print-out.  Otherwise you see nothing.
> 
> It does not always fail.  But if it fails, it is usually pretty consistent.
> Try a few times.  Moving source tree to a different directory may cause
> the symptom appear or disappear.
> 
> I spent quite some time to trace this problem, and came to suspect
> there might be a hardware problem.
> 
> The problem involves emulating a "lw" instruction in cp1 branch delay
> slot, which needs to  set up trampoline in user stack.  The net effect
> looks as if the icache line or dcache line is not flushed properly.
> 
> Using gdb/kgdb, printf or printk in any useful places would hide the bug.
> 
> I did find a smaller part of the problem.  flush_cache_sigtramp for
> MIPS32 (4Kc) calls protected_writeback_dcache_line in mips32_cache.h.
> It uses Hit_Writeback_D, and the 4Kc mannual says it is not implemented
> and executed as no-op (*ick*).

Which version of the 4Kc manual are you looking at?  I'm looking
at a very recent version of the 4Kc Software User's Manual
(version 1.17, dated September 25, 2002), and it only shows
Hit_Writeback_D to be invalid for *secondary and teritary*
caches, which makes sense, since the 4KSc doesn't have any.

> Even after fixing this, I still see the problem happening.

That's not too surprising.  The 4Kc D-cache is write-through,
so if you're really seeing a problem with trampolimes, it is almost
certain to be a problem with the Icache invalidation, not the
Dcache flush.
 
> If you replace flush_cache_sigtramp() with flush_cache_all(), symptom
> would disppear.

Which again would make sense if there's a problem on
the icache side of the flush.  Oddly enough, we've seen
some glitches on other CPUs with other kernels that 
might have been explicable by failures of protected_flush_icache_line(),
but we never found a problem with it, and a higher-level
memory management patch made the problem go away.
Makes me wonder if we shouldn't look at it again, more
closely.  Is there any possibility that the logic for restarting
a protected kernel access following a page fault will somehow
screw up on CACHE instructions, as opposed to the loads
and stores for which the code was originally written?

> Several of my tests seem to suggest it is the icache that did not
> get flushed (or updated) properly.
> 
> Not re-producible on other MIPS boards.  At least so far.
> 
> Does anybody with more knowledge about 4Kc have any clues here?
> 
> Thanks.
> 
> Jun


[Index of Archives]     [Linux MIPS Home]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Linux]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux