On Mon, Jun 25, 2001 at 01:36:15PM +0200, Maciej W. Rozycki wrote: > After extensive debugging I managed to track down the bug that was > preventing me from building binutils since the beginning of February. > Once again the culprit turned out to the the explicit nature of MIPS' > caches. > > The problem lies in r3k_flush_cache_sigtramp(). It flushes three > consecutive word-wide locations starting from the address passed as an > argument. The argument is normally a sigreturn trampoline that is set up > by setup_frame() or setup_rt_frame(). But these functions set up two > opcodes only -- the third word is left untouched. In my case the address > was something like 0x7???bff8. So the area to be flushed spanned a page > boundary and since the third word was unreferenced, a TLB entry for the > page the word was located in was absent. As a result, a TLB refill > exception happened with caches isolated, which is not necessarily a win. > The symptom was a solid crash. > > I don't see any reason to flush the third word location, so I removed the > code doing it. This fixed the crashes I was observing, but since we are > using mapped (KUSEG) addresses in r3k_flush_cache_sigtramp(), I believe we > need more protection against unwanted TLB exceptions. The point is we are > running with interrupts enabled and a reschedule may happen between > touching the trampoline in setup*_frame() and flushing the cache. Hence > the TLB entries for the trampoline area, even once present, may get > removed meanwhile. So I added some code to explicitly load the entries, > if needed, with interrupts disabled just before isolating caches. > Following is a resulting patch. > > Ralf, this is a showstopper bug -- please apply the fix ASAP. Applied. Ralf