RE: SMTC support status in latest git head.

STUART VENTERS <stuart.venters@xxxxxxxxxx> · Wed, 22 Dec 2010 10:34:39 -0600

Anoop,

Nothing jumps out to me in the new set of register values.

It might be worth dumping all the CP0 registers?
   I'm especially interested in the Config3 to see the VEIC bit.
   The timer registers might be useful as well.

Regards,

Stuart

-----Original Message-----
From: Kevin D. Kissell [mailto:kevink@xxxxxxxxxxxxx]
Sent: Wednesday, December 22, 2010 7:03 AM
To: Anoop P A
Cc: Anoop P.A.; STUART VENTERS; linux-mips@xxxxxxxxxxxxxx
Subject: Re: SMTC support status in latest git head.

On 12/22/10 3:51 AM, Anoop P A wrote:
> On Wed, 2010-12-22 at 03:37 -0800, Kevin D. Kissell wrote:
>> Thanks.  This is indeed strange.  The VPE0 Status and TC0 TCStatus/Cause
>> all indicate that interrupts are enabled and not inhibited at the per-TC
>> level, and the presumed timer interrupt, in the 0x4000 bit, is present
>> and not masked-off.  Logically, the system must be entering (and
>> exiting) the interrupt handler, yet the timer calibration isn't
>> completing.  That leaves more complex possible explanations for failure,
>> most of which would fall into two categories:
>>
>> 1)  The platform interrupt handler is failing to decode the event
>> properly as a timer event.
>> 2)  Despite there being only one TC active, the calibration code is
>> waiting for some handshake from another "CPU"
>>
>> To test the first, you might consider adding a kprintf() to the case of
>> a "spurious" timer-like interrupt being detected and ignored...
> I have tried it . only one interrupt is coming and platform handler
> detect it as timer interrupt and acknowledges properly . you can see a
> time stamp change in the logs.
That's really strange.  And your timer interrupt is definitely on the 
interrupt that corresponds to the 0x4000 mask?

I may have written the MT spec and the original SMTC code, but I don't 
have a copy of the spec, and it's been a few years, and I can't 
interpret the MVP and VPE control/config values. But I just don't see 
how the processor could not be taking more interrupts.  Stuart did 
decode the global/VPE state enough to observe that global multithreaded 
execution wasn't enabled, which is indeed strange - it shouldn't matter 
for single-TC execution, but I don't recall there being any special-case 
in the SMTC initialization that bypassed that enable.  That makes me 
suspect that maybe someone changed the initialization sequence in a way 
that bypasses one of the canonical initialization steps in a way that 
would break SMTC, but I don't know why that would result in the 
interrupt behavior you observe.

It might be yet another blind alley, but could you add/arm diagnostic 
output for each of the initialization functions in smtc.c?

Ah, yes, and one other thing.  You should add a dump of ErrorEPC to the 
MT register dump.  I did it for myself once upon a time when I was 
confronted with a similar mystery, but never filed a patch.  If you're 
breaking in with NMI, that could help identify more precisely where it's 
locking up.

You really ought to try to borrow an EJTAG probe.  It would save us both 
a lot of time.  And my time to trouble-shoot this with you is limited.

             Regards,

             Kevin K.