On 12/22/10 3:51 AM, Anoop P A wrote:
On Wed, 2010-12-22 at 03:37 -0800, Kevin D. Kissell wrote:
Thanks. This is indeed strange. The VPE0 Status and TC0 TCStatus/Cause
all indicate that interrupts are enabled and not inhibited at the per-TC
level, and the presumed timer interrupt, in the 0x4000 bit, is present
and not masked-off. Logically, the system must be entering (and
exiting) the interrupt handler, yet the timer calibration isn't
completing. That leaves more complex possible explanations for failure,
most of which would fall into two categories:
1) The platform interrupt handler is failing to decode the event
properly as a timer event.
2) Despite there being only one TC active, the calibration code is
waiting for some handshake from another "CPU"
To test the first, you might consider adding a kprintf() to the case of
a "spurious" timer-like interrupt being detected and ignored...
I have tried it . only one interrupt is coming and platform handler
detect it as timer interrupt and acknowledges properly . you can see a
time stamp change in the logs.
That's really strange. And your timer interrupt is definitely on the
interrupt that corresponds to the 0x4000 mask?
I may have written the MT spec and the original SMTC code, but I don't
have a copy of the spec, and it's been a few years, and I can't
interpret the MVP and VPE control/config values. But I just don't see
how the processor could not be taking more interrupts. Stuart did
decode the global/VPE state enough to observe that global multithreaded
execution wasn't enabled, which is indeed strange - it shouldn't matter
for single-TC execution, but I don't recall there being any special-case
in the SMTC initialization that bypassed that enable. That makes me
suspect that maybe someone changed the initialization sequence in a way
that bypasses one of the canonical initialization steps in a way that
would break SMTC, but I don't know why that would result in the
interrupt behavior you observe.
It might be yet another blind alley, but could you add/arm diagnostic
output for each of the initialization functions in smtc.c?
Ah, yes, and one other thing. You should add a dump of ErrorEPC to the
MT register dump. I did it for myself once upon a time when I was
confronted with a similar mystery, but never filed a patch. If you're
breaking in with NMI, that could help identify more precisely where it's
locking up.
You really ought to try to borrow an EJTAG probe. It would save us both
a lot of time. And my time to trouble-shoot this with you is limited.
Regards,
Kevin K.