Am 13.12.2011 22:30, schrieb Ninja:
Am 12.12.2011 12:08, schrieb Marko Ristola:
On 12/10/2011 01:57 AM, Ninja wrote:
Hi,
has anyone an idea how the SMP problems could be fixed?
You could turn on Mantis Kernel module's debug messages.
It could tell you the emitted interrupts.
One risky thing with the Interrupt handler code is that
MANTIS_GPIF_STATUS is cleared, even though IRQ0 isn't active yet.
This could lead to a rare starvation of the wait queue you described.
I supplied a patch below. Does it help?
I did some further investigation. When comparing the number of
interrupts with all cores enabled and the interrupts with only one
core enabled it seems like only the IRQ0 changed, the other IRQs and
the total number stays quite the same:
4 Cores:
All IRQ/sec: 493
Masked IRQ/sec: 400
Unknown IRQ/sec: 0
DMA/sec: 400
IRQ-0/sec: 143
IRQ-1/sec: 0
OCERR/sec: 0
PABRT/sec: 0
RIPRR/sec: 0
PPERR/sec: 0
FTRGT/sec: 0
RISCI/sec: 258
RACK/sec: 0
1 Core:
All IRQ/sec: 518
Masked IRQ/sec: 504
Unknown IRQ/sec: 0
DMA/sec: 504
IRQ-0/sec: 246
IRQ-1/sec: 0
OCERR/sec: 0
PABRT/sec: 0
RIPRR/sec: 0
PPERR/sec: 0
FTRGT/sec: 0
RISCI/sec: 258
RACK/sec: 0
So, where might be the problem?
Turning on Mantis debug messages, might tell the difference between
these interrupts.
....
I hope somebody can help, because I think we are very close to a
fully functional CAM here.
I ran out of things to test to get closer to the solution :(
Btw: Is there any documentation available for the mantis PCI bridge?
Not that I know.
Manuel
--
To unsubscribe from this list: send the line "unsubscribe
linux-media" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Regards,
Marko Ristola
Hi Marko,
thanks for the patch. I did some quick testing today. The IRQ0 problem
stays, but it seems like the small hangs (3-5 seconds every 20 minutes
or something) are fixed :)
Manuel
Hi,
I did some further investigation of my problem. Almost all IRQ0s
originate from calling the function "mantis_hif_read_iom" (at least when
the CAM is up and running). Changing the udelay between the writes to
about 100 gets almost rid of the lost IRQ0 problem, but somehow it
increases the number of total interrupts and IRQ0 as well to about
double to triple of the numbers with udelay(20).
This increase doesn't happen when reducing the number of cores as
workaround.
And getting *almost* no timeouts doesn't help much, because every
timeout causes a hang/freeze until the CAM is initialized again.
Changing the PCI latency to 0xff didn't help either.
btw: The DMA patches of Marko postet in the other thread "Multiple
Mantis devices gives me glitches" doesn't help me further since I'm
using the latest code which already includes the patch.
Manuel
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html