On Nov 23, 2008, at 10:31 AM, Andy Walls wrote: > > Brandon, > > [culling out and reorganizing interesting stuff:] > > cx18-2 warning: Possibly falling behind: CPU self-ack'ed our > incoming CPU to EPU mailbox (sequence no. 7427) while processing > cx18-2 warning: Possibly falling behind: CPU self-ack'ed our > incoming CPU to EPU mailbox (sequence no. 9372) while processing > cx18-2 warning: Possibly falling behind: CPU self-ack'ed our > incoming CPU to EPU mailbox (sequence no. 10914) while processing > cx18-2 warning: Possibly falling behind: CPU self-ack'ed our > incoming CPU to EPU mailbox (sequence no. 12855) > cx18-2 warning: Possibly falling behind: CPU self-ack'ed our > incoming CPU to EPU mailbox (sequence no. 12978) while processing > cx18-2 warning: Possibly falling behind: CPU self-ack'ed our > incoming CPU to EPU mailbox (sequence no. 13012) > cx18-2 warning: Possibly falling behind: CPU self-ack'ed our > incoming CPU to EPU mailbox (sequence no. 14966) > cx18-2 warning: Possibly falling behind: CPU self-ack'ed our > incoming CPU to EPU mailbox (sequence no. 17100) > cx18-2 warning: Possibly falling behind: CPU self-ack'ed our > incoming CPU to EPU mailbox (sequence no. 17405) while processing > cx18-2 warning: Possibly falling behind: CPU self-ack'ed our > incoming CPU to EPU mailbox (sequence no. 17422) while processing > > This is all OK. They are all pretty far apart. The ones with "while > processing" on the end are really no big deal, as we're sure we got a > good copy of the mailbox data on those. And there are no messages > about > detecting buffers to have fallen out of rotation and being being put > back, so you're actually not loosing buffers (that's why the message > says "Possibly"). You're in good shape. > > > I would note that only "cx18-2" is experiencing trouble in meeting > it's > IRQ handling timeline imposed by it's firmware. There's either: > > a. some piece of hardware sharing an interrupt with this cx18-2 board > and it's IRq handling routine is causing delays in invoking the cx18 > ISR > routine for this board. > > b. the PCI bus MMIO to cx18-2 is slow because latency timer settings > on > other devices are set really, really long. > > c. You've got 3 boards in an only dual core machine, and the other > cx18 > board IRQ handling routines have interrupts disabled on both the other > cores when the IRQ for cx18-2 comes in. > > d. anything else you can dream up for why the time form hardware > interrupt line being asserted to cx18_irq_handler() is longer than the > other boards. :) > > > > cx18-0: sending CX18_CPU_DE_SET_MDL timed out waiting 10 msecs for > RPU acknowledgement > cx18-0: sending CX18_CPU_DE_SET_MDL timed out waiting 10 msecs for > RPU acknowledgement > cx18-2: sending CX18_CPU_DE_SET_MDL timed out waiting 10 msecs for > RPU acknowledgement > cx18-2: sending CX18_CPU_DE_SET_MDL timed out waiting 10 msecs for > RPU acknowledgement > cx18-0: sending CX18_CPU_DE_SET_MDL timed out waiting 10 msecs for > RPU acknowledgement > cx18-0: sending CX18_CPU_DE_SET_MDL timed out waiting 10 msecs for > RPU acknowledgement > cx18-0: sending CX18_CPU_DE_SET_MDL timed out waiting 10 msecs for > RPU acknowledgement > cx18-0: sending CX18_CPU_DE_SET_MDL timed out waiting 10 msecs for > RPU acknowledgement > cx18-1: sending CX18_CPU_DE_SET_MDL timed out waiting 10 msecs for > RPU acknowledgement > cx18-0: sending CX18_CPU_DE_SET_MDL timed out waiting 10 msecs for > RPU acknowledgement > [snip] > cx18-0: sending CX18_CPU_DE_SET_MDL timed out waiting 10 msecs for > RPU acknowledgement > > These are no big deal. We get impatient with the firmware and move on > when it doesn't respond to our command - we can't spend forever > sleeping > for something that has little corrective action when it fails. Since > you got no "stuck mailbox" messages, the outgoing command likely did > complete. Even if the SET_MDL command actually fails on occasion, the > "buffer fallen out of rotation" detection logic will pick it up later > and send it again. > > > It's somewhat annoying that the firmware wants response times on the > order of 100's of usecs (I think) for it's data to be ack'ed, but it > makes us wait over 10 msecs for it to ack us at times. The 10 msec > was > an empirical number on a single board machine. I may have to up the > timeout length or just quiet the message to a debug level. > > > > > All in all you're in good shape. Your system appears to have low > enough > interrupt service latency to meet the demands of the firmware. (At > least one ivtv-users list user has a system that is having real > trouble > meeting the interrupt service latency timeline of the firmware. I may > have to add a polled mailbox IO mode to the driver for these systems > to > use.) > > > Again a lot of this was going on previously, it's just that the cx18 > driver never bothered to look for it on incoming DMA done > notifications, > report the precise condition in the logs, or correct for it very well. > > Thanks for the testing and providing data! > > Regards, > Andy > >> Brandon > > Andy, A couple of points to note: 1) cx18-2 was the only board making use of analog and hd recordings. The other devices were only performing HD OTA captures. 2) cx18-2 shares irqs with: ls /proc/irq/18/ cx18-2 ehci_hcd:usb4 smp_affinity spurious uhci_hcd:usb3 uhci_hcd:usb7. Can I use irq17? nothing seems to be on this. 3) This is actually a quad core cpu. Thanks, Brandon _______________________________________________ linux-dvb mailing list linux-dvb@xxxxxxxxxxx http://www.linuxtv.org/cgi-bin/mailman/listinfo/linux-dvb