On 05/06/13 18:43, De Roo, Steven wrote: > Dear Martyn, Manohar & all VME-users, > > > I'm still struggling with my VME SBC & the VME-drivers... > Please be so kind to have another look at the current situation: > > I got to the point that I had a board perfectly reading/writing data to another board, > even with DMA calls to speed up things with a factor 8. > > Yesterday, we put the board in a production environment, > and things have become quite complicated since then. > (finally we get some sunshine in Belgium, but now clouds are covering my project...) > > Apparently, the existing VME-crate into which the new board was put, > has a lot of VME bus errors in the existing traffic, > as can be seen with a VME analyzer card. > The existing legacy hardware/software cannot be modified however, > so I'll have to live with this situation. > > These bus errors cause the TSI148 chipset to abort the read-calls, > and generate an interrupt, which is caught by function 'tsi148_irqhandler()', > and there some strange things happen... > Yes, there's a link-list that's recording bus errors. That is checked if you have err_chk enabled, but will also require you to be doing reads and writes to clear it. Minimally the items should only be added to the link list if err_chk != 0. tsi148_dma_list_exec() should probably also be checking this list if err_chk !=0. if err_chk == 0, there's possibly a case for not enabling the VME Bus error interrupt as the driver doesn't care. > In the beginning of this function, the 'interrupt enable out' > and 'interrupt status' registers are read out, as shown below: > ... > 260 /* Determine which interrupts are unmasked and set */ > 261 enable = ioread32be(bridge->base + TSI148_LCSR_INTEO); > 262 stat = ioread32be(bridge->base + TSI148_LCSR_INTS); > ... > > The first time an interrupt occurs, both 'enable' and 'stat' are 0, > which is not ok, since the 'enable' register is definitely set to a proper value > when the 'vme_tsi1148' module is loaded/probed. > > A consecutive interrupt leads to value -1 (0xFFFFFFFF) for both 'enable' and 'stat', > and then a kernel crash is not far way... > In this case, all handlers (DMA, LM, MB, PERR, VERR, ...) are called, > and inside these handlers, NULL pointers are dereferenced, leading to a crash. > > What could possibly be corrupting the TSI148_LCSR_INTEO and TSI148_LCSR_INTS registers ? > Could it be the driver, or can it be the TSI148 chipset itself ? Have you added debug that's reading the registers. Apparently reading the interrupt registers more than once has been seen to cause issues. > (e.g. can it be that the interrupt is raised before the registers are filled in correctly ?) > I've just taken a quick look at a XVME-6300 Datasheet floating around the web. The datasheet suggests that the VMEbus Interface has hardware byte swapping. This is not a feature of the TSI-148. I have seen boards in the past place an FPGA between the VMEbus and the VME-PCI bridge to do byte swapping, these have also done funny things with the interrupt as well. It might be worth looking at any documentation you have for the XVME-6300/ contact the manufacturer to see if there are any other sources (such as a byte swapping FPGA) that end up generating the same interrupt as the TSI148. > Also, I can't find any good documentation on the difference between the > 'interrupt enable out' and 'interrupt enable' registers. Do you have a clue ? > I think this is covered in the TSI-148 manual. From memory, one is for PCI interrupts, the other can enable some error conditions to be converted into VME interrupts. Martyn > > Kind regards, > Steven De Roo > -- Martyn Welch (Lead Software Engineer) | Registered in England and Wales GE Intelligent Platforms | (3828642) at 100 Barbirolli Square T +44(0)1327322748 | Manchester, M2 3AB E martyn.welch@xxxxxx | VAT:GB 927559189 _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/devel