On 01/12/17 14:05, Victor Ascroft wrote: > I have a iMX6 running a 4.9 kernel with a custom kernel driver communicating > with a FPGA over PCIe. The driver is not built in to the kernel but loaded as > a module after complete boot up. During the running of the system, after a few > hours the kernel completely freezes. No kernel panics or stack traces, nothing. > I have access to the serial console. I've done a lot of work with the imx6 and an Altera Cyclone IV FPGA connected via PCIe bus and I've not experienced any major issues with this setup. > In such a scenario what are the ways to debug and try locating the source of > the problem? I am not looking for a solution for my problem but things or > approaches one can go about trying while trying to fix such a scenario? This is a difficult situation and it will take a lot of time to debug but you really just need to spend time picking apart the driver. You should try disabling various parts and adding dynamic debug messages or tracing. My first suspicion in these cases however is always with interrupts. There have been a few times when our FPGA code has a fault and the interrupts fail, so my first port of call is to usually disable interrupts in my driver and replace them with highres timers. Also you might want to look at load balancing the interrupts, ARM processors keep interrupts to one core (or they did in the kernels I've been using) and you can either manually assign the interrupts to other cores or use irqbalance to do so automatically. I prefered the manual solution as irqbalance didn't seem to assign my workload efficiently across the cores. At any rate you should probably be monitoring the interrupts. Good Luck! Regards, Philip Downer _______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies