On Sat, Jan 18, 2020 at 07:16:14AM +0530, Muni Sekhar wrote: > On Thu, Jan 9, 2020 at 10:05 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > > > On Thu, Jan 09, 2020 at 08:47:51AM +0530, Muni Sekhar wrote: > > > On Thu, Jan 9, 2020 at 1:45 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > > > On Tue, Jan 07, 2020 at 09:45:13PM +0530, Muni Sekhar wrote: > > > > > Hi, > > > > > > > > > > I have module with Xilinx FPGA. It implements UART(s), SPI(s), > > > > > parallel I/O and interfaces them to the Host CPU via PCI Express bus. > > > > > I see that my system freezes without capturing the crash dump for > > > > > certain tests. I debugged this issue and it was tracked down to the > > > > > below mentioned interrupt handler code. > > > > > > > > > > > > > > > In ISR, first reads the Interrupt Status register using ‘readl()’ as > > > > > given below. > > > > > status = readl(ctrl->reg + INT_STATUS); > > > > > > > > > > > > > > > And then clears the pending interrupts using ‘writel()’ as given blow. > > > > > writel(status, ctrl->reg + INT_STATUS); > > > > > > > > > > > > > > > I've noticed a kernel hang if INT_STATUS register read again after > > > > > clearing the pending interrupts. > > > > > > > > > > Can someone clarify me why the kernel hangs without crash dump incase > > > > > if I read the INT_STATUS register using readl() after clearing the > > > > > pending bits? > > > > > > > > > > Can readl() block? > > > > > > > > readl() should not block in software. Obviously at the hardware CPU > > > > instruction level, the read instruction has to wait for the result of > > > > the read. Since that data is provided by the device, i.e., your FPGA, > > > > it's possible there's a problem there. > > > > > > Thank you very much for your reply. > > > Where can I find the details about what is protocol for reading the > > > ‘memory mapped IO’? Can you point me to any useful links.. > > > I tried locate the exact point of the kernel code where CPU waits for > > > read instruction as given below. > > > readl() -> __raw_readl() -> return *(const volatile u32 __force *)add > > > Do I need to check for the assembly instructions, here? > > > > The C pointer dereference, e.g., "*address", will be some sort of a > > "load" instruction in assembly. The CPU wait isn't explicit; it's > > just that when you load a value, the CPU waits for the value. > > > > > > Can you tell whether the FPGA has received the Memory Read for > > > > INT_STATUS and sent the completion? > I have not seen any ‘missing’ completions on the logic analyser. Is > there any other ways to debug this one? If you see the Memory Read and the associated Completion, and you still see a hang in the kernel, then mostly likely the problem is not in PCIe. I would start by trying to prove that the instruction after the readl() is or is not executed. Bjorn