On Thu, Jan 9, 2020 at 1:45 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > On Tue, Jan 07, 2020 at 09:45:13PM +0530, Muni Sekhar wrote: > > Hi, > > > > I have module with Xilinx FPGA. It implements UART(s), SPI(s), > > parallel I/O and interfaces them to the Host CPU via PCI Express bus. > > I see that my system freezes without capturing the crash dump for > > certain tests. I debugged this issue and it was tracked down to the > > below mentioned interrupt handler code. > > > > > > In ISR, first reads the Interrupt Status register using ‘readl()’ as > > given below. > > status = readl(ctrl->reg + INT_STATUS); > > > > > > And then clears the pending interrupts using ‘writel()’ as given blow. > > writel(status, ctrl->reg + INT_STATUS); > > > > > > I've noticed a kernel hang if INT_STATUS register read again after > > clearing the pending interrupts. > > > > Can someone clarify me why the kernel hangs without crash dump incase > > if I read the INT_STATUS register using readl() after clearing the > > pending bits? > > > > Can readl() block? > > readl() should not block in software. Obviously at the hardware CPU > instruction level, the read instruction has to wait for the result of > the read. Since that data is provided by the device, i.e., your FPGA, > it's possible there's a problem there. Thank you very much for your reply. Where can I find the details about what is protocol for reading the ‘memory mapped IO’? Can you point me to any useful links.. I tried locate the exact point of the kernel code where CPU waits for read instruction as given below. readl() -> __raw_readl() -> return *(const volatile u32 __force *)add Do I need to check for the assembly instructions, here? > > Can you tell whether the FPGA has received the Memory Read for > INT_STATUS and sent the completion? Is there a way to know this with the help of software debugging(either enabling dynamic debugging or adding new debug prints)? Can you please point some tools\hw needed to find this? > > On the architectures I'm familiar with, if a device doesn't respond, > something would eventually time out so the CPU doesn't wait forever. What is timeout here? I mean how long CPU waits for completion? Since this code runs from interrupt context, does it causes the system to freeze if timeout is more? lspci output: $ lspci 00:00.0 Host bridge: Intel Corporation Atom Processor Z36xxx/Z37xxx Series SoC Transaction Register (rev 11) 00:02.0 VGA compatible controller: Intel Corporation Atom Processor Z36xxx/Z37xxx Series Graphics & Display (rev 11) 00:13.0 SATA controller: Intel Corporation Atom Processor E3800 Series SATA AHCI Controller (rev 11) 00:14.0 USB controller: Intel Corporation Atom Processor Z36xxx/Z37xxx, Celeron N2000 Series USB xHCI (rev 11) 00:1a.0 Encryption controller: Intel Corporation Atom Processor Z36xxx/Z37xxx Series Trusted Execution Engine (rev 11) 00:1b.0 Audio device: Intel Corporation Atom Processor Z36xxx/Z37xxx Series High Definition Audio Controller (rev 11) 00:1c.0 PCI bridge: Intel Corporation Atom Processor E3800 Series PCI Express Root Port 1 (rev 11) 00:1c.2 PCI bridge: Intel Corporation Atom Processor E3800 Series PCI Express Root Port 3 (rev 11) 00:1c.3 PCI bridge: Intel Corporation Atom Processor E3800 Series PCI Express Root Port 4 (rev 11) 00:1d.0 USB controller: Intel Corporation Atom Processor Z36xxx/Z37xxx Series USB EHCI (rev 11) 00:1f.0 ISA bridge: Intel Corporation Atom Processor Z36xxx/Z37xxx Series Power Control Unit (rev 11) 00:1f.3 SMBus: Intel Corporation Atom Processor E3800 Series SMBus Controller (rev 11) 01:00.0 RAM memory: PLDA Device 5555 03:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03) > > > Snippet of the ISR code is given blow: > > > > https://pastebin.com/WdnZJZF5 > > > > > > > > static irqreturn_t pcie_isr(int irq, void *dev_id) > > > > { > > > > struct test_device *ctrl = data; > > > > u32 status; > > > > … > > > > > > > > status = readl(ctrl->reg + INT_STATUS); > > > > /* > > > > * Check to see if it was our interrupt > > > > */ > > > > if (!(status & 0x000C)) > > > > return IRQ_NONE; > > > > > > > > /* Clear the interrupt */ > > > > writel(status, ctrl->reg + INT_STATUS); > > > > > > > > if (status & 0x0004) { > > > > /* > > > > * Tx interrupt pending. > > > > */ > > > > .... > > > > } > > > > > > > > if (status & 0x0008) { > > > > /* Rx interrupt Pending */ > > > > /* The system freezes if I read again the INT_STATUS > > register as given below */ > > > > status = readl(ctrl->reg + INT_STATUS); > > > > .... > > > > } > > > > .. > > > > return IRQ_HANDLED; > > } > > > > > > > > -- > > Thanks, > > Sekhar -- Thanks, Sekhar