Hi John! On Sat, Mar 17, 2018 at 6:47 PM, John David Anglin <dave.anglin@xxxxxxxx> wrote: > Hi Grant, > > On 2018-03-17 12:12 PM, Grant Grundler wrote: >> >> "Master Abort" means the MMIO >> transaction timed out - usually due to the device not responding to an >> MMIO read. > > In lba_pci.c and sba_iommu.c, it says "BE WARNED: register writes are > posted" and need to be followed by a read. It seems there are a some > routines in these modules that have writes that aren't followed by a read. > One is lba_wr_cfg(). Another might be the macro > LBA_CFG_RESTORE(). Are these okay? I looked through the two examples you point out and I *think* both are ok. lba_wr_cfg() issues an mmio write and immediately after calls LBA_CFG_MASTER_ABORT_CHECK() which performs an MMIO read from the same base address. The LBA_CFG_RESTORE() is "lazy" - the next MMIO read will flush those three writes and (I believe) any following MMIO writes will still be issued in order. Typically, the problem with posted MMIO writes is DMA or other events don't start until the MMIO write is "seen" by the device. This is important when specific timing between MMIO transactions is required OR some magic (e.g. device reset, updates Frame Buffer, etc) happens. > It seems probable that the problem that Carlo is having is a conflict > between devices. Hrm. I don't know. I haven't yet looked at the latest dump that Carlo helpfully provided as I'm still traveling. Why do you suspect this? I'm skeptical about "conflict between devices" (due to lba_wr_cfg()) for two reasons: 1) configuration space accesses are usually not part of normal IO device transaction processing. 2) I've nearly always found that PCI Master Aborts (on MMIO reads) are usually just a symptom of something else going wrong and not the root cause. Typically, the issues I recall running into are around the drivers hitting a corner case where the device is still performing DMA to an address that gets unmapped by the driver. This will wedge the IOMMU (sba) and then following MMIO reads will generate an HPMC. The hard part is to determine what the corner case is based on a DMA address (as reported in SER PIM output). It requires deeper understanding of the DMA programming for the given SATA controller (driver directing HW what to do), how transaction completions are reported (SATA controller HW) and handled (driver operation). In the past, I've sorted several of these issues out for tg3 and tulip NIC drivers and I can with confidence say that some issues still remain in the tulip driver shutdown path. But I gave up on trying to fix those and lost interest later. cheers, grant -- To unsubscribe from this list: send the line "unsubscribe linux-parisc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html