On 21/03/2019 00:07, Robin Murphy wrote: > Unfortunately, having looked around the code, I think I do. 4.4 long > predates the iommu-map binding, and in the absence of anything other > than the hard-coded SID==RID assumption of arm-smmu at the time, they > apparently went and did their own wacky thing[1]. AFAICS the Stream ID > appears to be pretty much derived from the PCI topology as I would hope, > but it looks like it might depend on some sort of lookup table being > programmed appropriately as well. > > Bear in mind that the this is _The Qualcomm Android Kernel_ we're trying > to reason about here - playing true to the stereotype, the diff against > the mainline driver is significantly bigger than the entire mainline > driver itself; the line count of arm-smmu.c alone is pushing > 2-and-a-half times that of the file in 4.4.y ;) > > Since the curiosity had set in, I finally got round to dumping the ACPI > tables from my Snapdragon 835 laptop, and judging by the IORT it seems > like the EFI firmware for Windows machines does provide some set of > static ID mappings which could probably transcribe to an iommu-map (if > indeed it's valid at all - Windows itself doesn't seem to be even > touching PCI here), but I guess the Android BSP might not be so > generous. That'll be a question for the Qualcomm folks. FWIW mine > interestingly claims that its SMMU instances are all sharing SPI 231 as > a global fault interrupt, but whether that's true and/or depends on the > runtime firmware, again I really have no idea. > > Robin. > > [1] > https://source.codeaurora.org/quic/la/kernel/msm-4.4/tree/drivers/pci/host/pci-msm.c?h=LE.UM.1.3.r3.25&id=f1fa301f977f06dcf990c0452d85e2f67d8cbbf1#n4687 As far as I can tell, the specific line that crashes the system is: AT_WRITE_REG(hw, REG_RXQ_CTRL, rxq); which attempts to write 0x80800000 to [0x1b300000 + 0x15a0] Note that I wrote "crashes the system" not "crashes the kernel" because there is no panic, no log from the kernel. The system just hangs for a few seconds, and then reboots. I suspect the system might be trapping into the FW, which is unhappy because $REASON. What I find confusing is that the previous write was: AT_WRITE_REG(hw, REG_TXQ_CTRL, txq); Why would REG_RXQ_CTRL be handled differently than REG_TXQ_CTRL? Sigh...