On Thu, Jan 5, 2017 at 10:49 AM, Will Deacon <will.deacon@xxxxxxx> wrote: > On Thu, Jan 05, 2017 at 10:27:27AM -0500, Rob Clark wrote: >> On Thu, Jan 5, 2017 at 6:55 AM, Will Deacon <will.deacon@xxxxxxx> wrote: >> > On Tue, Jan 03, 2017 at 04:30:54PM -0500, Rob Clark wrote: >> >> TODO maybe we want two options, one to enable stalling, and 2nd to punt >> >> handling to wq? I haven't needed to use mm APIs from fault handler yet >> >> (although it is something that I think we'll want some day). Perhaps >> >> stalling support is limited to just letting driver dump some extra >> >> debugging information otherwise. Threaded handling probably only useful >> >> with stalling, but inverse may not always be true. >> > >> > I'd actually like to see this stuck on a worker thread, because I think >> > that's more generally useful and I don't want to have a situation where >> > sometimes the IOMMU fault notifier is run in IRQ context and sometimes it's >> > not. >> >> So I was talking a bit w/ Jordan on IRC yesterday.. and we also have >> the GPU's hw hang-detect to contend with. So I *suspect* that when we >> get to the point of using this to do things like page in things from >> swap and resume the faulting transaction, we probably want to get >> called immediately from the IRQ handler so we can disable the hw >> hang-detect. > > Well, if you want to use an SMMU for paging, then the GPU driver would > need to request that explicitly when allocating its DMA buffers, to that > would be the time to either delay or disable the hang detection. If userspace is using SVM, for example, it is pretty impossible to know when to expect a fault. The best you could do is keep track that *some* process which has active work queued up for gpu is using SVM and disable hang detect for *everyone*.. which is kind of sad. >> I'm not sure if the better solution then would be to have two fault >> callbacks, one immediately from the IRQ and a later one from wq. Or >> let the driver handle the wq business and give it a way to tell the >> IOMMU when to resume. >> >> I kinda think we should punt on the worker thread for now until we are >> ready to resume faulting transactions, because I guess a strong chance >> that whatever way we do it now will be wrong ;-) > > I guess what I'm after is for you to change the interrupt handlers to be > threaded, like they are for SMMUv3. I *think* you can do that with a NULL > thread_fn for now, and just call report_iommu_fault from the handler. > The return value of that could, in theory, be used to queued the paging > request and wake the paging thread in future. If we only pass in the non-threaded irq fxn, I'm not really sure how that changes anything.. or maybe I'm not understanding what you mean. But yeah, I guess we could use request_threaded_irq() to get both IRQ context notification and a later thread context notification rather than doing the wq thing. Either way the iommu API has to change slightly. >> > I wonder if this should also be predicated on the compatible string, so >> > that the "arm,smmu-enable-stall" property is ignored (with a warning) if >> > the compatible string isn't specific enough to identify an implementation >> > with the required SS behaviour? On the other hand, it feels pretty >> > redundant and a single "stalling works" property is all we need. >> >> We could also drop the property and key the behavior on specific >> compat strings I guess. Having both seems a bit odd. Anyways, I'll >> defer to DT folks about what the cleaner approach is. > > As Robin pointed out, we do need to be able to distinguish the integration > of the device from the device itself. For example, MMU-9000 might be capable > of stalling, but if it's bolted to a PCI RC, it's not safe to do so. Hmm, well we install the fault handler on the iommu_domain.. perhaps maybe a combo of dts property (or deciding based on more specific compat string), plus extra param passed in to iommu_set_fault_hander(). The dts property or compat string to indicate whether the iommu (and how it is wired up) can handle stalls, and enable_stall param when fault handler is registered to indicate whether the device itself can cope.. if either can't do stalling, then don't set CFCFG. BR, -R -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html