Hi Russell/Laurent/Robin, On 5/24/2017 4:12 AM, Russell King - ARM Linux wrote: > On Wed, May 24, 2017 at 12:46:51AM +0300, Laurent Pinchart wrote: >> Hi Russell, >> >> On Tuesday 23 May 2017 18:53:19 Russell King - ARM Linux wrote: >>> On Tue, May 23, 2017 at 05:55:57PM +0100, Robin Murphy wrote: >>>> On 23/05/17 17:25, Russell King - ARM Linux wrote: >>>>> So, I've come to apply this patch (since it's finally landed in the >>>>> patch system), and I'm not convinced that the commit message is really >>>>> up to scratch. >>>>> >>>>> The current commit message looks like this: >>>>> >>>>> " ARM: 8674/1: dma-mapping: Reset the device's dma_ops >>>>> arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops(), >>>>> dma_ops should be cleared in the teardown path. Otherwise this >>>>> causes problem when the probe of device is retried after being >>>>> deferred. The device's iommu structures are cleared after >>>>> EPROBEDEFER error, but on the next try dma_ops will still be set to >>>>> old value, which is not right." >>>>> >>>>> It is obviously a fix, but a fix for which patch? Looking at the >>>>> history, we have "arm: dma-mapping: Don't override dma_ops in >>>>> arch_setup_dma_ops()" which I would have guessed is the appropriate >>>>> one, but this post-dates your patch (it's very recent, v4.12-rc >>>>> recent.) >>>>> >>>>> So, I need more description about the problem you were seeing when >>>>> you first proposed this patch. >>>>> >>>>> How does leaving the dma_ops in place prior to "arm: dma-mapping: >>>>> Don't override dma_ops in arch_setup_dma_ops()" cause problems for >>>>> deferred probing? >>>>> >>>>> What patch is your change trying to fix? In other words, how far >>>>> back does this patch need to be backported? >>>> >>>> In effect, it's fixing a latent inconsistency that's been present since >>>> its introduction in 4bb25789ed28. However, that has clearly not proven >>>> to be an issue in practice since then. With 09515ef5ddad we start >>>> actually calling arch_teardown_dma_ops() in a manner that might leave >>>> things partially initialised if anyone starts removing and reprobing >>>> drivers, but so far that's still from code inspection[1] rather than >>>> anyone hitting it. >>>> >>>> Given that the changes which tickle it are fresh in -rc1 I'd say there's >>>> no need to backport this, but at the same time it shouldn't do any >>>> damage if you really want to. >>> >>> Well, looking at this, I'm not convinced that much of it is correct. >>> >>> 1) set_dma_ops() is in arch_setup_dma_ops() but this patch adds >>> the unsetting of the DMA ops inside arm_teardown_iommu_dma_ops() >>> rather than arch_teardown_dma_ops(). >>> >>> This doesn't strike me as being particularly symmetric. >>> arm_teardown_iommu_dma_ops() is arm_setup_iommu_dma_ops()'s >>> counterpart. >>> >>> 2) arch_setup_dma_ops(), the recent patch to add the existing dma_ops >>> check, and Xen - Xen wants to override the DMA ops if in the >>> initial domain. It's not clear (at least to me) whether the recent >>> patch adding the dma_ops check took account of this or not. >>> >>> 3) random places seem to fiddle with the dma_ops - notice that >>> arm_iommu_detach_device() sets the dma_ops to NULL. >>> >>> In fact, I think moving __arm_iommu_detach_device() into >>> arm_iommu_detach_device(), calling arm_iommu_detach_device(), >>> and getting rid of the explicit set_dma_ops(, NULL) in this >>> path would be a good first step. >>> >>> 4) I think arch_setup_dma_ops() is over-complex. >>> >>> So, in summary, this code is a mess today, and that means it's not >>> obviously correct - which is bad. This needs sorting. >> >> We've reached the same conclusion independently, but I'll refrain from >> commenting on whether that's a good or bad thing :-) >> >> I don't think this patch should be applied, as it could break Xen (and other >> platforms/drivers that set the DMA operations manually) by resetting DMA >> operations at device remove() time even if they have been set independently of >> arch_setup_dma_ops(). > > That will only occur if the dma ops have been overriden once the DMA > operations have been setup via arch_setup_dma_ops. What saves it from > wholesale NULLing of the DMA operations is the check for a valid > dma_iommu_mapping structure in arm_teardown_iommu_dma_ops(). This only > exists when arm_setup_iommu_dma_ops() has attached a mapping to the > device. > Right, only if the dma ops are set and no dma_iommu_mapping is created for the device, then arch_teardown_iommu_dma_ops does nothing. Firstly, this patch for resetting the dma_ops in teardown was required only when arch_setup_dma_ops has created both the mapping and dma_ops for the device. Because mappings that were created in arch_setup_dma_ops are cleared in teardown, so dma ops should also be reset. But this can be done by calling arm_iommu_detach_device() from arm_teardown_iommu_dma_ops to avoid explicitly calling set_dma_ops again, probably this what was suggested in #3 above ? Really sorry for the mess, but below cleanups looks required otherwise, 1) set_dma_ops is called for mach-highbank, mach-mevbu during the BUS_NOTIFY_ADD_DEVICE. That should be removed and made to come from DT path and arch_setup_dma_ops. dmabounce.c also does set_dma_ops, not very sure how to handle that (swiotlb ?), are call dmabounce_register_dev during the device's probe instead to have the dma_set_ops overriding later in probe ? 2) arm_iommu_attach_device is called from ipmmu-vmsa.c, mtk_iommu_v1.c, iommu drivers, from the iommu add_device callback, called from BUS_NOTIFY_ADD_DEVICE notifier. This is a problem because, with probe-deferral, this can be overridden in arch_setup_dma_ops during device probe and cleared in teardown path. But the add_device callback notifier is not called again when the device gets reprobed again. With probe deferral, add_device callback also gets called from of_iommu_configure during device probe, so the above drivers should be adapted to properly register the iommu_ops to have its add_device called from of_iommu_configure path. mtk_iommu_v1.c seems to be fine, but ipmmu-vmsa.c should be adapted. otherwise if the devices attached to those iommus call arm_iommu_attach_device from its probe path to override the default ops set in arch_setup_dma_ops, then all is fine. This seems to be the case with exynos_drm_iommu.c, omap3isp/isp.c. If the above two are done, the overridding of the default dma_ops and mapping should happen after arch_setup_dma_ops is called and also overridden every time the device gets reprobed. That should help to get rid of couple of fixes that has been added. 3) As Laurent already pointed out earlier, return error codes from some of the IOMMU apis needs to standardized. Please let me know if its right way of doing it ? Regards, Sricharan -- "QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation -- To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html