On Tue, Feb 04, 2025 at 09:21:12AM -0400, Jason Gunthorpe wrote: > On Tue, Feb 04, 2025 at 12:26:46PM +0900, Mikko Perttunen wrote: > > On 2/4/25 4:13 AM, Jason Gunthorpe wrote: > > > On Mon, Feb 03, 2025 at 06:49:07PM +0000, Robin Murphy wrote: > > > > I'd hope the historical reasons for not supporting IOMMU_DOMAIN_DMA in > > > > tegra-smmu no longer apply, given that all the default domain stuff has now > > > > been integrated into host1x for the newer arm-smmu based Tegras. > > > > > > Indeed I do see appropriate looking calls to the normal DMA API, and > > > the other mapping path is conditionalized by !host->domain. > > > > > > So, why didn't it work for Diogo? Even in identity mode the DMA API > > > will return correct DMA addresses and the !host->domain path will skip > > > mapping them. > > > > > > Poking around I wonder if there is some assumption that if other parts > > > of the stack, maybe the DRM driver, are using the special domain than > > > everyone is? It seems to blindly pass around IOVA without really > > > checking who is consuming it. > > > > I'm not sure where that would be, but it's certainly possible given that > > this combination of code paths wouldn't have been tested. > > I saw some weird stuff.. Like tegra_bo_pin() does: > > /* > * If we've manually mapped the buffer object through the IOMMU, make sure to return the > * existing IOVA address of our mapping. > */ > if (!obj->mm) { > } else { > map->phys = obj->iova; > map->chunks = 1; > > And obj->iova isn't obviously linked to a dma map on dev.. The comment > reads like it is making the assumption that there is only one iommu > domain shared by everyone (without checking dev is part of that > scheme) Yes, that's correct. We used to have a bit more code around to deal with single domain (Tegra SMMU is supported on some platforms where we have a maximum of 4 address spaces, hence we have to be very careful about what devices share which address space). In fact the host1x and DRM drivers also support Tegra20, which didn't have an IOMMU at all, but one of those really old GARTs. On top of that it only supported a 32 *M*iB window. I think others have concluded that today we can't practically use GART and it's better to use something like CMA for those cases. So maybe that's not an argument anymore. Most of these restrictions no longer apply as of Tegra124 (I think) and starting with Tegra186 the IOMMU is an ARM SMMU, so none of these work- arounds should be needed there. Most of this code exists for very legacy reasons, but there's still people with Tegra30, Tegra114, Tegra124 and Tegra210, all of which are susceptible to this (at least partially). > > > Christian was telling me DMABUF had some drivers that made the > > > (incorrect!) assumption they were all sharing translation. > > > > > > It does seem like a nice project for someone who has the hardware to > > > rip out all of this custom domain stuff and just have the iommu layer > > > setup a shared dma-iommu domain. > > > > This has been a long-standing project. The issue is that some boot chains > > set up the display expecting identity mappings, I should revive an old patch series that attempted to get rid of most of this. The plan was indeed to remove all of the direct IOMMU API code and use DMA API exclusively. There were a couple of issues with it, such as performance regressions due to the extra mappings required, but I think I got pretty far in mitigating most of those. In the end I got frustrated with it a few years back because there are some corner cases that I couldn't make work and testing of this becomes increasingly complicated. And yes, display was specifically an issue on Tegra210 platforms (earlier platforms didn't usually set up display on boot). One additional problem with supporting the reserved-memory identity mappings is that existing firmware doesn't set things up correctly and these devices are software EOL'ed, so no new releases are planned that could contain a fix for this. > The smmu driver can match on compatible strings and leave some devices > in identity mode, if that helps. Other drivers are doing this to work > around various issues. This might indeed be a viable path. Again, we need to be careful about the older devices, but maybe a shared dma-iommu domain would work these days. We definitely want to transition to a DMA IOMMU domain later on, but it would need to be after display has had a chance to set things up. And it would still need a firmware that actually passes the right information to DT because otherwise we'd still get SMMU faults from the display hardware trying to scan out from memory that it doesn't own (or I guess scanning out garbage from memory that's now being used for other purposes). > > See https://lore.kernel.org/linux-iommu/20220512190052.1152377-5-thierry.reding@xxxxxxxxx/ > > But using RMR seems like a better solution? reserved-memory is the DT equivalent of ACPI's RMR, as far as I understand. Or at least, it's the closest thing to it. We don't have ACPI on these devices, hence why we do this via reserved-memory DT nodes. But again, the problem is that firmware doesn't currently pass this correctly to the kernel, so we'd need a new release for that. I'd need to find out what can be done to achieve this. One big issue might be that it'd be a DT ABI break, technically. So upstream Linux would only work reliably using that new version. That might be fine if we can make some sort of official release. > We could perhaps also figure out some way to leave the translation in > identity until the DRM driver completes the reset, then the DRM driver > could activate the DMA API translation manually. Yeah. We do something similar for Tegra186 and later where we have the memory controller redirect accesses to the SMMU only after the device has been attached. I'm not sure if we can do that for the old Tegra SMMU, though. It would probably still require a firmware update to make sure this can be properly handed off to the kernel. Thierry
Attachment:
signature.asc
Description: PGP signature