Re: [REGRESSION] Invalid gather when using Tegra210 media engines

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 04, 2025 at 09:21:12AM -0400, Jason Gunthorpe wrote:
> On Tue, Feb 04, 2025 at 12:26:46PM +0900, Mikko Perttunen wrote:
> > On 2/4/25 4:13 AM, Jason Gunthorpe wrote:
> > > On Mon, Feb 03, 2025 at 06:49:07PM +0000, Robin Murphy wrote:
> > > > I'd hope the historical reasons for not supporting IOMMU_DOMAIN_DMA in
> > > > tegra-smmu no longer apply, given that all the default domain stuff has now
> > > > been integrated into host1x for the newer arm-smmu based Tegras.
> > > 
> > > Indeed I do see appropriate looking calls to the normal DMA API, and
> > > the other mapping path is conditionalized by !host->domain.
> > > 
> > > So, why didn't it work for Diogo? Even in identity mode the DMA API
> > > will return correct DMA addresses and the !host->domain path will skip
> > > mapping them.
> > > 
> > > Poking around I wonder if there is some assumption that if other parts
> > > of the stack, maybe the DRM driver, are using the special domain than
> > > everyone is? It seems to blindly pass around IOVA without really
> > > checking who is consuming it.
> > 
> > I'm not sure where that would be, but it's certainly possible given that
> > this combination of code paths wouldn't have been tested.
> 
> I saw some weird stuff.. Like tegra_bo_pin() does:
> 
> 	/*
> 	 * If we've manually mapped the buffer object through the IOMMU, make sure to return the
> 	 * existing IOVA address of our mapping.
> 	 */
> 	if (!obj->mm) {
> 	} else {
> 		map->phys = obj->iova;
> 		map->chunks = 1;
> 
> And obj->iova isn't obviously linked to a dma map on dev.. The comment
> reads like it is making the assumption that there is only one iommu
> domain shared by everyone (without checking dev is part of that
> scheme)

Yes, that's correct. We used to have a bit more code around to deal with
single domain (Tegra SMMU is supported on some platforms where we have a
maximum of 4 address spaces, hence we have to be very careful about what
devices share which address space).

In fact the host1x and DRM drivers also support Tegra20, which didn't
have an IOMMU at all, but one of those really old GARTs. On top of that
it only supported a 32 *M*iB window. I think others have concluded that
today we can't practically use GART and it's better to use something
like CMA for those cases. So maybe that's not an argument anymore.

Most of these restrictions no longer apply as of Tegra124 (I think) and
starting with Tegra186 the IOMMU is an ARM SMMU, so none of these work-
arounds should be needed there. Most of this code exists for very legacy
reasons, but there's still people with Tegra30, Tegra114, Tegra124 and
Tegra210, all of which are susceptible to this (at least partially).

> > > Christian was telling me DMABUF had some drivers that made the
> > > (incorrect!) assumption they were all sharing translation.
> > > 
> > > It does seem like a nice project for someone who has the hardware to
> > > rip out all of this custom domain stuff and just have the iommu layer
> > > setup a shared dma-iommu domain.
> > 
> > This has been a long-standing project. The issue is that some boot chains
> > set up the display expecting identity mappings,

I should revive an old patch series that attempted to get rid of most of
this. The plan was indeed to remove all of the direct IOMMU API code and
use DMA API exclusively. There were a couple of issues with it, such as
performance regressions due to the extra mappings required, but I think
I got pretty far in mitigating most of those.

In the end I got frustrated with it a few years back because there are
some corner cases that I couldn't make work and testing of this becomes
increasingly complicated. And yes, display was specifically an issue on
Tegra210 platforms (earlier platforms didn't usually set up display on
boot).

One additional problem with supporting the reserved-memory identity
mappings is that existing firmware doesn't set things up correctly and
these devices are software EOL'ed, so no new releases are planned that
could contain a fix for this.

> The smmu driver can match on compatible strings and leave some devices
> in identity mode, if that helps. Other drivers are doing this to work
> around various issues.

This might indeed be a viable path. Again, we need to be careful about
the older devices, but maybe a shared dma-iommu domain would work these
days.

We definitely want to transition to a DMA IOMMU domain later on, but it
would need to be after display has had a chance to set things up. And it
would still need a firmware that actually passes the right information
to DT because otherwise we'd still get SMMU faults from the display
hardware trying to scan out from memory that it doesn't own (or I guess
scanning out garbage from memory that's now being used for other
purposes).

> > See https://lore.kernel.org/linux-iommu/20220512190052.1152377-5-thierry.reding@xxxxxxxxx/
> 
> But using RMR seems like a better solution?

reserved-memory is the DT equivalent of ACPI's RMR, as far as I
understand. Or at least, it's the closest thing to it. We don't have
ACPI on these devices, hence why we do this via reserved-memory DT
nodes. But again, the problem is that firmware doesn't currently pass
this correctly to the kernel, so we'd need a new release for that. I'd
need to find out what can be done to achieve this. One big issue might
be that it'd be a DT ABI break, technically. So upstream Linux would
only work reliably using that new version. That might be fine if we can
make some sort of official release.

> We could perhaps also figure out some way to leave the translation in
> identity until the DRM driver completes the reset, then the DRM driver
> could activate the DMA API translation manually.

Yeah. We do something similar for Tegra186 and later where we have the
memory controller redirect accesses to the SMMU only after the device
has been attached. I'm not sure if we can do that for the old Tegra
SMMU, though. It would probably still require a firmware update to make
sure this can be properly handed off to the kernel.

Thierry

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [ARM Kernel]     [Linux ARM]     [Linux ARM MSM]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux