On 2022-04-07 14:59, Jason Gunthorpe wrote:
On Thu, Apr 07, 2022 at 07:18:48AM +0000, Tian, Kevin wrote:
From: Jason Gunthorpe <jgg@xxxxxxxxxx>
Sent: Thursday, April 7, 2022 1:17 AM
On Wed, Apr 06, 2022 at 06:10:31PM +0200, Christoph Hellwig wrote:
On Wed, Apr 06, 2022 at 01:06:23PM -0300, Jason Gunthorpe wrote:
On Wed, Apr 06, 2022 at 05:50:56PM +0200, Christoph Hellwig wrote:
On Wed, Apr 06, 2022 at 12:18:23PM -0300, Jason Gunthorpe wrote:
Oh, I didn't know about device_get_dma_attr()..
Which is completely broken for any non-OF, non-ACPI plaform.
I saw that, but I spent some time searching and could not find an
iommu driver that would load independently of OF or ACPI. ie no IOMMU
platform drivers are created by board files. Things like Intel/AMD
discover only from ACPI, etc.
Intel discovers IOMMUs (and optionally ACPI namespace devices) from
ACPI, but there is no ACPI description for PCI devices i.e. the current
logic of device_get_dma_attr() cannot be used on PCI devices.
Oh? So on x86 acpi_get_dma_attr() returns DEV_DMA_NON_COHERENT or
DEV_DMA_NOT_SUPPORTED?
I think it _should_ return DEV_DMA_COHERENT on x86/IA-64 (unless a _CCA
method was actually present to say otherwise), based on
acpi_init_coherency(), but I only know for sure what happens on arm64.
I think I should give up on this and just redefine the existing iommu
cap flag to IOMMU_CAP_CACHE_SUPPORTED or something.
TBH I don't see any issue with current name, but I'd certainly be happy
to nail down a specific definition for it, along the lines of "this
means that IOMMU_CACHE mappings are generally coherent". That works for
things like Arm's S2FWB making it OK to assign an otherwise-non-coherent
device without extra hassle.
For the specific case of overriding PCIe No Snoop (which is more
problematic from an Arm SMMU PoV) when assigning to a VM, would that not
be easier solved by just having vfio-pci clear the "Enable No Snoop"
control bit in the endpoint's PCIe capability?
We could alternatively use existing device_get_dma_attr() as a default
with an iommu wrapper and push the exception down through the iommu
driver and s390 can override it.
if going this way probably device_get_dma_attr() should be renamed to
device_fwnode_get_dma_attr() instead to make it clearer?
I'm looking at the few users:
drivers/ata/ahci_ceva.c
drivers/ata/ahci_qoriq.c
- These are ARM only drivers. They are trying to copy the dma-coherent
property from its DT/ACPI definition to internal register settings
which look like they tune how the AXI bus transactions are created.
I'm guessing the SATA IP block's AXI interface can be configured to
generate coherent or non-coherent requests and it has to be set
in a way that is consistent with the SOC architecture and match
what the DMA API expects the device will do.
drivers/crypto/ccp/sp-platform.c
- Only used on ARM64 and also programs a HW register similar to the
sata drivers. Refuses to work if the FW property is not present.
drivers/net/ethernet/amd/xgbe/xgbe-platform.c
- Seems to be configuring another ARM AXI block
drivers/gpu/drm/panfrost/panfrost_drv.c
- Robin's commit comment here is good, and one of the things this
controls is if the coherent_walk is set for the io-pgtable-arm.c
code which avoids DMA API calls
drivers/gpu/drm/tegra/uapi.c
- Returns DRM_TEGRA_CHANNEL_CAP_CACHE_COHERENT to userspace. No idea.
My take is that the drivers using this API are doing it to make sure
their HW blocks are setup in a way that is consistent with the DMA API
they are also using, and run in constrained embedded-style
environments that know the firmware support is present.
So in the end it does not seem suitable right now for linking to
IOMMU_CACHE..
That seems a pretty good summary - I think they're basically all
"firmware told Linux I'm coherent so I'd better act coherent" cases, but
that still doesn't necessarily mean that they're *forced* to respect
that. One of the things on my to-do list is to try adding a
DMA_ATTR_NO_SNOOP that can force DMA cache maintenance for coherent
devices, primarily to hook up in Panfrost (where there is a bit of a
performance to claw back on the coherent AmLogic SoCs by leaving certain
buffers non-cacheable).
Cheers,
Robin.