On 08/07/2022 17:32, Tian, Kevin wrote:
From: Alexey Kardashevskiy <aik@xxxxxxxxx>
Sent: Friday, July 8, 2022 2:35 PM
On 7/8/22 15:00, Alexey Kardashevskiy wrote:
On 7/8/22 01:10, Jason Gunthorpe wrote:
On Thu, Jul 07, 2022 at 11:55:52PM +1000, Alexey Kardashevskiy wrote:
Historically PPC64 managed to avoid using iommu_ops. The VFIO driver
uses a SPAPR TCE sub-driver and all iommu_ops uses were kept in
the Type1 VFIO driver. Recent development though has added a
coherency
capability check to the generic part of VFIO and essentially disabled
VFIO on PPC64; the similar story about
iommu_group_dma_owner_claimed().
This adds an iommu_ops stub which reports support for cache
coherency. Because bus_set_iommu() triggers IOMMU probing of PCI
devices,
this provides minimum code for the probing to not crash.
stale comment since this patch doesn't use bus_set_iommu() now.
+
+static int spapr_tce_iommu_attach_dev(struct iommu_domain *dom,
+ struct device *dev)
+{
+ return 0;
+}
It is important when this returns that the iommu translation is all
emptied. There should be no left over translations from the DMA API at
this point. I have no idea how power works in this regard, but it
should be explained why this is safe in a comment at a minimum.
> It will turn into a security problem to allow kernel mappings to leak
> past this point.
>
I've added for v2 checking for no valid mappings for a device (or, more
precisely, in the associated iommu_group), this domain does not need
checking, right?
Uff, not that simple. Looks like once a device is in a group, its
dma_ops is set to iommu_dma_ops and IOMMU code owns DMA. I guess
then
there is a way to set those to NULL or do something similar to let
dma_map_direct() from kernel/dma/mapping.c return "true", is not there?
dev->dma_ops is NULL as long as you don't handle DMA domain type
here and don't call iommu_setup_dma_ops().
Given this only supports blocking domain then above should be irrelevant.
For now I'll add a comment in spapr_tce_iommu_attach_dev() that it is
fine to do nothing as tce_iommu_take_ownership() and
tce_iommu_take_ownership_ddw() take care of not having active DMA
mappings. Thanks,
In general, is "domain" something from hardware or it is a software
concept? Thanks,
'domain' is a software concept to represent the hardware I/O page
table.
About this. If a platform has a concept of explicit DMA windows (2 or
more), is it one domain with 2 windows or 2 domains with one window each?
If it is 2 windows, iommu_domain_ops misses windows manipulation
callbacks (I vaguely remember it being there for embedded PPC64 but
cannot find it quickly).
If it is 1 window per a domain, then can a device be attached to 2
domains at least in theory (I suspect not)?
On server POWER CPUs, each DMA window is backed by an independent IOMMU
page table. (reminder) A window is a bus address range where devices are
allowed to DMA to/from ;)
Thanks,
A blocking domain means all DMAs from a device attached to
this domain are blocked/rejected (equivalent to an empty I/O page
table), usually enforced in the .attach_dev() callback.
Yes, a comment for why having a NULL .attach_dev() is OK is welcomed.
Thanks
Kevin
--
Alexey