On Tue, Jan 04, 2022 at 02:08:00AM -0800, Christoph Hellwig wrote: > On Tue, Jan 04, 2022 at 09:56:31AM +0800, Lu Baolu wrote: > > Multiple devices may be placed in the same IOMMU group because they > > cannot be isolated from each other. These devices must either be > > entirely under kernel control or userspace control, never a mixture. I guess the reason is that if a group contained a mixture, userspace could attack the kernel by programming a device to DMA to a device owned by the kernel? > > This adds dma ownership management in iommu core and exposes several > > interfaces for the device drivers and the device userspace assignment > > framework (i.e. vfio), so that any conflict between user and kernel > > controlled DMA could be detected at the beginning. Maybe I'm missing the point because I don't know what "conflict between user and kernel controlled DMA" is. Are you talking about both userspace and the kernel programming the same device to do DMA? > > The device driver oriented interfaces are, > > > > int iommu_device_use_dma_api(struct device *dev); > > void iommu_device_unuse_dma_api(struct device *dev); Nit, do we care whether it uses the actual DMA API? Or is it just that iommu_device_use_dma_api() tells us the driver may program the device to do DMA? > > Devices under kernel drivers control must call iommu_device_use_dma_api() > > before driver probes. The driver binding process must be aborted if it > > returns failure. "Devices" don't call functions. Drivers do, or in this case, it looks like the bus DMA code (platform, amba, fsl, pci, etc). These functions are EXPORT_SYMBOL_GPL(), but it looks like all the callers are built-in, so maybe the export is unnecessary? You use "iommu"/"IOMMU" and "dma"/"DMA" interchangeably above. Would be easier to read if you picked one. > > The vfio oriented interfaces are, > > > > int iommu_group_set_dma_owner(struct iommu_group *group, > > void *owner); > > void iommu_group_release_dma_owner(struct iommu_group *group); > > bool iommu_group_dma_owner_claimed(struct iommu_group *group); > > > > The device userspace assignment must be disallowed if the set dma owner > > interface returns failure. Can you connect this back to the "never a mixture" from the beginning? If all you cared about was prevent an IOMMU group from containing devices with a mixture of kernel drivers and userspace drivers, I assume you could do that without iommu_device_use_dma_api(). So is this a way to *allow* a mixture under certain restricted conditions? Another nit below. > > Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx> > > Signed-off-by: Kevin Tian <kevin.tian@xxxxxxxxx> > > Signed-off-by: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx> > > --- > > include/linux/iommu.h | 31 ++++++++ > > drivers/iommu/iommu.c | 161 +++++++++++++++++++++++++++++++++++++++++- > > 2 files changed, 189 insertions(+), 3 deletions(-) > > > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h > > index de0c57a567c8..568f285468cf 100644 > > --- a/include/linux/iommu.h > > +++ b/include/linux/iommu.h > > @@ -682,6 +682,13 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, > > void iommu_sva_unbind_device(struct iommu_sva *handle); > > u32 iommu_sva_get_pasid(struct iommu_sva *handle); > > > > +int iommu_device_use_dma_api(struct device *dev); > > +void iommu_device_unuse_dma_api(struct device *dev); > > + > > +int iommu_group_set_dma_owner(struct iommu_group *group, void *owner); > > +void iommu_group_release_dma_owner(struct iommu_group *group); > > +bool iommu_group_dma_owner_claimed(struct iommu_group *group); > > + > > #else /* CONFIG_IOMMU_API */ > > > > struct iommu_ops {}; > > @@ -1082,6 +1089,30 @@ static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev) > > { > > return NULL; > > } > > + > > +static inline int iommu_device_use_dma_api(struct device *dev) > > +{ > > + return 0; > > +} > > + > > +static inline void iommu_device_unuse_dma_api(struct device *dev) > > +{ > > +} > > + > > +static inline int > > +iommu_group_set_dma_owner(struct iommu_group *group, void *owner) > > +{ > > + return -ENODEV; > > +} > > + > > +static inline void iommu_group_release_dma_owner(struct iommu_group *group) > > +{ > > +} > > + > > +static inline bool iommu_group_dma_owner_claimed(struct iommu_group *group) > > +{ > > + return false; > > +} > > #endif /* CONFIG_IOMMU_API */ > > > > /** > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > > index 8b86406b7162..ff0c8c1ad5af 100644 > > --- a/drivers/iommu/iommu.c > > +++ b/drivers/iommu/iommu.c > > @@ -48,6 +48,8 @@ struct iommu_group { > > struct iommu_domain *default_domain; > > struct iommu_domain *domain; > > struct list_head entry; > > + unsigned int owner_cnt; > > + void *owner; > > }; > > > > struct group_device { > > @@ -289,7 +291,12 @@ int iommu_probe_device(struct device *dev) > > mutex_lock(&group->mutex); > > iommu_alloc_default_domain(group, dev); > > > > - if (group->default_domain) { > > + /* > > + * If device joined an existing group which has been claimed > > + * for none kernel DMA purpose, avoid attaching the default > > + * domain. AOL: another "none kernel DMA purpose" that doesn't read well. Is this supposed to be "non-kernel"? What does "claimed for non-kernel DMA purpose" mean? What interface does that? > > + */ > > + if (group->default_domain && !group->owner) { > > ret = __iommu_attach_device(group->default_domain, dev); > > if (ret) { > > mutex_unlock(&group->mutex); > > @@ -2320,7 +2327,7 @@ static int __iommu_attach_group(struct iommu_domain *domain, > > { > > int ret; > > > > - if (group->default_domain && group->domain != group->default_domain) > > + if (group->domain && group->domain != group->default_domain) > > return -EBUSY; > > > > ret = __iommu_group_for_each_dev(group, domain, > > @@ -2357,7 +2364,11 @@ static void __iommu_detach_group(struct iommu_domain *domain, > > { > > int ret; > > > > - if (!group->default_domain) { > > + /* > > + * If group has been claimed for none kernel DMA purpose, avoid > > + * re-attaching the default domain. > > + */ > > none kernel reads odd. But maybe drop that and just say 'claimed > already' ala: > > /* > * If the group has been claimed already, do not re-attach the default > * domain. > */