Re: [PATCH RFC v2 1/4] vdpa: introduce .reset_map operation callback

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 12, 2023 at 7:31 AM Si-Wei Liu <si-wei.liu@xxxxxxxxxx> wrote:
>
> Hi Jason,
>
> On 9/10/2023 8:42 PM, Jason Wang wrote:
> > Hi Si-Wei:
> >
> > On Sat, Sep 9, 2023 at 9:34 PM Si-Wei Liu <si-wei.liu@xxxxxxxxxx> wrote:
> >> On-chip IOMMU parent driver could use it to restore memory mapping
> >> to the initial state.
> > As discussed before. On-chip IOMMU is the hardware details that need
> > to be hidden by the vDPA bus.
> I guess today this is exposed to the bus driver layer already, for e.g.
> vhost_vdpa_map() can call into the  .dma_map, or .set_map, or
> iommu_map() flavors depending on the specific hardware IOMMU
> implementation underneath? Specifically, "struct iommu_domain *domain"
> is now part of "struct vhost_vdpa" at an individual bus driver
> (vhost-vdpa), rather than being wrapped around under the vdpa core
> "struct vdpa_device" as vdpa device level object. Do we know for what
> reason the hardware details could be exposed to bus callers like
> vhost_vdpa_map and vhost_vdpa_general_unmap, while it's prohibited for
> other similar cases on the other hand? Or is there a boundary in between
> I was not aware of?

Let me try to explain:

set_map(), dma_map(), dma_unmap() is used for parent specific
mappings. It means the parents want to do vendor specific setup for
the mapping. The abstraction of translation is still one dimension
(thought the actual implementation in the parent could be two
dimensions).  So it's not necessarily the on-chip stuff (see the
example of the VDUSE).

That means we never expose two dimension mappings like (on-chip)
beyond the bus. So it's not one dimension vs two dimensions but the
platform specific mappings vs vendor specific mappings.

>
> I think a more fundamental question I don't quite understand, is adding
> an extra API to on-chip IOMMU itself an issue, or just that you don't
> like the way how the IOMMU model gets exposed via this specific API of
> .reset_map?

extra API to on-chip IOMMU, since the on-chip logics should be hidden
by the bus unless we want to introduce the two dimensions abstraction
(which seems to be an overkill).

> For the platform IOMMU case, internally there exists
> distinction between the 1:1 identify (passthrough) mode and DMA page
> mapping mode, and this distinction is somehow getting exposed and
> propagated through the IOMMU API - for e.g. iommu_domain_alloc() and
> iommu_attach_device() are being called explicitly from
> vhost_vdpa_alloc_domain() by vhost-vdpa (and the opposite from within
> vhost_vdpa_free_domain), while for virtio-vdpa it doesn't call any IOMMU
> API at all on the other hand

It's the way the kernel manages DMA mappings. For a userspace driver
via vhost-vDPA, it needs to call IOMMU APIs. And for a kernel driver
via virtio-vDPA, DMA API is used (via the dma_dev exposed through
virtio_vdpa). DMA API may decide to call IOMMU API if IOMMU is enabled
but not in passthrough mode.

> - which is to inherit what default IOMMU
> domain has.

Yes, but it's not a 1:1 (identify) mapping, it really depends on the
configuration. (And there could even be a swiotlb layer in the
middle).

> Ideally for on-chip IOMMU we can and should do pretty much
> the same, but I don't think there's a clean way without introducing any
> driver API to make vhost-vdpa case distinguish from the virtio-vdpa
> case. I'm afraid to say that it was just a hack to hide the necessary
> distinction needed by vdpa bus users for e.g. in the deep of
> vdpa_reset(), if not introducing any new driver API is the goal here...

So rest_map() is fine if it is not defined just for on-chip. For
example, does VDUSE need to implement it or not?

>
> > Exposing this will complicate the implementation of bus drivers.
> As said above, this distinction is needed by bus drivers, and it's
> already done by platform IOMMU via IOMMU API. I can drop the .reset_map
> API while add another set of similar driver API to mimic
> iommu_domain_alloc/iommu_domain_free, but doing this will complicate the
> parent driver's implementation on the other hand.

I'm not sure I understand the issue. But something like PD
allocation/free in RDMA?

> While .reset_map is
> what I can think of to be the simplest for parent, I can do the other
> way if you're fine with it. Let me know how it sounds.

I think what I still don't understand is: how is reset_map() related
to persistent IOTLB? I guess it's a must but I still didn't figure out
why.

Thanks

>
> Thanks,
> -Siwei
>
> >
> > Thanks
> >
> >> Signed-off-by: Si-Wei Liu <si-wei.liu@xxxxxxxxxx>
> >> ---
> >>   include/linux/vdpa.h | 7 +++++++
> >>   1 file changed, 7 insertions(+)
> >>
> >> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> >> index 17a4efa..daecf55 100644
> >> --- a/include/linux/vdpa.h
> >> +++ b/include/linux/vdpa.h
> >> @@ -324,6 +324,12 @@ struct vdpa_map_file {
> >>    *                             @iova: iova to be unmapped
> >>    *                             @size: size of the area
> >>    *                             Returns integer: success (0) or error (< 0)
> >> + * @reset_map:                 Reset device memory mapping (optional)
> >> + *                             Needed for device that using device
> >> + *                             specific DMA translation (on-chip IOMMU)
> >> + *                             @vdev: vdpa device
> >> + *                             @asid: address space identifier
> >> + *                             Returns integer: success (0) or error (< 0)
> >>    * @get_vq_dma_dev:            Get the dma device for a specific
> >>    *                             virtqueue (optional)
> >>    *                             @vdev: vdpa device
> >> @@ -401,6 +407,7 @@ struct vdpa_config_ops {
> >>                         u64 iova, u64 size, u64 pa, u32 perm, void *opaque);
> >>          int (*dma_unmap)(struct vdpa_device *vdev, unsigned int asid,
> >>                           u64 iova, u64 size);
> >> +       int (*reset_map)(struct vdpa_device *vdev, unsigned int asid);
> >>          int (*set_group_asid)(struct vdpa_device *vdev, unsigned int group,
> >>                                unsigned int asid);
> >>          struct device *(*get_vq_dma_dev)(struct vdpa_device *vdev, u16 idx);
> >> --
> >> 1.8.3.1
> >>
>

_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization




[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux