On 10/15/2023 11:32 PM, Jason Wang wrote:
On Fri, Oct 13, 2023 at 3:36 PM Si-Wei Liu <si-wei.liu@xxxxxxxxxx> wrote:
On 10/12/2023 8:01 PM, Jason Wang wrote:
On Tue, Oct 10, 2023 at 5:05 PM Si-Wei Liu <si-wei.liu@xxxxxxxxxx> wrote:
Devices with on-chip IOMMU or vendor specific IOTLB implementation
may need to restore iotlb mapping to the initial or default state
using the .reset_map op, as it's desirable for some parent devices
to solely manipulate mappings by its own, independent of virtio device
state. For instance, device reset does not cause mapping go away on
such IOTLB model in need of persistent mapping. Before vhost-vdpa
is going away, give them a chance to reset iotlb back to the initial
state in vhost_vdpa_cleanup().
Signed-off-by: Si-Wei Liu <si-wei.liu@xxxxxxxxxx>
---
drivers/vhost/vdpa.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 851535f..a3f8160 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -131,6 +131,15 @@ static struct vhost_vdpa_as *vhost_vdpa_find_alloc_as(struct vhost_vdpa *v,
return vhost_vdpa_alloc_as(v, asid);
}
+static void vhost_vdpa_reset_map(struct vhost_vdpa *v, u32 asid)
+{
+ struct vdpa_device *vdpa = v->vdpa;
+ const struct vdpa_config_ops *ops = vdpa->config;
+
+ if (ops->reset_map)
+ ops->reset_map(vdpa, asid);
+}
+
static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
{
struct vhost_vdpa_as *as = asid_to_as(v, asid);
@@ -140,6 +149,13 @@ static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
hlist_del(&as->hash_link);
vhost_vdpa_iotlb_unmap(v, &as->iotlb, 0ULL, 0ULL - 1, asid);
+ /*
+ * Devices with vendor specific IOMMU may need to restore
+ * iotlb to the initial or default state which is not done
+ * through device reset, as the IOTLB mapping manipulation
+ * could be decoupled from the virtio device life cycle.
+ */
Should we do this according to whether IOTLB_PRESIST is set?
Well, in theory this seems like so but it's unnecessary code change
actually, as that is the way how vDPA parent behind platform IOMMU works
today, and userspace doesn't break as of today. :)
Well, this is one question I've ever asked before. You have explained
that one of the reason that we don't break userspace is that they may
couple IOTLB reset with vDPA reset as well. One example is the Qemu.
Nope, it was the opposite. Maybe it was not clear enough, let me try
once more - userspace CANNOT decouple IOTLB reset from vDPA reset today.
This is because of bug/discrepancy in mlx5_vdap and vdpa_sim already
breaking userspace's expectation, rendering the brokenness/inconsistency
on vhost-vdpa mapping interface from behaving what it promised and
should have done. Only with the IOTLB_PERSIST flag seen userspace can
trust vhost-vdpa kernel interface *reliably* to decouple IOTLB reset
from vDPA reset. Without seeing this flag, no matter how the code in
QEMU was written, today's older userspace was never like to assume the
mappings will *definitely* be cleared by vDPA reset. If any userspace
implementation wants to get consistent behavior for all vDPA parent
devices, it still has to *explicitly* clear all existing mappings by its
own by sending bunch of unmap (iotlb invalidate) requests to vhost-vdpa
kernel before resetting the vDPA backend.
In brief, userspace is already broken by kernel implementation today,
and new userspace needs some device flag to know for sure if kernel bug
has already been fixed; older userspace doesn't care about preserving
the broken kernel behavior at all, regardless whether or not it wants to
decouple IOTLB from vDPA reset.
As explained in previous threads [1][2], when IOTLB_PERSIST is not set
it doesn't necessarily mean the iotlb will definitely be destroyed
across reset (think about the platform IOMMU case), so userspace today
is already tolerating enough with either good or bad IOMMU. This code of
not checking IOTLB_PERSIST being set is intentional, there's no point to
emulate bad IOMMU behavior even for older userspace (with improper
emulation to be done it would result in even worse performance).
For two reasons:
1) backend features need acked by userspace this is by design
There's no breakage on this part. Backend feature IOTLB_PERSIST won't be
set if userspace doesn't ack.
2) keep the odd behaviour seems to be more safe as we can't audit
every userspace program
Definitely don't have to audit every userspace program, but I cannot
think of a case where a sane userspace program can be broken. Can you
elaborate one or two potential userspace usage that may break because of
this? As said, platform IOMMU already did it this way.
Regards,
-Siwei
Thanks
I think
the purpose of the IOTLB_PERSIST flag is just to give userspace 100%
certainty of persistent iotlb mapping not getting lost across vdpa reset.
Thanks,
-Siwei
[1]
https://lore.kernel.org/virtualization/9f118fc9-4f6f-dd67-a291-be78152e47fd@xxxxxxxxxx/
[2]
https://lore.kernel.org/virtualization/3364adfd-1eb7-8bce-41f9-bfe5473f1f2e@xxxxxxxxxx/
Otherwise
we may break old userspace.
Thanks
+ vhost_vdpa_reset_map(v, asid);
kfree(as);
return 0;
--
1.8.3.1
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization