On Thu, Jul 18, 2024 at 2:29 AM Steven Sistare <steven.sistare@xxxxxxxxxx> wrote: > > On 7/16/2024 1:28 AM, Jason Wang wrote: > > On Mon, Jul 15, 2024 at 10:28 PM Steven Sistare > > <steven.sistare@xxxxxxxxxx> wrote: > >> > >> On 7/14/2024 10:34 PM, Jason Wang wrote: > >>> On Fri, Jul 12, 2024 at 9:19 PM Steve Sistare <steven.sistare@xxxxxxxxxx> wrote: > >>>> > >>>> When device ownership is passed to a new process via VHOST_NEW_OWNER, > >>>> some devices need to know the new userland addresses of the dma mappings. > >>>> Define the new iotlb message type VHOST_IOTLB_REMAP to update the uaddr > >>>> of a mapping. The new uaddr must address the same memory object as > >>>> originally mapped. > >>>> > >>>> The user must suspend the device before the old address is invalidated, > >>>> and cannot resume it until after VHOST_IOTLB_REMAP is called, but this > >>>> requirement is not enforced by the API. > >>>> > >>>> Signed-off-by: Steve Sistare <steven.sistare@xxxxxxxxxx> > >>>> --- > >>>> drivers/vhost/vdpa.c | 58 ++++++++++++++++++++++++++++++++ > >>>> include/uapi/linux/vhost_types.h | 11 +++++- > >>>> 2 files changed, 68 insertions(+), 1 deletion(-) > >>>> > >>>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c > >>>> index 4396fe1a90c4..51f71c45c4a9 100644 > >>>> --- a/drivers/vhost/vdpa.c > >>>> +++ b/drivers/vhost/vdpa.c > >>>> @@ -1257,6 +1257,61 @@ static int vhost_vdpa_pa_map(struct vhost_vdpa *v, > >>>> > >>>> } > >>>> > >>>> +static int vhost_vdpa_process_iotlb_remap(struct vhost_vdpa *v, > >>>> + struct vhost_iotlb *iotlb, > >>>> + struct vhost_iotlb_msg *msg) > >>>> +{ > >>>> + struct vdpa_device *vdpa = v->vdpa; > >>>> + const struct vdpa_config_ops *ops = vdpa->config; > >>>> + u32 asid = iotlb_to_asid(iotlb); > >>>> + u64 start = msg->iova; > >>>> + u64 last = start + msg->size - 1; > >>>> + struct vhost_iotlb_map *map; > >>>> + int r = 0; > >>>> + > >>>> + if (msg->perm || !msg->size) > >>>> + return -EINVAL; > >>>> + > >>>> + map = vhost_iotlb_itree_first(iotlb, start, last); > >>>> + if (!map) > >>>> + return -ENOENT; > >>>> + > >>>> + if (map->start != start || map->last != last) > >>>> + return -EINVAL; > >>> > >>> I had a question here, if a buggy user space that: > >>> > >>> 1) forget to update some of the mappings > >>> 2) address is wrong > >>> 3) other cases. > >>> > >>> Does this mean the device can DMA to the previous owner? > >> > >> Yes, but only to the mappings which were already pinned for DMA for this > >> device, and the old owner is giving the new owner permission to DMA to that > >> memory even without bugs. > >> > >>> If yes, does > >>> this have security implications? > >> > >> No. The previous owner has given the new owner permission to take over. They > >> trust each other completely. In the live update case, they are the same; the new > >> owner is the old owner reincarnated. > > > > I understand the processes may trust each other but I meant the kernel > > may not trust those processes. > > If a process holds the key (the fd) then the kernel can trust that is has > permission to use it. Keys are only passed around voluntarily, unless there > is a bug. Looks not, for example, kernel can choose to limit various operations on a fd, even if the process holds the key 1) privileged process do setup on fd, passing that fd to unprivileged fd 2) unprivileged process can only use a subset of the functions of a fd In the case of Qemu, it prevents the kernel from the case where for example malicious guests can escape to Qemu. In the case of vhost-net, the privilege is the owner. For example, the following seems to be valid in the case of vhost-net: 1) Two processes (A and B) share a part of the memory 2) A is the owner of the vhost-net who is in charge of building memory mappings via IOTLB 3) A passess vhost-net fd to process B > > > For example: > > > > 1) old owner pass fd to new owner which is another process > > 2) the new owner do VHOST_NEW_OWNER > > 3) new owner doesn't do remap correctly > > > > There's no way for the old owner to remove/unpin the mappings as we > > have the owner check in IOTLB_UPDATE. Looks like a potential way for > > DOS. > > This is a bug in the second cooperating process, not a DOS. The application > must fix it. Sometimes you cannot recover from an application bug at run time. > > BTW, at one time vfio enforced the concept of an owner, but Alex deleted it. > It adds no value, because possession of the fd is the key. > ffed0518d871 ("vfio: remove useless judgement") This seems to be a great relaxation of the ownership check. I would like to hear from Michael first. Thanks > > - Steve > > >>>> + > >>>> + /* > >>>> + * The current implementation does not support the platform iommu > >>>> + * with use_va. And if !use_va, remap is not necessary. > >>>> + */ > >>>> + if (!ops->set_map && !ops->dma_map) > >>>> + return -EINVAL; > >>>> + > >>>> + /* > >>>> + * The current implementation supports set_map but not dma_map. > >>>> + */ > >>>> + if (!ops->set_map) > >>>> + return -EINVAL; > >>>> + > >>>> + /* > >>>> + * Do not verify that the new size@uaddr points to the same physical > >>>> + * pages as the old size@uaddr, because that would take time O(npages), > >>>> + * and would increase guest down time during live update. If the app > >>>> + * is buggy and they differ, then the app may corrupt its own memory, > >>>> + * but no one else's. > >>>> + */ > >>>> + > >>>> + /* > >>>> + * Batch will finish later in BATCH_END by calling set_map for the new > >>>> + * addresses collected here. Non-batch must do it now. > >>>> + */ > >>>> + if (!v->in_batch) > >>>> + r = ops->set_map(vdpa, asid, iotlb); > >>>> + if (!r) > >>>> + map->addr = msg->uaddr; > >>>> + > >>>> + return r; > >>>> +} > >>>> + > >>>> static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, > >>>> struct vhost_iotlb *iotlb, > >>>> struct vhost_iotlb_msg *msg) > >>>> @@ -1336,6 +1391,9 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, u32 asid, > >>>> ops->set_map(vdpa, asid, iotlb); > >>>> v->in_batch = false; > >>>> break; > >>>> + case VHOST_IOTLB_REMAP: > >>>> + r = vhost_vdpa_process_iotlb_remap(v, iotlb, msg); > >>>> + break; > >>>> default: > >>>> r = -EINVAL; > >>>> break; > >>>> diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h > >>>> index 9177843951e9..35908315ff55 100644 > >>>> --- a/include/uapi/linux/vhost_types.h > >>>> +++ b/include/uapi/linux/vhost_types.h > >>>> @@ -79,7 +79,7 @@ struct vhost_iotlb_msg { > >>>> /* > >>>> * VHOST_IOTLB_BATCH_BEGIN and VHOST_IOTLB_BATCH_END allow modifying > >>>> * multiple mappings in one go: beginning with > >>>> - * VHOST_IOTLB_BATCH_BEGIN, followed by any number of > >>>> + * VHOST_IOTLB_BATCH_BEGIN, followed by any number of VHOST_IOTLB_REMAP or > >>>> * VHOST_IOTLB_UPDATE messages, and ending with VHOST_IOTLB_BATCH_END. > >>>> * When one of these two values is used as the message type, the rest > >>>> * of the fields in the message are ignored. There's no guarantee that > >>>> @@ -87,6 +87,15 @@ struct vhost_iotlb_msg { > >>>> */ > >>>> #define VHOST_IOTLB_BATCH_BEGIN 5 > >>>> #define VHOST_IOTLB_BATCH_END 6 > >>>> + > >>>> +/* > >>>> + * VHOST_IOTLB_REMAP registers a new uaddr for the existing mapping at iova. > >>>> + * The new uaddr must address the same memory object as originally mapped. > >>>> + * Failure to do so will result in user memory corruption and/or device > >>>> + * misbehavior. iova and size must match the arguments used to create the > >>>> + * an existing mapping. Protection is not changed, and perm must be 0. > >>>> + */ > >>>> +#define VHOST_IOTLB_REMAP 7 > >>>> __u8 type; > >>>> }; > >>> > >>> Thanks > >>> > >>>> > >>>> -- > >>>> 2.39.3 > >>>> > >>> > >> > > >