On Wed, Apr 5, 2023 at 7:08 AM Mike Christie <michael.christie@xxxxxxxxxx> wrote: > > On 4/4/23 3:00 AM, Jason Wang wrote: > >> > >> -static void vhost_worker_free(struct vhost_dev *dev) > >> +static void vhost_worker_put(struct vhost_dev *dev, struct vhost_worker *worker) > >> { > >> - struct vhost_worker *worker = dev->worker; > >> - > >> if (!worker) > >> return; > >> > >> - dev->worker = NULL; > >> + if (!refcount_dec_and_test(&worker->refcount)) > >> + return; > >> + > >> WARN_ON(!llist_empty(&worker->work_list)); > >> vhost_task_stop(worker->vtsk); > >> kfree(worker); > >> } > >> > >> +static void vhost_vq_detach_worker(struct vhost_virtqueue *vq) > >> +{ > >> + if (vq->worker) > > > > What happens to the pending work that queues for the old worker? > > I didn't think there would be works queued at this time. I see your comment > below about my assumption about the backend being set being wrong. Will > discuss down there. > > > >> > >> +/* Caller must have device and virtqueue mutex */ > >> +static void __vhost_vq_attach_worker(struct vhost_virtqueue *vq, > >> + struct vhost_worker *worker) > >> +{ > >> + refcount_inc(&worker->refcount); > >> + vhost_vq_detach_worker(vq);()) > >> + vq->worker = worker; > > > > What happens if there's a pending flush in a specific device (e.g in > > vhost_scsi_tmf_resp_work())? > > We wouldn't hit that specific case where we are running the above code and > vhost_scsi_tmf_resp_work. > > Either: > > 1. The backend is NULL and has never been set, and so we would never have > called vhost_scsi_tmf_resp_work, because it can only be called after the > backend is set. > > 2. If the backed has been set and vhost_scsi_tmf_resp_work has > run or is running, then we when we would not have called __vhost_vq_attach_worker > from vhost_vq_attach_worker because it would see vhost_vq_get_backend > returning a non-NULL value. > > If vhost_scsi later sets the backend to NULL, then vhost_scsi_clear_endpoint > will have made sure the flush has completed when the clear function returns. > It does that with the device mutex so when we run __vhost_vq_attach_worker > It will only see a vq/worker with no flushes in progress. Ok. > > For the general case of can we be doing a vhost_dev_flush/vhost_work_flush_on > and __vhost_vq_attach_worker, then I thought we are ok as well because I > thought we have to currently have the device mutex when we flush so those can't > race with ioctl calls to vhost_vq_attach_worker since we hold the dev mutex > during that ioctls. I'm not sure I understand here, but we can't assume the user of vhost_work_queue() is called in the ioctl context. See vhost_zerocopy_callback(). But since you want to limit the call before set_backend, another question comes, do we really need the dynamic attaching/creating in this case? How about a simple one ioctl that is used to build the queue to workers mapping instead? > Or we call flush from the file_operations release function > so the device is closed and can't race with ioctls. > > > > > Does this mean we need to hold vq mutex when doing the flush? (which > > seems not the case of vhost_scsi_tmf_resp_work()). > > > >> +} > >> + > >> +/* Caller must have device and virtqueue mutex */ > >> +static int vhost_vq_attach_worker(struct vhost_virtqueue *vq, > >> + struct vhost_vring_worker *info) > >> +{ > >> + unsigned long index = info->worker_id; > >> + struct vhost_dev *dev = vq->dev; > >> + struct vhost_worker *worker; > >> + > >> + if (!dev->use_worker) > >> + return -EINVAL; > >> + > >> + /* > >> + * We don't support setting a worker on an active vq to make flushing > >> + * and removal simple. > >> + */ > >> + if (vhost_vq_get_backend(vq)) > >> + return -EBUSY; > > > > This assumes the worker won't be started until the backend is set > > which is not true. > > I can see how we get flushes before setting the backend, but I thought we are > ok because we have the device mutex held. > > What are the other possible cases? Is one case we could get a > VHOST_SET_VRING_KICK before setting the backend and so vhost_poll_start starts > to allow vhost_poll_queue before the backend is set? Yes, and it's not necessarily the kick, the networking core could wake up workers before set_backend. > Is there any thing else? Haven't found time to do this, but what I want to say is, if we want this assumption, we need to document it and change the devices that are affected by this change. > > I'll fix this issue. > > > >> + > >> +/* Caller must have device mutex */ > >> +static int vhost_free_worker(struct vhost_dev *dev, > >> + struct vhost_worker_state *info) > >> +{ > >> + unsigned long index = info->worker_id; > >> + struct vhost_worker *worker; > >> + > >> + if (!dev->use_worker) > >> + return -EINVAL; > >> + > >> + worker = xa_find(&dev->worker_xa, &index, UINT_MAX, XA_PRESENT); > > > > So we use int for worker_id which conflicts with UINT_MAX here? > > I switched from idr in the last versions to xa last second and added this mistake. > Will fix. > > > > > > struct vhost_worker_state { > > /* > > * For VHOST_NEW_WORKER the kernel will return the new vhost_worker id. > > * For VHOST_FREE_WORKER this must be set to the id of the vhost_worker > > * to free. > > */ > > int worker_id; > > }; > > > > A side effect of using xa_index directly is that userspace can guess > > the xa_index of the default worker and free it here, is this intended? > I don't need the functionality. It was only something that I didn't > think mattered much, because you can only free the worker if it's not in > use by any virtqueues, so I didn't add any special code to handle it. > The user would have had to do an ATTACH to the default worker and replace > it then it could free it, so it works like the other workers. It's about the robustness of the uAPI: struct vhost_worker_state { /* * For VHOST_NEW_WORKER the kernel will return the new vhost_worker id. * For VHOST_FREE_WORKER this must be set to the id of the vhost_worker * to free. */ int worker_id; }; It looks like the workder_id is the one userspace gets from VHOST_NEW_WORKER. If yes, we should hide the default worker. > > > Should we hide the default worker from xa? > > I can change it if you are worried about future problems. > > > >> - default: > >> + case VHOST_ATTACH_VRING_WORKER: > >> + if (copy_from_user(&w, argp, sizeof(w))) { > >> + r = -EFAULT; > >> + break; > >> + } > >> + r = vhost_vq_attach_worker(vq, &w); > >> + if (!r && copy_to_user(argp, &w, sizeof(w))) > >> + r = -EFAULT; > >> + break; > > > > It's a little odd that we have new and attach but only a free. > > Do you mean we would also want a detach? I've been debating that. > I'm not sure what the user wanted though. > > Would it leave the virtqueue with no worker? Or, does it pick the first > worker it finds? If there is no worker because we did the former or > because userspace detached all of them, then do we just drop work that > gets queued? > > It seemed like a user would never want to drop work, so I made it so > you can only tell it to use new workers (attach which I guess is more > like a swap worker) swap and free old worker indeed? static void vhost_vq_detach_worker(struct vhost_virtqueue *vq) { if (vq->worker) vhost_worker_put(vq->dev, vq->worker); vq->worker = NULL; } That makes me think under which case we should use VHOST_FREE_WORKER? Thanks > so we always have a worker and I don't have to > worry about dropping work. > _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization