On 2020/12/8 上午2:31, Mike Christie wrote:
On 12/6/20 10:27 PM, Jason Wang wrote:
On 2020/12/5 上午12:32, Mike Christie wrote:
On 12/4/20 2:09 AM, Jason Wang wrote:
On 2020/12/4 下午3:56, Mike Christie wrote:
+static long vhost_vring_set_cpu(struct vhost_dev *d, struct
vhost_virtqueue *vq,
+ void __user *argp)
+{
+ struct vhost_vring_state s;
+ int ret = 0;
+
+ if (vq->private_data)
+ return -EBUSY;
+
+ if (copy_from_user(&s, argp, sizeof s))
+ return -EFAULT;
+
+ if (s.num == -1) {
+ vq->cpu = s.num;
+ return 0;
+ }
+
+ if (s.num >= nr_cpu_ids)
+ return -EINVAL;
+
+ if (!d->ops || !d->ops->get_workqueue)
+ return -EINVAL;
+
+ if (!d->wq)
+ d->wq = d->ops->get_workqueue();
+ if (!d->wq)
+ return -EINVAL;
+
+ vq->cpu = s.num;
+ return ret;
+}
So one question here. Who is in charge of doing this set_cpu? Note
that sched_setaffinity(2) requires CAP_SYS_NICE to work, so I
wonder whether or not it's legal for unprivileged Qemu to do this.
I was having qemu do it when it's setting up the vqs since it had
the info there already.
Is it normally the tool that makes calls into qemu that does the
operations that require CAP_SYS_NICE?
My understanding is that it only matter scheduling. And this patch
wants to change the affinity which should check that capability.
If so, then I see the interface needs to be changed.
Actually, if I read this patch correctly it requires e.g qemu to make
the decision instead of the management layer. This may bring some
troubles to for e.g the libvirt emulatorpin[1] implementation.
Let me make sure I understood you.
I thought qemu would just have a new property, and users would pass
that in like they do for the number of queues setting. Then qemu would
pass that to the kernel. The primary user I have to support at work
does not use libvirt based tools so I thought that was a common point
that would work for everyone.
I think we need talk with libvirt guys to see if it works for them. My
understanding is the scheduling should be the charge of them not qemu.
For my work use requirement, your emulatorpin and CAP_SYS_NICE comment
then that means we want an interface that something other than qemu
can use right? So the tools would call directly into the kernel and
not go through qemu right?
Yes, usually qemu runs without any privilege. So could it be e.g a sysfs
interface or other?
Thanks