On Mon, May 31, 2021 at 12:39 PM Jason Wang <jasowang@xxxxxxxxxx> wrote: > > > 在 2021/5/31 下午12:27, Yongji Xie 写道: > > On Fri, May 28, 2021 at 10:31 AM Jason Wang <jasowang@xxxxxxxxxx> wrote: > >> > >> 在 2021/5/27 下午9:17, Yongji Xie 写道: > >>> On Thu, May 27, 2021 at 4:41 PM Jason Wang <jasowang@xxxxxxxxxx> wrote: > >>>> 在 2021/5/27 下午3:34, Yongji Xie 写道: > >>>>> On Thu, May 27, 2021 at 1:40 PM Jason Wang <jasowang@xxxxxxxxxx> wrote: > >>>>>> 在 2021/5/27 下午1:08, Yongji Xie 写道: > >>>>>>> On Thu, May 27, 2021 at 1:00 PM Jason Wang <jasowang@xxxxxxxxxx> wrote: > >>>>>>>> 在 2021/5/27 下午12:57, Yongji Xie 写道: > >>>>>>>>> On Thu, May 27, 2021 at 12:13 PM Jason Wang <jasowang@xxxxxxxxxx> wrote: > >>>>>>>>>> 在 2021/5/17 下午5:55, Xie Yongji 写道: > >>>>>>>>>>> + > >>>>>>>>>>> +static int vduse_dev_msg_sync(struct vduse_dev *dev, > >>>>>>>>>>> + struct vduse_dev_msg *msg) > >>>>>>>>>>> +{ > >>>>>>>>>>> + init_waitqueue_head(&msg->waitq); > >>>>>>>>>>> + spin_lock(&dev->msg_lock); > >>>>>>>>>>> + vduse_enqueue_msg(&dev->send_list, msg); > >>>>>>>>>>> + wake_up(&dev->waitq); > >>>>>>>>>>> + spin_unlock(&dev->msg_lock); > >>>>>>>>>>> + wait_event_killable(msg->waitq, msg->completed); > >>>>>>>>>> What happens if the userspace(malicous) doesn't give a response forever? > >>>>>>>>>> > >>>>>>>>>> It looks like a DOS. If yes, we need to consider a way to fix that. > >>>>>>>>>> > >>>>>>>>> How about using wait_event_killable_timeout() instead? > >>>>>>>> Probably, and then we need choose a suitable timeout and more important, > >>>>>>>> need to report the failure to virtio. > >>>>>>>> > >>>>>>> Makes sense to me. But it looks like some > >>>>>>> vdpa_config_ops/virtio_config_ops such as set_status() didn't have a > >>>>>>> return value. Now I add a WARN_ON() for the failure. Do you mean we > >>>>>>> need to add some change for virtio core to handle the failure? > >>>>>> Maybe, but I'm not sure how hard we can do that. > >>>>>> > >>>>> We need to change all virtio device drivers in this way. > >>>> Probably. > >>>> > >>>> > >>>>>> We had NEEDS_RESET but it looks we don't implement it. > >>>>>> > >>>>> Could it handle the failure of get_feature() and get/set_config()? > >>>> Looks not: > >>>> > >>>> " > >>>> > >>>> The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state > >>>> that a reset is needed. If DRIVER_OK is set, after it sets > >>>> DEVICE_NEEDS_RESET, the device MUST send a device configuration change > >>>> notification to the driver. > >>>> > >>>> " > >>>> > >>>> This looks implies that NEEDS_RESET may only work after device is > >>>> probed. But in the current design, even the reset() is not reliable. > >>>> > >>>> > >>>>>> Or a rough idea is that maybe need some relaxing to be coupled loosely > >>>>>> with userspace. E.g the device (control path) is implemented in the > >>>>>> kernel but the datapath is implemented in the userspace like TUN/TAP. > >>>>>> > >>>>> I think it can work for most cases. One problem is that the set_config > >>>>> might change the behavior of the data path at runtime, e.g. > >>>>> virtnet_set_mac_address() in the virtio-net driver and > >>>>> cache_type_store() in the virtio-blk driver. Not sure if this path is > >>>>> able to return before the datapath is aware of this change. > >>>> Good point. > >>>> > >>>> But set_config() should be rare: > >>>> > >>>> E.g in the case of virtio-net with VERSION_1, config space is read only, > >>>> and it was set via control vq. > >>>> > >>>> For block, we can > >>>> > >>>> 1) start from without WCE or > >>>> 2) we add a config change notification to userspace or > >>> I prefer this way. And I think we also need to do similar things for > >>> set/get_vq_state(). > >> > >> Yes, I agree. > >> > > Hi Jason, > > > > Now I'm working on this. But I found the config change notification > > must be synchronous in the virtio-blk case, which means the kernel > > still needs to wait for the response from userspace in set_config(). > > Otherwise, some I/Os might still run the old way after we change the > > cache_type in sysfs. > > > > The simple ways to solve this problem are: > > > > 1. Only support read-only config space, disable WCE as you suggested > > 2. Add a return value to set_config() and handle the failure only in > > virtio-blk driver > > 3. Print some warnings after timeout since it only affects the > > dataplane which is under userspace's control > > > > Any suggestions? > > > Let's go without WCE first and make VDUSE work first. We can then think > of a solution for WCE on top. > It's fine with me. Thanks, Yongji