On Fri, May 28, 2021 at 9:33 AM Jason Wang <jasowang@xxxxxxxxxx> wrote: > > > 在 2021/5/27 下午6:14, Yongji Xie 写道: > > On Thu, May 27, 2021 at 4:43 PM Jason Wang <jasowang@xxxxxxxxxx> wrote: > >> > >> 在 2021/5/27 下午4:41, Jason Wang 写道: > >>> 在 2021/5/27 下午3:34, Yongji Xie 写道: > >>>> On Thu, May 27, 2021 at 1:40 PM Jason Wang <jasowang@xxxxxxxxxx> wrote: > >>>>> 在 2021/5/27 下午1:08, Yongji Xie 写道: > >>>>>> On Thu, May 27, 2021 at 1:00 PM Jason Wang <jasowang@xxxxxxxxxx> > >>>>>> wrote: > >>>>>>> 在 2021/5/27 下午12:57, Yongji Xie 写道: > >>>>>>>> On Thu, May 27, 2021 at 12:13 PM Jason Wang <jasowang@xxxxxxxxxx> > >>>>>>>> wrote: > >>>>>>>>> 在 2021/5/17 下午5:55, Xie Yongji 写道: > >>>>>>>>>> + > >>>>>>>>>> +static int vduse_dev_msg_sync(struct vduse_dev *dev, > >>>>>>>>>> + struct vduse_dev_msg *msg) > >>>>>>>>>> +{ > >>>>>>>>>> + init_waitqueue_head(&msg->waitq); > >>>>>>>>>> + spin_lock(&dev->msg_lock); > >>>>>>>>>> + vduse_enqueue_msg(&dev->send_list, msg); > >>>>>>>>>> + wake_up(&dev->waitq); > >>>>>>>>>> + spin_unlock(&dev->msg_lock); > >>>>>>>>>> + wait_event_killable(msg->waitq, msg->completed); > >>>>>>>>> What happens if the userspace(malicous) doesn't give a response > >>>>>>>>> forever? > >>>>>>>>> > >>>>>>>>> It looks like a DOS. If yes, we need to consider a way to fix that. > >>>>>>>>> > >>>>>>>> How about using wait_event_killable_timeout() instead? > >>>>>>> Probably, and then we need choose a suitable timeout and more > >>>>>>> important, > >>>>>>> need to report the failure to virtio. > >>>>>>> > >>>>>> Makes sense to me. But it looks like some > >>>>>> vdpa_config_ops/virtio_config_ops such as set_status() didn't have a > >>>>>> return value. Now I add a WARN_ON() for the failure. Do you mean we > >>>>>> need to add some change for virtio core to handle the failure? > >>>>> Maybe, but I'm not sure how hard we can do that. > >>>>> > >>>> We need to change all virtio device drivers in this way. > >>> > >>> Probably. > >>> > >>> > >>>>> We had NEEDS_RESET but it looks we don't implement it. > >>>>> > >>>> Could it handle the failure of get_feature() and get/set_config()? > >>> > >>> Looks not: > >>> > >>> " > >>> > >>> The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state > >>> that a reset is needed. If DRIVER_OK is set, after it sets > >>> DEVICE_NEEDS_RESET, the device MUST send a device configuration change > >>> notification to the driver. > >>> > >>> " > >>> > >>> This looks implies that NEEDS_RESET may only work after device is > >>> probed. But in the current design, even the reset() is not reliable. > >>> > >>> > >>>>> Or a rough idea is that maybe need some relaxing to be coupled loosely > >>>>> with userspace. E.g the device (control path) is implemented in the > >>>>> kernel but the datapath is implemented in the userspace like TUN/TAP. > >>>>> > >>>> I think it can work for most cases. One problem is that the set_config > >>>> might change the behavior of the data path at runtime, e.g. > >>>> virtnet_set_mac_address() in the virtio-net driver and > >>>> cache_type_store() in the virtio-blk driver. Not sure if this path is > >>>> able to return before the datapath is aware of this change. > >>> > >>> Good point. > >>> > >>> But set_config() should be rare: > >>> > >>> E.g in the case of virtio-net with VERSION_1, config space is read > >>> only, and it was set via control vq. > >>> > >>> For block, we can > >>> > >>> 1) start from without WCE or > >>> 2) we add a config change notification to userspace or > >>> 3) extend the spec to use vq instead of config space > >>> > >>> Thanks > >> > >> Another thing if we want to go this way: > >> > >> We need find a way to terminate the data path from the kernel side, to > >> implement to reset semantic. > >> > > Do you mean terminate the data path in vdpa_reset(). > > > Yes. > > > > Is it ok to just > > notify userspace to stop data path asynchronously? > > > For well-behaved userspace, yes but no for buggy or malicious ones. > But the buggy or malicious daemons can't do anything if my understanding is correct. > I had an idea, how about terminate IOTLB in this case? Then we're in > fact turn datapath off. > Sorry, I didn't get your point here. What do you mean by terminating IOTLB? Remove iotlb mapping? But userspace can still access the mapped region. Thanks, Yongji