On Thu, Jun 24, 2021 at 11:35 AM Jason Wang <jasowang@xxxxxxxxxx> wrote: > > > 在 2021/6/23 下午1:50, Yongji Xie 写道: > > On Wed, Jun 23, 2021 at 11:31 AM Jason Wang <jasowang@xxxxxxxxxx> wrote: > >> > >> 在 2021/6/22 下午4:14, Yongji Xie 写道: > >>> On Tue, Jun 22, 2021 at 3:50 PM Jason Wang <jasowang@xxxxxxxxxx> wrote: > >>>> 在 2021/6/22 下午3:22, Yongji Xie 写道: > >>>>>> We need fix a way to propagate the error to the userspace. > >>>>>> > >>>>>> E.g if we want to stop the deivce, we will delay the status reset until > >>>>>> we get respose from the userspace? > >>>>>> > >>>>> I didn't get how to delay the status reset. And should it be a DoS > >>>>> that we want to fix if the userspace doesn't give a response forever? > >>>> You're right. So let's make set_status() can fail first, then propagate > >>>> its failure via VHOST_VDPA_SET_STATUS. > >>>> > >>> OK. So we only need to propagate the failure in the vhost-vdpa case, right? > >> > >> I think not, we need to deal with the reset for virtio as well: > >> > >> E.g in register_virtio_devices(), we have: > >> > >> /* We always start by resetting the device, in case a previous > >> * driver messed it up. This also tests that code path a > >> little. */ > >> dev->config->reset(dev); > >> > >> We probably need to make reset can fail and then fail the > >> register_virtio_device() as well. > >> > > OK, looks like virtio_add_status() and virtio_device_ready()[1] should > > be also modified if we need to propagate the failure in the > > virtio-vdpa case. Or do we only need to care about the reset case? > > > > [1] https://lore.kernel.org/lkml/20210517093428.670-1-xieyongji@xxxxxxxxxxxxx/ > > > My understanding is DRIVER_OK is not something that needs to be validated: > > " > > DRIVER_OK (4) > Indicates that the driver is set up and ready to drive the device. > > " > > Since the spec doesn't require to re-read the and check if DRIVER_OK is > set in 3.1.1 Driver Requirements: Device Initialization. > > It's more about "telling the device that driver is ready." > > But we don have some status bit that requires the synchronization with > the device. > > 1) FEATURES_OK, spec requires to re-read the status bit to check whether > or it it was set by the device: > > " > > Re-read device status to ensure the FEATURES_OK bit is still set: > otherwise, the device does not support our subset of features and the > device is unusable. > > " > > This is useful for some device which can only support a subset of the > features. E.g a device that can only work for packed virtqueue. This > means the current design of set_features won't work, we need either: > > 1a) relay the set_features request to userspace > > or > > 1b) introduce a mandated_device_features during device creation and > validate the driver features during the set_features(), and don't set > FEATURES_OK if they don't match. > > > 2) Some transports (PCI) requires to re-read the status to ensure the > synchronization. > > " > > After writing 0 to device_status, the driver MUST wait for a read of > device_status to return 0 before reinitializing the device. > > " > > So we need to deal with both FEATURES_OK and reset, but probably not > DRIVER_OK. > OK, I see. Thanks for the explanation. One more question is how about clearing the corresponding status bit in get_status() rather than making set_status() fail. Since the spec recommends this way for validation which is done in virtio_dev_remove() and virtio_finalize_features(). Thanks, Yongji