On Fri, Apr 17, 2015 at 03:59:15PM +0800, Fam Zheng wrote: > Currently, virtio code chooses to kill QEMU if the guest passes any invalid > data with vring. > That has drawbacks such as losing unsaved data (e.g. when > guest user is writing a very long email), or possible denial of service in > a nested vm use case where virtio device is passed through. > > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used to > improve this by communicating the error state between virtio devices and > drivers. The device notifies guest upon setting the bit, then the guest driver > should detect this bit and report to userspace, or recover the device by > resetting it. Unfortunately, virtio 1 spec does not have a conformance statement that requires driver to recover. We merely have a non-normative looking text: Note: For example, the driver can’t assume requests in flight will be completed if DEVICE_NEEDS_RESET is set, nor can it assume that they have not been completed. A good implementation will try to recover by issuing a reset. Implementing this reset for all devices in a race-free manner might also be far from trivial. I think we'd need a feature bit for this. OTOH as long as we make this a new feature, would an ability to reset a single VQ be a better match for what you are trying to achieve? > This series makes necessary changes in virtio core code, based on which > virtio-blk is converted. Other devices now keep the existing behavior by > passing in "error_abort". They will be converted in following series. The Linux > driver part will also be worked on. > > One concern with this behavior change is that it's now harder to notice the > actual driver bug that caused the error, as the guest continues to run. To > address that, we could probably add a new error action option to virtio > devices, similar to the "read/write werror" in block layer, so the vm could be > paused and the management will get an event in QMP like pvpanic. This work can > be done on top. At the architectural level, that's only one concern. Others would be - workloads such as openstack handle guest crash better than a guest that's e.g. slow because of a memory leak - it's easier for guests to probe host for security issues if guest isn't killed - guest can flood host log with guest-triggered errors At the implementation level, there's one big issue you seem to have missed: DMA to invalid memory addresses causes a crash in memory core. I'm not sure whether it makes sense to recover from virtio core bugs when we can't recover from device bugs. > > > Fam Zheng (18): > virtio: Return error from virtqueue_map_sg > virtio: Return error from virtqueue_num_heads > virtio: Return error from virtqueue_get_head > virtio: Return error from virtqueue_next_desc > virtio: Return error from virtqueue_get_avail_bytes > virtio: Return error from virtqueue_pop > virtio: Return error from virtqueue_avail_bytes > virtio: Return error from virtio_add_queue > virtio: Return error from virtio_del_queue > virtio: Add macro for VIRTIO_CONFIG_S_NEEDS_RESET > virtio: Add "needs_reset" flag to virtio device > virtio: Return -EINVAL if the vdev needs reset in virtqueue_pop > virtio-blk: Graceful error handling of virtqueue_pop > qtest: Add "QTEST_FILTER" to filter test cases > qtest: virtio-blk: Extract "setup" for future reuse > libqos: Add qvirtio_needs_reset > qtest: Add test case for "needs reset" of virtio-blk > qtest: virtio-blk: Suppress virtio error messages in "make check" > > hw/9pfs/virtio-9p-device.c | 2 +- > hw/9pfs/virtio-9p.c | 2 +- > hw/block/dataplane/virtio-blk.c | 9 +- > hw/block/virtio-blk.c | 62 +++++-- > hw/char/virtio-serial-bus.c | 30 ++-- > hw/net/virtio-net.c | 36 +++-- > hw/scsi/virtio-scsi.c | 8 +- > hw/virtio/virtio-balloon.c | 13 +- > hw/virtio/virtio-rng.c | 6 +- > hw/virtio/virtio.c | 214 ++++++++++++++++++------- > include/hw/virtio/virtio-blk.h | 3 +- > include/hw/virtio/virtio.h | 17 +- > include/standard-headers/linux/virtio_config.h | 2 + > tests/Makefile | 6 +- > tests/libqos/virtio.c | 5 + > tests/libqos/virtio.h | 2 + > tests/virtio-blk-test.c | 196 ++++++++++++++++++++-- > 17 files changed, 482 insertions(+), 131 deletions(-) > > -- > 1.9.3 > _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization