On Tue, Oct 12, 2021 at 10:43:57AM +0800, Jason Wang wrote: > On Mon, Oct 11, 2021 at 8:36 PM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: > > > > On Mon, Oct 11, 2021 at 03:36:51PM +0800, Jason Wang wrote: > > > On Tue, Oct 5, 2021 at 3:42 PM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: > > > > > > > > On Mon, Sep 13, 2021 at 01:53:44PM +0800, Jason Wang wrote: > > > > > Hi All: > > > > > > > > > > This series treis to do more hardening for virito. > > > > > > > > > > patch 1 validates the num_queues for virio-blk device. > > > > > patch 2-4 validates max_nr_ports for virito-console device. > > > > > patch 5-7 harden virtio-pci interrupts to make sure no exepcted > > > > > interrupt handler is tiggered. If this makes sense we can do similar > > > > > things in other transport drivers. > > > > > patch 8-9 validate used ring length. > > > > > > > > > > Smoking test on blk/net with packed=on/off and iommu_platform=on/off. > > > > > > > > > > Please review. > > > > > > > > > > Thanks > > > > > > > > So I poked at console at least, and I think I see > > > > an issue: if interrupt handler queues a work/bh, > > > > then it can still run while reset is in progress. > > > > > > Looks like a bug which is unrelated to the hardening? > > > > Won't preventing use after free be relevant? > > Oh right. > > > I frankly don't know what does hardening means then. > > > E.g the driver > > > should sync with work/bh before reset. > > > > No, there's no way to fix it ATM without extra locks and state which I > > think we should strive to avoid or make it generic, not per-driver, > > since sync before reset is useless, new interrupts will just arrive and > > queue more work. And a sync after reset is too late since driver will > > try to add buffers. > > Can we do something like > > 1) disable interrupt > 2) sync bh > > Or I guess this is somehow you meant in the following steps. So that would mean a new API to disable vq interrupts. reset will re-enable. E.g. virtqueue_cancel_cb_before_reset()? Then drivers can sync, then reset. This means maintaining more state though, which I don't like. An alternative is something like this: static void (*virtio_flush_device)(struct virtio_device *dev); void virtio_reset_device(struct virtio_device *dev, virtio_flush_device flush) { might_sleep(); if (flush) { dev->config->disable_interrupts(dev); flush(dev); dev->config->reset(dev); dev->config->enable_interrupts(dev); } else { dev->config->reset(dev); } } I have patches wrapping all reset calls in virtio_reset_device (without the flush parameter but that's easy to tweak). > > > > Maybe we can break device. Two issues with that > > - drivers might not be ready to handle add_buf failures > > - restore needs to unbreak then and we don't have a way to do that yet > > > > So .. careful reading of all device drivers and hoping we don't mess > > things up even more ... here we come. > > Yes. The biggest issue with this trick is drivers not handling add_buf errors, adding a failure path here risks creating memory leaks. OTOH with e.g. bounce buffers maybe it's possible for add buf to fail anyway? > > > > > > > > > > I sent a patch to fix it for console removal specifically, > > > > but I suspect it's not enough e.g. freeze is still broken. > > > > And note this has been reported without any TDX things - > > > > it's not a malicious device issue, can be triggered just > > > > by module unload. > > > > > > > > I am vaguely thinking about new APIs to disable/enable callbacks. > > > > An alternative: > > > > > > > > 1. adding new remove_nocb/freeze_nocb calls > > > > 2. disabling/enabling interrupts automatically around these > > > > 3. gradually moving devices to using these > > > > 4. once/if all device move, removing the old callbacks > > > > > > > > the advantage here is that we'll be sure calls are always > > > > paired correctly. > > > > > > I'm not sure I get the idea, but my feeling is that it doesn't > > > conflict with the interrupt hardening here (or at least the same > > > method is required e.g NO_AUTO_EN). > > > > > > Thanks > > > > Right. It's not that it conflicts, it's that I was hoping that > > since you are working on hardening you can take up fixing that. > > Let me know whether you have the time. Thanks! > > I can do that. > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > > > > > Jason Wang (9): > > > > > virtio-blk: validate num_queues during probe > > > > > virtio: add doc for validate() method > > > > > virtio-console: switch to use .validate() > > > > > virtio_console: validate max_nr_ports before trying to use it > > > > > virtio_config: introduce a new ready method > > > > > virtio_pci: harden MSI-X interrupts > > > > > virtio-pci: harden INTX interrupts > > > > > virtio_ring: fix typos in vring_desc_extra > > > > > virtio_ring: validate used buffer length > > > > > > > > > > drivers/block/virtio_blk.c | 3 +- > > > > > drivers/char/virtio_console.c | 51 +++++++++++++++++++++--------- > > > > > drivers/virtio/virtio_pci_common.c | 43 +++++++++++++++++++++---- > > > > > drivers/virtio/virtio_pci_common.h | 7 ++-- > > > > > drivers/virtio/virtio_pci_legacy.c | 5 +-- > > > > > drivers/virtio/virtio_pci_modern.c | 6 ++-- > > > > > drivers/virtio/virtio_ring.c | 27 ++++++++++++++-- > > > > > include/linux/virtio.h | 1 + > > > > > include/linux/virtio_config.h | 6 ++++ > > > > > 9 files changed, 118 insertions(+), 31 deletions(-) > > > > > > > > > > -- > > > > > 2.25.1 > > > > > > _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization