在 2022/3/24 下午7:03, Michael S. Tsirkin 写道:
On Thu, Mar 24, 2022 at 04:40:04PM +0800, Jason Wang wrote:
This is a rework on the previous IRQ hardening that is done for
virtio-pci where several drawbacks were found and were reverted:
1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
that is used by some device such as virtio-blk
2) done only for PCI transport
In this patch, we tries to borrow the idea from the INTX IRQ hardening
in the reverted commit 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
by introducing a global irq_soft_enabled variable for each
virtio_device. Then we can to toggle it during
virtio_reset_device()/virtio_device_ready(). A synchornize_rcu() is
used in virtio_reset_device() to synchronize with the IRQ handlers. In
the future, we may provide config_ops for the transport that doesn't
use IRQ. With this, vring_interrupt() can return check and early if
irq_soft_enabled is false. This lead to smp_load_acquire() to be used
but the cost should be acceptable.
Maybe it should be but is it? Can't we use synchronize_irq instead?
Even if we allow the transport driver to synchornize through
synchronize_irq() we still need a check in the vring_interrupt().
We do something like the following previously:
if (!READ_ONCE(vp_dev->intx_soft_enabled))
return IRQ_NONE;
But it looks like a bug since speculative read can be done before the
check where the interrupt handler can't see the uncommitted setup which
is done by the driver.
To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
module parameter is introduced to enable the hardening so function
hardening is disabled by default.
Which devices are these? How come they send an interrupt before there
are any buffers in any queues?
I copied this from the commit log for 22b7050a024d7
"
This change will also benefit old hypervisors (before 2009)
that send interrupts without checking DRIVER_OK: previously,
the callback could race with driver-specific initialization.
"
If this is only for config interrupt, I can remove the above log.
Note that the hardening is only done for vring interrupt since the
config interrupt hardening is already done in commit 22b7050a024d7
("virtio: defer config changed notifications"). But the method that is
used by config interrupt can't be reused by the vring interrupt
handler because it uses spinlock to do the synchronization which is
expensive.
Signed-off-by: Jason Wang <jasowang@xxxxxxxxxx>
---
drivers/virtio/virtio.c | 19 +++++++++++++++++++
drivers/virtio/virtio_ring.c | 9 ++++++++-
include/linux/virtio.h | 4 ++++
include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
4 files changed, 56 insertions(+), 1 deletion(-)
diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index 8dde44ea044a..85e331efa9cc 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -7,6 +7,12 @@
#include <linux/of.h>
#include <uapi/linux/virtio_ids.h>
+static bool irq_hardening = false;
+
+module_param(irq_hardening, bool, 0444);
+MODULE_PARM_DESC(irq_hardening,
+ "Disalbe IRQ software processing when it is not expected");
+
/* Unique numbering for virtio devices. */
static DEFINE_IDA(virtio_index_ida);
@@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
* */
void virtio_reset_device(struct virtio_device *dev)
{
+ /*
+ * The below synchronize_rcu() guarantees that any
+ * interrupt for this line arriving after
+ * synchronize_rcu() has completed is guaranteed to see
+ * irq_soft_enabled == false.
News to me I did not know synchronize_rcu has anything to do
with interrupts. Did not you intend to use synchronize_irq?
I am not even 100% sure synchronize_rcu is by design a memory barrier
though it's most likely is ...
According to the comment above tree RCU version of synchronize_rcu():
"""
* RCU read-side critical sections are delimited by rcu_read_lock()
* and rcu_read_unlock(), and may be nested. In addition, but only in
* v5.0 and later, regions of code across which interrupts, preemption,
* or softirqs have been disabled also serve as RCU read-side critical
* sections. This includes hardware interrupt handlers, softirq handlers,
* and NMI handlers.
"""
So interrupt handlers are treated as read-side critical sections.
And it has the comment for explain the barrier:
"""
* Note that this guarantee implies further memory-ordering guarantees.
* On systems with more than one CPU, when synchronize_rcu() returns,
* each CPU is guaranteed to have executed a full memory barrier since
* the end of its last RCU read-side critical section whose beginning
* preceded the call to synchronize_rcu(). In addition, each CPU having
"""
So on SMP it provides a full barrier. And for UP/tiny RCU we don't need
the barrier, if the interrupt come after WRITE_ONCE() it will see the
irq_soft_enabled as false.
+ */
+ WRITE_ONCE(dev->irq_soft_enabled, false);
+ synchronize_rcu();
+
dev->config->reset(dev);
}
EXPORT_SYMBOL_GPL(virtio_reset_device);
Please add comment explaining where it will be enabled.
Also, we *really* don't need to synch if it was already disabled,
let's not add useless overhead to the boot sequence.
Ok.
@@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
spin_lock_init(&dev->config_lock);
dev->config_enabled = false;
dev->config_change_pending = false;
+ dev->irq_soft_check = irq_hardening;
+
+ if (dev->irq_soft_check)
+ dev_info(&dev->dev, "IRQ hardening is enabled\n");
/* We always start by resetting the device, in case a previous
* driver messed it up. This also tests that code path a little. */
one of the points of hardening is it's also helpful for buggy
devices. this flag defeats the purpose.
Do you mean:
1) we need something like config_enable? This seems not easy to be
implemented without obvious overhead, mainly the synchronize with the
interrupt handlers
2) enable this by default, so I don't object, but this may have some
risk for old hypervisors
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 962f1477b1fa..0170f8c784d8 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
}
-irqreturn_t vring_interrupt(int irq, void *_vq)
+irqreturn_t vring_interrupt(int irq, void *v)
{
+ struct virtqueue *_vq = v;
+ struct virtio_device *vdev = _vq->vdev;
struct vring_virtqueue *vq = to_vvq(_vq);
+ if (!virtio_irq_soft_enabled(vdev)) {
+ dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
+ return IRQ_NONE;
+ }
+
if (!more_used(vq)) {
pr_debug("virtqueue interrupt with no work for %p\n", vq);
return IRQ_NONE;
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 5464f398912a..957d6ad604ac 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
* @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
* @config_enabled: configuration change reporting enabled
* @config_change_pending: configuration change reported while disabled
+ * @irq_soft_check: whether or not to check @irq_soft_enabled
+ * @irq_soft_enabled: callbacks enabled
* @config_lock: protects configuration change reporting
* @dev: underlying device.
* @id: the device type identification (used to match it with a driver).
@@ -109,6 +111,8 @@ struct virtio_device {
bool failed;
bool config_enabled;
bool config_change_pending;
+ bool irq_soft_check;
+ bool irq_soft_enabled;
spinlock_t config_lock;
spinlock_t vqs_list_lock; /* Protects VQs list access */
struct device dev;
diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index dafdc7f48c01..9c1b61f2e525 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
return __virtio_test_bit(vdev, fbit);
}
+/*
+ * virtio_irq_soft_enabled: whether we can execute callbacks
+ * @vdev: the device
+ */
+static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
+{
+ if (!vdev->irq_soft_check)
+ return true;
+
+ /*
+ * Read irq_soft_enabled before reading other device specific
+ * data. Paried with smp_store_relase() in
paired
Will fix.
Thanks
+ * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
+ * virtio_reset_device().
+ */
+ return smp_load_acquire(&vdev->irq_soft_enabled);
+}
+
/**
* virtio_has_dma_quirk - determine whether this device has the DMA quirk
* @vdev: the device
@@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
if (dev->config->enable_cbs)
dev->config->enable_cbs(dev);
+ /*
+ * Commit the driver setup before enabling the virtqueue
+ * callbacks. Paried with smp_load_acuqire() in
+ * virtio_irq_soft_enabled()
+ */
+ smp_store_release(&dev->irq_soft_enabled, true);
+
BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
}
--
2.25.1
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization