[no subject]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Bcc: 
Subject: Re: [PATCH 3/3] virtio: harden vring IRQ
Message-ID: <20220325021422-mutt-send-email-mst@xxxxxxxxxx>
Reply-To: 
In-Reply-To: <f7046303-7d7d-e39f-3c71-3688126cc812@xxxxxxxxxx>

On Fri, Mar 25, 2022 at 11:04:08AM +0800, Jason Wang wrote:
> 
> 在 2022/3/24 下午7:03, Michael S. Tsirkin 写道:
> > On Thu, Mar 24, 2022 at 04:40:04PM +0800, Jason Wang wrote:
> > > This is a rework on the previous IRQ hardening that is done for
> > > virtio-pci where several drawbacks were found and were reverted:
> > > 
> > > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
> > >     that is used by some device such as virtio-blk
> > > 2) done only for PCI transport
> > > 
> > > In this patch, we tries to borrow the idea from the INTX IRQ hardening
> > > in the reverted commit 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> > > by introducing a global irq_soft_enabled variable for each
> > > virtio_device. Then we can to toggle it during
> > > virtio_reset_device()/virtio_device_ready(). A synchornize_rcu() is
> > > used in virtio_reset_device() to synchronize with the IRQ handlers. In
> > > the future, we may provide config_ops for the transport that doesn't
> > > use IRQ. With this, vring_interrupt() can return check and early if
> > > irq_soft_enabled is false. This lead to smp_load_acquire() to be used
> > > but the cost should be acceptable.
> > Maybe it should be but is it? Can't we use synchronize_irq instead?
> 
> 
> Even if we allow the transport driver to synchornize through
> synchronize_irq() we still need a check in the vring_interrupt().
> 
> We do something like the following previously:
> 
>         if (!READ_ONCE(vp_dev->intx_soft_enabled))
>                 return IRQ_NONE;
> 
> But it looks like a bug since speculative read can be done before the check
> where the interrupt handler can't see the uncommitted setup which is done by
> the driver.

I don't think so - if you sync after setting the value then
you are guaranteed that any handler running afterwards
will see the new value.

Although I couldn't find anything about this in memory-barriers.txt
which surprises me.

CC Paul to help make sure I'm right.


> 
> > 
> > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > module parameter is introduced to enable the hardening so function
> > > hardening is disabled by default.
> > Which devices are these? How come they send an interrupt before there
> > are any buffers in any queues?
> 
> 
> I copied this from the commit log for 22b7050a024d7
> 
> "
> 
>     This change will also benefit old hypervisors (before 2009)
>     that send interrupts without checking DRIVER_OK: previously,
>     the callback could race with driver-specific initialization.
> "
> 
> If this is only for config interrupt, I can remove the above log.


This is only for config interrupt.

> 
> > 
> > > Note that the hardening is only done for vring interrupt since the
> > > config interrupt hardening is already done in commit 22b7050a024d7
> > > ("virtio: defer config changed notifications"). But the method that is
> > > used by config interrupt can't be reused by the vring interrupt
> > > handler because it uses spinlock to do the synchronization which is
> > > expensive.
> > > 
> > > Signed-off-by: Jason Wang <jasowang@xxxxxxxxxx>
> > 
> > > ---
> > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > >   include/linux/virtio.h        |  4 ++++
> > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > index 8dde44ea044a..85e331efa9cc 100644
> > > --- a/drivers/virtio/virtio.c
> > > +++ b/drivers/virtio/virtio.c
> > > @@ -7,6 +7,12 @@
> > >   #include <linux/of.h>
> > >   #include <uapi/linux/virtio_ids.h>
> > > +static bool irq_hardening = false;
> > > +
> > > +module_param(irq_hardening, bool, 0444);
> > > +MODULE_PARM_DESC(irq_hardening,
> > > +		 "Disalbe IRQ software processing when it is not expected");
> > > +
> > >   /* Unique numbering for virtio devices. */
> > >   static DEFINE_IDA(virtio_index_ida);
> > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > >    * */
> > >   void virtio_reset_device(struct virtio_device *dev)
> > >   {
> > > +	/*
> > > +	 * The below synchronize_rcu() guarantees that any
> > > +	 * interrupt for this line arriving after
> > > +	 * synchronize_rcu() has completed is guaranteed to see
> > > +	 * irq_soft_enabled == false.
> > News to me I did not know synchronize_rcu has anything to do
> > with interrupts. Did not you intend to use synchronize_irq?
> > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > though it's most likely is ...
> 
> 
> According to the comment above tree RCU version of synchronize_rcu():
> 
> """
> 
>  * RCU read-side critical sections are delimited by rcu_read_lock()
>  * and rcu_read_unlock(), and may be nested.  In addition, but only in
>  * v5.0 and later, regions of code across which interrupts, preemption,
>  * or softirqs have been disabled also serve as RCU read-side critical
>  * sections.  This includes hardware interrupt handlers, softirq handlers,
>  * and NMI handlers.
> """
> 
> So interrupt handlers are treated as read-side critical sections.
> 
> And it has the comment for explain the barrier:
> 
> """
> 
>  * Note that this guarantee implies further memory-ordering guarantees.
>  * On systems with more than one CPU, when synchronize_rcu() returns,
>  * each CPU is guaranteed to have executed a full memory barrier since
>  * the end of its last RCU read-side critical section whose beginning
>  * preceded the call to synchronize_rcu().  In addition, each CPU having
> """
> 
> So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> barrier, if the interrupt come after WRITE_ONCE() it will see the
> irq_soft_enabled as false.
> 

You are right. So then
1. I do not think we need load_acquire - why is it needed? Just
   READ_ONCE should do.
2. isn't synchronize_irq also doing the same thing?


> > 
> > > +	 */
> > > +	WRITE_ONCE(dev->irq_soft_enabled, false);
> > > +	synchronize_rcu();
> > > +
> > >   	dev->config->reset(dev);
> > >   }
> > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > Please add comment explaining where it will be enabled.
> > Also, we *really* don't need to synch if it was already disabled,
> > let's not add useless overhead to the boot sequence.
> 
> 
> Ok.
> 
> 
> > 
> > 
> > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > >   	spin_lock_init(&dev->config_lock);
> > >   	dev->config_enabled = false;
> > >   	dev->config_change_pending = false;
> > > +	dev->irq_soft_check = irq_hardening;
> > > +
> > > +	if (dev->irq_soft_check)
> > > +		dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > >   	/* We always start by resetting the device, in case a previous
> > >   	 * driver messed it up.  This also tests that code path a little. */
> > one of the points of hardening is it's also helpful for buggy
> > devices. this flag defeats the purpose.
> 
> 
> Do you mean:
> 
> 1) we need something like config_enable? This seems not easy to be
> implemented without obvious overhead, mainly the synchronize with the
> interrupt handlers

But synchronize is only on tear-down path. That is not critical for any
users at the moment, even less than probe.

> 2) enable this by default, so I don't object, but this may have some risk
> for old hypervisors


The risk if there's a driver adding buffers without setting DRIVER_OK.
So with this approach, how about we rename the flag "driver_ok"?
And then add_buf can actually test it and BUG_ON if not there  (at least
in the debug build).

And going down from there, how about we cache status in the
device? Then we don't need to keep re-reading it every time,
speeding boot up a tiny bit.

> 
> > 
> > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > index 962f1477b1fa..0170f8c784d8 100644
> > > --- a/drivers/virtio/virtio_ring.c
> > > +++ b/drivers/virtio/virtio_ring.c
> > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > >   	return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > >   }
> > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > +irqreturn_t vring_interrupt(int irq, void *v)
> > >   {
> > > +	struct virtqueue *_vq = v;
> > > +	struct virtio_device *vdev = _vq->vdev;
> > >   	struct vring_virtqueue *vq = to_vvq(_vq);
> > > +	if (!virtio_irq_soft_enabled(vdev)) {
> > > +		dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > +		return IRQ_NONE;
> > > +	}
> > > +
> > >   	if (!more_used(vq)) {
> > >   		pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > >   		return IRQ_NONE;
> > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > index 5464f398912a..957d6ad604ac 100644
> > > --- a/include/linux/virtio.h
> > > +++ b/include/linux/virtio.h
> > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > >    * @config_enabled: configuration change reporting enabled
> > >    * @config_change_pending: configuration change reported while disabled
> > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > + * @irq_soft_enabled: callbacks enabled
> > >    * @config_lock: protects configuration change reporting
> > >    * @dev: underlying device.
> > >    * @id: the device type identification (used to match it with a driver).
> > > @@ -109,6 +111,8 @@ struct virtio_device {
> > >   	bool failed;
> > >   	bool config_enabled;
> > >   	bool config_change_pending;
> > > +	bool irq_soft_check;
> > > +	bool irq_soft_enabled;
> > >   	spinlock_t config_lock;
> > >   	spinlock_t vqs_list_lock; /* Protects VQs list access */
> > >   	struct device dev;
> > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > index dafdc7f48c01..9c1b61f2e525 100644
> > > --- a/include/linux/virtio_config.h
> > > +++ b/include/linux/virtio_config.h
> > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > >   	return __virtio_test_bit(vdev, fbit);
> > >   }
> > > +/*
> > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > + * @vdev: the device
> > > + */
> > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > +{
> > > +	if (!vdev->irq_soft_check)
> > > +		return true;
> > > +
> > > +	/*
> > > +	 * Read irq_soft_enabled before reading other device specific
> > > +	 * data. Paried with smp_store_relase() in
> > paired
> 
> 
> Will fix.
> 
> Thanks
> 
> 
> > 
> > > +	 * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > +	 * virtio_reset_device().
> > > +	 */
> > > +	return smp_load_acquire(&vdev->irq_soft_enabled);
> > > +}
> > > +
> > >   /**
> > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > >    * @vdev: the device
> > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > >   	if (dev->config->enable_cbs)
> > >                     dev->config->enable_cbs(dev);
> > > +	/*
> > > +	 * Commit the driver setup before enabling the virtqueue
> > > +	 * callbacks. Paried with smp_load_acuqire() in
> > > +	 * virtio_irq_soft_enabled()
> > > +	 */
> > > +	smp_store_release(&dev->irq_soft_enabled, true);
> > > +
> > >   	BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > >   	dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > >   }
> > > -- 
> > > 2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization




[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux