(Re-adding Rusty to CC, looks like the linux-foundation ml drops CCs.) On (Sun) Oct 03 2010 [17:53:59], Michael S. Tsirkin wrote: > On Wed, Sep 29, 2010 at 05:42:24PM +0530, Amit Shah wrote: > > Let host know of our state (partial yet) so that the host PCI device is > > up to where we were when we were suspended. > > > > This is still an RFC as this doesn't completely work: an unused device > > at the time of suspend will work fine after resume, but a device that > > has seen some activity doesn't behave well after resume. Especially, > > host->guest communication doesn't go through. This could be due to > > interrupts/MSI not being wired properly. I haven't looked at this part > > of the problem yet. > > > > We also need a per-driver resume callback which can update the devices > > with driver-specific data. Eg, for virtio_console, the guest port > > open/close status will have to be known to the device. > > > > QEMU also starts using the queues as soon as the VIRTIO_PCI_QUEUE_PFN > > command is sent, so it has to be taught to only start using queues when > > the device is ready and the queues are set up. > > > > However, I just wanted to send this out to get reactions / comments. > > > > Not-yet-Signed-off-by: Amit Shah <amit.shah@xxxxxxxxxx> > > --- > > drivers/virtio/virtio_pci.c | 34 ++++++++++++++++++++++++++++++++++ > > drivers/virtio/virtio_ring.c | 8 ++++++++ > > include/linux/virtio.h | 2 ++ > > include/linux/virtio_pci.h | 4 +++- > > 4 files changed, 47 insertions(+), 1 deletions(-) > > > > diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c > > index ef8d9d5..a3c7f59 100644 > > --- a/drivers/virtio/virtio_pci.c > > +++ b/drivers/virtio/virtio_pci.c > > @@ -698,8 +698,42 @@ static int virtio_pci_suspend(struct pci_dev *pci_dev, pm_message_t state) > > > > static int virtio_pci_resume(struct pci_dev *pci_dev) > > { > > + struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev); > > + struct virtio_pci_vq_info *info; > > + unsigned long flags; > > + > > pci_restore_state(pci_dev); > > pci_set_power_state(pci_dev, PCI_D0); > > + > > + iowrite32(vp_dev->vdev.features[0], vp_dev->ioaddr+VIRTIO_PCI_GUEST_FEATURES); > > need spaces around + That bit is taken from a place elsewhere in the file -- I guess it's that way to keep everything on the same line. But in the final submission, I'll split up the line as I've done the others below. > > + if (vp_dev->msix_used_vectors) > > + iowrite16(vp_dev->msix_used_vectors, > > + vp_dev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR); > > I think this is wrong, msix_used_vectors is a counter not a vector > number, try with 2 vectors and it will break I think: this is shared > vectors configuration. We just never had a need to remember the config > vector number, either do remember it, or rely on the fact that it is > always 0 in our code: and maybe add BUG_ON in vp_request_msix_vectors > where it is first assigned. The condition would simply be > msix_enabled. Yes, I hadn't paid attention to MSI, this helps -- I'll adjust the code accordingly. > > + > > + spin_lock_irqsave(&vp_dev->lock, flags); > > + list_for_each_entry(info, &vp_dev->virtqueues, node) { > > + /* Select the queue we're interested in */ > > + iowrite16(info->queue_index, > > + vp_dev->ioaddr + VIRTIO_PCI_QUEUE_SEL); > > + > > + /* Update the last idx we sent data in */ > > + iowrite16(virtqueue_get_avail_idx(info->vq), > > + vp_dev->ioaddr + VIRTIO_PCI_AVAIL_IDX); > > Interesting. Could we just reset the ring instead? > I think this would also solve the qemu problem you > outline, won't it? The problem here is qemu doesn't "know" we went into suspend and came out of it. When going to suspend, qemu could've received a kick notification and would've been just about to process some queue entries. Now when we resume and reset the ring, qemu could crash anyway seeing invalid index values. I think that's happening now anyway. However, I think the current qemu problem I have might be related to me not introducing memory barriers here. > > + if (info->msix_vector != VIRTIO_MSI_NO_VECTOR) { > > + iowrite16(info->msix_vector, > > + vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR); > > + } > > actually it's probably better to write msix_vector if it is > VIRTIO_MSI_NO_VECTOR as well, just to make sure > we are in a known state. The condition > should be that msix is enabled in the device. > > Further, please read the value back and verify that it was > written successfully: on error you get VIRTIO_MSI_NO_VECTOR. Yes, error handling is to be done yet. > > + /* activate the queue */ > > + iowrite32(virt_to_phys(info->queue) >> VIRTIO_PCI_QUEUE_ADDR_SHIFT, > > + vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN); > > + > > + } > > + spin_unlock_irqrestore(&vp_dev->lock, flags); > > + > > + vp_set_status(&vp_dev->vdev, VIRTIO_CONFIG_S_DRIVER_OK); > > + > > return 0; > > } > > #endif > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > index 1475ed6..3eb91d1 100644 > > --- a/drivers/virtio/virtio_ring.c > > +++ b/drivers/virtio/virtio_ring.c > > @@ -237,6 +237,14 @@ add_head: > > } > > EXPORT_SYMBOL_GPL(virtqueue_add_buf_gfp); > > > > +u16 virtqueue_get_avail_idx(struct virtqueue *_vq) > > +{ > > + struct vring_virtqueue *vq = to_vvq(_vq); > > + > > + return vq->vring.avail->idx; > > +} > > +EXPORT_SYMBOL_GPL(virtqueue_get_avail_idx); > > + > > void virtqueue_kick(struct virtqueue *_vq) > > { > > struct vring_virtqueue *vq = to_vvq(_vq); > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h > > index aff5b4f..cdb04ec 100644 > > --- a/include/linux/virtio.h > > +++ b/include/linux/virtio.h > > @@ -88,6 +88,8 @@ bool virtqueue_enable_cb(struct virtqueue *vq); > > > > void *virtqueue_detach_unused_buf(struct virtqueue *vq); > > > > +u16 virtqueue_get_avail_idx(struct virtqueue *vq); > > + > > This is a bit of a layering violation, in that the avail index > belongs to the ring internals. Yes, it is. This exercise is basically to find out what's needed to make suspend work; we can design the final solution once we have a better idea of what's involved. > Possibly the best solution to expose it > is to put the index in the ring. This was one of the ideas > proposed for vring2. Other options include getting rid of the > index entirely. > > > /** > > * virtio_device - representation of a device using virtio > > * @index: unique position on the virtio bus > > diff --git a/include/linux/virtio_pci.h b/include/linux/virtio_pci.h > > index 9a3d7c4..aa3e584 100644 > > --- a/include/linux/virtio_pci.h > > +++ b/include/linux/virtio_pci.h > > @@ -55,9 +55,11 @@ > > /* Vector value used to disable MSI for queue */ > > #define VIRTIO_MSI_NO_VECTOR 0xffff > > > > +#define VIRTIO_PCI_AVAIL_IDX 24 > > + > > /* The remaining space is defined by each driver as the per-driver > > * configuration space */ > > -#define VIRTIO_PCI_CONFIG(dev) ((dev)->msix_enabled ? 24 : 20) > > +#define VIRTIO_PCI_CONFIG(dev) 26 > > This will silently break all devices out there, won't it? Sure it does :-) (Has to be handled when this is proposed as a fix officially.) Amit _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/virtualization