On Tue, Feb 11, 2020 at 12:03:57PM +0100, David Hildenbrand wrote: > On 22.01.20 18:43, Alexander Duyck wrote: > > From: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx> > > > > Add support for the page reporting feature provided by virtio-balloon. > > Reporting differs from the regular balloon functionality in that is is > > much less durable than a standard memory balloon. Instead of creating a > > list of pages that cannot be accessed the pages are only inaccessible > > while they are being indicated to the virtio interface. Once the > > interface has acknowledged them they are placed back into their respective > > free lists and are once again accessible by the guest system. > > > > Unlike a standard balloon we don't inflate and deflate the pages. Instead > > we perform the reporting, and once the reporting is completed it is > > assumed that the page has been dropped from the guest and will be faulted > > back in the next time the page is accessed. > > > > Acked-by: Michael S. Tsirkin <mst@xxxxxxxxxx> > > Reviewed-by: David Hildenbrand <david@xxxxxxxxxx> > > Signed-off-by: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx> > > --- > > drivers/virtio/Kconfig | 1 + > > drivers/virtio/virtio_balloon.c | 64 +++++++++++++++++++++++++++++++++++ > > include/uapi/linux/virtio_balloon.h | 1 + > > 3 files changed, 66 insertions(+) > > > > diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig > > index 078615cf2afc..4b2dd8259ff5 100644 > > --- a/drivers/virtio/Kconfig > > +++ b/drivers/virtio/Kconfig > > @@ -58,6 +58,7 @@ config VIRTIO_BALLOON > > tristate "Virtio balloon driver" > > depends on VIRTIO > > select MEMORY_BALLOON > > + select PAGE_REPORTING > > ---help--- > > This driver supports increasing and decreasing the amount > > of memory within a KVM guest. > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > > index 40bb7693e3de..a07b9e18a292 100644 > > --- a/drivers/virtio/virtio_balloon.c > > +++ b/drivers/virtio/virtio_balloon.c > > @@ -19,6 +19,7 @@ > > #include <linux/mount.h> > > #include <linux/magic.h> > > #include <linux/pseudo_fs.h> > > +#include <linux/page_reporting.h> > > > > /* > > * Balloon device works in 4K page units. So each page is pointed to by > > @@ -47,6 +48,7 @@ enum virtio_balloon_vq { > > VIRTIO_BALLOON_VQ_DEFLATE, > > VIRTIO_BALLOON_VQ_STATS, > > VIRTIO_BALLOON_VQ_FREE_PAGE, > > + VIRTIO_BALLOON_VQ_REPORTING, > > VIRTIO_BALLOON_VQ_MAX > > }; > > > > @@ -114,6 +116,10 @@ struct virtio_balloon { > > > > /* To register a shrinker to shrink memory upon memory pressure */ > > struct shrinker shrinker; > > + > > + /* Free page reporting device */ > > + struct virtqueue *reporting_vq; > > + struct page_reporting_dev_info pr_dev_info; > > }; > > > > static struct virtio_device_id id_table[] = { > > @@ -153,6 +159,33 @@ static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq) > > > > } > > > > +int virtballoon_free_page_report(struct page_reporting_dev_info *pr_dev_info, > > + struct scatterlist *sg, unsigned int nents) > > +{ > > + struct virtio_balloon *vb = > > + container_of(pr_dev_info, struct virtio_balloon, pr_dev_info); > > + struct virtqueue *vq = vb->reporting_vq; > > + unsigned int unused, err; > > + > > + /* We should always be able to add these buffers to an empty queue. */ > > + err = virtqueue_add_inbuf(vq, sg, nents, vb, GFP_NOWAIT | __GFP_NOWARN); > > + > > + /* > > + * In the extremely unlikely case that something has occurred and we > > + * are able to trigger an error we will simply display a warning > > + * and exit without actually processing the pages. > > + */ > > + if (WARN_ON_ONCE(err)) > > + return err; > > + > > + virtqueue_kick(vq); > > + > > + /* When host has read buffer, this completes via balloon_ack */ > > + wait_event(vb->acked, virtqueue_get_buf(vq, &unused)); > > + > > + return 0; > > +} > > > Did you see the discussion regarding unifying handling of > inflate/deflate/free_page_hinting_free_page_reporting, requested by > Michael? I think free page reporting is special and shall be left alone. Not sure what do you mean by "left alone here". Could you clarify? > VIRTIO_BALLOON_F_REPORTING is nothing but a more advanced inflate, right > (sg, inflate based on size - not "virtio pages")? Not exactly - it's also initiated by guest as opposed to host, and not guided by the ballon size request set by the host. And uses a dedicated queue to avoid blocking other functionality ... I really think this is more like an inflate immediately followed by deflate. > And you rely on > deflates not being required before reusing an inflated page. > > I suggest the following: > > /* New interface (+ 2 virtqueues) to inflate/deflate using a SG */ > VIRTIO_BALLOON_F_SG > /* > * No need to deflate when reusing pages (once the inflate request was > * processed). Applies to all inflate queues. > */ > VIRTIO_BALLOON_F_OPTIONAL_DEFLATE > > And two new virtqueues > > VIRTIO_BALLOON_VQ_INFLATE_SG > VIRTIO_BALLOON_VQ_DEFLATE_SG > > > Your feature would depend on VIRTIO_BALLOON_F_SG && > VIRTIO_BALLOON_F_OPTIONAL_DEFLATE. VIRTIO_BALLOON_F_OPTIONAL_DEFLATE > could be reused to avoid deflating on certain events (e.g., from > OOM/shrinker). > > Thoughts? I'd rather wait until we have a usecase and preferably a POC showing it helps before we add optional deflate ... For now I personally am fine with just making this go ahead as is, and imply SG and OPTIONAL_DEFLATE just for this VQ. Do you feel strongly we need to bring this up to a TC vote? It means spec patch needs to be written, but it does not have to be a big patch ... > -- > Thanks, > > David / dhildenb