On 7/24/19 3:56 PM, David Hildenbrand wrote: > On 24.07.19 21:47, Michael S. Tsirkin wrote: >> On Wed, Jul 10, 2019 at 03:51:58PM -0400, Nitesh Narayan Lal wrote: >>> Enables the kernel to negotiate VIRTIO_BALLOON_F_HINTING feature with the >>> host. If it is available and page_hinting_flag is set to true, page_hinting >>> is enabled and its callbacks are configured along with the max_pages count >>> which indicates the maximum number of pages that can be isolated and hinted >>> at a time. Currently, only free pages of order >= (MAX_ORDER - 2) are >>> reported. To prevent any false OOM max_pages count is set to 16. >>> >>> By default page_hinting feature is enabled and gets loaded as soon >>> as the virtio-balloon driver is loaded. However, it could be disabled >>> by writing the page_hinting_flag which is a virtio-balloon parameter. >>> >>> Signed-off-by: Nitesh Narayan Lal <nitesh@xxxxxxxxxx> >>> --- >>> drivers/virtio/Kconfig | 1 + >>> drivers/virtio/virtio_balloon.c | 91 ++++++++++++++++++++++++++++- >>> include/uapi/linux/virtio_balloon.h | 11 ++++ >>> 3 files changed, 102 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig >>> index 023fc3bc01c6..dcc0cb4269a5 100644 >>> --- a/drivers/virtio/Kconfig >>> +++ b/drivers/virtio/Kconfig >>> @@ -47,6 +47,7 @@ config VIRTIO_BALLOON >>> tristate "Virtio balloon driver" >>> depends on VIRTIO >>> select MEMORY_BALLOON >>> + select PAGE_HINTING >>> ---help--- >>> This driver supports increasing and decreasing the amount >>> of memory within a KVM guest. >>> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c >>> index 44339fc87cc7..1fb0eb0b2c20 100644 >>> --- a/drivers/virtio/virtio_balloon.c >>> +++ b/drivers/virtio/virtio_balloon.c >>> @@ -18,6 +18,7 @@ >>> #include <linux/mm.h> >>> #include <linux/mount.h> >>> #include <linux/magic.h> >>> +#include <linux/page_hinting.h> >>> >>> /* >>> * Balloon device works in 4K page units. So each page is pointed to by >>> @@ -35,6 +36,12 @@ >>> /* The size of a free page block in bytes */ >>> #define VIRTIO_BALLOON_FREE_PAGE_SIZE \ >>> (1 << (VIRTIO_BALLOON_FREE_PAGE_ORDER + PAGE_SHIFT)) >>> +/* Number of isolated pages to be reported to the host at a time. >>> + * TODO: >>> + * 1. Set it via host. >>> + * 2. Find an optimal value for this. >>> + */ >>> +#define PAGE_HINTING_MAX_PAGES 16 >>> >>> #ifdef CONFIG_BALLOON_COMPACTION >>> static struct vfsmount *balloon_mnt; >>> @@ -45,6 +52,7 @@ enum virtio_balloon_vq { >>> VIRTIO_BALLOON_VQ_DEFLATE, >>> VIRTIO_BALLOON_VQ_STATS, >>> VIRTIO_BALLOON_VQ_FREE_PAGE, >>> + VIRTIO_BALLOON_VQ_HINTING, >>> VIRTIO_BALLOON_VQ_MAX >>> }; >>> >>> @@ -54,7 +62,8 @@ enum virtio_balloon_config_read { >>> >>> struct virtio_balloon { >>> struct virtio_device *vdev; >>> - struct virtqueue *inflate_vq, *deflate_vq, *stats_vq, *free_page_vq; >>> + struct virtqueue *inflate_vq, *deflate_vq, *stats_vq, *free_page_vq, >>> + *hinting_vq; >>> >>> /* Balloon's own wq for cpu-intensive work items */ >>> struct workqueue_struct *balloon_wq; >>> @@ -112,6 +121,9 @@ struct virtio_balloon { >>> >>> /* To register a shrinker to shrink memory upon memory pressure */ >>> struct shrinker shrinker; >>> + >>> + /* Array object pointing at the isolated pages ready for hinting */ >>> + struct isolated_memory isolated_pages[PAGE_HINTING_MAX_PAGES]; >>> }; >>> >>> static struct virtio_device_id id_table[] = { >>> @@ -119,6 +131,66 @@ static struct virtio_device_id id_table[] = { >>> { 0 }, >>> }; >>> >>> +static struct page_hinting_config page_hinting_conf; >>> +bool page_hinting_flag = true; >>> +struct virtio_balloon *hvb; >>> +module_param(page_hinting_flag, bool, 0444); >>> +MODULE_PARM_DESC(page_hinting_flag, "Enable page hinting"); >>> + >>> +static int page_hinting_report(void) >>> +{ >>> + struct virtqueue *vq = hvb->hinting_vq; >>> + struct scatterlist sg; >>> + int err = 0, unused; >>> + >>> + mutex_lock(&hvb->balloon_lock); >>> + sg_init_one(&sg, hvb->isolated_pages, sizeof(hvb->isolated_pages[0]) * >>> + PAGE_HINTING_MAX_PAGES); >>> + err = virtqueue_add_outbuf(vq, &sg, 1, hvb, GFP_KERNEL); >> In Alex's patch, I really like it that he's passing pages as sg >> entries. IMHO that's both cleaner and allows seamless >> support for arbitrary page sizes. >> > +1 > > I especially like passing full addresses and sizes instead of PFNs and > orders (compared to Alex's v1, where he would pass PFNs and orders). I agree it fixes the issues which could have been introduced due to different page sizes in the host and the guest. > -- Thanks Nitesh