RE: [PATCH V2 2/3] Drivers: hv: balloon: Support 2M page allocations for ballooning

KY Srinivasan <kys@xxxxxxxxxxxxx> · Tue, 19 Mar 2013 14:53:57 +0000

> -----Original Message-----
> From: Michal Hocko [mailto:mhocko@xxxxxxx]
> Sent: Tuesday, March 19, 2013 10:46 AM
> To: KY Srinivasan
> Cc: gregkh@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> devel@xxxxxxxxxxxxxxxxxxxxxx; olaf@xxxxxxxxx; apw@xxxxxxxxxxxxx;
> andi@xxxxxxxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx;
> kamezawa.hiroyuki@xxxxxxxxx; hannes@xxxxxxxxxxx; yinghan@xxxxxxxxxx
> Subject: Re: [PATCH V2 2/3] Drivers: hv: balloon: Support 2M page allocations for
> ballooning
> 
> On Mon 18-03-13 13:51:37, K. Y. Srinivasan wrote:
> > On Hyper-V it will be very efficient to use 2M allocations in the guest as this
> > makes the ballooning protocol with the host that much more efficient. Hyper-V
> > uses page ranges (start pfn : number of pages) to specify memory being moved
> > around and with 2M pages this encoding can be very efficient. However, when
> > memory is returned to the guest, the host does not guarantee any granularity.
> > To deal with this issue, split the page soon after a successful 2M allocation
> > so that this memory can potentially be freed as 4K pages.
> 
> How many pages are requested usually?

This depends entirely on how many pages the guest has, the pressure reported by the guest and
the overall memory demand as perceived by the host. On idling guests that have been configured with
several Giga bytes of memory, I have seen several Giga bytes being ballooned out of the guest as soon
as new VMs are started or pressure goes up in an existing VM. In these cases, if 2M allocations succeed,
the ballooning operation can be done very efficiently.
> 
> > If 2M allocations fail, we revert to 4K allocations.
> >
> > In this version of the patch, based on the feedback from Michal Hocko
> > <mhocko@xxxxxxx>, I have added some additional commentary to the patch
> > description.
> >
> > Signed-off-by: K. Y. Srinivasan <kys@xxxxxxxxxxxxx>
> 
> I am not going to ack the patch because I am still not entirely
> convinced that big allocations are worth it. But that is up to you and
> hyper-V users.

As I said, Hyper-V has chosen 2M unit as the unit that will be most efficient in moving memory
around. All I am doing is making Linux participate in this protocol just as efficiently as Windows 
guests do.

Regards,

K. Y
> 
> > ---
> >  drivers/hv/hv_balloon.c |   18 ++++++++++++++++--
> >  1 files changed, 16 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
> > index 2cf7d4e..71655b4 100644
> > --- a/drivers/hv/hv_balloon.c
> > +++ b/drivers/hv/hv_balloon.c
> > @@ -997,6 +997,14 @@ static int  alloc_balloon_pages(struct
> hv_dynmem_device *dm, int num_pages,
> >
> >  		dm->num_pages_ballooned += alloc_unit;
> >
> > +		/*
> > +		 * If we allocatted 2M pages; split them so we
> > +		 * can free them in any order we get.
> > +		 */
> > +
> > +		if (alloc_unit != 1)
> > +			split_page(pg, get_order(alloc_unit << PAGE_SHIFT));
> > +
> >  		bl_resp->range_count++;
> >  		bl_resp->range_array[i].finfo.start_page =
> >  			page_to_pfn(pg);
> 
> I would suggest also using __GFP_NO_KSWAPD (or basically use
> GFP_TRANSHUGE for alloc_unit>0) for the allocation to be as least
> disruptive as possible.
> 
> > @@ -1023,9 +1031,10 @@ static void balloon_up(struct work_struct *dummy)
> >
> >
> >  	/*
> > -	 * Currently, we only support 4k allocations.
> > +	 * We will attempt 2M allocations. However, if we fail to
> > +	 * allocate 2M chunks, we will go back to 4k allocations.
> >  	 */
> > -	alloc_unit = 1;
> > +	alloc_unit = 512;
> >
> >  	while (!done) {
> >  		bl_resp = (struct dm_balloon_response *)send_buffer;
> > @@ -1041,6 +1050,11 @@ static void balloon_up(struct work_struct *dummy)
> >  						bl_resp, alloc_unit,
> >  						 &alloc_error);
> >
> 
> You should handle alloc_balloon_pages returns 0 && !alloc_error which
> happens when num_pages < alloc_unit.
> 
> > +		if ((alloc_error) && (alloc_unit != 1)) {
> > +			alloc_unit = 1;
> > +			continue;
> > +		}
> > +
> >  		if ((alloc_error) || (num_ballooned == num_pages)) {
> >  			bl_resp->more_pages = 0;
> >  			done = true;
> > --
> > 1.7.4.1
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> 
> --
> Michal Hocko
> SUSE Labs
> 

_______________________________________________
devel mailing list
devel@xxxxxxxxxxxxxxxxxxxxxx
http://driverdev.linuxdriverproject.org/mailman/listinfo/devel