RE: [PATCH] video: hyperv: hyperv_fb: Use physical memory for fb on HyperV Gen 1 VMs.

Michael Kelley <mikelley@xxxxxxxxxxxxx> · Sat, 2 Nov 2019 00:28:32 +0000

From: Wei Hu <weh@xxxxxxxxxxxxx> Sent: Tuesday, October 22, 2019 4:11 AM
> 
> On Hyper-V, Generation 1 VMs can directly use VM's physical memory for
> their framebuffers. This can improve the efficiency of framebuffer and
> overall performence for VM. The physical memory assigned to framebuffer
> must be contiguous. We use CMA allocator to get contiguouse physicial
> memory when the framebuffer size is greater than 4MB. For size under
> 4MB, we use alloc_pages to achieve this.
> 
> To enable framebuffer memory allocation from CMA, supply a kernel
> parameter to give enough space to CMA allocator at boot time. For
> example:
>     cma=130m
> This gives 130MB memory to CAM allocator that can be allocated to
> framebuffer. If this fails, we fall back to the old way of using
> mmio for framebuffer.
> 
> Signed-off-by: Wei Hu <weh@xxxxxxxxxxxxx>
> ---

[snip]

> +/*
> + * Allocate enough contiguous physical memory.
> + * Return physical address if succeeded or -1 if failed.
> + */
> +static unsigned long hvfb_get_phymem(unsigned int request_size)
> +{
> +	struct page *page = NULL;
> +	unsigned int request_pages;
> +	unsigned long paddr = 0;

Using 'unsigned long' for physical addresses is problematic on 32-bit
systems with Physical Address Extension (PAE) enabled. Linux has
phys_addr_t that handles 32-bit PAE correctly and that probably
should be used here.   This whole driver needs to be carefully
checked for the correct type for physical addresses.

> +	unsigned int order = get_order(request_size);
> +
> +	if (request_size == 0)
> +		return -1;
> +
> +	/* Try call alloc_pages if the size is less than 2^MAX_ORDER */
> +	if (order < MAX_ORDER) {
> +		page = alloc_pages(GFP_KERNEL | __GFP_ZERO, order);
> +		if (!page)
> +			return -1;
> +
> +		request_pages = (1 << order);
> +		goto get_phymem1;
> +	}
> +
> +	/* Allocate from CMA */
> +	// request_pages = (request_size >> PAGE_SHIFT) + 1;
> +	request_pages = (round_up(request_size, PAGE_SIZE) >> PAGE_SHIFT);
> +	page = dma_alloc_from_contiguous(NULL, request_pages, 0, false);
> +
> +	if (page == NULL)
> +		return -1;
> +
> +get_phymem1:
> +	paddr = (page_to_pfn(page) << PAGE_SHIFT);
> +
> +	pr_info("Allocated %d pages starts at physical addr 0x%lx\n",
> +		request_pages, paddr);
> +
> +	return paddr;
> +}
> +
> +/* Release contiguous physical memory */
> +static void hvfb_release_phymem(unsigned long paddr, unsigned int size)

Same issue here with 'unsigned long paddr'.

> +{
> +	unsigned int order = get_order(size);
> +
> +	if (order < MAX_ORDER)
> +		__free_pages(pfn_to_page(paddr >> PAGE_SHIFT), order);
> +	else
> +		dma_release_from_contiguous(NULL,
> +					    pfn_to_page(paddr >> PAGE_SHIFT),
> +					    (round_up(size, PAGE_SIZE) >>
> +					     PAGE_SHIFT));
> +					    // (size >> PAGE_SHIFT) + 1);
> +}
> +
> 
>  /* Get framebuffer memory from Hyper-V video pci space */
>  static int hvfb_getmem(struct hv_device *hdev, struct fb_info *info)
> @@ -947,8 +1017,58 @@ static int hvfb_getmem(struct hv_device *hdev, struct fb_info
> *info)
>  	void __iomem *fb_virt;
>  	int gen2vm = efi_enabled(EFI_BOOT);
>  	resource_size_t pot_start, pot_end;
> +	unsigned long paddr;

Same issue with 'unsigned long'.

>  	int ret;
> 
> +	if (!gen2vm) {
> +		pdev = pci_get_device(PCI_VENDOR_ID_MICROSOFT,
> +			PCI_DEVICE_ID_HYPERV_VIDEO, NULL);
> +		if (!pdev) {
> +			pr_err("Unable to find PCI Hyper-V video\n");
> +			return -ENODEV;
> +		}
> +	}
> +
> +	info->apertures = alloc_apertures(1);
> +	if (!info->apertures)
> +		goto err1;
> +
> +	if (gen2vm) {
> +		info->apertures->ranges[0].base = screen_info.lfb_base;
> +		info->apertures->ranges[0].size = screen_info.lfb_size;
> +	} else {
> +		info->apertures->ranges[0].base = pci_resource_start(pdev, 0);
> +		info->apertures->ranges[0].size = pci_resource_len(pdev, 0);
> +	}
> +
> +	/*
> +	 * For Gen 1 VM, we can directly use the contiguous memory
> +	 * from VM. If we success, deferred IO happens directly
> +	 * on this allocated framebuffer memory, avoiding extra
> +	 * memory copy.
> +	 */
> +	if (!gen2vm) {
> +		paddr = hvfb_get_phymem(screen_fb_size);
> +		if (paddr != (unsigned long) -1) {

phys_addr_t

> +			par->mmio_pp = paddr;

I'm not really sure what to do about the above, because mmio_pp is
declared as 'unsigned long' when it really should be phys_addr_t.
But maybe a MMIO address will always be less than 4 Gb and therefore
will fit in 32 bits?

> +			par->mmio_vp = par->dio_vp = __va(paddr);
> +
> +			info->fix.smem_start = paddr;

smem_start is also declared as 'unsigned long', which seems
problematic with 32-bit PAE.  There are a lot of drivers that cast values
as 'unsigned long' before storing into smem_start

> +			info->fix.smem_len = screen_fb_size;
> +			info->screen_base = par->mmio_vp;
> +			info->screen_size = screen_fb_size;
> +
> +			par->need_docopy = false;
> +			goto getmem1;
> +		} else {
> +			pr_info("Unable to allocate enough contiguous physical memory on
> Gen 1 VM. Use MMIO instead.\n");
> +		}
> +	}
> +
> +	/*
> +	 * Cannot use the contiguous physical memory.
> +	 * Allocate mmio space for framebuffer.
> +	 */
>  	dio_fb_size =
>  		screen_width * screen_height * screen_depth / 8;
> 
> @@ -956,13 +1076,6 @@ static int hvfb_getmem(struct hv_device *hdev, struct fb_info
> *info)
>  		pot_start = 0;
>  		pot_end = -1;
>  	} else {
> -		pdev = pci_get_device(PCI_VENDOR_ID_MICROSOFT,
> -			      PCI_DEVICE_ID_HYPERV_VIDEO, NULL);
> -		if (!pdev) {
> -			pr_err("Unable to find PCI Hyper-V video\n");
> -			return -ENODEV;
> -		}
> -
>  		if (!(pci_resource_flags(pdev, 0) & IORESOURCE_MEM) ||
>  		    pci_resource_len(pdev, 0) < screen_fb_size) {
>  			pr_err("Resource not available or (0x%lx < 0x%lx)\n",
> @@ -991,20 +1104,6 @@ static int hvfb_getmem(struct hv_device *hdev, struct fb_info
> *info)
>  	if (par->dio_vp == NULL)
>  		goto err3;
> 
> -	info->apertures = alloc_apertures(1);
> -	if (!info->apertures)
> -		goto err4;
> -
> -	if (gen2vm) {
> -		info->apertures->ranges[0].base = screen_info.lfb_base;
> -		info->apertures->ranges[0].size = screen_info.lfb_size;
> -		remove_conflicting_framebuffers(info->apertures,
> -						KBUILD_MODNAME, false);
> -	} else {
> -		info->apertures->ranges[0].base = pci_resource_start(pdev, 0);
> -		info->apertures->ranges[0].size = pci_resource_len(pdev, 0);
> -	}
> -
>  	/* Physical address of FB device */
>  	par->mmio_pp = par->mem->start;

32-bit PAE problem could also occur here, even though the above line isn't
part of your patch.

Ideally, we would build the kernel in 32-bit mode with PAE enabled, and
test in an environment where the frame buffer gets allocated above the
4 Gbyte line.  There may be some similar bugs in other parts of this
driver that aren't touched by the patch.

Michael