Re: [PATCH] virtio: 9p: correctly pass physical address to userspace for high pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Rusty,

On Thu, Oct 18, 2012 at 03:19:06AM +0100, Rusty Russell wrote:
> Will Deacon <will.deacon@xxxxxxx> writes:
> > When using a virtio transport, the 9p net device allocates pages to back
> > the descriptors inserted into the virtqueue. These allocations may be
> > performed from atomic context (under the channel lock) and can therefore
> > return high mappings which aren't suitable for virt_to_phys.
> 
> I had not appreciated that subtlety about GFP_ATOMIC :(

Yeah, it's unfortunate for poor old userspace.

> This isn't just 9p, the console, block, scsi and net devices also use
> GFP_ATOMIC.

Ok, I'll split this patch in two since I think that only 9p has the
zero-copy stuff, which is why an extra fix is needed there for creating the
scatterlist correctly.

> > @@ -165,7 +166,8 @@ static int vring_add_indirect(struct vring_virtqueue *vq,
> >  	/* Use a single buffer which doesn't continue */
> >  	head = vq->free_head;
> >  	vq->vring.desc[head].flags = VRING_DESC_F_INDIRECT;
> > -	vq->vring.desc[head].addr = virt_to_phys(desc);
> > +	vq->vring.desc[head].addr = page_to_phys(kmap_to_page(desc)) +
> > +				    ((unsigned long)desc & ~PAGE_MASK);
> >  	vq->vring.desc[head].len = i * sizeof(struct vring_desc);
> 
> Gah, virt_to_phys_harder()?

Tell me about it...

> What's the performance effect?  If it's negligible, why doesn't
> virt_to_phys() just do this for us?

I've not measured it, but even when you don't have CONFIG_HIGHMEM, there's
going to be an overhead here because we go around the houses to get the page
and then add the offset on afterwards. I doubt it's something we want to
plumb directly into virt_to_phys (also, kmap_to_page may call virt_to_phys via
the __pa macro so we'd get stuck).

> We do have an alternate solution: masking out __GFP_HIGHMEM from the
> kmalloc of desc.  If it fails, we will fall back to laying out the
> virtio request directly inside the ring; if it doesn't fit, we'll wait
> for the device to consume more buffers.

Hmm, that will probably work for the vring but the zero-copy code for 9p may
just give us an address from userspace if I'm understanding it correctly. In
that case, we really have to do the translation as below (which is actually
much cleaner because everything is page-aligned).

> > @@ -325,7 +326,7 @@ static int p9_get_mapped_pages(struct virtio_chan *chan,
> >  		int count = nr_pages;
> >  		while (nr_pages) {
> >  			s = rest_of_page(data);
> > -			pages[index++] = virt_to_page(data);
> > +			pages[index++] = kmap_to_page(data);
> >  			data += s;
> >  			nr_pages--;
> >  		}

So what do you reckon? How about I leave this hunk as a separate patch and
have a play masking out __GFP_HIGHMEM for the vring descriptor?

Cheers,

Will
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux