Re: Infrastructure for zerocopy I/O

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 30, 2015 at 12:14:59PM -0500, Alan Stern wrote:
> As I recall, Markus's original patch took care of this by checking to
> see whether the transfer buffer was in one of the mmap'ed areas.  If it
> was then the transfer would be zerocopy; otherwise it would be normal.  
> That seems like a reasonable approach.

OK. So that's basically preserved in this version.

>> Clarification appreciated ;-)
> Detailed notes below.

Thanks for the review.

>> +struct usb_memory {
>> +	struct list_head memlist;
>> +	int vma_use_count;
>> +	int usb_use_count;
>> +	u32 offset;
> What's the reason for the "offset" member?  It doesn't seem to do
> anything.

No longer needed; removed.

>> +	u32 size;
>> +	void *mem;
> You'll need to store the DMA address as well.

Done.

>> +	mem = alloc_pages_exact(size, GFP_KERNEL | GFP_DMA32);
> dma_zalloc_coherent(), not alloc_pages_exact().

Done.

>> +	memset(mem, 0x0, PAGE_SIZE << get_order(size));
> Then this line won't be needed.

Removed.

>> +	usbm->mem = mem;
>> +	usbm->offset = virt_to_phys(mem);
>> +	usbm->size = size;
>> +	usbm->ps = ps;
>> +	spin_lock_irqsave(&ps->lock, flags);
>> +	list_add_tail(&usbm->memlist, &ps->memory_list);
>> +	spin_unlock_irqrestore(&ps->lock, flags);
>> +
>> +	return usbm;
>> +}
> This subroutine should be merged into usbdev_mmap().

Hm. Can you elaborate a bit on why? I thought it was nice to have it as a
counterpoint to dec_use_count(), the deallocation function, and reasonably
short.

> In fact, all the memory-oriented routines should be located together
> in the source file.  It's confusing to put some of them near the start
> and others near the end.

Is this about usbdev_mmap, or about usbdev_vm_{open,close}? I looked at the
former as a syscall function and thus somehow better suited closer to e.g.
the ioctl function. find_memory_area() can't be moved too far down since it's
used from other places, unless you want me to do a forward declaration?

> Forget about the WARN_ON; you know all the callers because this patch 
> introduces them.  If you prefer, instead of a pointer pass an 
> enumeration value: USB_MEMORY_URB_COUNT or USB_MEMORY_VMA_COUNT.

Removed. (It was mostly meant as documentation to the programmer.)

> Also, you might change the name to make it a little less generic.  For 
> example, dec_usb_memory_use_count().

Done.

>> +	spin_lock_irqsave(&ps->lock, flags);
>> +	list_for_each_entry(iter, &ps->memory_list, memlist) {
>> +		if (iter->usb_use_count == 0 &&
> I don't think this is necessary.  It should be legal to keep the data
> for two URBs in the same memory region, so long as they don't overlap.

Removed. I haven't added an overlap check, but I guess this should be the
caller's problem?

>> +		    uurb_start >= iter->vm_start &&
>> +		    uurb->buffer_length <= iter->vm_start - uurb_start +
>> +				(PAGE_SIZE << get_order(iter->size))) {
> Shouldn't this be:
> 
> 			uurb->start >= iter->vm_start &&
> 			uurb->start < iter->vm_start + iter->size &&
> 			uurb->buffer_length <= iter->vm_start + iter->size -
> 					uurb->start) {

I think both will work (modulo maybe issues with the PAGE_SIZE part?),
but yours is clearer. Changed. (I assume you meant uurb_start, not
uurb->start.)

>> +	spin_lock_irqsave(&ps->lock, flags);
>> +	list_for_each_entry(usbm_iter, &ps->memory_list, memlist) {
>> +		if (usbm_iter->mem == as->urb->transfer_buffer) {
>> +			usbm = usbm_iter;
>> +			break;
>> +		}
>> +	}
>> +	spin_unlock_irqrestore(&ps->lock, flags);
> There's no need to do this.  Just store the usbm pointer in the as
> structure.

Done.

>> +	list_for_each_safe(p, q, &ps->memory_list) {
>> +		tmp = list_entry(p, struct usb_memory, memlist);
>> +		list_del(p);
>> +		if (tmp->mem)
>> +			free_pages_exact(tmp->mem, tmp->size);
>> +		kfree(tmp);
> No, you can't do this here.  The memory might still be in use by the
> VMA.  You have to do:
> 
> 	list_del(&ps->memory_list);
> 
> perhaps with a comment explaining when the usb_memory structures and
> the memory buffers get freed (i.e., when the use counters go to 0).

Done.

>> +		/* do not use SG buffers when memory mapped segments
>> +		 * are allocated
>> +		 */
>> +		if (list_empty(&ps->memory_list)) {
> No, first call find_memory_area().  If the result is NULL then do this.

Done.

>> +	as->ps = ps;
>> +
> Why move this?  Isn't it fine where it is?

Reverted. (It was part of some other cleanup that I tried, I think.)

>>  	as->userurb = arg;
>> -	if (is_in && uurb->buffer_length > 0)
>> +	if (is_in && uurb->buffer_length > 0 && usbm == NULL)
> I think you should keep this even when usbm is not NULL.  No?

No, then processcompl() will try to copy data there, which defeats the point
of zerocopy. What other use would the userbuffer member have?

>>  		as->userbuffer = uurb->buffer;
>>  	else
>>  		as->userbuffer = NULL;
> When you use dma_zalloc_coherent(), you have to tell the USB core that
> the buffer is already mapped for DMA by setting the URB_NO_TRANSFER_DMA_MAP
> flag in urb->transfer_flags and by storing the DMA address in
> urb->transfer_dma.

Done. I have one question, though: Should I offset the transfer_dma address
to match with the offset in the buffer, or is this just a handle? (I've
assumed the latter, but I'm not at all sure it's correct.)

>> +	if (usbm)
>> +		dec_use_count(usbm, &usbm->usb_use_count);
>>  	kfree(isopkt);
>>  	kfree(dr);
>>  	if (as)
> You have to change processcompl().  The copy_urb_data() call isn't
> needed when zerocopy is being used -- in fact, it rather defeats the 
> purpose of zerocopy!

See previous comment on this. :-)

>> +	if (remap_pfn_range(vma, vma->vm_start,
>> +			    virt_to_phys(usbm->mem) >> PAGE_SHIFT,
>> +			    size, vma->vm_page_prot) < 0)
>> +		return -EAGAIN;
> As Oliver pointed out, the memory needs to be reclaimed and accounted
> for on the failure pathways.

Done. (I saw you used a plural; is there more than this one?)

New patch attached. However, it's not working; I'm getting several of these
messages:

  [   28.796244] DMAR: Allocating domain for 2-2 failed

/* Steinar */
-- 
Homepage: https://www.sesse.net/
>From c470a7907df4b1ff526900400e8ba0a1f4cca529 Mon Sep 17 00:00:00 2001
From: "Steinar H. Gunderson" <sesse@xxxxxxxxxx>
Date: Thu, 26 Nov 2015 01:19:13 +0100
Subject: [PATCH] Add support for usbfs zerocopy.

This is essentially a patch by Markus Rechberger with some updates.
The original can be found at

  http://sundtek.de/support/devio_mmap_v0.4.diff

This version has the following changes:

  - Rebased against a newer kernel (with some conflicts fixed).
  - Fixed all checkpatch violations and some style issues.
  - Fixes an issue where isochronous transfers would not really be
    zero-copy, but go through a pointless memcpy from one area to
    itself.
  - Ask for cached memory instead of uncached.
  - Drop the custom ioctl; use mmap only for allocation.

Signed-off-by: Steinar H. Gunderson <sesse@xxxxxxxxxx>
Signed-off-by: Markus Rechberger <mrechberger@xxxxxxxxx>
---
 drivers/usb/core/devio.c | 202 ++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 192 insertions(+), 10 deletions(-)

diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
index 38ae877c..95c9fed 100644
--- a/drivers/usb/core/devio.c
+++ b/drivers/usb/core/devio.c
@@ -50,6 +50,7 @@
 #include <linux/user_namespace.h>
 #include <linux/scatterlist.h>
 #include <linux/uaccess.h>
+#include <linux/dma-mapping.h>
 #include <asm/byteorder.h>
 #include <linux/moduleparam.h>
 
@@ -69,6 +70,7 @@ struct usb_dev_state {
 	spinlock_t lock;            /* protects the async urb lists */
 	struct list_head async_pending;
 	struct list_head async_completed;
+	struct list_head memory_list;
 	wait_queue_head_t wait;     /* wake up if a request completed */
 	unsigned int discsignr;
 	struct pid *disc_pid;
@@ -79,6 +81,17 @@ struct usb_dev_state {
 	u32 disabled_bulk_eps;
 };
 
+struct usb_memory {
+	struct list_head memlist;
+	int vma_use_count;
+	int usb_use_count;
+	u32 size;
+	void *mem;
+	dma_addr_t dma_handle;
+	unsigned long vm_start;
+	struct usb_dev_state *ps;
+};
+
 struct async {
 	struct list_head asynclist;
 	struct usb_dev_state *ps;
@@ -89,6 +102,7 @@ struct async {
 	void __user *userbuffer;
 	void __user *userurb;
 	struct urb *urb;
+	struct usb_memory *usbm;
 	unsigned int mem_usage;
 	int status;
 	u32 secid;
@@ -157,6 +171,75 @@ static int connected(struct usb_dev_state *ps)
 			ps->dev->state != USB_STATE_NOTATTACHED);
 }
 
+static struct usb_memory *
+alloc_usb_memory(struct usb_dev_state *ps, size_t size)
+{
+	struct usb_memory *usbm;
+	void *mem;
+	unsigned long flags;
+	dma_addr_t dma_handle;
+
+	mem = dma_zalloc_coherent(&ps->dev->dev, size, &dma_handle,
+				  GFP_KERNEL | GFP_DMA32);
+	if (!mem)
+		return NULL;
+
+	usbm = kzalloc(sizeof(struct usb_memory), GFP_KERNEL);
+	if (!usbm) {
+		dma_free_coherent(&ps->dev->dev, size, mem, dma_handle);
+		return NULL;
+	}
+	usbm->mem = mem;
+	usbm->dma_handle = dma_handle;
+	usbm->size = size;
+	usbm->ps = ps;
+	spin_lock_irqsave(&ps->lock, flags);
+	list_add_tail(&usbm->memlist, &ps->memory_list);
+	spin_unlock_irqrestore(&ps->lock, flags);
+
+	return usbm;
+}
+
+static void dec_usb_memory_use_count(struct usb_memory *usbm, int *count)
+{
+	struct usb_dev_state *ps = usbm->ps;
+	unsigned long flags;
+
+	spin_lock_irqsave(&ps->lock, flags);
+	--*count;
+	if (usbm->usb_use_count == 0 && usbm->vma_use_count == 0) {
+		list_del_init(&usbm->memlist);
+		dma_free_coherent(&ps->dev->dev, usbm->size, usbm->mem,
+				  usbm->dma_handle);
+		usbfs_decrease_memory_usage(
+			usbm->size + sizeof(struct usb_memory));
+		kfree(usbm);
+	}
+	spin_unlock_irqrestore(&ps->lock, flags);
+}
+
+static struct usb_memory *
+find_memory_area(struct usb_dev_state *ps, const struct usbdevfs_urb *uurb)
+{
+	struct usb_memory *usbm = NULL, *iter;
+	unsigned long flags;
+	unsigned long uurb_start = (unsigned long)uurb->buffer;
+
+	spin_lock_irqsave(&ps->lock, flags);
+	list_for_each_entry(iter, &ps->memory_list, memlist) {
+		if (uurb_start >= iter->vm_start &&
+		    uurb_start < iter->vm_start + iter->size &&
+		    uurb->buffer_length <= iter->vm_start + iter->size -
+			uurb_start) {
+			usbm = iter;
+			usbm->usb_use_count++;
+			break;
+		}
+	}
+	spin_unlock_irqrestore(&ps->lock, flags);
+	return usbm;
+}
+
 static loff_t usbdev_lseek(struct file *file, loff_t offset, int orig)
 {
 	loff_t ret;
@@ -288,6 +371,7 @@ static struct async *alloc_async(unsigned int numisoframes)
 
 static void free_async(struct async *as)
 {
+	struct usb_memory *usbm = NULL;
 	int i;
 
 	put_pid(as->pid);
@@ -297,8 +381,13 @@ static void free_async(struct async *as)
 		if (sg_page(&as->urb->sg[i]))
 			kfree(sg_virt(&as->urb->sg[i]));
 	}
+
 	kfree(as->urb->sg);
-	kfree(as->urb->transfer_buffer);
+	if (as->usbm == NULL)
+		kfree(as->urb->transfer_buffer);
+	else
+		dec_usb_memory_use_count(usbm, &as->usbm->usb_use_count);
+
 	kfree(as->urb->setup_packet);
 	usb_free_urb(as->urb);
 	usbfs_decrease_memory_usage(as->mem_usage);
@@ -910,6 +999,7 @@ static int usbdev_open(struct inode *inode, struct file *file)
 	INIT_LIST_HEAD(&ps->list);
 	INIT_LIST_HEAD(&ps->async_pending);
 	INIT_LIST_HEAD(&ps->async_completed);
+	INIT_LIST_HEAD(&ps->memory_list);
 	init_waitqueue_head(&ps->wait);
 	ps->discsignr = 0;
 	ps->disc_pid = get_pid(task_pid(current));
@@ -962,6 +1052,13 @@ static int usbdev_release(struct inode *inode, struct file *file)
 		free_async(as);
 		as = async_getcompleted(ps);
 	}
+
+	/* Any elements still left on this list are still in use and cannot
+	 * be deleted here, but will be freed once the counters go to 0
+	 * (see dec_usb_memory_use_count()).
+	 */
+	list_del(&ps->memory_list);
+
 	kfree(ps);
 	return 0;
 }
@@ -1291,6 +1388,7 @@ static int proc_do_submiturb(struct usb_dev_state *ps, struct usbdevfs_urb *uurb
 	struct usb_host_endpoint *ep;
 	struct async *as = NULL;
 	struct usb_ctrlrequest *dr = NULL;
+	struct usb_memory *usbm = NULL;
 	unsigned int u, totlen, isofrmlen;
 	int i, ret, is_in, num_sgs = 0, ifnum = -1;
 	int number_of_packets = 0;
@@ -1372,9 +1470,17 @@ static int proc_do_submiturb(struct usb_dev_state *ps, struct usbdevfs_urb *uurb
 			uurb->type = USBDEVFS_URB_TYPE_INTERRUPT;
 			goto interrupt_urb;
 		}
-		num_sgs = DIV_ROUND_UP(uurb->buffer_length, USB_SG_SIZE);
-		if (num_sgs == 1 || num_sgs > ps->dev->bus->sg_tablesize)
-			num_sgs = 0;
+		/* do not use SG buffers when memory mapped segments
+		 * are in use
+		 */
+		if (find_memory_area(ps, uurb) == NULL) {
+			num_sgs = DIV_ROUND_UP(uurb->buffer_length,
+					       USB_SG_SIZE);
+			if (num_sgs == 1 ||
+			    num_sgs > ps->dev->bus->sg_tablesize) {
+				num_sgs = 0;
+			}
+		}
 		if (ep->streams)
 			stream_id = uurb->stream_id;
 		break;
@@ -1445,6 +1551,7 @@ static int proc_do_submiturb(struct usb_dev_state *ps, struct usbdevfs_urb *uurb
 	if (ret)
 		goto error;
 	as->mem_usage = u;
+	as->usbm = NULL;
 
 	if (num_sgs) {
 		as->urb->sg = kmalloc(num_sgs * sizeof(struct scatterlist),
@@ -1476,21 +1583,33 @@ static int proc_do_submiturb(struct usb_dev_state *ps, struct usbdevfs_urb *uurb
 			totlen -= u;
 		}
 	} else if (uurb->buffer_length > 0) {
-		as->urb->transfer_buffer = kmalloc(uurb->buffer_length,
-				GFP_KERNEL);
+		if (!list_empty(&ps->memory_list)) {
+			usbm = find_memory_area(ps, uurb);
+			if (usbm) {
+				as->usbm = usbm;
+				as->urb->transfer_buffer = usbm->mem;
+			} else {
+				ret = -ENOMEM;
+				goto error;
+			}
+		} else {
+			as->urb->transfer_buffer = kmalloc(uurb->buffer_length,
+							   GFP_KERNEL);
+		}
 		if (!as->urb->transfer_buffer) {
 			ret = -ENOMEM;
 			goto error;
 		}
 
-		if (!is_in) {
+		if (!is_in && usbm == NULL) {
 			if (copy_from_user(as->urb->transfer_buffer,
 					   uurb->buffer,
 					   uurb->buffer_length)) {
 				ret = -EFAULT;
 				goto error;
 			}
-		} else if (uurb->type == USBDEVFS_URB_TYPE_ISO) {
+		} else if (uurb->type == USBDEVFS_URB_TYPE_ISO &&
+			   usbm == NULL) {
 			/*
 			 * Isochronous input data may end up being
 			 * discontiguous if some of the packets are short.
@@ -1545,10 +1664,15 @@ static int proc_do_submiturb(struct usb_dev_state *ps, struct usbdevfs_urb *uurb
 	isopkt = NULL;
 	as->ps = ps;
 	as->userurb = arg;
-	if (is_in && uurb->buffer_length > 0)
+	if (is_in && uurb->buffer_length > 0 && usbm == NULL) {
 		as->userbuffer = uurb->buffer;
-	else
+	} else {
 		as->userbuffer = NULL;
+	}
+	if (usbm != NULL) {
+		as->urb->transfer_flags |= URB_NO_TRANSFER_DMA_MAP;
+		as->urb->transfer_dma = usbm->dma_handle;
+	}
 	as->signr = uurb->signr;
 	as->ifnum = ifnum;
 	as->pid = get_pid(task_pid(current));
@@ -1604,6 +1728,8 @@ static int proc_do_submiturb(struct usb_dev_state *ps, struct usbdevfs_urb *uurb
 	return 0;
 
  error:
+	if (usbm)
+		dec_usb_memory_use_count(usbm, &usbm->usb_use_count);
 	kfree(isopkt);
 	kfree(dr);
 	if (as)
@@ -2337,6 +2463,61 @@ static long usbdev_ioctl(struct file *file, unsigned int cmd,
 	return ret;
 }
 
+static void usbdev_vm_open(struct vm_area_struct *vma)
+{
+	struct usb_memory *usbm = vma->vm_private_data;
+	unsigned long flags;
+
+	spin_lock_irqsave(&usbm->ps->lock, flags);
+	++usbm->vma_use_count;
+	spin_unlock_irqrestore(&usbm->ps->lock, flags);
+}
+
+static void usbdev_vm_close(struct vm_area_struct *vma)
+{
+	struct usb_memory *usbm = vma->vm_private_data;
+
+	dec_usb_memory_use_count(usbm, &usbm->vma_use_count);
+}
+
+struct vm_operations_struct usbdev_vm_ops = {
+	.open = usbdev_vm_open,
+	.close = usbdev_vm_close
+};
+
+static int usbdev_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	struct usb_memory *usbm = NULL;
+	struct usb_dev_state *ps = file->private_data;
+	size_t size = vma->vm_end - vma->vm_start;
+	int ret;
+
+	ret = usbfs_increase_memory_usage(size + sizeof(struct usb_memory));
+	if (ret)
+		return ret;
+
+	usbm = alloc_usb_memory(ps, size);
+	if (!usbm)
+		return -ENOMEM;
+
+	if (remap_pfn_range(vma, vma->vm_start,
+			    virt_to_phys(usbm->mem) >> PAGE_SHIFT,
+			    size, vma->vm_page_prot) < 0) {
+		int dummy_count = 1;
+
+		dec_usb_memory_use_count(usbm, &dummy_count);
+		return -EAGAIN;
+	}
+
+	usbm->vm_start = vma->vm_start;
+	vma->vm_flags |= VM_IO;
+	vma->vm_flags |= (VM_DONTEXPAND | VM_DONTDUMP);
+	vma->vm_ops = &usbdev_vm_ops;
+	vma->vm_private_data = usbm;
+	usbdev_vm_open(vma);
+	return 0;
+}
+
 #ifdef CONFIG_COMPAT
 static long usbdev_compat_ioctl(struct file *file, unsigned int cmd,
 			unsigned long arg)
@@ -2373,6 +2554,7 @@ const struct file_operations usbdev_file_operations = {
 #ifdef CONFIG_COMPAT
 	.compat_ioctl =   usbdev_compat_ioctl,
 #endif
+	.mmap =           usbdev_mmap,
 	.open =		  usbdev_open,
 	.release =	  usbdev_release,
 };
-- 
2.1.4


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux