Re: [RFT PATCH 1/4] usb: dwc2: Simplify and fix DMA alignment code

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/27/20 2:06 PM, Doug Anderson wrote:
> Hi,
> 
> On Wed, Feb 26, 2020 at 1:04 PM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
>>
>> The code to align buffers for DMA was first introduced with commit
>> 3bc04e28a030 ("usb: dwc2: host: Get aligned DMA in a more supported way").
>> It was updated with commit 56406e017a88 ("usb: dwc2: Fix DMA alignment
>> to start at allocated boundary") because it did not really align buffers to
>> DMA boundaries but to word offsets. This was then optimized in commit
>> 1e111e885238 ("usb: dwc2: Fix inefficient copy of unaligned buffers")
>> to only copy actual data rather than the whole buffer. Commit 4a4863bf2e79
>> ("usb: dwc2: Fix DMA cache alignment issues") changed this further to add
>> a padding at the end of the buffer to ensure that the old data pointer is
>> not in the same cache line as the buffer.
>>
>> This last commit states "Otherwise, the stored_xfer_buffer gets corrupted
>> for IN URBs on non-cache-coherent systems". However, such corruptions are
>> still observed. This suggests that the commit may have been hiding a
>> problem rather than fixing it. Further analysis shows that this is indeed
>> the case: The code in dwc2_hc_start_transfer() assumes that the transfer
>> size is a multiple of wMaxPacketSize, and rounds up the transfer size
>> communicated to the chip accordingly. Added debug code confirms that
>> the chip does under some circumstances indeed send more data than requested
>> in the urb receive buffer size.
>>
>> On top of that, it turns out that buffers are still not guaranteed to be
>> aligned to dma_get_cache_alignment(), but to DWC2_USB_DMA_ALIGN (4).
>> Further debugging shows that packets aligned to DWC2_USB_DMA_ALIGN
>> but not to dma_get_cache_alignment() are indeed common and work just fine.
>> This suggests that commit 56406e017a88 was not really necessary because
>> even without it packets were already aligned to DWC2_USB_DMA_ALIGN.
>>
>> To simplify the code, move the old data pointer back to the beginning of
>> the new buffer, restoring most of the original commit. Stop aligning the
>> buffer to dma_get_cache_alignment() since it isn't needed and only makes
>> the code more complex. Instead, ensure that the allocated buffer is a
>> multiple of wMaxPacketSize to ensure that the chip does not write beyond
>> the end of the buffer.
> 
> I do like the cleanliness of being able to easily identify the old
> buffer (AKA by putting it first) and I agree that the existing code
> was only really guaranteeing 4-byte alignment and if we truly needed
> more alignment then we'd be allocating a lot more bounce buffers
> (which is pretty expensive).
> 
> ...but the argument in commit 56406e017a88 ("usb: dwc2: Fix DMA
> alignment to start at allocated boundary") is still a compelling one.

Sure, but, as mentioned above, it wasn't followed anyway.

> Maybe at least put a comment in the code next to the "#define
> DWC2_USB_DMA_ALIGN" saying that we think that this is enough alignment
> for anyone using dwc2's built-in DMA logic?
> 
Makes sense. I'll add a comment.

> 
>> Cc: Douglas Anderson <dianders@xxxxxxxxxxxx>
>> Cc: Boris Arzur <boris@xxxxxxxxx>
>> Fixes: 56406e017a88 ("usb: dwc2: Fix DMA alignment to start at allocated boundary")
>> Signed-off-by: Guenter Roeck <linux@xxxxxxxxxxxx>
>> ---
>>  drivers/usb/dwc2/hcd.c | 67 ++++++++++++++++++++++--------------------
>>  1 file changed, 35 insertions(+), 32 deletions(-)
> 
> Sorry for such a mess and thank you for all the work tracking down and
> documenting all the problems.  Clearly deep understanding of DMA is
> not something I can claim.  :(
> 
> A few points of order first:
> * Although get_maintainer doesn't identify him, it has seemed like
> Felipe Balbi lands most of the dwc2 things.  Probably a good idea to
> CC him.
> * I have historically found Stefan Wahren interested in dwc2 fixes and
> willing to test them on Raspberry Pi w/ various peripherals.
> * Since one of the commits you refer to was written by Martin Schiller
> it probably wouldn't hurt to CC him.
> 
I'll do that.

> 
>> diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
>> index b90f858af960..f6d8cc9cee34 100644
>> --- a/drivers/usb/dwc2/hcd.c
>> +++ b/drivers/usb/dwc2/hcd.c
>> @@ -2471,19 +2471,21 @@ static int dwc2_alloc_split_dma_aligned_buf(struct dwc2_hsotg *hsotg,
>>
>>  #define DWC2_USB_DMA_ALIGN 4
>>
>> +struct dma_aligned_buffer {
>> +       void *old_xfer_buffer;
>> +       u8 data[0];
>> +};
>> +
>>  static void dwc2_free_dma_aligned_buffer(struct urb *urb)
>>  {
>> -       void *stored_xfer_buffer;
>> +       struct dma_aligned_buffer *dma;
>>         size_t length;
>>
>>         if (!(urb->transfer_flags & URB_ALIGNED_TEMP_BUFFER))
>>                 return;
>>
>> -       /* Restore urb->transfer_buffer from the end of the allocated area */
>> -       memcpy(&stored_xfer_buffer,
>> -              PTR_ALIGN(urb->transfer_buffer + urb->transfer_buffer_length,
>> -                        dma_get_cache_alignment()),
>> -              sizeof(urb->transfer_buffer));
>> +       dma = container_of(urb->transfer_buffer,
>> +                          struct dma_aligned_buffer, data);
>>
>>         if (usb_urb_dir_in(urb)) {
>>                 if (usb_pipeisoc(urb->pipe))
>> @@ -2491,49 +2493,50 @@ static void dwc2_free_dma_aligned_buffer(struct urb *urb)
>>                 else
>>                         length = urb->actual_length;
>>
>> -               memcpy(stored_xfer_buffer, urb->transfer_buffer, length);
>> +               memcpy(dma->old_xfer_buffer, dma->data, length);
>>         }
>> -       kfree(urb->transfer_buffer);
>> -       urb->transfer_buffer = stored_xfer_buffer;
>> +       urb->transfer_buffer = dma->old_xfer_buffer;
>> +       kfree(dma);
>>
>>         urb->transfer_flags &= ~URB_ALIGNED_TEMP_BUFFER;
>>  }
>>
>>  static int dwc2_alloc_dma_aligned_buffer(struct urb *urb, gfp_t mem_flags)
>>  {
>> -       void *kmalloc_ptr;
>> +       struct dma_aligned_buffer *dma;
>>         size_t kmalloc_size;
>>
>> -       if (urb->num_sgs || urb->sg ||
>> -           urb->transfer_buffer_length == 0 ||
>> +       if (urb->num_sgs || urb->sg || urb->transfer_buffer_length == 0 ||
>> +           (urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP) ||
>>             !((uintptr_t)urb->transfer_buffer & (DWC2_USB_DMA_ALIGN - 1)))
> 
> Maybe I'm misunderstanding things, but it feels like we need something
> more here.  Specifically I'm worried about the fact when the transfer
> buffer is already aligned but the length is not a multiple of the
> endpoint's maximum transfer size.  You need to handle that, right?
> AKA something like this (untested):
> 
> /* Simple case of not having to allocate a bounce buffer */
> if (urb->num_sgs || urb->sg || urb->transfer_buffer_length == 0 ||
>     (urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP))
>   return 0;
> 
> /* Can also avoid bounce buffer if alignment and size are good */
> maxp = usb_endpoint_maxp(&ep->desc);
> if (maxp == urb->transfer_buffer_length &&

No, transfer_buffer_length would have to be a multiple of maxp. There
are many situations where roundup(transfer_buffer_length, maxp) !=
transfer_buffer_length. I agree, this would be the prudent approach
(and it was my original implementation), but then it didn't seem to
cause trouble so far, and I was hesitant to add it in because it results
in creating temporary buffers for almost every receive operation.
I'd like to get some test feedback from Boris - if the current code
causes crashes with his use case, we'll know that it is needed.
Otherwise, we'll have to decide if the current approach (with fewer
copies) is worth the risk, or if we want to play save and always
copy if roundup(transfer_buffer_length, maxp) != transfer_buffer_length.

Thanks,
Guenter

> 
>     !((uintptr_t)urb->transfer_buffer & (DWC2_USB_DMA_ALIGN - 1)))
>   return 0;
> >
> Other than that this looks good to me.
> 
> -Doug
> 




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux