Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> writes:
> On Tue, Oct 04, 2016 at 03:23:12PM +0300, Felipe Balbi wrote:
>> 
>> Hi,
>> 
>> Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> writes:
>> > On Mon, Oct 03, 2016 at 07:08:43PM +0300, Felipe Balbi wrote:
>> >> 
>> >> Hi,
>> >> 
>> >> ville.syrjala@xxxxxxxxxxxxxxx writes:
>> >> > From: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx>
>> >> >
>> >> > This reverts commit 55a0237f8f47957163125e20ee9260538c5c341c.
>> >> >
>> >> > commit  55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for
>> >> > LST bit") causes my BYT FFRD8 with g_ether to behave poorly. ssh/scp
>> >> > is very sluggish and can even stall entirely. Revert cures it.
>> >> >
>> >> > Cc: Felipe Balbi <felipe.balbi@xxxxxxxxxxxxxxx>
>> >> > Cc: stable@xxxxxxxxxxxxxxx
>> >> > Signed-off-by: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx>
>> >> 
>> >> Reverting that commit is not the best idea. Let's start with the usual:
>> >> 
>> >> - Kernel version
>> >> - dmesg on both sides (host and device)
>> >> - dwc3 tracepoints:
>> >> 
>> >> # mkdir -p /t
>> >> # mount -t tracefs none /t
>> >> # echo 8192 > /t/buffer_size_kb
>> >> # echo 1 > /t/events/dwc3/enable
>> >> # echo 0 > /t/events/dwc3/dwc3_readl/enable
>> >> # echo 0 > /t/events/dwc3/dwc3_writel/enable
>> >> 
>> >> This should be enough to tell me what's really going on.
>> >
>> > I've attached the (compressed due to size) dmesgs/traces from the device.
>> > For my test I just did 'for i in `seq 1 5` ; dmesg ; done' in an ssh
>> > session, and I did get the lag in the bad run, so presumably the trace
>> > should have caught it. Both were on 4.8 (+ the revert for the good run).
>> >
>> > Host side dmesg just showes me:
>> >  usb 3-3.4: new high-speed USB device number 25 using xhci_hcd
>> >  cdc_ether 3-3.4:1.0 usb0: register 'cdc_ether' at usb-0000:00:14.0-3.4, CDC Ethernet Device, ...
>> > when I boot the device. Nothing extra appears when things lag around.
>> > I'm running 4.5.something on the host.
>> 
>> Looking at your logs what I can see is that "revert.trace" doesn't
>> respect gadget driver's IRQ throttling, which means you get an interrupt
>> for each and every request, while "bad.trace" shows dwc3 respecting
>> "no_interrupt" flag passed in by g_ether.
>> 
>> The only explanation I have for this behavior when we start respecting
>> gadget driver's flags, is that there's a possible race in u_ether.c
>> where it can go for a long time without an interrupt due to that
>> throttling.
>> 
>> This only happens with IN endpoints (the ones which are throttled) and
>> the behavior can be seen in log snippet below:
>> 
>>   27782:     irq/17-dwc3-2524  [000] d...   666.891969: tx_complete: req ffff88016da1a6c0 length 1514/1514 flags ZsI tx_qlen 4 qmult 5
>>   27802:     ksoftirqd/0-3     [000] .Ns1   666.892047: eth_start_xmit: req ffff88016da1a6c0 length 0/1514 flags Zsi tx_qlen 5 qmult 5
>>   27808:     ksoftirqd/0-3     [000] .Ns2   666.892062: eth_start_xmit: req ffff88016da1a300 length 0/1514 flags ZsI tx_qlen 6 qmult 5
>>   27814:     ksoftirqd/0-3     [000] .Ns2   666.892069: eth_start_xmit: req ffff88016da1ab40 length 0/1514 flags Zsi tx_qlen 7 qmult 5
>>   27820:     ksoftirqd/0-3     [000] .Ns2   666.892075: eth_start_xmit: req ffff88016da1acc0 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>>   27826:     ksoftirqd/0-3     [000] .Ns2   666.892082: eth_start_xmit: req ffff88016da1ae40 length 0/1514 flags Zsi tx_qlen 9 qmult 5
>>   27832:     ksoftirqd/0-3     [000] .Ns2   666.892088: eth_start_xmit: req ffff88016da1a780 length 0/1514 flags Zsi tx_qlen 10 qmult 5
>>   27884:     irq/17-dwc3-2524  [000] d...   666.892213: tx_complete: req ffff88016da1a840 length 1514/1514 flags Zsi tx_qlen 9 qmult 5
>>   27889:     irq/17-dwc3-2524  [000] d...   666.892215: tx_complete: req ffff88016da1a600 length 1514/1514 flags Zsi tx_qlen 8 qmult 5
>>   27897:            sshd-2684  [002] ..s1   666.892223: eth_start_xmit: req ffff88016da1a840 length 0/1514 flags Zsi tx_qlen 9 qmult 5
>>   27898:     irq/17-dwc3-2524  [000] d...   666.892223: tx_complete: req ffff88016da1a180 length 1514/1514 flags Zsi tx_qlen 8 qmult 5
>>   27903:     irq/17-dwc3-2524  [000] d...   666.892225: tx_complete: req ffff88016da1a240 length 1514/1514 flags Zsi tx_qlen 7 qmult 5
>>   27910:            sshd-2684  [002] ..s1   666.892230: eth_start_xmit: req ffff88016da1a180 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>>   27913:     irq/17-dwc3-2524  [000] d...   666.892231: tx_complete: req ffff88016da1a6c0 length 1514/1514 flags Zsi tx_qlen 7 qmult 5
>>   27920:            sshd-2684  [002] ..s1   666.892236: eth_start_xmit: req ffff88016da1a6c0 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>>   27923:     irq/17-dwc3-2524  [000] d...   666.892238: tx_complete: req ffff88016da1a300 length 1514/1514 flags ZsI tx_qlen 7 qmult 5
>>   27928:            sshd-2684  [002] ..s1   666.892242: eth_start_xmit: req ffff88016da1a240 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>>   27944:            sshd-2684  [002] ..s1   666.892253: eth_start_xmit: req ffff88016da1a300 length 0/1514 flags Zsi tx_qlen 9 qmult 5
>>   27953:            sshd-2684  [002] ..s1   666.892262: eth_start_xmit: req ffff88016da1a600 length 0/1514 flags Zsi tx_qlen 10 qmult 5
>> 
>> Note here that we reached tx_qlen = 10 without any interrupts being requested.
>> 
>> I patched u_ether with below to capture log above:
>> 
>> diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c
>> index 5f562c1ec795..e34c26ffcdf4 100644
>> --- a/drivers/usb/gadget/function/u_ether.c
>> +++ b/drivers/usb/gadget/function/u_ether.c
>> @@ -465,6 +465,15 @@ static void tx_complete(struct usb_ep *ep, struct usb_request *req)
>>         dev_kfree_skb_any(skb);
>>  
>>         atomic_dec(&dev->tx_qlen);
>> +
>> +       trace_printk("req %p length %d/%d flags %c%c%c tx_qlen %d qmult %d\n",
>> +                       req, req->actual, req->length,
>> +                       req->zero ? 'Z' : 'z',
>> +                       req->short_not_ok ? 'S' : 's',
>> +                       req->no_interrupt ? 'i' : 'I',
>> +                       atomic_read(&dev->tx_qlen),
>> +                       dev->qmult);
>> +
>>         if (netif_carrier_ok(dev->net))
>>                 netif_wake_queue(dev->net);
>>  }
>> @@ -600,6 +609,12 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,
>>         case 0:
>>                 netif_trans_update(net);
>>                 atomic_inc(&dev->tx_qlen);
>> +               trace_printk("req %p length %d/%d flags %c%c%c tx_qlen %d qmult %d\n",
>> +                               req, req->actual, req->length, req->zero ? 'Z' : 'z',
>> +                               req->short_not_ok ? 'S' : 's',
>> +                               req->no_interrupt ? 'i' : 'I',
>> +                               atomic_read(&dev->tx_qlen),
>> +                               dev->qmult);
>>         }
>>  
>>         if (retval) {
>> 
>> The only way I can think of to avoid this problem is to make sure we
>> don't starve our request queue with something like:
>> 
>> diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c
>> index 5f562c1ec795..ddfa866c2cba 100644
>> --- a/drivers/usb/gadget/function/u_ether.c
>> +++ b/drivers/usb/gadget/function/u_ether.c
>> @@ -587,8 +596,9 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,
>>  
>>         /* throttle high/super speed IRQ rate back slightly */
>>         if (gadget_is_dualspeed(dev->gadget))
>> -               req->no_interrupt = (dev->gadget->speed == USB_SPEED_HIGH ||
>> -                                    dev->gadget->speed == USB_SPEED_SUPER)
>> +               req->no_interrupt = (((dev->gadget->speed == USB_SPEED_HIGH ||
>> +                                      dev->gadget->speed == USB_SPEED_SUPER)) &&
>> +                                       !list_empty(&dev->tx_reqs))
>>                         ? ((atomic_read(&dev->tx_qlen) % dev->qmult) != 0)
>>                         : 0;
>>  
>> I have been testing this for about half an hour and I can't see delays
>> or link getting stuck. Care to test ? If it works for your setup, I'll
>> send this with Cc: stable and your Reported-by and Tested-by tags.
>
> That doesn't appear to cure the problem unfortunately. I've attached another
> trace, with your u_ether trace_printks included.

Now I only seem to find delays on ep1out. Here are a few instances:

     irq/22-dwc3-803   [001] d..1    19.446979: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
     irq/22-dwc3-803   [001] d..1    20.452232: dwc3_event: event (00006084): ep1out: Transfer In-Progress

     irq/22-dwc3-803   [001] d..1    20.761702: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
     irq/22-dwc3-803   [001] d..1    21.044345: dwc3_event: event (00006084): ep1out: Transfer In-Progress

     irq/22-dwc3-803   [001] d..1    21.999383: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
[...]
     irq/22-dwc3-803   [001] d..1    22.239576: dwc3_event: event (00006084): ep1out: Transfer In-Progress

     irq/22-dwc3-803   [001] d..1    22.979703: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
[...]
     irq/22-dwc3-803   [001] d..1    23.125096: dwc3_event: event (00006084): ep1out: Transfer In-Progress

     irq/22-dwc3-803   [001] d..1    23.820697: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
     irq/22-dwc3-803   [001] d..1    24.461117: dwc3_event: event (00006084): ep1out: Transfer In-Progress

     irq/22-dwc3-803   [001] d..1    24.882567: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
     irq/22-dwc3-803   [001] d..1    25.693330: dwc3_event: event (00006084): ep1out: Transfer In-Progress

     irq/22-dwc3-803   [001] d..1    25.702107: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
     irq/22-dwc3-803   [001] d..1    26.045495: dwc3_event: event (00006084): ep1out: Transfer In-Progress

Based on this, it would seem that host side isn't sending the data; but
that's quite odd. Let's put a sniffer here tomorrow and look at this
more carefully, I couldn't reproduce this with SKL as I was testing this
today.

-- 
balbi

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]