Re: [PATCH] USB: usbtmc: Fix RCU stall warning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




________________________________________
From: Guido Kiener <Guido.Kiener@xxxxxxxxxxxxxxxxx>
Sent: Friday, 23 July 2021 01:33
To: Greg KH; dave penkler
Cc: Zhang, Qiang; Alan Stern; Dmitry Vyukov; paulmck@xxxxxxxxxx; USB
Subject: Re: [PATCH] USB: usbtmc: Fix RCU stall warning

[Please note: This e-mail is from an EXTERNAL e-mail address]

> From: Greg KH
> Sent: Wednesday, July 21, 2021 11:48 AM
> Subject: *EXT* Re: [PATCH] USB: usbtmc: Fix RCU stall warning
>
> On Wed, Jul 21, 2021 at 11:44:23AM +0200, dave penkler wrote:
> > On Wed, 21 Jul 2021 at 09:52, Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > On Wed, Jul 21, 2021 at 09:41:15AM +0200, dave penkler wrote:
> > > > On Wed, 21 Jul 2021 at 09:08, Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>
> wrote:
> > > > >
> > > > > On Tue, Jun 29, 2021 at 11:32:36AM +0800, qiang.zhang@xxxxxxxxxxxxx
> wrote:
> > > > > > From: Zqiang <qiang.zhang@xxxxxxxxxxxxx>
> > > > >
> > > > > I need a "full" name here, and in the signed-off-by line please.
> > > > >
> > > > > >
> > > > > > rcu: INFO: rcu_preempt self-detected stall on CPU
> > > > > > rcu:    1-...!: (2 ticks this GP) idle=d92/1/0x4000000000000000
> > > > > >         softirq=25390/25392 fqs=3
> > > > > >         (t=12164 jiffies g=31645 q=43226)
> > > > > > rcu: rcu_preempt kthread starved for 12162 jiffies! g31645 f0x0
> > > > > >      RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> > > > > > rcu:    Unless rcu_preempt kthread gets sufficient CPU time,
> > > > > >         OOM is now expected behavior.
> > > > > > rcu: RCU grace-period kthread stack dump:
> > > > > > task:rcu_preempt     state:R  running task
> > > > > >
> > > > > > In the case of system use dummy_hcd as usb controller, when
> > > > > > the usbtmc devices is disconnected, in usbtmc_interrupt(), if
> > > > > > the urb status is unknown, the urb will be resubmit, the urb
> > > > > > may be insert to dum_hcd->urbp_list again, this will cause the
> > > > > > dummy_timer() not to exit for a long time, beacause the
> > > > > > dummy_timer() be called in softirq and local_bh is disable,
> > > > > > this not only causes the RCU reading critical area to consume
> > > > > > too much time but also makes the tasks in the current CPU runq not run
> in time, and that triggered RCU stall.
> > > > > >
> > > > > > return directly when find the urb status is not zero to fix it.
> > > > > >
> > > > > > Reported-by:
> > > > > > syzbot+e2eae5639e7203360018@xxxxxxxxxxxxxxxxxxxxxxxxx
> > > > > > Signed-off-by: Zqiang <qiang.zhang@xxxxxxxxxxxxx>
> > > > >
> > > > > What commit does this fix?  Does it need to go to stable kernels?
> > > > >
> > > > > What about the usbtmc maintainers, what do they think about this?
> > > >
> > > > This patch makes the babbling endpoint retry/recovery code in the
> > > > real world usb host controller drivers redundant and would prevent
> > > > usbtmc applications from benefiting from it.
> > >
> > > I do not understand, is this change ok or not?
> > >
> > > Why do usbtmc applications need to know if babbling happens or not?
> > >
> > > confused,
> > Sorry, the issue this patch is trying to fix occurs because the
> > current usbtmc driver resubmits the URB when it gets an EPROTO return.
> > The dummy usb host controller driver used in the syzbot tests keeps
> > returning the resubmitted URB with EPROTO causing a loop that starves
> > RCU. With an actual HCI driver it either recovers or returns an EPIPE.
> > In either case no loop occurs. So for my part as a user and maintainer
> > this patch is not ok.
>
> Thanks for the review.
>
> Zqiang, can you fix this patch up based on this please?
>
> thanks,
>
> greg k-h

>Qiang,
>
>After discussions with Alan and Dave we think that fixing the >usbtmc driver is the best approach to fix the RCU stall warning.
>Your first proposal was almost ok, but I think we should use >dev_dbg() instead of dev_err() to avoid printing the EPROTO >errors. See below:
>
>Please feel free to add the following text to your patch >description.
>
>-Guido
>
>
>The function usbtmc_interrupt() resubmits urbs when the error >status
>of an urb is -EPROTO. In systems using the dummy_hcd usb >controller
>this can result in endless interrupt loops when the usbtmc device >is
>disconnected from the host system.
>
>Since host controller drivers already try to recover from >transmission
>errors, there is no need to resubmit the urb or try other solutions
>to repair the error situation.
>
>In case of errors the INT pipe just stops to wait for further packets.
>
>Reviewed-by: Guido Kiener <guido.kiener@xxxxxxxxxxxxxxxxx>
>
>diff --git a/drivers/usb/class/usbtmc.c b/drivers/usb/class>/usbtmc.c
>index 74d5a9c5238a..73f419adce61 100644
>--- a/drivers/usb/class/usbtmc.c
>+++ b/drivers/usb/class/usbtmc.c
>@@ -2324,17 +2324,10 @@ static void usbtmc_interrupt(struct >urb *urb)
>                dev_err(dev, "overflow with length %d, actual length is >%d\n",
>                        data->iin_wMaxPacketSize, urb->actual_length);
>                fallthrough;
>-       case -ECONNRESET:
>-       case -ENOENT:
>-       case -ESHUTDOWN:
>-       case -EILSEQ:
>-       case -ETIME:
>-       case -EPIPE:
>+       default:
>                /* urb terminated, clean up */
>                dev_dbg(dev, "urb terminated, status: %d\n", status);
>                return;
>-       default:
>-               dev_err(dev, "unknown status received: %d\n", status);
>        }
> exit:
>        rv = usb_submit_urb(urb, GFP_ATOMIC);
>

Thanks

I will resend v2

Qiang



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux