> -----Original Message----- > From: Thinh Nguyen [mailto:Thinh.Nguyen@xxxxxxxxxxxx] > Sent: Friday, March 11, 2022 10:57 AM > To: 정재훈; Thinh Nguyen; 'Felipe Balbi'; 'Greg Kroah-Hartman' > Cc: 'open list:USB XHCI DRIVER'; 'open list'; 'Seungchull Suh'; 'Daehwan > Jung'; cpgs@xxxxxxxxxxx; cpgsproxy5@xxxxxxxxxxx > Subject: Re: [PATCH] usb: dwc3: Add dwc3 lock for blocking interrupt > storming > > 정재훈 wrote: > > Hi. > > > >> -----Original Message----- > >> From: Thinh Nguyen [mailto:Thinh.Nguyen@xxxxxxxxxxxx] > >> Sent: Thursday, March 10, 2022 11:14 AM > >> To: JaeHun Jung; Felipe Balbi; Greg Kroah-Hartman > >> Cc: open list:USB XHCI DRIVER; open list; Seungchull Suh; Daehwan > >> Jung > >> Subject: Re: [PATCH] usb: dwc3: Add dwc3 lock for blocking interrupt > >> storming > >> > >> Hi, > >> > >> JaeHun Jung wrote: > >>> Interrupt Storming occurred with a very low probability of occurrence. > >>> The occurrence of the problem is estimated to be caused by a race > >>> condition between the top half and bottom half of the interrupt > >>> service > >> routine. > >>> It was confirmed that variables have values that cannot be held when > >>> ISR occurs through normal H / W irq. > >>> ==================================================================== > >>> = (struct dwc3_event_buffer *) ev_buf = 0xFFFFFF88DE6A0380 ( > >>> (void *) buf = 0xFFFFFFC01594E000, > >>> (void *) cache = 0xFFFFFF88DDC14080, > >>> (unsigned int) length = 4096, > >>> (unsigned int) lpos = 0, > >>> (unsigned int) count = 0, << > >>> (unsigned int) flags = 1, << > >>> ==================================================================== > >>> = "evt->count=0" and "evt->flags=DWC3_EVENT_PENDING" cannot be set > >>> at the same time. > >>> > >>> We estimate that a race condition occurred between dwc3_interrupt() > >>> and dwc3_process_event_buf() called by > >>> dwc3_gadget_process_pending_events(). > >>> So I try to block the race condition through spin_lock. > >> > >> This looks like it needs a memory barrier. Would this work for you? > > Maybe it could be. But "evt->count = 0;" is updated on > dwc3_process_event_buf(). > > So, I think spin_lock is more clear routine for this issue. > > > > Not really. If problem is due to the evt->flags not updated in time, then > the solution should be using the memory barrier. The spin_lock would > obfuscate the issue. And we should avoid using spin_lock in the top-half. This issue was occurred by watchdog. The interrupt occurred in units of 4 to 5us and cannot be released until the bottom is executed. If it is a problem with the memory barrier, the value should be updated after a few clocks and the TOP should run normally. Isn't it? And Could you explain me why we should avoid using spin_lock in the top-half. > > BR, > Thinh > > >> > >> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c > >> index > >> c02e239978e0..a96c344b9f17 100644 > >> --- a/drivers/usb/dwc3/gadget.c > >> +++ b/drivers/usb/dwc3/gadget.c > >> @@ -5340,6 +5340,9 @@ static irqreturn_t dwc3_check_event_buf(struct > >> dwc3_event_buffer *evt) > >> return IRQ_HANDLED; > >> } > >> > >> + /* Make sure the event flags is updated */ > >> + wmb(); > >> + > >> /* > >> * With PCIe legacy interrupt, test shows that top-half irq > >> handler can > >> * be called again after HW interrupt deassertion. Check if > >> bottom- half > >> > >> > >> Thanks, > >> Thinh > >