Hi, On 01/15/2015 07:24 AM, Felipe Balbi wrote: >>>>>>>>>>> This is really, really odd. Register accesses are atomic, so the lock >>>>>>>>>>> isn't really doing anything. Besides, you're calling >>>>>>>>>>> dwc2_is_controller_alive() from within the IRQ handler, so IRQs are >>>>>>>>>>> already disabled. >>>>>>>>>> >>>>>>>>>> Spinlocks sometimes do more than you think. For instance, here the >>>>>>>>>> lock prevents the register access from happening while some other CPU >>>>>>>>>> is holding the lock. If a silicon quirk causes the register access to >>>>>>>>>> interfere with other activities, this could be important. >>>>>>>>> >>>>>>>>> readl() (which is used by dwc2_is_controller_alive()) adds a memory >>>>>>>>> barrier to the register accesses, that should force all register >>>>>>>>> accesses the be correctly ordered. >>>>>>>> >>>>>>>> Memory barriers will order accesses that are all made on the same CPU >>>>>>>> with respect to each other. They do not order these accesses against >>>>>>>> accesses made from another CPU -- that's why we have spinlocks. :-) >>>>>>> >>>>>>> a fair point :-) The register is still read-only, so that shouldn't >>>>>>> matter either :-) >>>>>>> >>>>>>>>> I fail to see how a silicon quirk >>>>>>>>> could cause this and if, indeed, it does, I'd be more comfortable with a >>>>>>>>> proper STARS tickect number from synopsys :-s >>>>>>>> >>>>>>>> Maybe accessing this register somehow resets something else. I don't >>>>>>>> know. It seems unlikely, but at least it explains how adding a >>>>>>>> spinlock could fix the problem. >>>>>>> >>>>>>> I would really need Paul (or someone at Synopsys) to confirm this >>>>>>> somehow. Maybe it has something to do with how the register is >>>>>>> implemented, dunno. >>>>>>> >>>>>>> Paul, do you have any idea what could cause this ? Could the HW into >>>>>>> some weird state if we read GSNPSID at random locations or when data is >>>>>>> being transferred, or anything like that ? >>>>>> >>>>>> Only thing I can think of is that there is some silicon bug in Robert's >>>>>> platform. But I am not aware of any STARs that mention accesses to the >>>>>> GSNPSID register as being problematic. >>>>>> >>>>>> Funny thing is, this code has been basically the same since at least >>>>>> November 2013. So I think some other recent change must have modified >>>>>> the timing of the register accesses, or something like that. But that's >>>>>> just handwaving, really. >>>>> >>>>> Alright, I'll apply this patch but for 3.20 with a stable tag as I have >>>>> already sent my last pull request to Greg. Unless someone has a really >>>>> big complaint about doing things as such. >>>> >>>> It should go to 3.19-rc shouldn't it? It's a fix, and Robert's platform >>>> is broken without it, IIUC. >>> >>> It can also be categorized as "has-never-worked-before" before the code >>> has been like this forever. Since we don't really have a git bisect >>> result pointing to a commit that went in v3.19 merge window, I'm not >>> sure how I can convince myself that this absolutely needs to be in >>> v3.19. >>> >>> At a minimum, I need a proper bisection with a proper commit being >>> blamed (even if it's a commit from months ago). From my point of view, >>> debugging of this "regression" has not been finalized and we're just >>> "assuming" it's caused by GSNPSID because moving that inside the >>> spin_lock seems to fix the problem. >> >> On further investigation, I was wrong about "this code has been >> basically the same since at least November 2013". Prior to commit >> db8178c33db "usb: dwc2: Update common interrupt handler to call gadget >> interrupt handler" from November 2014, the gadget interrupt handler >> did not read from the GSNPSID register. > > right, but the common IRQ always did. So unless Robert's SoC has always > been used only for peripheral, then I agree with you that behavior did, > in fact, change. As far as I know, DWC2 at this platform was always used as peripheral. Exynos SoC's has EHCI USB controllers, so in 99% of cases there is simply no need to use DWC2 as host. > >> So likely the bug in Robert's hardware has been there all along, and >> that commit just caused it to manifest itself. > > Robert, out of curiosity, which SoC are you using ? Is it UP or SMP ? > > I guess we need a mention on commit log that at least SoC XYZ is known > to break unless the register access is done with locks held. > I'm using Exynos4412 (Odroid U3). Revision number of my DWC2 is 2.81a. I will update commit message and send patch v3. Thanks, Robert Baldyga -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html