On Mon, May 17, 2021 at 02:57:28PM +0200, Benjamin Berg wrote: > Hi Heikki, > > > On Mon, 2021-05-17 at 15:15 +0300, Heikki Krogerus wrote: > > On Mon, May 17, 2021 at 01:03:12PM +0300, Heikki Krogerus wrote: > > > On Sat, May 15, 2021 at 09:09:53PM -0700, Bjorn Andersson wrote: > > > > It's possible that the interrupt handler for the UCSI driver > > > > signals a > > > > connector changes after the handler clears the PENDING bit, but > > > > before > > > > it has sent the acknowledge request. The result is that the handler > > > > is > > > > invoked yet again, to ack the same connector change. > > > > > > > > At least some versions of the Qualcomm UCSI firmware will not > > > > handle the > > > > second - "spurious" - acknowledgment gracefully. So make sure to > > > > not > > > > clear the pending flag until the change is acknowledged. > > > > > > > > Any connector changes coming in after the acknowledgment, that > > > > would > > > > have the pending flag incorrectly cleared, would afaict be covered > > > > by > > > > the subsequent connector status check. > > > > > > > > Fixes: 217504a05532 ("usb: typec: ucsi: Work around PPM losing > > > > change information") > > > > Signed-off-by: Bjorn Andersson <bjorn.andersson@xxxxxxxxxx> > > > > > > I'm OK with this if Bejamin does not see any problems with it. I'll > > > wait for his comments before giving my reviewed-by tag. > > > > > > That workaround (commit 217504a05532) is unfortunately too fragile. > > > I'm going to now separate the processing of the connector state from > > > the event handler (interrupt handler). That way we should be fairly > > > sure we don't loose any of the connector states even if an event is > > > generated while we are still in the middle of processing the previous > > > one(s), and at the same time be sure that we also don't confuse the > > > firmware. > > > > > > So the event handler shall after that only read the connector status, > > > schedule the unique job where it's processed and ACK the event. > > > Nothing else. > > > > Seems to be straightforward to implement. I'm attaching the patch I > > made for that. I think it should actually also remove the problem you > > are seeing. Can you test it? > > Hmm, I am happy to try the patch tomorrow in the scenario where I > observed my race condition. > > Unfortunately, I don't feel it'll work. The problem that I was seeing > looked like a race condition in the PPM itself, where the window is the > time between the UCSI_GET_CONNECTOR_STATUS command and the subsequent > ACK. > For such a firmware level bug in the PPM, we need a way to detect the > race condition when it happens (or get a fix for the firmware). OK. Let me know does the patch bring the issue back for you. > Note that there was also some code to always sent a > UCSI_GET_CAM_SUPPORTED command "to ensure the PPM does not get stuck in > case it assumes we do". I have no idea where this came from or what > PPMs might require this workaround. But I doubt it is a good idea to > drop it. Sure. thanks, -- heikki