On 2023-06-05 01:04 PM, Benjamin Tissoires wrote: > > > On Jun 05 2023, Mark Lord wrote: >> >> On 2023-06-05 10:27 AM, Benjamin Tissoires wrote: >>> >>> On Jun 05 2023, Linux regression tracking (Thorsten Leemhuis) wrote: >>>> >>>> On 03.06.23 14:41, Jiri Kosina wrote: >>>>> On Wed, 31 May 2023, Jiri Kosina wrote: >>>>> >>>>>>> If an attempt at contacting a receiver or a device fails because the >>>>>>> receiver or device never responds, don't restart the communication, only >>>>>>> restart it if the receiver or device answers that it's busy, as originally >>>>>>> intended. >>>>>>> >>>>>>> This was the behaviour on communication timeout before commit 586e8fede795 >>>>>>> ("HID: logitech-hidpp: Retry commands when device is busy"). >>>>>>> >>>>>>> This fixes some overly long waits in a critical path on boot, when >>>>>>> checking whether the device is connected by getting its HID++ version. >>>>>>> >>>>>>> Signed-off-by: Bastien Nocera <hadess@xxxxxxxxxx> >>>>>>> Suggested-by: Mark Lord <mlord@xxxxxxxxx> >>>>>>> Fixes: 586e8fede795 ("HID: logitech-hidpp: Retry commands when device is busy") >>>>>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217412 >>>>> [...] >>>>>> >>>>>> I have applied this even before getting confirmation from the reporters in >>>>>> bugzilla, as it's the right thing to do anyway. >>>>> >>>>> Unfortunately it doesn't seem to cure the reported issue (while reverting >>>>> 586e8fede79 does): >>>> >>>> BTW, remind me again: was fixing this by reverting 586e8fede79 for now a >>>> option? I guess it's not, but if I'm wrong I wonder if that might at >>>> this point be the best way forward. >>> >>> Could be. I don't think we thought at simply reverting it because it is >>> required for some new supoprted devices because they might differ >>> slightly from what we currently supported. >>> >>> That being said, Bastien will be unavailable for at least a week AFAIU, >>> so maybe we should revert that patch. >>> >>>> >>>>> https://bugzilla.kernel.org/show_bug.cgi?id=217523#c2 >> >> Pardon me, but I don't understand what that "retry" loop >> is even attempting to do inside function hidpp_send_message_sync(). >> >> It appears to unconditionally loop until: >> (1) the __hidpp_send_report() fails, >> or (2) it gets a HIDPP_ERROR, >> or (3) it gets a HIDPP20_ERROR other than HIDPP20_ERROR_BUSY, >> or (4) until it has looped 3 times, which appears to be the normal exit. >> >> It doesn't seem to have any provision to exit the loop earlier on "success" >> (whatever that is). >> >> And so when it finally does exit after the 3 iterations, >> it then returns the last value of "ret", >> which will be zero from the __hidpp_send_report() call, >> or sometimes the most recent non-BUSY HIDPP20_ERROR seen. >> >> Obviously I'm missing something, as otherwise this code would never have >> passed review and made it into the Linux kernel in the first place. Right? > > Ouch, very much ouch :( > > So that one is on me, sorry, I completely missed that and trusted the > local tests. We are human and can make mistakes. And TBH a lot of people > have been staring at this code for a while now without noticing that. > > Would you mind giving a shot at the following (untested) patch: > > --- >>From 7d03a6b765571d7c518c85e7e74f6f9d498fa354 Mon Sep 17 00:00:00 2001 > From: Benjamin Tissoires <benjamin.tissoires@xxxxxxxxxx> > Date: Mon, 5 Jun 2023 18:56:36 +0200 > Subject: [PATCH] HID: hidpp: terminate retry loop on success > > It seems we forgot the normal case to terminate the retry loop, > making us asking 3 times each command, which is probably a little bit > too much. > > And remove the ugly "goto exit" that can be replaced by a simpler "break" > > Fixes: 586e8fede795 ("HID: logitech-hidpp: Retry commands when device is busy") > Suggested-by: Mark Lord <mlord@xxxxxxxxx> > Signed-off-by: Benjamin Tissoires <benjamin.tissoires@xxxxxxxxxx> > --- > drivers/hid/hid-logitech-hidpp.c | 13 ++++++------- > 1 file changed, 6 insertions(+), 7 deletions(-) > > diff --git a/drivers/hid/hid-logitech-hidpp.c b/drivers/hid/hid-logitech-hidpp.c > index 2246044b1639..5e1a412fd28f 100644 > --- a/drivers/hid/hid-logitech-hidpp.c > +++ b/drivers/hid/hid-logitech-hidpp.c > @@ -286,7 +286,7 @@ static int hidpp_send_message_sync(struct hidpp_device *hidpp, > struct hidpp_report *message, > struct hidpp_report *response) > { > - int ret; > + int ret = -1; > int max_retries = 3; > > mutex_lock(&hidpp->send_mutex); > @@ -300,13 +300,13 @@ static int hidpp_send_message_sync(struct hidpp_device *hidpp, > */ > *response = *message; > > - for (; max_retries != 0; max_retries--) { > + for (; max_retries != 0 && ret; max_retries--) { > ret = __hidpp_send_report(hidpp->hid_dev, message); > > if (ret) { > dbg_hid("__hidpp_send_report returned err: %d\n", ret); > memset(response, 0, sizeof(struct hidpp_report)); > - goto exit; > + break; > } > > if (!wait_event_timeout(hidpp->wait, hidpp->answer_available, > @@ -314,14 +314,14 @@ static int hidpp_send_message_sync(struct hidpp_device *hidpp, > dbg_hid("%s:timeout waiting for response\n", __func__); > memset(response, 0, sizeof(struct hidpp_report)); > ret = -ETIMEDOUT; > - goto exit; > + break; > } > > if (response->report_id == REPORT_ID_HIDPP_SHORT && > response->rap.sub_id == HIDPP_ERROR) { > ret = response->rap.params[1]; > dbg_hid("%s:got hidpp error %02X\n", __func__, ret); > - goto exit; > + break; > } > > if ((response->report_id == REPORT_ID_HIDPP_LONG || > @@ -330,13 +330,12 @@ static int hidpp_send_message_sync(struct hidpp_device *hidpp, > ret = response->fap.params[1]; > if (ret != HIDPP20_ERROR_BUSY) { > dbg_hid("%s:got hidpp 2.0 error %02X\n", __func__, ret); > - goto exit; > + break; > } > dbg_hid("%s:got busy hidpp 2.0 error %02X, retrying\n", __func__, ret); > } > } > > -exit: > mutex_unlock(&hidpp->send_mutex); > return ret; > > Tested-by: Mark Lord <mlord@xxxxxxxxx> That works fine for me on top of 6.3.6, and I don't even see the ETIMEDOUT happening there either (added a printk() for it). I am unable to test on 6.4.0-rc5, because that kernel doesn't work with my USB3 docking station. Cheers -- Mark Lord