On Mon, May 22, 2023 at 5:38 AM Linux regression tracking (Thorsten Leemhuis) <regressions@xxxxxxxxxxxxx> wrote: > > FWIW, in case anybody is interested in a status update: one reporter > bisected the problem down to 586e8fede79 ("HID: logitech-hidpp: Retry > commands when device is busy"); reverting that commit on-top of 6.3 > fixes the problem for that reporter. For that reporter things also work > on 6.4-rc; but for someone else that is affected that's not the case. Hmm. It's likely timing-dependent. But that code is clearly buggy. If the wait_event_timeout() returns early, the device hasn't replied, but the code does if (!wait_event_timeout(hidpp->wait, hidpp->answer_available, 5*HZ)) { dbg_hid("%s:timeout waiting for response\n", __func__); memset(response, 0, sizeof(struct hidpp_report)); ret = -ETIMEDOUT; } and then continues to look at the response _anyway_. Now, depending on out hardening options, that response may have been initialized by the compiler, or may just be random stack contents. That bug is pre-existing (ie the problem was not introduced by that commit), but who knows if the retry makes things worse (ie if it then triggers on a retry, the response data will be the *previous* response). The whole "goto exit" games should be removed too, because we're in a for-loop, and instead of "goto exit" it should just do "break". IOW, something like this might be worth testing. That said, while I think the code is buggy, I doubt this is the actual cause of the problem people are reporting. But it would be lovely to hear if the attached patch makes any difference, and I think this is fixing a real - but unlikely - problem anyway. And obviously it might be helpful to actually enable those dbg_hid() messages, but I didn't look at what the magic config option to do so was. NOTE! Patch below *ENTIRELY* untested. I just looked at the code when that commit was mentioned, and went "that's not right"... Linus
drivers/hid/hid-logitech-hidpp.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/hid/hid-logitech-hidpp.c b/drivers/hid/hid-logitech-hidpp.c index 0fcfd85fea0f..72bd62d2f984 100644 --- a/drivers/hid/hid-logitech-hidpp.c +++ b/drivers/hid/hid-logitech-hidpp.c @@ -306,7 +306,7 @@ static int hidpp_send_message_sync(struct hidpp_device *hidpp, if (ret) { dbg_hid("__hidpp_send_report returned err: %d\n", ret); memset(response, 0, sizeof(struct hidpp_report)); - goto exit; + break; } if (!wait_event_timeout(hidpp->wait, hidpp->answer_available, @@ -314,13 +314,14 @@ static int hidpp_send_message_sync(struct hidpp_device *hidpp, dbg_hid("%s:timeout waiting for response\n", __func__); memset(response, 0, sizeof(struct hidpp_report)); ret = -ETIMEDOUT; + break; } if (response->report_id == REPORT_ID_HIDPP_SHORT && response->rap.sub_id == HIDPP_ERROR) { ret = response->rap.params[1]; dbg_hid("%s:got hidpp error %02X\n", __func__, ret); - goto exit; + break; } if ((response->report_id == REPORT_ID_HIDPP_LONG || @@ -329,13 +330,12 @@ static int hidpp_send_message_sync(struct hidpp_device *hidpp, ret = response->fap.params[1]; if (ret != HIDPP20_ERROR_BUSY) { dbg_hid("%s:got hidpp 2.0 error %02X\n", __func__, ret); - goto exit; + break; } dbg_hid("%s:got busy hidpp 2.0 error %02X, retrying\n", __func__, ret); } } -exit: mutex_unlock(&hidpp->send_mutex); return ret;