On 6/27/24 17:58, Dmitry Baryshkov wrote: > On Thu, 27 Jun 2024 at 18:51, Fabrice Gasnier > <fabrice.gasnier@xxxxxxxxxxx> wrote: >> >> On 6/25/24 18:49, Dmitry Baryshkov wrote: >>> On Tue, Jun 25, 2024 at 05:24:54PM GMT, Fabrice Gasnier wrote: >>>> On 6/21/24 00:55, Dmitry Baryshkov wrote: >>>>> Extract common functions to handle command sending and to handle events >>>>> from UCSI. This ensures that all UCSI glue drivers handle the ACKs in >>>>> the same way. >>>>> >>>>> The CCG driver used DEV_CMD_PENDING both for internal >>>>> firmware-related commands and for UCSI control handling. Leave the >>>>> former use case intact. >>>>> >>>>> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@xxxxxxxxxx> >>>>> --- >>>>> drivers/usb/typec/ucsi/ucsi.c | 43 +++++++++++++++++++++++++++ >>>>> drivers/usb/typec/ucsi/ucsi.h | 7 +++++ >>>>> drivers/usb/typec/ucsi/ucsi_acpi.c | 46 ++--------------------------- >>>>> drivers/usb/typec/ucsi/ucsi_ccg.c | 21 ++----------- >>>>> drivers/usb/typec/ucsi/ucsi_glink.c | 47 ++--------------------------- >>>>> drivers/usb/typec/ucsi/ucsi_stm32g0.c | 44 ++-------------------------- >>>>> drivers/usb/typec/ucsi/ucsi_yoga_c630.c | 52 ++------------------------------- >>>>> 7 files changed, 62 insertions(+), 198 deletions(-) >>>>> >>>>> diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c >>>>> index 4ba22323dbf9..691ee0c4ef87 100644 >>>>> --- a/drivers/usb/typec/ucsi/ucsi.c >>>>> +++ b/drivers/usb/typec/ucsi/ucsi.c >>>>> @@ -36,6 +36,48 @@ >>>>> */ >>>>> #define UCSI_SWAP_TIMEOUT_MS 5000 >>>>> >>>>> +void ucsi_notify_common(struct ucsi *ucsi, u32 cci) >>>>> +{ >>>>> + if (UCSI_CCI_CONNECTOR(cci)) >>>>> + ucsi_connector_change(ucsi, UCSI_CCI_CONNECTOR(cci)); >>>>> + >>>>> + if (cci & UCSI_CCI_ACK_COMPLETE && >>>>> + test_bit(ACK_PENDING, &ucsi->flags)) >>>>> + complete(&ucsi->complete); >>>>> + >>>>> + if (cci & UCSI_CCI_COMMAND_COMPLETE && >>>>> + test_bit(COMMAND_PENDING, &ucsi->flags)) >>>>> + complete(&ucsi->complete); >>>> >>>> Hi Dmitry, >>>> >>>> I've recently faced some race with ucsi_stm32g0 driver, and have sent a >>>> fix for it [1], as you've noticed in the cover letter. >>>> >>>> To fix that, I've used test_and_clear_bit() in above two cases, instead >>>> of test_bit(). >>> >>> Could you possible describe, why do you need test_and_clear_bit() >>> instead of just test_bit()? The bits are cleared at the end of the >>> .sync_write(), also there can be no other command (or ACK_CC) submission >>> before this one is fully processed. >> >> Hi Dmitry, >> >> It took me some time to reproduce this race I observed earlier. >> (I observe this during DR swap.) >> >> Once the ->async_control(UCSI_ACK_CC_CI) call bellow gets completed, and >> before the ACK_PENDING bit gets cleared, e.g. clear_bit(ACK_PENDING), I >> get an asynchronous interrupt. >> >> Basically, Then the above complete() gets called (due to >> UCSI_CCI_ACK_COMPLETE & ACK_PENDING). >> >> Subsequent UCSI_GET_CONNECTOR_STATUS command (from >> ucsi_handle_connector_change) will be unblocked immediately due to >> complete() call has already happen, without UCSI_CCI_COMMAND_COMPLETE >> cci flag, hence returning -EIO. > > But the ACK_CI is being sent as a response to a command. This means > that the ppm_lock should be locked. The UCSI_GET_CONNECTOR_STATUS > command should wait for ppm_lock to be freed and only then it can > proceed with sending the command. Maybe I'm misunderstanding the case > or maybe there is a loophole somewhere. There's probably also some miss-understanding at my end, I don't get how the ppm_lock would protext from non atomic async_control()/clear_bit(). Let me share some trace here, I hope it can better show the difference at my end when I get the race. I've added some debug prints around these lines, to trace the set/clear of the COMMAND/ACK_PENDING, main commands and cci flags. normal case is: --- ucsi_send_command_common: UCSI_SET_UOR set:COMMAND_PENDING ucsi_stm32g0_irq_handler ucsi_notify_common: UCSI_CCI_COMMAND_COMPLETE clr:COMMAND_PENDING ucsi_acknowledge: UCSI_ACK_COMMAND_COMPLETE UCSI_ACK_CONNECTOR_CHANGE set:ACK_PENDING ucsi_stm32g0_irq_handler ucsi_notify_common: UCSI_CCI_ACK_COMPLETE clr:ACK_PENDING ucsi_stm32g0_irq_handler ucsi_notify_common: ucsi_connector_change ucsi_send_command_common: UCSI_GET_CONNECTOR_STATUS set:COMMAND_PENDING ucsi_stm32g0_irq_handler ucsi_notify_common: UCSI_CCI_COMMAND_COMPLETE clr:COMMAND_PENDING ucsi_acknowledge: UCSI_ACK_COMMAND_COMPLETE set:ACK_PENDING ucsi_stm32g0_irq_handler ucsi_notify_common: UCSI_CCI_ACK_COMPLETE clr:ACK_PENDING racy case: --- ucsi_send_command_common: UCSI_SET_UOR set:COMMAND_PENDING ucsi_stm32g0_irq_handler ucsi_notify_common: UCSI_CCI_COMMAND_COMPLETE clr:COMMAND_PENDING ucsi_acknowledge: UCSI_ACK_COMMAND_COMPLETE UCSI_ACK_CONNECTOR_CHANGE set:ACK_PENDING ucsi_stm32g0_irq_handler <-- up to here nothing supicious ucsi_notify_common: ucsi_connector_change ucsi_notify_common: UCSI_CCI_ACK_COMPLETE ucsi_stm32g0_irq_handler <-- 2nd IRQ before clr:ACK_PENDING ucsi_notify_common: ucsi_connector_change ucsi_notify_common: UCSI_CCI_ACK_COMPLETE clr:ACK_PENDING ucsi_send_command_common: UCSI_GET_CONNECTOR_STATUS set:COMMAND_PENDING <-- completion already done :-( clr:COMMAND_PENDING ucsi_handle_connector_change: GET_CONNECTOR_STATUS failed (-5) I notice the two conditions are met above: both ucsi_connector_change() and cci ack completed. This happens before clear_bit(ACK_PENDING), which lead to anticipate completion of the subsequent UCSI_GET_CONNECTOR_STATUS command. So the test_and_clear_bit() would address a robustness case? Please advise, Best Regards, Fabrice > >> This is where the test_and_clear_bit() atomic operation helps, to avoid >> non atomic operation: >> >> -> async_control(UCSI_ACK_CC_CI) >> new interrupt may occur here >> -> clear_bit(ACK_PENDING) >> >>> >>>> >>>> https://lore.kernel.org/linux-usb/20240612124656.2305603-1-fabrice.gasnier@xxxxxxxxxxx/ >>>> >>>>> +} >>>>> +EXPORT_SYMBOL_GPL(ucsi_notify_common); >>>>> + >>>>> +int ucsi_sync_control_common(struct ucsi *ucsi, u64 command) >>>>> +{ >>>>> + bool ack = UCSI_COMMAND(command) == UCSI_ACK_CC_CI; >>>>> + int ret; >>>>> + >>>>> + if (ack) >>>>> + set_bit(ACK_PENDING, &ucsi->flags); >>>>> + else >>>>> + set_bit(COMMAND_PENDING, &ucsi->flags); >>>>> + >>>>> + ret = ucsi->ops->async_control(ucsi, command); >>>>> + if (ret) >>>>> + goto out_clear_bit; >>>>> + >>>>> + if (!wait_for_completion_timeout(&ucsi->complete, 5 * HZ)) >>>>> + ret = -ETIMEDOUT; >>>> >>>> With test_and_clear_bit(), could return 0, in case of success here. >>> >>> Oh, I see. So your code returns earlier. I have a feeling that this >>> approach is less logical and slightly harder to follow. >> >> By reading your proposal bellow, I'd agree with you. >>> >>> Maybe it's easier if it is implemented as: >>> >>> if (wait_for_completion_timeout(...)) >>> return 0; >> >> Yes, sounds good to me. >> >>> >>> if (ack) >>> clear_bit(ACK_PENDING) >>> else >>> clear_bit(COMMAND_PENDING) >>> >>> return -ETIMEDOUT; >>> >>> >>> OR >>> >>> if (!wait_for_completion_timeout(...)) { >>> if (ack) >>> clear_bit(ACK_PENDING) >>> else >>> clear_bit(COMMAND_PENDING) >>> >>> return -ETIMEDOUT; >>> } >> >> Both seems fine. >> >> Please advise, >> BR, >> Fabrice >> >>> >>> return 0; >>> >>> But really, unless there is an actual issue with the current code, I'd >>> prefer to keep it. It makes it clear that the bits are set and then are >>> cleared properly. >>> >>>> I'd suggest to use similar approach here, unless you see some drawback? >>>> >>>> Best Regards, >>>> Fabrice >>>> >>>>> + >>>>> +out_clear_bit: >>>>> + if (ack) >>>>> + clear_bit(ACK_PENDING, &ucsi->flags); >>>>> + else >>>>> + clear_bit(COMMAND_PENDING, &ucsi->flags); >>>>> + >>>>> + return ret; >>>>> +} >>>>> +EXPORT_SYMBOL_GPL(ucsi_sync_control_common); >>>>> + >>>>> static int ucsi_acknowledge(struct ucsi *ucsi, bool conn_ack) >>>>> { >>>>> u64 ctrl; >>>>> @@ -1883,6 +1925,7 @@ struct ucsi *ucsi_create(struct device *dev, const struct ucsi_operations *ops) >>>>> INIT_WORK(&ucsi->resume_work, ucsi_resume_work); >>>>> INIT_DELAYED_WORK(&ucsi->work, ucsi_init_work); >>>>> mutex_init(&ucsi->ppm_lock); >>>>> + init_completion(&ucsi->complete); >>>>> ucsi->dev = dev; >>>>> ucsi->ops = ops; >>>> >>>> [snip] >>>> >>>>> ucsi->ucsi = ucsi_create(dev, &pmic_glink_ucsi_ops); >>>>> diff --git a/drivers/usb/typec/ucsi/ucsi_stm32g0.c b/drivers/usb/typec/ucsi/ucsi_stm32g0.c >>>>> index 14737ca3724c..d948c3f579e1 100644 >>>>> --- a/drivers/usb/typec/ucsi/ucsi_stm32g0.c >>>>> +++ b/drivers/usb/typec/ucsi/ucsi_stm32g0.c >>>>> @@ -61,11 +61,7 @@ struct ucsi_stm32g0 { >>>> >>>> [snip] >>>> >>>>> - >>>>> - ret = ucsi_stm32g0_async_control(ucsi, command); >>>>> - if (ret) >>>>> - goto out_clear_bit; >>>>> - >>>>> - if (!wait_for_completion_timeout(&g0->complete, msecs_to_jiffies(5000))) >>>>> - ret = -ETIMEDOUT; >>>>> - else >>>>> - return 0; >>>>> - >>>>> -out_clear_bit: >>>>> - if (ack) >>>>> - clear_bit(ACK_PENDING, &g0->flags); >>>>> - else >>>>> - clear_bit(COMMAND_PENDING, &g0->flags); >>>>> - >>>>> - return ret; >>>>> -} >>>>> - >>>>> static irqreturn_t ucsi_stm32g0_irq_handler(int irq, void *data) >>>>> { >>>>> struct ucsi_stm32g0 *g0 = data; >>>>> @@ -449,13 +416,7 @@ static irqreturn_t ucsi_stm32g0_irq_handler(int irq, void *data) >>>>> if (ret) >>>>> return IRQ_NONE; >>>>> >>>>> - if (UCSI_CCI_CONNECTOR(cci)) >>>>> - ucsi_connector_change(g0->ucsi, UCSI_CCI_CONNECTOR(cci)); >>>>> - >>>>> - if (cci & UCSI_CCI_ACK_COMPLETE && test_and_clear_bit(ACK_PENDING, &g0->flags)) >>>>> - complete(&g0->complete); >>>>> - if (cci & UCSI_CCI_COMMAND_COMPLETE && test_and_clear_bit(COMMAND_PENDING, &g0->flags)) >>>>> - complete(&g0->complete); >>>>> + ucsi_notify_common(g0->ucsi, cci); >>>> >>>> I can see the fix "test_and_clear_bit()" sent earlier is removed from here. >>>> >>>> I'd suggest to use similar approach as here, unless you see some drawback? >>>> >>>> Please advise, >>>> Best Regards, >>>> Fabrice >>> > > >