The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable@xxxxxxxxxxxxxxx>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y git checkout FETCH_HEAD git cherry-pick -x 437a310b22244d4e0b78665c3042e5d1c0f45306 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable@xxxxxxxxxxxxxxx>' --in-reply-to '2024012719-remarry-magical-0c2e@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^.. Possible dependencies: 437a310b2224 ("firmware: arm_scmi: Check mailbox/SMT channel for consistency") 13fba878ccdd ("firmware: arm_scmi: Add priv parameter to scmi_rx_callback") e9b21c96181c ("firmware: arm_scmi: Make .clear_channel optional") ed7c04c1fea3 ("firmware: arm_scmi: Handle concurrent and out-of-order messages") 9ca5a1838e59 ("firmware: arm_scmi: Introduce monotonically increasing tokens") 3669032514be ("firmware: arm_scmi: Remove scmi_dump_header_dbg() helper") e30d91d4ffda ("firmware: arm_scmi: Move reinit_completion from scmi_xfer_get to do_xfer") 0cb7af474e0d ("firmware: arm_scmi: Reset Rx buffer to max size during async commands") d4f9dddd21f3 ("firmware: arm_scmi: Add dynamic scmi devices creation") f5800e0bf6f9 ("firmware: arm_scmi: Add protocol modularization support") a02d7c93c1f3 ("firmware: arm_scmi: Make notify_priv really private") 9162afa2ae99 ("firmware: arm_scmi: Cleanup unused core transfer helper wrappers") 51fe1b154e2f ("firmware: arm_scmi: Cleanup legacy protocol init code") fe4894d968f4 ("firmware: arm_scmi: Port voltage protocol to new protocols interface") b46d852718c1 ("firmware: arm_scmi: Port systempower protocol to new protocols interface") 9694a7f62359 ("firmware: arm_scmi: Port sensor protocol to new protocols interface") 7e0293442238 ("firmware: arm_scmi: Port reset protocol to new protocols interface") 887281c7519d ("firmware: arm_scmi: Port clock protocol to new protocols interface") 9bc8069c8567 ("firmware: arm_scmi: Port power protocol to new protocols interface") 1fec5e6b5233 ("firmware: arm_scmi: Port perf protocol to new protocols interface") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 437a310b22244d4e0b78665c3042e5d1c0f45306 Mon Sep 17 00:00:00 2001 From: Cristian Marussi <cristian.marussi@xxxxxxx> Date: Wed, 20 Dec 2023 17:21:12 +0000 Subject: [PATCH] firmware: arm_scmi: Check mailbox/SMT channel for consistency On reception of a completion interrupt the shared memory area is accessed to retrieve the message header at first and then, if the message sequence number identifies a transaction which is still pending, the related payload is fetched too. When an SCMI command times out the channel ownership remains with the platform until eventually a late reply is received and, as a consequence, any further transmission attempt remains pending, waiting for the channel to be relinquished by the platform. Once that late reply is received the channel ownership is given back to the agent and any pending request is then allowed to proceed and overwrite the SMT area of the just delivered late reply; then the wait for the reply to the new request starts. It has been observed that the spurious IRQ related to the late reply can be wrongly associated with the freshly enqueued request: when that happens the SCMI stack in-flight lookup procedure is fooled by the fact that the message header now present in the SMT area is related to the new pending transaction, even though the real reply has still to arrive. This race-condition on the A2P channel can be detected by looking at the channel status bits: a genuine reply from the platform will have set the channel free bit before triggering the completion IRQ. Add a consistency check to validate such condition in the A2P ISR. Reported-by: Xinglong Yang <xinglong.yang@xxxxxxxxxxx> Closes: https://lore.kernel.org/all/PUZPR06MB54981E6FA00D82BFDBB864FBF08DA@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ Fixes: 5c8a47a5a91d ("firmware: arm_scmi: Make scmi core independent of the transport type") Cc: stable@xxxxxxxxxxxxxxx # 5.15+ Signed-off-by: Cristian Marussi <cristian.marussi@xxxxxxx> Tested-by: Xinglong Yang <xinglong.yang@xxxxxxxxxxx> Link: https://lore.kernel.org/r/20231220172112.763539-1-cristian.marussi@xxxxxxx Signed-off-by: Sudeep Holla <sudeep.holla@xxxxxxx> diff --git a/drivers/firmware/arm_scmi/common.h b/drivers/firmware/arm_scmi/common.h index c46dc5215af7..00b165d1f502 100644 --- a/drivers/firmware/arm_scmi/common.h +++ b/drivers/firmware/arm_scmi/common.h @@ -314,6 +314,7 @@ void shmem_fetch_notification(struct scmi_shared_mem __iomem *shmem, void shmem_clear_channel(struct scmi_shared_mem __iomem *shmem); bool shmem_poll_done(struct scmi_shared_mem __iomem *shmem, struct scmi_xfer *xfer); +bool shmem_channel_free(struct scmi_shared_mem __iomem *shmem); /* declarations for message passing transports */ struct scmi_msg_payld; diff --git a/drivers/firmware/arm_scmi/mailbox.c b/drivers/firmware/arm_scmi/mailbox.c index 19246ed1f01f..b8d470417e8f 100644 --- a/drivers/firmware/arm_scmi/mailbox.c +++ b/drivers/firmware/arm_scmi/mailbox.c @@ -45,6 +45,20 @@ static void rx_callback(struct mbox_client *cl, void *m) { struct scmi_mailbox *smbox = client_to_scmi_mailbox(cl); + /* + * An A2P IRQ is NOT valid when received while the platform still has + * the ownership of the channel, because the platform at first releases + * the SMT channel and then sends the completion interrupt. + * + * This addresses a possible race condition in which a spurious IRQ from + * a previous timed-out reply which arrived late could be wrongly + * associated with the next pending transaction. + */ + if (cl->knows_txdone && !shmem_channel_free(smbox->shmem)) { + dev_warn(smbox->cinfo->dev, "Ignoring spurious A2P IRQ !\n"); + return; + } + scmi_rx_callback(smbox->cinfo, shmem_read_header(smbox->shmem), NULL); } diff --git a/drivers/firmware/arm_scmi/shmem.c b/drivers/firmware/arm_scmi/shmem.c index 87b4f4d35f06..517d52fb3bcb 100644 --- a/drivers/firmware/arm_scmi/shmem.c +++ b/drivers/firmware/arm_scmi/shmem.c @@ -122,3 +122,9 @@ bool shmem_poll_done(struct scmi_shared_mem __iomem *shmem, (SCMI_SHMEM_CHAN_STAT_CHANNEL_ERROR | SCMI_SHMEM_CHAN_STAT_CHANNEL_FREE); } + +bool shmem_channel_free(struct scmi_shared_mem __iomem *shmem) +{ + return (ioread32(&shmem->channel_status) & + SCMI_SHMEM_CHAN_STAT_CHANNEL_FREE); +}