On 2/25/2022 1:02 PM, Quan, Evan wrote:
[AMD Official Use Only]
-----Original Message-----
From: Lazar, Lijo <Lijo.Lazar@xxxxxxx>
Sent: Friday, February 25, 2022 2:03 PM
To: Quan, Evan <Evan.Quan@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Zhang, Hawking <Hawking.Zhang@xxxxxxx>; Deucher, Alexander
<Alexander.Deucher@xxxxxxx>; Wang, Yang(Kevin)
<KevinYang.Wang@xxxxxxx>
Subject: Re: [PATCH] drm/amd/pm: Send message when resp status is 0xFC
On 2/25/2022 11:25 AM, Quan, Evan wrote:
[AMD Official Use Only]
-----Original Message-----
From: Lazar, Lijo <Lijo.Lazar@xxxxxxx>
Sent: Friday, February 25, 2022 1:47 PM
To: Quan, Evan <Evan.Quan@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Zhang, Hawking <Hawking.Zhang@xxxxxxx>; Deucher, Alexander
<Alexander.Deucher@xxxxxxx>; Wang, Yang(Kevin)
<KevinYang.Wang@xxxxxxx>
Subject: RE: [PATCH] drm/amd/pm: Send message when resp status is
0xFC
[AMD Official Use Only]
That is the caller can perform something like issuing the same
message again without prerequisites check on PMFW busy
This patch expects this method. Caller may try to resend message
again. As part of message sending, driver first checks response
register. Current logic blocks sending any message if it sees 0xFC in
response register, this patch is to address that.
[Quan, Evan] Yes, I know. But the caller here could be another one. I mean
there may be another caller stepped in.
That shouldn't cause an issue to the second caller if it got message mutex.
The second caller also should be able to send message if PMFW got free by
that time. The first caller can retry when it gets back the message mutex. FW
doesn't maintain any state for 0xFC response. Any other message may be
sent after that. If driver keeps the state based on two callers, that is a logic
problem in driver. I don't think we have any flow like that.
[Quan, Evan] Yeah, but there may be some case that messages issued by the two callers have dependence.
That means the message issued by the 2nd caller should be only after the 1st one.
The one i can think of is "EnableAllSmuFeatures" message should be after "SetAllowedFeatures" message.
Although that should not cause any problem, I'm not sure whether there is other similar case.
What I suggest is something like below. We just do it again in smu_cmn_send_smc_msg_with_param() on PMFW busy.
int smu_cmn_send_smc_msg_with_param(struct smu_context *smu,
enum smu_message_type msg,
uint32_t param,
uint32_t *read_arg)
{
...
...
mutex_lock(&smu->message_lock);
reg = __smu_cmn_poll_stat(smu);
res = __smu_cmn_reg2errno(smu, reg);
if (reg == SMU_RESP_NONE ||
reg == SMU_RESP_BUSY_OTHER ||
res == -EREMOTEIO) {
__smu_cmn_reg_print_error(smu, reg, index, param, msg);
goto Out;
}
+retry:
__smu_cmn_send_msg(smu, (uint16_t) index, param);
reg = __smu_cmn_poll_stat(smu);
res = __smu_cmn_reg2errno(smu, reg);
+ if (reg == SMU_RESP_BUSY_OTHER) {
+ mdelay(1);
+ goto retry;
+ }
I suppose the retry option should be left to caller. Regardless of retry
or not, the patch is still valid.
Example situation -
rocm-smi is trying to get metrics and another app is trying set
performance profile. If metrics fetch fail and even retry of metrics
fetch fail after some loops, rocm-smi is free to fetch the metrics again
after say 5s. That also shouldn't prevent the second app to send
performance profile message and that app also may retry the same later.
Thanks,
Lijo
...
...
}
BR
Evan
Basically, 0xFC is not valid pre-condition check for sending any message. As
per PMFW team - it only means that PMFW was busy when a previous
message was sent and PMFW won't change the response status when it
becomes free.
Thanks,
Lijo
BR
Evan
Thanks,
Lijo
-----Original Message-----
From: Quan, Evan <Evan.Quan@xxxxxxx>
Sent: Friday, February 25, 2022 11:07 AM
To: Lazar, Lijo <Lijo.Lazar@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Zhang, Hawking <Hawking.Zhang@xxxxxxx>; Deucher, Alexander
<Alexander.Deucher@xxxxxxx>; Wang, Yang(Kevin)
<KevinYang.Wang@xxxxxxx>
Subject: RE: [PATCH] drm/amd/pm: Send message when resp status is
0xFC
[AMD Official Use Only]
This may introduce some problems for two callers scenarios. That is
the 2nd one will still proceed even if the 1st one was already blocked.
Maybe the logics here should be performed by the caller. That is the
caller can perform something like issuing the same message again
without prerequisites check on PMFW busy.
Or we can just update the smu_cmn_send_smc_msg APIs to give it
another try on PMFW busy.
BR
Evan
-----Original Message-----
From: Lazar, Lijo <Lijo.Lazar@xxxxxxx>
Sent: Friday, February 25, 2022 12:22 PM
To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Zhang, Hawking <Hawking.Zhang@xxxxxxx>; Deucher, Alexander
<Alexander.Deucher@xxxxxxx>; Wang, Yang(Kevin)
<KevinYang.Wang@xxxxxxx>; Quan, Evan <Evan.Quan@xxxxxxx>
Subject: [PATCH] drm/amd/pm: Send message when resp status is 0xFC
When PMFW is really busy, it will respond with 0xFC. However, it
doesn't change the response register state when it becomes free.
Driver should retry and proceed to send message if the response
status is
0xFC.
Signed-off-by: Lijo Lazar <lijo.lazar@xxxxxxx>
---
drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
index 590a6ed12d54..92161b9d8c1a 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
@@ -297,7 +297,6 @@ int smu_cmn_send_msg_without_waiting(struct
smu_context *smu,
reg = __smu_cmn_poll_stat(smu);
res = __smu_cmn_reg2errno(smu, reg);
if (reg == SMU_RESP_NONE ||
- reg == SMU_RESP_BUSY_OTHER ||
res == -EREMOTEIO)
goto Out;
__smu_cmn_send_msg(smu, msg_index, param); @@ -391,7 +390,6
@@ int
smu_cmn_send_smc_msg_with_param(struct
smu_context *smu,
reg = __smu_cmn_poll_stat(smu);
res = __smu_cmn_reg2errno(smu, reg);
if (reg == SMU_RESP_NONE ||
- reg == SMU_RESP_BUSY_OTHER ||
res == -EREMOTEIO) {
__smu_cmn_reg_print_error(smu, reg, index, param, msg);
goto Out;
--
2.25.1