[AMD Public Use] Hi Hawking, Not sure If there is a situation that we pass through the physical device to a vm, the condition check " if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) " in function amdgpu_mm_rreg should be true when we call RREG32_SOC15, is it correct? Brs Wenhui -----Original Message----- From: Zhang, Hawking <Hawking.Zhang@xxxxxxx> Sent: Monday, July 13, 2020 1:44 PM To: Sheng, Wenhui <Wenhui.Sheng@xxxxxxx>; Li, Dennis <Dennis.Li@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx Cc: Gao, Likun <Likun.Gao@xxxxxxx> Subject: RE: [PATCH 1/3] drm/amd/powerplay: add SMU mode1 reset [AMD Public Use] RE - [Dennis Li] It is better change to use RREG32_SOC15_NO_KIQ, because when GPU hang, RREG32_SOC15 will fail if it use RREG32_KIQ to read register RREG32_SOC15_NO_KIQ should have no difference from RREG32_SOC15 for this use scenario. This is the feature only supported in bare-metal environment and never run from guest environment. But this do remind us to exclude the feature from guest run-time environment by checking amdgpu_vf_sriov() Regards, Hawking -----Original Message----- From: Sheng, Wenhui <Wenhui.Sheng@xxxxxxx> Sent: Monday, July 13, 2020 11:43 To: Li, Dennis <Dennis.Li@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx Cc: Gao, Likun <Likun.Gao@xxxxxxx>; Zhang, Hawking <Hawking.Zhang@xxxxxxx> Subject: RE: [PATCH 1/3] drm/amd/powerplay: add SMU mode1 reset [AMD Official Use Only - Internal Distribution Only] Ok, will refine it. Brs Wenhui -----Original Message----- From: Li, Dennis <Dennis.Li@xxxxxxx> Sent: Monday, July 13, 2020 11:10 AM To: Sheng, Wenhui <Wenhui.Sheng@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx Cc: Gao, Likun <Likun.Gao@xxxxxxx>; Sheng, Wenhui <Wenhui.Sheng@xxxxxxx>; Zhang, Hawking <Hawking.Zhang@xxxxxxx> Subject: RE: [PATCH 1/3] drm/amd/powerplay: add SMU mode1 reset [AMD Official Use Only - Internal Distribution Only] -----Original Message----- From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of Wenhui Sheng Sent: Friday, July 10, 2020 10:17 PM To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx Cc: Gao, Likun <Likun.Gao@xxxxxxx>; Sheng, Wenhui <Wenhui.Sheng@xxxxxxx>; Zhang, Hawking <Hawking.Zhang@xxxxxxx> Subject: [PATCH 1/3] drm/amd/powerplay: add SMU mode1 reset >From PM FW 58.26.0 for sienna cichlid, SMU mode1 reset is support, driver sends PPSMC_MSG_Mode1Reset message to PM FW could trigger this reset. v2: add mode1 reset dpm interface Signed-off-by: Likun Gao <Likun.Gao@xxxxxxx> Signed-off-by: Wenhui Sheng <Wenhui.Sheng@xxxxxxx> --- drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c | 20 +++++++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h | 3 ++ drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 34 +++++++++++++++++++ .../gpu/drm/amd/powerplay/inc/amdgpu_smu.h | 4 +++ drivers/gpu/drm/amd/powerplay/inc/smu_types.h | 1 + drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h | 2 ++ .../drm/amd/powerplay/sienna_cichlid_ppt.c | 31 +++++++++++++++-- drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 13 +++++++ 8 files changed, 105 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c index 65472b3dd815..16668fc52d0d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.c @@ -1141,6 +1141,26 @@ int amdgpu_dpm_baco_reset(struct amdgpu_device *adev) return 0; } +bool amdgpu_dpm_is_mode1_reset_supported(struct amdgpu_device *adev) { + struct smu_context *smu = &adev->smu; + + if (is_support_sw_smu(adev)) + return smu_mode1_reset_is_support(smu); + + return false; +} + +int amdgpu_dpm_mode1_reset(struct amdgpu_device *adev) { + struct smu_context *smu = &adev->smu; + + if (is_support_sw_smu(adev)) + return smu_mode1_reset(smu); + + return -EOPNOTSUPP; +} + int amdgpu_dpm_switch_power_profile(struct amdgpu_device *adev, enum PP_SMC_POWER_PROFILE type, bool en) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h index 6a8aae70a0e6..7f3cd7185650 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dpm.h @@ -529,6 +529,9 @@ int amdgpu_dpm_mode2_reset(struct amdgpu_device *adev); bool amdgpu_dpm_is_baco_supported(struct amdgpu_device *adev); +bool amdgpu_dpm_is_mode1_reset_supported(struct amdgpu_device *adev); +int amdgpu_dpm_mode1_reset(struct amdgpu_device *adev); + int amdgpu_dpm_set_mp1_state(struct amdgpu_device *adev, enum pp_mp1_state mp1_state); diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c index fe4948aa662f..b5a7422d9548 100644 --- a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c @@ -2737,6 +2737,40 @@ int smu_baco_exit(struct smu_context *smu) return ret; } +bool smu_mode1_reset_is_support(struct smu_context *smu) { + bool ret = false; + + if (!smu->pm_enabled) + return false; + + mutex_lock(&smu->mutex); + + if (smu->ppt_funcs && smu->ppt_funcs->mode1_reset_is_support) + ret = smu->ppt_funcs->mode1_reset_is_support(smu); + + mutex_unlock(&smu->mutex); + + return ret; +} + +int smu_mode1_reset(struct smu_context *smu) { + int ret = 0; + + if (!smu->pm_enabled) + return -EOPNOTSUPP; + + mutex_lock(&smu->mutex); + + if (smu->ppt_funcs->mode1_reset) + ret = smu->ppt_funcs->mode1_reset(smu); + + mutex_unlock(&smu->mutex); + + return ret; +} + int smu_mode2_reset(struct smu_context *smu) { int ret = 0; diff --git a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h index 7b349e038972..ba59620950d7 100644 --- a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h +++ b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h @@ -561,6 +561,8 @@ struct pptable_funcs { int (*baco_set_state)(struct smu_context *smu, enum smu_baco_state state); int (*baco_enter)(struct smu_context *smu); int (*baco_exit)(struct smu_context *smu); + bool (*mode1_reset_is_support)(struct smu_context *smu); + int (*mode1_reset)(struct smu_context *smu); int (*mode2_reset)(struct smu_context *smu); int (*get_dpm_ultimate_freq)(struct smu_context *smu, enum smu_clk_type clk_type, uint32_t *min, uint32_t *max); int (*set_soft_freq_limited_range)(struct smu_context *smu, enum smu_clk_type clk_type, uint32_t min, uint32_t max); @@ -672,6 +674,8 @@ int smu_baco_get_state(struct smu_context *smu, enum smu_baco_state *state); int smu_baco_enter(struct smu_context *smu); int smu_baco_exit(struct smu_context *smu); +bool smu_mode1_reset_is_support(struct smu_context *smu); int +smu_mode1_reset(struct smu_context *smu); int smu_mode2_reset(struct smu_context *smu); extern int smu_get_atom_data_table(struct smu_context *smu, uint32_t table, diff --git a/drivers/gpu/drm/amd/powerplay/inc/smu_types.h b/drivers/gpu/drm/amd/powerplay/inc/smu_types.h index dff2295705be..7b585e205a5a 100644 --- a/drivers/gpu/drm/amd/powerplay/inc/smu_types.h +++ b/drivers/gpu/drm/amd/powerplay/inc/smu_types.h @@ -173,6 +173,7 @@ __SMU_DUMMY_MAP(GmiPwrDnControl), \ __SMU_DUMMY_MAP(DAL_DISABLE_DUMMY_PSTATE_CHANGE), \ __SMU_DUMMY_MAP(DAL_ENABLE_DUMMY_PSTATE_CHANGE), \ + __SMU_DUMMY_MAP(Mode1Reset), \ #undef __SMU_DUMMY_MAP #define __SMU_DUMMY_MAP(type) SMU_MSG_##type diff --git a/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h b/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h index d07bf4fe6e4a..38599112ae59 100644 --- a/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h +++ b/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h @@ -252,6 +252,8 @@ int smu_v11_0_baco_set_state(struct smu_context *smu, enum smu_baco_state state) int smu_v11_0_baco_enter(struct smu_context *smu); int smu_v11_0_baco_exit(struct smu_context *smu); +int smu_v11_0_mode1_reset(struct smu_context *smu); + int smu_v11_0_get_dpm_ultimate_freq(struct smu_context *smu, enum smu_clk_type clk_type, uint32_t *min, uint32_t *max); diff --git a/drivers/gpu/drm/amd/powerplay/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/powerplay/sienna_cichlid_ppt.c index dc5ca9121db5..319480550bb7 100644 --- a/drivers/gpu/drm/amd/powerplay/sienna_cichlid_ppt.c +++ b/drivers/gpu/drm/amd/powerplay/sienna_cichlid_ppt.c @@ -39,8 +39,8 @@ #include "nbio/nbio_2_3_sh_mask.h" #include "thm/thm_11_0_2_offset.h" #include "thm/thm_11_0_2_sh_mask.h" - -#include "asic_reg/mp/mp_11_0_sh_mask.h" +#include "mp/mp_11_0_offset.h" +#include "mp/mp_11_0_sh_mask.h" /* * DO NOT use these for err/warn/info/debug messages. @@ -116,6 +116,7 @@ static struct smu_11_0_cmn2aisc_mapping sienna_cichlid_message_map[SMU_MSG_MAX_C MSG_MAP(PowerDownJpeg, PPSMC_MSG_PowerDownJpeg), MSG_MAP(BacoAudioD3PME, PPSMC_MSG_BacoAudioD3PME), MSG_MAP(ArmD3, PPSMC_MSG_ArmD3), + MSG_MAP(Mode1Reset, PPSMC_MSG_Mode1Reset), }; static struct smu_11_0_cmn2aisc_mapping sienna_cichlid_clk_map[SMU_CLK_COUNT] = { @@ -1760,13 +1761,35 @@ static bool sienna_cichlid_is_baco_supported(struct smu_context *smu) struct amdgpu_device *adev = smu->adev; uint32_t val; - if (!smu_v11_0_baco_is_support(smu)) + if (amdgpu_sriov_vf(adev) || (!smu_v11_0_baco_is_support(smu))) return false; val = RREG32_SOC15(NBIO, 0, mmRCC_BIF_STRAP0); return (val & RCC_BIF_STRAP0__STRAP_PX_CAPABLE_MASK) ? true : false; } +static bool sienna_cichlid_is_mode1_reset_supported(struct smu_context +*smu) { + struct amdgpu_device *adev = smu->adev; + uint32_t val; + u32 smu_version; + + /** + * SRIOV env will not support SMU mode1 reset + * PM FW support mode1 reset from 58.26 + */ + smu_get_smc_version(smu, NULL, &smu_version); + if (amdgpu_sriov_vf(adev) || (smu_version < 0x003a1a00)) + return false; + + /** + * mode1 reset relies on PSP, so we should check if + * PSP is alive. + */ + val = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_81); + return val != 0x0; [Dennis Li] It is better to changed to use RREG32_SOC15_NO_KIQ, because when GPU hang, RREG32_SOC15 will fail if it use RREG32_KIQ to read register. +} + static int sienna_cichlid_set_thermal_range(struct smu_context *smu, struct smu_temperature_range range) { @@ -2538,6 +2561,8 @@ static const struct pptable_funcs sienna_cichlid_ppt_funcs = { .baco_set_state = smu_v11_0_baco_set_state, .baco_enter = smu_v11_0_baco_enter, .baco_exit = smu_v11_0_baco_exit, + .mode1_reset_is_support = sienna_cichlid_is_mode1_reset_supported, + .mode1_reset = smu_v11_0_mode1_reset, .get_dpm_ultimate_freq = sienna_cichlid_get_dpm_ultimate_freq, .set_soft_freq_limited_range = sienna_cichlid_set_soft_freq_limited_range, .override_pcie_parameters = smu_v11_0_override_pcie_parameters, diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c index 48e15885e9c3..c620dccb82e5 100644 --- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c +++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c @@ -63,6 +63,8 @@ MODULE_FIRMWARE("amdgpu/sienna_cichlid_smc.bin"); #define SMU11_VOLTAGE_SCALE 4 +#define SMU11_MODE1_RESET_WAIT_TIME 500 //500ms + static int smu_v11_0_send_msg_without_waiting(struct smu_context *smu, uint16_t msg) { @@ -1741,6 +1743,17 @@ int smu_v11_0_baco_exit(struct smu_context *smu) return ret; } +int smu_v11_0_mode1_reset(struct smu_context *smu) { + int ret = 0; + + ret = smu_send_smc_msg(smu, SMU_MSG_Mode1Reset, NULL); + if (!ret) + msleep(SMU11_MODE1_RESET_WAIT_TIME); + + return ret; +} + int smu_v11_0_get_dpm_ultimate_freq(struct smu_context *smu, enum smu_clk_type clk_type, uint32_t *min, uint32_t *max) { -- 2.17.1 _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7CDennis.Li%40amd.com%7C5125fa5f71e245a34f4608d824dbec23%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637299874279933866&sdata=DDzKv9mwDCDUyGQUYfKDk4nM0kqt07Qt45Iyr6RQQQU%3D&reserved=0 _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx