On Wed, Aug 28, 2024 at 7:28 AM WangYuli <wangyuli@xxxxxxxxxxxxx> wrote: > > From: wenlunpeng <wenlunpeng@xxxxxxxxxxxxx> > > The quirk is for reboot-stability. > > A device reboot stress test has been observed to cause > random system hangs when amdgpu_dpm is enabled. > > Disabling amdgpu_dpm can fix this. > > However, a boot-param can still overwrite it to enable > amdgpu_dpm. > > Serial log when error occurs: > ... > Console: switching to colour frame buffer device 160x45 > amdgpu 0000:01:00.0: fb0: amdgpudrmfb frame buffer device > [drm:amdgpu_device_ip_late_init] *ERROR* late_init of IP block <si_dpm> failed -22 > amdgpu 0000:01:00.0: amdgpu_device_ip_late_init failed > amdgpu 0000:01:00.0: Fatal error during GPU init > [drm] amdgpu: finishing device. > Console: switching to colour dummy device 80x25 > ... > > Signed-off-by: wenlunpeng <wenlunpeng@xxxxxxxxxxxxx> > Signed-off-by: WangYuli <wangyuli@xxxxxxxxxxxxx> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 23 +++++++++++++++++++++++ > 1 file changed, 23 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > index 094498a0964b..81716fcac7cd 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > @@ -32,6 +32,7 @@ > #include <drm/drm_vblank.h> > > #include <linux/cc_platform.h> > +#include <linux/dmi.h> > #include <linux/dynamic_debug.h> > #include <linux/module.h> > #include <linux/mmu_notifier.h> > @@ -3023,10 +3024,32 @@ static struct pci_driver amdgpu_kms_pci_driver = { > .dev_groups = amdgpu_sysfs_groups, > }; > > +static int quirk_set_amdgpu_dpm_0(const struct dmi_system_id *dmi) > +{ > + amdgpu_dpm = 0; This will disable dpm on all devices that you might install on this platform. If this is specific to a particular platform and board combination, it might be better to check the platform in the dpm_init() code for the specific chip that is problematic. Additionally, disabling dpm will result in boot clocks which means performance will be very low. Alex > + pr_info("Identified '%s', set amdgpu_dpm to 0.\n", dmi->ident); > + return 1; > +} > + > +static const struct dmi_system_id amdgpu_quirklist[] = { > + { > + .ident = "DS25 Desktop", > + .matches = { > + DMI_MATCH(DMI_BOARD_NAME, "THTF-SW831-1W-DS25_MB"), > + }, > + .callback = quirk_set_amdgpu_dpm_0, > + }, > + {} > +}; > + > static int __init amdgpu_init(void) > { > int r; > > + /* quirks for some hardware, applied only when it's untouched */ > + if (amdgpu_dpm == -1) > + dmi_check_system(amdgpu_quirklist); > + > if (drm_firmware_drivers_only()) > return -EINVAL; > > -- > 2.43.4 >