Hi Russell, This patch will read vbios, and parse vbios to get the baco reset feature bit. From the call trace, it shows error in " amdgpu_get_bios ", but this patch don't modify amdgpu_get_bios, and code before amdgpu_get_bios. Please first check why it will has error when read vbios. Best wishes Emily Deng >-----Original Message----- >From: Russell, Kent <Kent.Russell@xxxxxxx> >Sent: Thursday, June 13, 2019 7:11 PM >To: Quan, Evan <Evan.Quan@xxxxxxx>; Deng, Emily ><Emily.Deng@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx >Cc: Deng, Emily <Emily.Deng@xxxxxxx> >Subject: RE: [PATCH] drm/amdgpu: Need to set the baco cap before baco reset > >Hi Emily, > >This patch caused a regression on MI25 (Chip 6860, VBIOS 113-D0513700-001) >machines where the driver would not boot. Note that this was not seen on >Vega10 Frontier (Chip 6863, VBIOS 113-D0501100-X09) or Radeon64 (Chip >697f). Reverting this patch resolved the issue with no other work required and >was confirmed on all 3 machines. > >Here is the dmesg: > >[ 3.908653] amdgpu 0000:23:00.0: BAR 6: can't assign [??? 0x00000000 flags >0x20000000] (bogus alignment) >[ 3.908692] BUG: kernel NULL pointer dereference, address: >0000000000000008 >[ 3.908716] #PF: supervisor read access in kernel mode >[ 3.908734] #PF: error_code(0x0000) - not-present page >[ 3.908753] PGD 0 P4D 0 >[ 3.908767] Oops: 0000 [#1] SMP NOPTI >[ 3.909293] CPU: 8 PID: 409 Comm: kworker/8:1 Not tainted 5.2.0-rc1-kfd- >compute-roc-master-10734 #1 >[ 3.909753] Hardware name: Inventec P47 WC2071019001 >/P47 , BIOS 0.64 04/09/2018 >[ 3.910534] Workqueue: events work_for_cpu_fn >[ 3.910953] RIP: 0010:amdgpu_get_bios+0x3aa/0x580 [amdgpu] >[ 3.911314] Code: c0 48 c7 44 24 5c 00 00 00 00 48 c7 84 24 90 00 00 00 00 00 >00 00 48 89 d9 48 29 f9 83 c1 3c c1 e9 03 f3 48 ab 49 8b 44 24 38 <48> 8b 40 08 >48 85 c0 74 24 ba 3c 00 00 00 48 89 de 4c 89 e7 ff d0 >[ 3.912069] RSP: 0018:ffffa27dce28fc50 EFLAGS: 00010212 >[ 3.912502] RAX: 0000000000000000 RBX: ffffa27dce28fcac RCX: >0000000000000000 >[ 3.912980] RDX: 0000000000000000 RSI: 0000000000000082 RDI: >ffffa27dce28fce8 >[ 3.913467] RBP: 0000000000000000 R08: 0000000000000001 R09: >000000000000079a >[ 3.913940] R10: 0000000000000000 R11: 0000000000000001 R12: >ffff88d657af0000 >[ 3.914349] R13: ffffffffc0c38120 R14: ffffa27dce28fc68 R15: >ffff88d657af0000 >[ 3.914767] FS: 0000000000000000(0000) GS:ffff88d65f400000(0000) >knlGS:0000000000000000 >[ 3.915203] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >[ 3.915637] CR2: 0000000000000008 CR3: 0000003e7540a000 CR4: >00000000003406e0 >[ 3.916075] Call Trace: >[ 3.916522] ? pcie_capability_clear_and_set_word+0x53/0x80 >[ 3.917014] amdgpu_device_init+0x923/0x1820 [amdgpu] >[ 3.917515] amdgpu_driver_load_kms+0x71/0x310 [amdgpu] >[ 3.917997] drm_dev_register+0x113/0x1a0 [drm] >[ 3.918514] amdgpu_pci_probe+0xb8/0x150 [amdgpu] >[ 3.919003] ? __pm_runtime_resume+0x54/0x70 >[ 3.919270] usb 1-2: New USB device found, idVendor=14dd, idProduct=1005, >bcdDevice= 0.00 >[ 3.919498] local_pci_probe+0x3d/0x90 >[ 3.919503] ? __schedule+0x3de/0x690 >[ 3.920374] usb 1-2: New USB device strings: Mfr=1, Product=2, >SerialNumber=3 >[ 3.921137] work_for_cpu_fn+0x10/0x20 >[ 3.922028] usb 1-2: Product: D2CIM-VUSB >[ 3.922815] process_one_work+0x159/0x3e0 >[ 3.923633] usb 1-2: Manufacturer: Raritan >[ 3.923635] usb 1-2: SerialNumber: EFFB212D0A6E32F >[ 3.924416] worker_thread+0x22b/0x440 >[ 3.924419] ? rescuer_thread+0x350/0x350 >[ 3.927812] kthread+0xf6/0x130 >[ 3.928157] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) >[ 3.928365] ? kthread_destroy_worker+0x40/0x40 >[ 3.929401] ata1.00: ATA-10: INTEL SSDSC2KG960G7, SCV10100, max >UDMA/133 >[ 3.930101] ret_from_fork+0x1f/0x30 >[ 3.930103] Modules linked in: amdgpu(+) crct10dif_pclmul crc32_pclmul >ghash_clmulni_intel ast amd_iommu_v2 aesni_intel gpu_sched i2c_algo_bit >aes_x86_64 ttm crypto_simd drm_kms_helper cryptd glue_helper >syscopyarea sysfillrect ahci sysimgblt libahci fb_sys_fops ixgbe(+) drm dca >mdio >[ 3.930965] ata1.00: 1875385008 sectors, multi 1: LBA48 NCQ (depth 32) >[ 3.931085] ata1.00: configured for UDMA/133 >[ 3.931809] CR2: 0000000000000008 >[ 3.934723] scsi 0:0:0:0: Direct-Access ATA INTEL SSDSC2KG96 0100 PQ: >0 ANSI: 5 > > >Thanks! > > Kent > >-----Original Message----- >From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of Quan, >Evan >Sent: Monday, May 27, 2019 9:17 PM >To: Deng, Emily <Emily.Deng@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx >Cc: Deng, Emily <Emily.Deng@xxxxxxx> >Subject: RE: [PATCH] drm/amdgpu: Need to set the baco cap before baco reset > >Reviewed-by: Evan Quan <evan.quan@xxxxxxx> > >> -----Original Message----- >> From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of >> Emily Deng >> Sent: Monday, May 27, 2019 3:43 PM >> To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx >> Cc: Deng, Emily <Emily.Deng@xxxxxxx> >> Subject: [PATCH] drm/amdgpu: Need to set the baco cap before baco >> reset >> >> For passthrough, after rebooted the VM, driver will do a baco reset >> before doing other driver initialization during loading driver. For >> doing the baco reset, it will first check the baco reset capability. >> So first need to set the cap from the vbios information or baco reset won't >be enabled. >> >> Signed-off-by: Emily Deng <Emily.Deng@xxxxxxx> >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 24 ++++++++++----- >> ------- >> drivers/gpu/drm/amd/amdgpu/soc15.c | 3 ++- >> drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 4 ++++ >> .../amd/powerplay/hwmgr/vega10_processpptables.c | 24 >> ++++++++++++++++++++++ >> .../amd/powerplay/hwmgr/vega10_processpptables.h | 1 + >> 5 files changed, 42 insertions(+), 14 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> index b6ded84..7a8c220 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> @@ -1541,6 +1541,17 @@ static int amdgpu_device_ip_early_init(struct >> amdgpu_device *adev) >> if (amdgpu_sriov_vf(adev)) >> adev->pm.pp_feature &= ~PP_GFXOFF_MASK; >> >> + /* Read BIOS */ >> + if (!amdgpu_get_bios(adev)) >> + return -EINVAL; >> + >> + r = amdgpu_atombios_init(adev); >> + if (r) { >> + dev_err(adev->dev, "amdgpu_atombios_init failed\n"); >> + amdgpu_vf_error_put(adev, >> AMDGIM_ERROR_VF_ATOMBIOS_INIT_FAIL, 0, 0); >> + return r; >> + } >> + >> for (i = 0; i < adev->num_ip_blocks; i++) { >> if ((amdgpu_ip_block_mask & (1 << i)) == 0) { >> DRM_ERROR("disabled ip block: %d <%s>\n", @@ - >> 2591,19 +2602,6 @@ int amdgpu_device_init(struct amdgpu_device *adev, >> goto fence_driver_init; >> } >> >> - /* Read BIOS */ >> - if (!amdgpu_get_bios(adev)) { >> - r = -EINVAL; >> - goto failed; >> - } >> - >> - r = amdgpu_atombios_init(adev); >> - if (r) { >> - dev_err(adev->dev, "amdgpu_atombios_init failed\n"); >> - amdgpu_vf_error_put(adev, >> AMDGIM_ERROR_VF_ATOMBIOS_INIT_FAIL, 0, 0); >> - goto failed; >> - } >> - >> /* detect if we are with an SRIOV vbios */ >> amdgpu_device_detect_sriov_bios(adev); >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c >> b/drivers/gpu/drm/amd/amdgpu/soc15.c >> index 78bd4fc..d9fdd95 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c >> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c >> @@ -764,7 +764,8 @@ static bool soc15_need_reset_on_init(struct >> amdgpu_device *adev) >> /* Just return false for soc15 GPUs. Reset does not seem to >> * be necessary. >> */ >> - return false; >> + if (!amdgpu_passthrough(adev)) >> + return false; >> >> if (adev->flags & AMD_IS_APU) >> return false; >> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c >> b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c >> index ce6aeb5..1d9bb29 100644 >> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c >> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c >> @@ -5311,8 +5311,12 @@ static const struct pp_hwmgr_func >> vega10_hwmgr_funcs = { >> >> int vega10_hwmgr_init(struct pp_hwmgr *hwmgr) { >> + struct amdgpu_device *adev = hwmgr->adev; >> + >> hwmgr->hwmgr_func = &vega10_hwmgr_funcs; >> hwmgr->pptable_func = &vega10_pptable_funcs; >> + if (amdgpu_passthrough(adev)) >> + return vega10_baco_set_cap(hwmgr); >> >> return 0; >> } >> diff --git >> a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_processpptables.c >> b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_processpptables.c >> index b6767d7..83d22cd 100644 >> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_processpptables.c >> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_processpptables.c >> @@ -1371,3 +1371,27 @@ int vega10_get_powerplay_table_entry(struct >> pp_hwmgr *hwmgr, >> >> return result; >> } >> + >> +int vega10_baco_set_cap(struct pp_hwmgr *hwmgr) { >> + int result = 0; >> + >> + const ATOM_Vega10_POWERPLAYTABLE *powerplay_table; >> + >> + powerplay_table = get_powerplay_table(hwmgr); >> + >> + PP_ASSERT_WITH_CODE((powerplay_table != NULL), >> + "Missing PowerPlay Table!", return -1); >> + >> + result = check_powerplay_tables(hwmgr, powerplay_table); >> + >> + PP_ASSERT_WITH_CODE((result == 0), >> + "check_powerplay_tables failed", return result); >> + >> + set_hw_cap( >> + hwmgr, >> + 0 != (le32_to_cpu(powerplay_table->ulPlatformCaps) >> & ATOM_VEGA10_PP_PLATFORM_CAP_BACO), >> + PHM_PlatformCaps_BACO); >> + return result; >> +} >> + >> diff --git >> a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_processpptables.h >> b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_processpptables.h >> index d83ed2a..da5fbec 100644 >> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_processpptables.h >> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_processpptables.h >> @@ -59,4 +59,5 @@ extern int >> vega10_get_number_of_powerplay_table_entries(struct pp_hwmgr >*hwmgr); >> extern int vega10_get_powerplay_table_entry(struct pp_hwmgr *hwmgr, >> uint32_t entry_index, >> struct pp_power_state *power_state, int >(*call_back_func)(struct >> pp_hwmgr *, void *, >> struct pp_power_state *, void *, uint32_t)); >> +extern int vega10_baco_set_cap(struct pp_hwmgr *hwmgr); >> #endif >> -- >> 2.7.4 >> >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx@xxxxxxxxxxxxxxxxxxxxx >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >_______________________________________________ >amd-gfx mailing list >amd-gfx@xxxxxxxxxxxxxxxxxxxxx >https://lists.freedesktop.org/mailman/listinfo/amd-gfx _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx