Hi Alex, Thanks for your prompt response. On Mon, Feb 24 2025, Alex Deucher wrote: > On Mon, Feb 24, 2025 at 8:51 AM Baruch Siach <baruch@xxxxxxxxxx> wrote: >> I see this failure on probe when trying to bring up amdgpu on a new arm64 >> platform. Kernel is v6.14-rc4, and aldebaran firmware is latest >> (linux-firmware commit 4f47e84d06f9). >> >> Tested with these kernel command line parameters: >> >> amdgpu.vm_size=1 amdgpu.msi=1 amdgpu.gartsize=32 amdgpu.vramlimit=32 amdgpu.gttsize=32 > > Why are you setting those? Does the driver load ok if you do not > specify those driver options? Driver probe fails the same way with or without these options. These options were meant to fit the driver into limited address space. >> I guess the "CP firmware version" warning is bogus. IP version for GC_HWIP is >> 9.4.2. >> >> Any idea? > > Potentially the driver parameters combination is causing a problem, or > your ARM SoC may not be PCIe compliant. A lot of small SoC's just > throw a PCIe bridge on the SoC without proper coherency in place > between the CPU and the PCIe bus. PCIe requires cohorency with the > CPU (i.e., the device can snoop the CPU's cache). I'll take a look into PCIe coherency in this SoC. Thanks for the hint. baruch >> Relevant log snippets follows: >> >> [ 1.792949] pci 0000:05:00.0: [1002:740f] type 00 class 0x038000 PCIe Endpoint >> [ 1.800652] pci 0000:05:00.0: BAR 0 [mem 0x00000000-0xfffffffff 64bit pref] >> [ 1.807629] pci 0000:05:00.0: BAR 2 [mem 0x00000000-0x001fffff 64bit pref] >> [ 1.814506] pci 0000:05:00.0: BAR 4 [io 0x0000-0x00ff] >> [ 1.819729] pci 0000:05:00.0: BAR 5 [mem 0x00000000-0x0007ffff] >> [ 1.825647] pci 0000:05:00.0: ROM [mem 0x00000000-0x0001ffff pref] >> [ 1.833297] pci 0000:05:00.0: PME# supported from D1 D2 D3hot D3cold >> [ 1.840118] pci 0000:05:00.0: 126.024 Gb/s available PCIe bandwidth, limited by 16.0 GT/s PCIe x8 link at 0000:02:00.0 (capable of 252.048 Gb/s with 16.0 GT/s PCIe x16 link) >> [ 1.857150] pci_bus 0000:05: busn_res: [bus 05-ff] end is updated to 05 >> ... >> [ 2.615336] pci 0000:05:00.0: BAR 0 [mem 0x1000000000-0x1fffffffff 64bit pref]: assigned >> [ 2.623529] pci 0000:05:00.0: BAR 2 [mem 0x2000000000-0x20001fffff 64bit pref]: assigned >> [ 2.631720] pci 0000:05:00.0: BAR 5 [mem 0x5d000000-0x5d07ffff]: assigned >> [ 2.638544] pci 0000:05:00.0: ROM [mem 0x5d080000-0x5d09ffff pref]: assigned >> [ 2.645583] pci 0000:05:00.0: BAR 4 [io size 0x0100]: can't assign; no space >> [ 2.652707] pci 0000:05:00.0: BAR 4 [io size 0x0100]: failed to assign >> ... >> [ 3.153154] amdgpu 0000:05:00.0: enabling device (0000 -> 0002) >> [ 3.159112] [drm] initializing kernel modesetting (ALDEBARAN 0x1002:0x740F 0x1002:0x0C34 0x02). >> [ 3.167817] [drm] register mmio base: 0x5D000000 >> [ 3.172425] [drm] register mmio size: 524288 >> [ 3.176775] amdgpu 0000:05:00.0: amdgpu: detected ip block number 0 <soc15_common> >> [ 3.184341] amdgpu 0000:05:00.0: amdgpu: detected ip block number 1 <gmc_v9_0> >> [ 3.191558] amdgpu 0000:05:00.0: amdgpu: detected ip block number 2 <vega20_ih> >> [ 3.198858] amdgpu 0000:05:00.0: amdgpu: detected ip block number 3 <psp> >> [ 3.205639] amdgpu 0000:05:00.0: amdgpu: detected ip block number 4 <smu> >> [ 3.212421] amdgpu 0000:05:00.0: amdgpu: detected ip block number 5 <gfx_v9_0> >> [ 3.219635] amdgpu 0000:05:00.0: amdgpu: detected ip block number 6 <sdma_v4_0> >> [ 3.226935] amdgpu 0000:05:00.0: amdgpu: detected ip block number 7 <vcn_v2_6> >> [ 3.234149] amdgpu 0000:05:00.0: amdgpu: detected ip block number 8 <jpeg_v2_6> >> [ 3.247351] amdgpu 0000:05:00.0: amdgpu: Fetched VBIOS from ROM BAR >> [ 3.253626] amdgpu: ATOM BIOS: 113-D67301V-073 >> [ 3.259731] [drm] CP firmware version too old, please update! >> [ 3.260400] amdgpu 0000:05:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported >> [ 3.274294] amdgpu 0000:05:00.0: amdgpu: PCIE atomic ops is not supported >> [ 3.281115] amdgpu 0000:05:00.0: amdgpu: MEM ECC is active. >> [ 3.286679] amdgpu 0000:05:00.0: amdgpu: SRAM ECC is active. >> [ 3.292351] amdgpu 0000:05:00.0: amdgpu: RAS INFO: ras initialized successfully, hardware ability[7ff7f] ras_mask[7ff7f] >> [ 3.303232] [drm] vm size is 1 GB, 2 levels, block size is 9-bit, fragment size is 9-bit >> [ 3.311338] amdgpu 0000:05:00.0: amdgpu: VRAM: 65520M 0x0000020000000000 - 0x0000020FFEFFFFFF (32M used) >> [ 3.320811] amdgpu 0000:05:00.0: amdgpu: GART: 32M 0x0000000000000000 - 0x0000000001FFFFFF >> [ 3.329070] [drm] Detected VRAM RAM=65520M, BAR=65536M >> [ 3.334199] [drm] RAM width 4096bits HBM >> [ 3.338251] [drm] amdgpu: 32M of VRAM memory ready >> [ 3.343039] [drm] amdgpu: 32M of GTT memory ready. >> [ 3.347861] [drm] GART: num cpu pages 8192, num gpu pages 8192 >> [ 3.353779] [drm] PCIE GART of 32M enabled. >> [ 3.357955] [drm] PTB located at 0x0000020001FF0000 >> [ 3.365901] [drm] Found VCN firmware Version ENC: 1.1 DEC: 1 VEP: 0 Revision: 28 >> [ 3.432199] amdgpu 0000:05:00.0: amdgpu: reserve 0x800000 from 0x20001000000 for PSP TMR >> [ 3.504497] amdgpu 0000:05:00.0: amdgpu: smu driver if version = 0x00000008, smu fw if version = 0x00000009, smu fw program = 0, smu fw version = 0x00443f00 (68.63.0) >> [ 3.519356] amdgpu 0000:05:00.0: amdgpu: SMU driver if version not matched >> [ 3.526265] amdgpu 0000:05:00.0: amdgpu: use vbios provided pptable >> [ 3.532523] amdgpu 0000:05:00.0: amdgpu: smc_dpm_info table revision(format.content): 4.10 >> [ 3.560964] amdgpu 0000:05:00.0: amdgpu: SMU is initialized successfully! >> [ 3.568167] [drm] kiq ring mec 2 pipe 1 q 0 >> [ 3.785160] amdgpu 0000:05:00.0: [drm:amdgpu_ring_test_helper] *ERROR* ring kiq_0.2.1.0 test failed (-110) >> [ 3.794825] [drm:amdgpu_gfx_enable_kcq] *ERROR* KCQ enable failed >> [ 3.800929] [drm:amdgpu_device_init] *ERROR* hw_init of IP block <gfx_v9_0> failed -110 >> [ 3.808929] amdgpu 0000:05:00.0: amdgpu: amdgpu_device_ip_init failed >> [ 3.815361] amdgpu 0000:05:00.0: amdgpu: Fatal error during GPU init >> [ 3.821705] amdgpu 0000:05:00.0: amdgpu: amdgpu: finishing device. >> >> Thanks, >> baruch >> >> -- >> ~. .~ Tk Open Systems >> =}------------------------------------------------ooO--U--Ooo------------{= >> - baruch@xxxxxxxxxx - tel: +972.52.368.4656, http://www.tkos.co.il - -- ~. .~ Tk Open Systems =}------------------------------------------------ooO--U--Ooo------------{= - baruch@xxxxxxxxxx - tel: +972.52.368.4656, http://www.tkos.co.il -