Re: Trying to run AMD E9260 (Polaris 11) on NXP LS1012A-RDB

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The PCI Express controller as instantiated on this chip does not support hardware coherency. All incoming PCI Express transactions are made non IO-coherent.

Would AMDGPU still work with that PCI Express controller, or is this a show-stopper?

I'm really wondering what this comment in the documentation means.

As far as I know PCIe doesn't support cache coherency in the downstream and supporting it in the up stream is a must have.

So what exactly is meant here with IO-coherent?

Regards,
Christian.

Am 10.01.19 um 11:55 schrieb Bas Vermeulen:
Hi Alex,

I've managed to get a little further. I am currently running mainline (5.0.0-rc1) and am getting the errors below.
Looking at the datasheet for the LS1012A, it mentions in the PCI Express section that:

The PCI Express controller as instantiated on this chip does not support hardware coherency. All incoming PCI Express transactions are made non IO-coherent.

Would AMDGPU still work with that PCI Express controller, or is this a show-stopper?

[    5.727691] [drm] initializing kernel modesetting (POLARIS11 0x1002:0x67E8 0x1DA2:0xE362 0x80).
[    5.761767] [drm] register mmio base: 0x58000000
[    5.777272] [drm] register mmio size: 262144
[    5.825973] [drm] add ip block number 0 <vi_common>
[    5.832242] [drm] add ip block number 1 <gmc_v8_0>
[    5.837767] [drm] add ip block number 2 <tonga_ih>
[    5.843121] [drm] add ip block number 3 <gfx_v8_0>
[    5.848480] [drm] add ip block number 4 <sdma_v3_0>
[    5.853969] [drm] add ip block number 5 <powerplay>
[    5.859413] [drm] add ip block number 6 <dm>
[    5.864238] [drm] add ip block number 7 <uvd_v6_0>
[    5.869690] [drm] add ip block number 8 <vce_v3_0>
[    5.875067] [drm] UVD is enabled in VM mode
[    5.879858] [drm] UVD ENC is enabled in VM mode
[    5.884985] [drm] VCE enabled in VM mode
[    6.114020] ATOM BIOS: 113-C98511-U01
[    6.117757] [drm] GPU posting now...
[    6.238112] [drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
[    6.247290] amdgpu 0000:01:00.0: BAR 2: releasing [mem 0x4050000000-0x40501fffff 64bit pref]
[    6.257354] amdgpu 0000:01:00.0: BAR 0: releasing [mem 0x4040000000-0x404fffffff 64bit pref]
[    6.267261] pcieport 0000:00:00.0: BAR 15: releasing [mem 0x4040000000-0x4057ffffff 64bit pref]
[    6.277288] pcieport 0000:00:00.0: BAR 15: no space for [mem size 0x300000000 64bit pref]
[    6.286843] pcieport 0000:00:00.0: BAR 15: failed to assign [mem size 0x300000000 64bit pref]
[    6.296708] amdgpu 0000:01:00.0: BAR 0: no space for [mem size 0x200000000 64bit pref]
[    6.306009] amdgpu 0000:01:00.0: BAR 0: failed to assign [mem size 0x200000000 64bit pref]
[    6.315635] amdgpu 0000:01:00.0: BAR 2: no space for [mem size 0x00200000 64bit pref]
[    6.324804] amdgpu 0000:01:00.0: BAR 2: failed to assign [mem size 0x00200000 64bit pref]
[    6.334366] pcieport 0000:00:00.0: PCI bridge to [bus 01-ff]
[    6.341323] pcieport 0000:00:00.0:   bridge window [io  0x1000-0x1fff]
[    6.349158] pcieport 0000:00:00.0:   bridge window [mem 0x4058000000-0x40580fffff]
[    6.358095] pcieport 0000:00:00.0: PCI bridge to [bus 01-ff]
[    6.365054] pcieport 0000:00:00.0:   bridge window [io  0x1000-0x1fff]
[    6.372981] pcieport 0000:00:00.0:   bridge window [mem 0x4058000000-0x40580fffff]
[    6.381917] pcieport 0000:00:00.0:   bridge window [mem 0x4040000000-0x4057ffffff 64bit pref]
[    6.391789] [drm] Not enough PCI address space for a large BAR.
[    6.391820] amdgpu 0000:01:00.0: BAR 0: assigned [mem 0x4040000000-0x404fffffff 64bit pref]
[    6.407776] amdgpu 0000:01:00.0: BAR 2: assigned [mem 0x4050000000-0x40501fffff 64bit pref]
[    6.417672] amdgpu 0000:01:00.0: VRAM: 8192M 0x000000F400000000 - 0x000000F5FFFFFFFF (8192M used)
[    6.426586] amdgpu 0000:01:00.0: GART: 256M 0x000000FF00000000 - 0x000000FF0FFFFFFF
[    6.436171] [drm] Detected VRAM RAM=8192M, BAR=256M
[    6.478952] [drm] GART: num cpu pages 65536, num gpu pages 65536
[    6.487871] [drm] PCIE GART of 256M enabled (table at 0x000000F400000000).
[    6.496316] amdgpu 0000:01:00.0: Direct firmware load for amdgpu/polaris11_pfp_2.bin failed with error -2
[    6.508078] amdgpu 0000:01:00.0: Direct firmware load for amdgpu/polaris11_me_2.bin failed with error -2
[    6.519538] amdgpu 0000:01:00.0: Direct firmware load for amdgpu/polaris11_ce_2.bin failed with error -2
[    6.531496] amdgpu 0000:01:00.0: Direct firmware load for amdgpu/polaris11_mec_2.bin failed with error -2
[    6.544214] amdgpu 0000:01:00.0: Direct firmware load for amdgpu/polaris11_mec2_2.bin failed with error -2
[    6.565121] [drm] Found UVD firmware Version: 1.79 Family ID: 16
[    6.571241] [drm] UVD ENC is disabled
[    6.580927] [drm] Found VCE firmware Version: 52.4 Binary ID: 3
[    6.656854] amdgpu 0000:01:00.0: GPU fault detected: 147 0x00004802 for process  pid 0 thread  pid 0
[    6.666013] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0017F000
[    6.673508] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x02048002
[    6.681006] amdgpu 0000:01:00.0: VM fault (0x02, vmid 1, pasid 0) at page 1568768, read from 'TC0' (0x54433000) (72)
[    6.691557] amdgpu 0000:01:00.0: GPU fault detected: 147 0x00004402 for process  pid 0 thread  pid 0
[    6.700706] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0017F000
[    6.708200] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x02048002
[    6.715697] amdgpu 0000:01:00.0: VM fault (0x02, vmid 1, pasid 0) at page 1568768, read from 'TC0' (0x54433000) (72)
[    6.726246] amdgpu 0000:01:00.0: GPU fault detected: 147 0x00000402 for process  pid 0 thread  pid 0
[    6.735395] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0017F000
[    6.742889] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x02048002
[    6.750386] amdgpu 0000:01:00.0: VM fault (0x02, vmid 1, pasid 0) at page 1568768, read from 'TC0' (0x54433000) (72)
[    6.984485] amdgpu 0000:01:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring gfx test failed (-110)
[    6.994562] [drm:amdgpu_device_init [amdgpu]] *ERROR* hw_init of IP block <gfx_v8_0> failed -110
[    7.004942] amdgpu 0000:01:00.0: amdgpu_device_ip_init failed
[    7.011963] amdgpu 0000:01:00.0: Fatal error during GPU init
[    7.018890] [drm] amdgpu: finishing device.
[    7.308898] WARNING: CPU: 0 PID: 2084 at /home/bas/linux/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:915 amdgpu_bo_unpin+0xe4/0x110 [amdgpu]
[    7.321451] Modules linked in: amdgpu(+) realtek chash gpu_sched ttm drm_kms_helper drm crct10dif_ce drm_panel_orientation_quirks pfe(C) ip_tables x_tables ipv6
[    7.335850] CPU: 0 PID: 2084 Comm: systemd-udevd Tainted: G         C        5.0.0-rc1-00001-g3bd6e94bec12-dirty #1
[    7.346303] Hardware name: LS1012A RDB Board (DT)
[    7.351014] pstate: 40000005 (nZcv daif -PAN -UAO)
[    7.356090] pc : amdgpu_bo_unpin+0xe4/0x110 [amdgpu]
[    7.361307] lr : amdgpu_bo_free_kernel+0x7c/0x148 [amdgpu]
[    7.366799] sp : ffff0000118c3730
[    7.370114] x29: ffff0000118c3730 x28: ffff000008e63710 
[    7.375434] x27: ffff0000118c3df0 x26: 0000000000000100 
[    7.380754] x25: ffff80003599cb60 x24: ffff8000359927d0 
[    7.386073] x23: ffff800035994750 x22: ffff8000359927d0 
[    7.391393] x21: ffff8000358cb800 x20: ffff0000111fd000 
[    7.396713] x19: ffff8000358cb800 x18: 000000000000001e 
[    7.402033] x17: 0000000000000000 x16: 0000000000000002 
[    7.407351] x15: 0000000000000400 x14: 0000000000000400 
[    7.412670] x13: 000000000000cb72 x12: 000000000000b308 
[    7.417989] x11: ffff7e0000d02488 x10: ffff800037798030 
[    7.423308] x9 : ffff0000118c36e4 x8 : 0000000000000000 
[    7.428627] x7 : 0000000000210d00 x6 : 0000000000000000 
[    7.433946] x5 : 0000000000000001 x4 : 0000000000000001 
[    7.439265] x3 : ffff8000358cb87c x2 : 0000000000000000 
[    7.444583] x1 : 0000000000000000 x0 : ffff8000358cb800 
[    7.449903] Call trace:
[    7.452583]  amdgpu_bo_unpin+0xe4/0x110 [amdgpu]
[    7.457437]  amdgpu_bo_free_kernel+0x7c/0x148 [amdgpu]
[    7.462812]  amdgpu_gfx_rlc_fini+0x50/0x78 [amdgpu]
[    7.467926]  gfx_v8_0_sw_fini+0xfc/0x1c0 [amdgpu]
[    7.472864]  amdgpu_device_fini+0x1e0/0x480 [amdgpu]
[    7.478058]  amdgpu_driver_unload_kms+0xa0/0x150 [amdgpu]
[    7.483686]  amdgpu_driver_load_kms+0x144/0x1f8 [amdgpu]
[    7.489110]  drm_dev_register+0x14c/0x1e0 [drm]
[    7.493906]  amdgpu_pci_probe+0xcc/0x188 [amdgpu]
[    7.498622]  local_pci_probe+0x3c/0xb0
[    7.502375]  pci_device_probe+0x150/0x1b8
[    7.506390]  really_probe+0x1f0/0x298
[    7.510055]  driver_probe_device+0x58/0x100
[    7.514242]  __driver_attach+0xd4/0xd8
[    7.517993]  bus_for_each_dev+0x74/0xc8
[    7.521832]  driver_attach+0x20/0x28
[    7.525410]  bus_add_driver+0x1ac/0x218
[    7.529250]  driver_register+0x60/0x110
[    7.533089]  __pci_register_driver+0x40/0x48
[    7.537599]  amdgpu_init+0x58/0x1000 [amdgpu]
[    7.541964]  do_one_initcall+0x5c/0x178
[    7.545804]  do_init_module+0x58/0x1b0
[    7.549556]  load_module+0x1dc8/0x2178
[    7.553308]  __se_sys_finit_module+0xbc/0xd0
[    7.557582]  __arm64_sys_finit_module+0x18/0x20
[    7.562119]  el0_svc_common+0x60/0x100
[    7.565872]  el0_svc_handler+0x2c/0x80
[    7.581715] amdgpu 0000:01:00.0: (____ptrval____) unpin not necessary
[    7.594743] [TTM] Finalizing pool allocator
[    7.599867] [TTM] Finalizing DMA pool allocator
[    7.605265] [TTM] Zone  kernel: Used memory at exit: 0 kiB
[    7.611482] [drm] amdgpu: ttm finalized
[    7.618028] amdgpu: probe of 0000:01:00.0 failed with error -110

Bas Vermeulen


On Thu, Dec 20, 2018 at 5:11 PM Bas Vermeulen <bas@xxxxxxxxxxxx> wrote:
Hi Alex,

I already have a similar patch in, that doesn't fix it just yet. I'll investigate some more.

Bas Vermeulen

On Thu, Dec 20, 2018 at 4:27 PM Alex Deucher <alexdeucher@xxxxxxxxx> wrote:
On Thu, Dec 20, 2018 at 9:06 AM Bas Vermeulen <bas@xxxxxxxxxxxx> wrote:
>
> Hi all,
>
> I have connected an E9260 (Polaris 11 based) to a mini-PCIe slot on my NXP LS1012ARDB.
> The GPU is seen, all the BARs are correctly assigned (but there's not enough PCIe memory space for the big BARs).
>
> When I try to load the amdgpu module, I can't get the driver to enable the acceleration (the scratch register check fails).
>
> The E9260 is connected to a PCIe x1 (Gen2) slot.
>
> Anyone have an idea on where to look or how to fix this? This is a test-bed before we get an LS1046ARDB with quad core A72's.
>
> Any help would be appreciated,

Something like this patch should fix it assuming this is an ARM based platform:
https://patchwork.freedesktop.org/patch/269367/

Alex

>
> Bas Vermeulen
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@xxxxxxxxxxxxxxxxxxxxx
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux