Hello, Context: trying to understand what happens with my Renoir passed through to a Xen domu [0] (starting with the "VCN disabled" because I don't need it now (so let's postpone the problem with its _fini) and with "PSP disabled" because the alternative issue seems easier to solve -- so ip_block_mask=0xF7). I'm slowed down by a number of additional terms: * KIQ: we have the acronym, but a few more words about it would be great: it seems to relate to a ring buffer provided by the GFX IP, but this one does not talk much to me (e.g. it tells me less than the names of the "gfx" and "compute" ones) * "me", "mec" = ? In some places at least "me" stands for "micro engine" but what are those ? A "mec" contains pipes which contain queues. And in amdgpu_ring the "me" field seems to identify a "mec" * "mes", rather looks like an IP/block family than the plural of "me". A specific list of those IPs / hw blocks would be useful (maybe with a diagram showing how they interact, much as what was started by Rodrigo for the DC pipeline, but a first components/subcomponents diagram would probably be helpful) * RLC ? Looks like a "micro engine" inside the GFX IPs ? * one starting point for enhancing doc would be to start with amdgpu.h, where a number of acronyms used in structs are not self-explanatory: IB, SS, CP, ACP, CAC, HPD, ... Do we have somewhere a description of what the hardware expects to find in those queues ? About amdgpu_gfx_enable_kcq(): - Isn't the `DRM_INFO("kiq ring mec %d pipe %d q %d\n"` line rather meant as DRM_DEBUG ? - An error from amdgpu_ring_alloc() is reported as "failed to lock", but looks like "failed to allocate space on ring" ? amdgpu_ring_alloc() itself is unconditionally setting count_dw, which looked suspicious to me -- so I added the check shown below, and it does look like ring_alloc() gets called again too soon. Am I right in thinking this could be the cause of amdgpu_ring_test_helper() failing in timeout ? --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c @@ -70,6 +70,9 @@ int amdgpu_ring_alloc(struct amdgpu_ring *ring, unsigned ndw) if (WARN_ON_ONCE(ndw > ring->max_dw)) return -ENOMEM; + /* check we're not allocating too fast */ + WARN_ON_ONCE(ring->count_dw); + ring->count_dw = ndw; ring->wptr_old = ring->wptr; About gfx_v9_0_sw_fini(): - the 2 calls to bo_free are called here without condition, whereas they are allocated from rlc_init, not directly from sw_init. Is this asymmetry wanted ? Maybe such info should join the documentation at some point? [0] https://lists.freedesktop.org/archives/amd-gfx/2021-November/071855.html