On 2018-12-01 9:11 a.m., Christian König wrote: >> Won't this break VM fault handling in KFD? > No, we still send all VM faults to KFD after processing them. Only > filtered retries are not send to the KFD any more. OK, I missed that src->funcs->process returning 0 means "not handled", >0 means "handled". Currently I don't see any interrupt processing callbacks returning >0. I think that gets added in patch 4. > >> As far as I can tell, the only code path that leave IRQs unhandled >> and passes them to KFD prints an error message in the kernel log. We >> can't have the kernel log flooded with error messages every time >> there are IRQs for KFD. We can get extremely high frequency >> interrupts for HSA signals. > Since the KFD didn't filtered the faults this would have a been a > problem before as well. I missed that r == 0 means not handled without being an error. > > So I'm pretty sure that we already have registered handlers for all > interrupts the KFD is interested in as well. No. As far as I can tell, you're missing these two: GFX_9_0__SRCID__CP_BAD_OPCODE_ERROR (183) GFX_9_0__SRCID__SQ_INTERRUPT_ID (239) 239 is used for signaling events from shaders and can be very frequent. Triggering an error message on those interrupts would be bad. Regards, Felix > > Regards, > Christian. > > Am 30.11.18 um 17:31 schrieb Kuehling, Felix: >> Won't this break VM fault handling in KFD? I don't see a way with the >> current code that you can leave some VM faults for KFD to process. If >> we could consider VM faults with VMIDs 8-15 as not handled in amdgpu >> and leave them for KFD to process, then this could work. >> >> As far as I can tell, the only code path that leave IRQs unhandled >> and passes them to KFD prints an error message in the kernel log. We >> can't have the kernel log flooded with error messages every time >> there are IRQs for KFD. We can get extremely high frequency >> interrupts for HSA signals. >> >> Regards, >> Felix >> >> -----Original Message----- >> From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of >> Alex Deucher >> Sent: Friday, November 30, 2018 10:03 AM >> To: Christian König <ckoenig.leichtzumerken@xxxxxxxxx> >> Cc: amd-gfx list <amd-gfx@xxxxxxxxxxxxxxxxxxxxx> >> Subject: Re: [PATCH 02/11] drm/amdgpu: send IVs to the KFD only after >> processing them v2 >> >> On Fri, Nov 30, 2018 at 7:36 AM Christian König >> <ckoenig.leichtzumerken@xxxxxxxxx> wrote: >>> This allows us to filter out VM faults in the GMC code. >>> >>> v2: don't filter out all faults >>> >>> Signed-off-by: Christian König <christian.koenig@xxxxxxx> >> Acked-by: Alex Deucher <alexander.deucher@xxxxxxx> >> >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 29 >>> +++++++++++++++---------- >>> 1 file changed, 17 insertions(+), 12 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c >>> index 6b6524f04ce0..6db4c58ddc13 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c >>> @@ -149,9 +149,6 @@ static void amdgpu_irq_callback(struct >>> amdgpu_device *adev, >>> if (!amdgpu_ih_prescreen_iv(adev)) >>> return; >>> >>> - /* Before dispatching irq to IP blocks, send it to amdkfd */ >>> - amdgpu_amdkfd_interrupt(adev, (const void *) >>> &ih->ring[ring_index]); >>> - >>> entry.iv_entry = (const uint32_t *)&ih->ring[ring_index]; >>> amdgpu_ih_decode_iv(adev, &entry); >>> >>> @@ -371,29 +368,31 @@ void amdgpu_irq_dispatch(struct amdgpu_device >>> *adev, >>> unsigned client_id = entry->client_id; >>> unsigned src_id = entry->src_id; >>> struct amdgpu_irq_src *src; >>> + bool handled = false; >>> int r; >>> >>> trace_amdgpu_iv(entry); >>> >>> if (client_id >= AMDGPU_IRQ_CLIENTID_MAX) { >>> - DRM_DEBUG("Invalid client_id in IV: %d\n", client_id); >>> + DRM_ERROR("Invalid client_id in IV: %d\n", client_id); >>> return; >>> } >>> >>> if (src_id >= AMDGPU_MAX_IRQ_SRC_ID) { >>> - DRM_DEBUG("Invalid src_id in IV: %d\n", src_id); >>> + DRM_ERROR("Invalid src_id in IV: %d\n", src_id); >>> return; >>> } >>> >>> if (adev->irq.virq[src_id]) { >>> >>> generic_handle_irq(irq_find_mapping(adev->irq.domain, src_id)); >>> - } else { >>> - if (!adev->irq.client[client_id].sources) { >>> - DRM_DEBUG("Unregistered interrupt client_id: >>> %d src_id: %d\n", >>> - client_id, src_id); >>> - return; >>> - } >>> + return; >>> + } >>> >>> + if (!adev->irq.client[client_id].sources) { >>> + DRM_DEBUG("Unregistered interrupt client_id: %d >>> src_id: %d\n", >>> + client_id, src_id); >>> + return; >>> + } else { >>> src = adev->irq.client[client_id].sources[src_id]; >>> if (!src) { >>> DRM_DEBUG("Unhandled interrupt src_id: %d\n", >>> src_id); @@ -401,9 +400,15 @@ void amdgpu_irq_dispatch(struct >>> amdgpu_device *adev, >>> } >>> >>> r = src->funcs->process(adev, src, entry); >>> - if (r) >>> + if (r < 0) >>> DRM_ERROR("error processing interrupt (%d)\n", >>> r); >>> + else if (r) >>> + handled = true; >>> } >>> + >>> + /* Send it to amdkfd as well if it isn't already handled */ >>> + if (!handled) >>> + amdgpu_amdkfd_interrupt(adev, entry->iv_entry); >>> } >>> >>> /** >>> -- >>> 2.17.1 >>> >>> _______________________________________________ >>> amd-gfx mailing list >>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx@xxxxxxxxxxxxxxxxxxxxx >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx