Ah! Never mind, now I see it what you mean! I accidentally added my debug change DRM_DEBUG->DRM_ERROR to this patch as well. Sorry for the noise, Christian. Am 03.12.18 um 17:35 schrieb Christian König: >> No. As far as I can tell, you're missing these two: >> >> GFX_9_0__SRCID__CP_BAD_OPCODE_ERROR (183) >> GFX_9_0__SRCID__SQ_INTERRUPT_ID (239) >> >> 239 is used for signaling events from shaders and can be very frequent. >> Triggering an error message on those interrupts would be bad. > > Mhm, then why didn't those trigger a message before? > > As far as I can see we actually didn't changed the handling for that. > > Christian. > > Am 03.12.18 um 17:31 schrieb Kuehling, Felix: >> On 2018-12-01 9:11 a.m., Christian König wrote: >>>> Won't this break VM fault handling in KFD? >>> No, we still send all VM faults to KFD after processing them. Only >>> filtered retries are not send to the KFD any more. >> OK, I missed that src->funcs->process returning 0 means "not handled", >>> 0 means "handled". Currently I don't see any interrupt processing >> callbacks returning >0. I think that gets added in patch 4. >> >> >>>> As far as I can tell, the only code path that leave IRQs unhandled >>>> and passes them to KFD prints an error message in the kernel log. We >>>> can't have the kernel log flooded with error messages every time >>>> there are IRQs for KFD. We can get extremely high frequency >>>> interrupts for HSA signals. >>> Since the KFD didn't filtered the faults this would have a been a >>> problem before as well. >> I missed that r == 0 means not handled without being an error. >> >> >>> So I'm pretty sure that we already have registered handlers for all >>> interrupts the KFD is interested in as well. >> No. As far as I can tell, you're missing these two: >> >> GFX_9_0__SRCID__CP_BAD_OPCODE_ERROR (183) >> GFX_9_0__SRCID__SQ_INTERRUPT_ID (239) >> >> 239 is used for signaling events from shaders and can be very frequent. >> Triggering an error message on those interrupts would be bad. >> >> Regards, >> Felix >> >> >>> Regards, >>> Christian. >>> >>> Am 30.11.18 um 17:31 schrieb Kuehling, Felix: >>>> Won't this break VM fault handling in KFD? I don't see a way with the >>>> current code that you can leave some VM faults for KFD to process. If >>>> we could consider VM faults with VMIDs 8-15 as not handled in amdgpu >>>> and leave them for KFD to process, then this could work. >>>> >>>> As far as I can tell, the only code path that leave IRQs unhandled >>>> and passes them to KFD prints an error message in the kernel log. We >>>> can't have the kernel log flooded with error messages every time >>>> there are IRQs for KFD. We can get extremely high frequency >>>> interrupts for HSA signals. >>>> >>>> Regards, >>>> Felix >>>> >>>> -----Original Message----- >>>> From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of >>>> Alex Deucher >>>> Sent: Friday, November 30, 2018 10:03 AM >>>> To: Christian König <ckoenig.leichtzumerken@xxxxxxxxx> >>>> Cc: amd-gfx list <amd-gfx@xxxxxxxxxxxxxxxxxxxxx> >>>> Subject: Re: [PATCH 02/11] drm/amdgpu: send IVs to the KFD only after >>>> processing them v2 >>>> >>>> On Fri, Nov 30, 2018 at 7:36 AM Christian König >>>> <ckoenig.leichtzumerken@xxxxxxxxx> wrote: >>>>> This allows us to filter out VM faults in the GMC code. >>>>> >>>>> v2: don't filter out all faults >>>>> >>>>> Signed-off-by: Christian König <christian.koenig@xxxxxxx> >>>> Acked-by: Alex Deucher <alexander.deucher@xxxxxxx> >>>> >>>>> --- >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 29 >>>>> +++++++++++++++---------- >>>>> 1 file changed, 17 insertions(+), 12 deletions(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c >>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c >>>>> index 6b6524f04ce0..6db4c58ddc13 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c >>>>> @@ -149,9 +149,6 @@ static void amdgpu_irq_callback(struct >>>>> amdgpu_device *adev, >>>>> if (!amdgpu_ih_prescreen_iv(adev)) >>>>> return; >>>>> >>>>> - /* Before dispatching irq to IP blocks, send it to amdkfd */ >>>>> - amdgpu_amdkfd_interrupt(adev, (const void *) >>>>> &ih->ring[ring_index]); >>>>> - >>>>> entry.iv_entry = (const uint32_t *)&ih->ring[ring_index]; >>>>> amdgpu_ih_decode_iv(adev, &entry); >>>>> >>>>> @@ -371,29 +368,31 @@ void amdgpu_irq_dispatch(struct amdgpu_device >>>>> *adev, >>>>> unsigned client_id = entry->client_id; >>>>> unsigned src_id = entry->src_id; >>>>> struct amdgpu_irq_src *src; >>>>> + bool handled = false; >>>>> int r; >>>>> >>>>> trace_amdgpu_iv(entry); >>>>> >>>>> if (client_id >= AMDGPU_IRQ_CLIENTID_MAX) { >>>>> - DRM_DEBUG("Invalid client_id in IV: %d\n", >>>>> client_id); >>>>> + DRM_ERROR("Invalid client_id in IV: %d\n", >>>>> client_id); >>>>> return; >>>>> } >>>>> >>>>> if (src_id >= AMDGPU_MAX_IRQ_SRC_ID) { >>>>> - DRM_DEBUG("Invalid src_id in IV: %d\n", src_id); >>>>> + DRM_ERROR("Invalid src_id in IV: %d\n", src_id); >>>>> return; >>>>> } >>>>> >>>>> if (adev->irq.virq[src_id]) { >>>>> generic_handle_irq(irq_find_mapping(adev->irq.domain, src_id)); >>>>> - } else { >>>>> - if (!adev->irq.client[client_id].sources) { >>>>> - DRM_DEBUG("Unregistered interrupt client_id: >>>>> %d src_id: %d\n", >>>>> - client_id, src_id); >>>>> - return; >>>>> - } >>>>> + return; >>>>> + } >>>>> >>>>> + if (!adev->irq.client[client_id].sources) { >>>>> + DRM_DEBUG("Unregistered interrupt client_id: %d >>>>> src_id: %d\n", >>>>> + client_id, src_id); >>>>> + return; >>>>> + } else { >>>>> src = adev->irq.client[client_id].sources[src_id]; >>>>> if (!src) { >>>>> DRM_DEBUG("Unhandled interrupt src_id: >>>>> %d\n", >>>>> src_id); @@ -401,9 +400,15 @@ void amdgpu_irq_dispatch(struct >>>>> amdgpu_device *adev, >>>>> } >>>>> >>>>> r = src->funcs->process(adev, src, entry); >>>>> - if (r) >>>>> + if (r < 0) >>>>> DRM_ERROR("error processing interrupt >>>>> (%d)\n", >>>>> r); >>>>> + else if (r) >>>>> + handled = true; >>>>> } >>>>> + >>>>> + /* Send it to amdkfd as well if it isn't already handled */ >>>>> + if (!handled) >>>>> + amdgpu_amdkfd_interrupt(adev, entry->iv_entry); >>>>> } >>>>> >>>>> /** >>>>> -- >>>>> 2.17.1 >>>>> >>>>> _______________________________________________ >>>>> amd-gfx mailing list >>>>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx >>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >>>> _______________________________________________ >>>> amd-gfx mailing list >>>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx >>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx@xxxxxxxxxxxxxxxxxxxxx >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx