On 2018-11-02 9:48 a.m., Christian König wrote: > Vega10 has multiple interrupt rings, I don't think I've seen your code that implements multiple interrupt rings. So it's a bit hard to comment. As I understand it, the only way this could happen is, if the two interrupt rings are handled by different CPU cores. Alternatively, you may end up calling kgd2kfd_interrupt outside of interrupt context (e.g. in a kernel thread), which would invalidate an implicit assumption in kgd2kfd_interrupt. Can you clarify what's happening here? Anyway, this change shouldn't cause any other problems so feel free to add my Acked-by. Regards, Felix > so this can be called from multiple > calles at the same time resulting in: > > [ 71.779334] ================================ > [ 71.779406] WARNING: inconsistent lock state > [ 71.779478] 4.19.0-rc1+ #44 Tainted: G W > [ 71.779565] -------------------------------- > [ 71.779637] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. > [ 71.779740] kworker/6:1/120 [HC0[0]:SC0[0]:HE1:SE1] takes: > [ 71.779832] 00000000ad761971 (&(&kfd->interrupt_lock)->rlock){?...}, > at: kgd2kfd_interrupt+0x75/0x100 [amdgpu] > [ 71.780058] {IN-HARDIRQ-W} state was registered at: > [ 71.780115] _raw_spin_lock+0x2c/0x40 > [ 71.780180] kgd2kfd_interrupt+0x75/0x100 [amdgpu] > [ 71.780248] amdgpu_irq_callback+0x6c/0x150 [amdgpu] > [ 71.780315] amdgpu_ih_process+0x88/0x100 [amdgpu] > [ 71.780380] amdgpu_irq_handler+0x20/0x40 [amdgpu] > [ 71.780409] __handle_irq_event_percpu+0x49/0x2a0 > [ 71.780436] handle_irq_event_percpu+0x30/0x70 > [ 71.780461] handle_irq_event+0x37/0x60 > [ 71.780484] handle_edge_irq+0x83/0x1b0 > [ 71.780506] handle_irq+0x1f/0x30 > [ 71.780526] do_IRQ+0x53/0x110 > [ 71.780544] ret_from_intr+0x0/0x22 > [ 71.780566] cpuidle_enter_state+0xaa/0x330 > [ 71.780591] do_idle+0x203/0x280 > [ 71.780610] cpu_startup_entry+0x6f/0x80 > [ 71.780634] start_secondary+0x1b0/0x200 > [ 71.780657] secondary_startup_64+0xa4/0xb0 > > Fix this by always using irq save spin locks. > > Signed-off-by: Christian König <christian.koenig@xxxxxxx> > --- > drivers/gpu/drm/amd/amdkfd/kfd_device.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > index d7e0f41f8edc..c004647c8cb4 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > @@ -682,6 +682,7 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry) > { > uint32_t patched_ihre[KFD_MAX_RING_ENTRY_SIZE]; > bool is_patched = false; > + unsigned long flags; > > if (!kfd->init_complete) > return; > @@ -691,7 +692,7 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry) > return; > } > > - spin_lock(&kfd->interrupt_lock); > + spin_lock_irqsave(&kfd->interrupt_lock, flags); > > if (kfd->interrupts_active > && interrupt_is_wanted(kfd, ih_ring_entry, > @@ -700,7 +701,7 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry) > is_patched ? patched_ihre : ih_ring_entry)) > queue_work(kfd->ih_wq, &kfd->interrupt_work); > > - spin_unlock(&kfd->interrupt_lock); > + spin_unlock_irqrestore(&kfd->interrupt_lock, flags); > } > > int kgd2kfd_quiesce_mm(struct mm_struct *mm) _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx