On Mon, Mar 26, 2018 at 11:35:42AM -0500, Uma Krishnan wrote: > The following Oops can occur when there is heavy I/O traffic and the host > is reset by a tool such as sg_reset. > > [c000200fff3fbc90] c00800001690117c process_cmd_doneq+0x104/0x500 > [cxlflash] (unreliable) > [c000200fff3fbd80] c008000016901648 cxlflash_rrq_irq+0xd0/0x150 [cxlflash] > [c000200fff3fbde0] c000000000193130 __handle_irq_event_percpu+0xa0/0x310 > [c000200fff3fbea0] c0000000001933d8 handle_irq_event_percpu+0x38/0x90 > [c000200fff3fbee0] c000000000193494 handle_irq_event+0x64/0xb0 > [c000200fff3fbf10] c000000000198ea0 handle_fasteoi_irq+0xc0/0x230 > [c000200fff3fbf40] c00000000019182c generic_handle_irq+0x4c/0x70 > [c000200fff3fbf60] c00000000001794c __do_irq+0x7c/0x1c0 > [c000200fff3fbf90] c00000000002a390 call_do_irq+0x14/0x24 > [c000200e5828fab0] c000000000017b2c do_IRQ+0x9c/0x130 > [c000200e5828fb00] c000000000009b04 h_virt_irq_common+0x114/0x120 > > When a context is reset, the pending commands are flushed and the AFU > is notified. Before the AFU handles this request there could be command > completion interrupts queued to PHB which are yet to be delivered to the > context. In this scenario, a context could receive an interrupt for a > command that has been flushed, leading to a possible crash when the memory > for the flushed command is accessed. > > To resolve this problem, a boolean will indicate if the hardware queue is > ready to process interrupts or not. This can be evaluated in the interrupt > handler before proessing an interrupt. > > Signed-off-by: Uma Krishnan <ukrishn@xxxxxxxxxxxxxxxxxx> Acked-by: Matthew R. Ochs <mrochs@xxxxxxxxxxxxxxxxxx>