RE: [PATCH v3] dmaengine: idxd: fix submission race window

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Konstantin observed that when descriptors are submitted, the descriptor is
> added to the pending list after the submission. This creates a race window
> with the slight possibility that the descriptor can complete before it
> gets added to the pending list and this window would cause the completion
> handler to miss processing the descriptor.
> 
> To address the issue, the addition of the descriptor to the pending list
> must be done before it gets submitted to the hardware. However, submitting
> to swq with ENQCMDS instruction can cause a failure with the condition of
> either wq is full or wq is not "active".
> 
> With the descriptor allocation being the gate to the wq capacity, it is not
> possible to hit a retry with ENQCMDS submission to the swq. The only
> possible failure can happen is when wq is no longer "active" due to hw
> error and therefore we are moving towards taking down the portal. Given
> this is a rare condition and there's no longer concern over I/O
> performance, the driver can walk the completion lists in order to retrieve
> and abort the descriptor.
> 
> The error path will set the descriptor to aborted status. It will take the
> work list lock to prevent further processing of worklist. It will do a
> delete_all on the pending llist to retrieve all descriptors on the pending
> llist. The delete_all action does not require a lock. It will walk through
> the acquired llist to find the aborted descriptor while add all remaining
> descriptors to the work list since it holds the lock. If it does not find
> the aborted descriptor on the llist, it will walk through the work
> list. And if it still does not find the descriptor, then it means the
> interrupt handler has removed the desc from the llist but is pending on
> the work list lock and will process it once the error path releases the
> lock.
> 
> Fixes: eb15e7154fbf ("dmaengine: idxd: add interrupt handle request and release support")
> Reported-by: Konstantin Ananyev <konstantin.ananyev@xxxxxxxxx>
> Signed-off-by: Dave Jiang <dave.jiang@xxxxxxxxx>
> ---
> 
> v3:
> - add missing init for var (Konstantin)
> 
> v2:
> - do abort callback outside of lock (Konstantin)
> - fix abort reason flag (Konstantin)
> - remove changes to spinlock
> 
>  drivers/dma/idxd/idxd.h   |   14 ++++++++
>  drivers/dma/idxd/irq.c    |   27 +++++++++++-----
>  drivers/dma/idxd/submit.c |   75 ++++++++++++++++++++++++++++++++++++++++-----
>  3 files changed, 99 insertions(+), 17 deletions(-)
> 

Acked-by: Konstantin Ananyev <konstantin.ananyev@xxxxxxxxx>





[Index of Archives]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux PCI]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux