On 25-06-22, 15:44, Jie Hai wrote: > When we get a DMA channel and try to use it in multiple threads it > will cause oops and hanging the system. > > % echo 100 > /sys/module/dmatest/parameters/threads_per_chan > % echo 100 > /sys/module/dmatest/parameters/iterations > % echo 1 > /sys/module/dmatest/parameters/run > [383493.327077] Unable to handle kernel paging request at virtual > address dead000000000108 > [383493.335103] Mem abort info: > [383493.335103] ESR = 0x96000044 > [383493.335105] EC = 0x25: DABT (current EL), IL = 32 bits > [383493.335107] SET = 0, FnV = 0 > [383493.335108] EA = 0, S1PTW = 0 > [383493.335109] FSC = 0x04: level 0 translation fault > [383493.335110] Data abort info: > [383493.335111] ISV = 0, ISS = 0x00000044 > [383493.364739] CM = 0, WnR = 1 > [383493.367793] [dead000000000108] address between user and kernel > address ranges > [383493.375021] Internal error: Oops: 96000044 [#1] PREEMPT SMP > [383493.437574] CPU: 63 PID: 27895 Comm: dma0chan0-copy2 Kdump: > loaded Tainted: GO 5.17.0-rc4+ #2 > [383493.457851] pstate: 204000c9 (nzCv daIF +PAN -UAO -TCO -DIT > -SSBS BTYPE=--) > [383493.465331] pc : vchan_tx_submit+0x64/0xa0 > [383493.469957] lr : vchan_tx_submit+0x34/0xa0 > > This happens because of data race. Each thread rewrite channels's > descriptor as soon as device_issue_pending is called. It leads to > the situation that the driver thinks that it uses the right > descriptor in interrupt handler while channels's descriptor has > been changed by other thread. > > With current fixes channels's descriptor changes it's value only > when it has been used. A new descriptor is acquired from > vc->desc_issued queue that is already filled with descriptors > that are ready to be sent. Threads have no direct access to DMA > channel descriptor. Now it is just possible to queue a descriptor > for further processing. > > Fixes: e9f08b65250d ("dmaengine: hisilicon: Add Kunpeng DMA engine support") > Signed-off-by: Jie Hai <haijie1@xxxxxxxxxx> > --- > drivers/dma/hisi_dma.c | 6 ++---- > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/drivers/dma/hisi_dma.c b/drivers/dma/hisi_dma.c > index 0a0f8a4d168a..0385419be8d5 100644 > --- a/drivers/dma/hisi_dma.c > +++ b/drivers/dma/hisi_dma.c > @@ -271,7 +271,6 @@ static void hisi_dma_start_transfer(struct hisi_dma_chan *chan) > > vd = vchan_next_desc(&chan->vc); > if (!vd) { > - dev_err(&hdma_dev->pdev->dev, "no issued task!\n"); how is this a fix? > chan->desc = NULL; > return; > } > @@ -303,7 +302,7 @@ static void hisi_dma_issue_pending(struct dma_chan *c) > > spin_lock_irqsave(&chan->vc.lock, flags); > > - if (vchan_issue_pending(&chan->vc)) > + if (vchan_issue_pending(&chan->vc) && !chan->desc) This looks good > hisi_dma_start_transfer(chan); > > spin_unlock_irqrestore(&chan->vc.lock, flags); > @@ -442,11 +441,10 @@ static irqreturn_t hisi_dma_irq(int irq, void *data) > chan->qp_num, chan->cq_head); > if (FIELD_GET(STATUS_MASK, cqe->w0) == STATUS_SUCC) { > vchan_cookie_complete(&desc->vd); > + hisi_dma_start_transfer(chan); Why should this fix the error reported? > } else { > dev_err(&hdma_dev->pdev->dev, "task error!\n"); > } > - > - chan->desc = NULL; > } > > spin_unlock(&chan->vc.lock); > -- > 2.33.0 -- ~Vinod