> Aaro Koskinen <aaro.koskinen@xxxxxx> hat am 29. März 2019 um 20:19 geschrieben: > > > Hi, > > On Fri, Mar 29, 2019 at 12:58:18AM +0200, Aaro Koskinen wrote: > > On Sun, Mar 10, 2019 at 02:51:31AM +0200, Aaro Koskinen wrote: > > > On Sat, Mar 09, 2019 at 11:57:57AM +0100, Stefan Wahren wrote: > > > > > Aaro Koskinen <aaro.koskinen@xxxxxx> hat am 27. Februar 2019 um 19:51 geschrieben: > > > > > On Tue, Feb 26, 2019 at 09:31:14AM +0100, Stefan Wahren wrote: > > > > > > it will take some time for to setup this test scenario. Could you please > > > > > > do me a favor and test 4.20.12 which has some backports of recent mmc / > > > > > > DMA fixes. > > > > > > > > > > Both 4.20 and 4.20.12 work fine. Only 5.0-rc fails reliably. > > > > > > > > I tried to reproduce the issue by compiling gcc and using stress on > > > > Raspberry Pi 3 (arm64/defconfig) with Arch Linux ARM without any luck. > > > > > > > > Were you able to reproduce the issue using stress? > > > > > > No, not yet. I'll let you know if I'm able to come up with a more reliable > > > reproducer. > > > > I tried GCC bootstrap again with 5.1-rc2 and LOCKDEP enabled, and get > > the below warning. Might some different unrelated issue, however. > > So with 5.1-rc2, the GCC bootstrap & testsuite went fine (some 20 hours) > without any MMC timeout errors or lockups. Also I think the below may > be the cause of the earlier problems I had: > > > [ 1164.390902] > > [ 1164.398302] ====================================================== > > [ 1164.416501] WARNING: possible circular locking dependency detected > > [ 1164.434710] 5.1.0-rc2-rpi3-los_6ba38c+-00247-g9936328b41ce-dirty #1 Not tainted > > [ 1164.454495] ------------------------------------------------------ > > [ 1164.472908] cc1plus/30873 is trying to acquire lock: > > [ 1164.489750] 0000000040a8ff57 (&mq->complete_lock){+.+.}, at: mmc_blk_mq_complete_prev_req.part.12+0x3c/0x220 > > [ 1164.518548] > > [ 1164.518548] but task is already holding lock: > > [ 1164.541662] 0000000059d7e9bb (fs_reclaim){+.+.}, at: fs_reclaim_acquire.part.19+0x0/0x40 > > [ 1164.567105] > > [ 1164.567105] which lock already depends on the new lock. > > [ 1164.567105] > > [ 1164.595691] > > [ 1164.595691] the existing dependency chain (in reverse order) is: > > [ 1164.616711] > > [ 1164.616711] -> #2 (fs_reclaim){+.+.}: > > [ 1164.630507] lock_acquire+0xe8/0x250 > > [ 1164.638922] fs_reclaim_acquire.part.19+0x34/0x40 > > [ 1164.652170] fs_reclaim_acquire+0x20/0x28 > > [ 1164.665139] __kmalloc+0x50/0x390 > > [ 1164.673717] bcm2835_dma_create_cb_chain+0x70/0x270 > > I think the bug is that it's using GFP_KERNEL here. Hm, i'm not sure about how to solve this properly. Can you try this, because i wasn't able to reproduce this: diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c index ec8a291..54093ff 100644 --- a/drivers/dma/bcm2835-dma.c +++ b/drivers/dma/bcm2835-dma.c @@ -671,7 +671,7 @@ static struct dma_async_tx_descriptor *bcm2835_dma_prep_slave_sg( d = bcm2835_dma_create_cb_chain(chan, direction, false, info, extra, frames, src, dst, 0, 0, - GFP_KERNEL); + GFP_NOWAIT); if (!d) return NULL;