On Mon, Dec 28 2015, Stanislav Samsonov wrote: > On 24 December 2015 at 00:46, Dan Williams <dan.j.williams@xxxxxxxxx> wrote: >> >> On Wed, Dec 23, 2015 at 2:39 PM, NeilBrown <neilb@xxxxxxxx> wrote: >> > On Thu, Dec 24 2015, Dan Williams wrote: >> >>> Changing the GFP_NOIO to GFP_ATOMIC in all the calls to >> >>> dmaengine_get_unmap_data() in crypto/async_tx/ would probably fix the >> >>> issue... or make it crash even worse :-) >> >>> >> >>> Dan: do you have any wisdom here? The xor is using the percpu data in >> >>> raid5, so it cannot be sleep, but GFP_NOIO allows sleep. >> >>> Does the code handle failure to get_unmap_data() safely? It looks like >> >>> it probably does. >> >> >> >> Those GFP_NOIO should move to GFP_NOWAIT. We don't want GFP_ATOMIC >> >> allocations to consume emergency reserves for a performance >> >> optimization. Longer term async_tx needs to be merged into md >> >> directly as we can allocate this unmap data statically per-stripe >> >> rather than per request. This asyntc_tx re-write has been on the todo >> >> list for years, but never seems to make it to the top. >> > >> > So the following maybe? >> > If I could get an acked-by from you Dan, and a Tested-by: from you >> > Slava, I'll submit upstream. >> > >> > Thanks, >> > NeilBrown >> > >> > From: NeilBrown <neilb@xxxxxxxx> >> > Date: Thu, 24 Dec 2015 09:35:18 +1100 >> > Subject: [PATCH] async_tx: use GFP_NOWAIT rather than GFP_IO >> > >> > These async_XX functions are called from md/raid5 in an atomic >> > section, between get_cpu() and put_cpu(), so they must not sleep. >> > So use GFP_NOWAIT rather than GFP_IO. >> > >> > Dan Williams writes: Longer term async_tx needs to be merged into md >> > directly as we can allocate this unmap data statically per-stripe >> > rather than per request. >> > >> > Reported-by: Stanislav Samsonov <slava@xxxxxxxxxxxxxxxxx> >> > Signed-off-by: NeilBrown <neilb@xxxxxxxx> >> >> Acked-by: Dan Williams <dan.j.williams@xxxxxxxxx> > > Tested-by: Slava Samsonov <slava@xxxxxxxxxxxxxxxxx> Thanks. I guess this was problem was introduced by Commit: 7476bd79fc01 ("async_pq: convert to dmaengine_unmap_data") in 3.13. Do we think it deserves to go to -stable? (I just realised that this is really Dan's code more than mine, so why am I submitting it ??? But we are here now so it may as well go in through the md tree.) NeilBrown
Attachment:
signature.asc
Description: PGP signature