Re: raid5 async_xor: sleep in atomic

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jan 3, 2016 at 5:33 PM, NeilBrown <neilb@xxxxxxxx> wrote:
> On Mon, Dec 28 2015, Stanislav Samsonov wrote:
>
>> On 24 December 2015 at 00:46, Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
>>>
>>> On Wed, Dec 23, 2015 at 2:39 PM, NeilBrown <neilb@xxxxxxxx> wrote:
>>> > On Thu, Dec 24 2015, Dan Williams wrote:
>>> >>> Changing the GFP_NOIO to GFP_ATOMIC in all the calls to
>>> >>> dmaengine_get_unmap_data() in crypto/async_tx/ would probably fix the
>>> >>> issue... or make it crash even worse :-)
>>> >>>
>>> >>> Dan: do you have any wisdom here?  The xor is using the percpu data in
>>> >>> raid5, so it cannot be sleep, but GFP_NOIO allows sleep.
>>> >>> Does the code handle failure to get_unmap_data() safely?  It looks like
>>> >>> it probably does.
>>> >>
>>> >> Those GFP_NOIO should move to GFP_NOWAIT.  We don't want GFP_ATOMIC
>>> >> allocations to consume emergency reserves for a performance
>>> >> optimization.  Longer term async_tx needs to be merged into md
>>> >> directly as we can allocate this unmap data statically per-stripe
>>> >> rather than per request. This asyntc_tx re-write has been on the todo
>>> >> list for years, but never seems to make it to the top.
>>> >
>>> > So the following maybe?
>>> > If I could get an acked-by from you Dan, and a Tested-by: from you
>>> > Slava, I'll submit upstream.
>>> >
>>> > Thanks,
>>> > NeilBrown
>>> >
>>> > From: NeilBrown <neilb@xxxxxxxx>
>>> > Date: Thu, 24 Dec 2015 09:35:18 +1100
>>> > Subject: [PATCH] async_tx: use GFP_NOWAIT rather than GFP_IO
>>> >
>>> > These async_XX functions are called from md/raid5 in an atomic
>>> > section, between get_cpu() and put_cpu(), so they must not sleep.
>>> > So use GFP_NOWAIT rather than GFP_IO.
>>> >
>>> > Dan Williams writes: Longer term async_tx needs to be merged into md
>>> > directly as we can allocate this unmap data statically per-stripe
>>> > rather than per request.
>>> >
>>> > Reported-by: Stanislav Samsonov <slava@xxxxxxxxxxxxxxxxx>
>>> > Signed-off-by: NeilBrown <neilb@xxxxxxxx>
>>>
>>> Acked-by: Dan Williams <dan.j.williams@xxxxxxxxx>
>>
>> Tested-by: Slava Samsonov <slava@xxxxxxxxxxxxxxxxx>
>
> Thanks.
>
> I guess this was problem was introduced by
> Commit: 7476bd79fc01 ("async_pq: convert to dmaengine_unmap_data")
> in 3.13.

Yes.

> Do we think it deserves to go to -stable?

I think so, yes.

> (I just realised that this is really Dan's code more than mine,
>  so why am I submitting it ???

True!  I was grateful for your offer, but I should have taken over
coordination...

> But we are here now so it may as well go
>  in through the md tree.)

That or Vinod is maintaining drivers/dma/ these days (added Cc's).
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux