Re: [PATCH] libceph: avoid a __vmalloc() deadlock in ceph_kvmalloc()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 11, 2019 at 8:32 AM Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
>
> On Tue, Sep 10, 2019 at 05:17:48PM +0200, Ilya Dryomov wrote:
> > The vmalloc allocator doesn't fully respect the specified gfp mask:
> > while the actual pages are allocated as requested, the page table pages
> > are always allocated with GFP_KERNEL.  ceph_kvmalloc() may be called
> > with GFP_NOFS and GFP_NOIO (for ceph and rbd respectively), so this may
> > result in a deadlock.
> >
> > There is no real reason for the current PAGE_ALLOC_COSTLY_ORDER logic,
> > it's just something that seemed sensible at the time (ceph_kvmalloc()
> > predates kvmalloc()).  kvmalloc() is smarter: in an attempt to reduce
> > long term fragmentation, it first tries to kmalloc non-disruptively.
> >
> > Switch to kvmalloc() and set the respective PF_MEMALLOC_* flag using
> > the scope API to avoid the deadlock.  Note that kvmalloc() needs to be
> > passed GFP_KERNEL to enable the fallback.
>
> If you can please just stop using GFP_NOFS altogether and set
> PF_MEMALLOC_* for the actual contexts.

Hi Christoph,

ceph_kvmalloc() is indirectly called from dozens of places, everywhere
a new RPC message is allocated.  Some of them are used for client setup
and don't need a scope (GFP_KERNEL is fine), but the vast majority do.
I don't think wrapping each call is practical.

As for getting rid of GFP_NOFS and GFP_NOIO entirely (i.e. dropping the
gfp mask from all libceph APIs and using scopes instead), it's something
that I have had in the back of my head for a while now because we cheat
in a few places and hard-code GFP_NOIO as the lowest common denominator
instead of properly propagating the gfp mask.  It's more of a project
though, and won't be backportable.

Thanks,

                Ilya



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux