Re: [PATCH] libceph: avoid a __vmalloc() deadlock in ceph_kvmalloc()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2019-09-10 at 17:17 +0200, Ilya Dryomov wrote:
> The vmalloc allocator doesn't fully respect the specified gfp mask:
> while the actual pages are allocated as requested, the page table pages
> are always allocated with GFP_KERNEL.  ceph_kvmalloc() may be called
> with GFP_NOFS and GFP_NOIO (for ceph and rbd respectively), so this may
> result in a deadlock.
> 
> There is no real reason for the current PAGE_ALLOC_COSTLY_ORDER logic,
> it's just something that seemed sensible at the time (ceph_kvmalloc()
> predates kvmalloc()).  kvmalloc() is smarter: in an attempt to reduce
> long term fragmentation, it first tries to kmalloc non-disruptively.
> 
> Switch to kvmalloc() and set the respective PF_MEMALLOC_* flag using
> the scope API to avoid the deadlock.  Note that kvmalloc() needs to be
> passed GFP_KERNEL to enable the fallback.
> 
> Signed-off-by: Ilya Dryomov <idryomov@xxxxxxxxx>
> ---
>  net/ceph/ceph_common.c | 29 +++++++++++++++++++++++------
>  1 file changed, 23 insertions(+), 6 deletions(-)
> 
> diff --git a/net/ceph/ceph_common.c b/net/ceph/ceph_common.c
> index c41789154cdb..970e74b46213 100644
> --- a/net/ceph/ceph_common.c
> +++ b/net/ceph/ceph_common.c
> @@ -13,6 +13,7 @@
>  #include <linux/nsproxy.h>
>  #include <linux/fs_parser.h>
>  #include <linux/sched.h>
> +#include <linux/sched/mm.h>
>  #include <linux/seq_file.h>
>  #include <linux/slab.h>
>  #include <linux/statfs.h>
> @@ -185,18 +186,34 @@ int ceph_compare_options(struct ceph_options *new_opt,
>  }
>  EXPORT_SYMBOL(ceph_compare_options);
>  
> +/*
> + * kvmalloc() doesn't fall back to the vmalloc allocator unless flags are
> + * compatible with (a superset of) GFP_KERNEL.  This is because while the
> + * actual pages are allocated with the specified flags, the page table pages
> + * are always allocated with GFP_KERNEL.  map_vm_area() doesn't even take
> + * flags because GFP_KERNEL is hard-coded in {p4d,pud,pmd,pte}_alloc().
> + *
> + * ceph_kvmalloc() may be called with GFP_KERNEL, GFP_NOFS or GFP_NOIO.
> + */
>  void *ceph_kvmalloc(size_t size, gfp_t flags)
>  {
> -	if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
> -		void *ptr = kmalloc(size, flags | __GFP_NOWARN);
> -		if (ptr)
> -			return ptr;
> +	void *p;
> +
> +	if ((flags & (__GFP_IO | __GFP_FS)) == (__GFP_IO | __GFP_FS)) {
> +		p = kvmalloc(size, flags);
> +	} else if ((flags & (__GFP_IO | __GFP_FS)) == __GFP_IO) {
> +		unsigned int nofs_flag = memalloc_nofs_save();
> +		p = kvmalloc(size, GFP_KERNEL);
> +		memalloc_nofs_restore(nofs_flag);
> +	} else {
> +		unsigned int noio_flag = memalloc_noio_save();
> +		p = kvmalloc(size, GFP_KERNEL);
> +		memalloc_noio_restore(noio_flag);
>  	}
>  
> -	return __vmalloc(size, flags, PAGE_KERNEL);
> +	return p;
>  }
>  
> -
>  static int parse_fsid(const char *str, struct ceph_fsid *fsid)
>  {
>  	int i = 0;

Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux