On Tue 03-01-17 11:23:04, Vlastimil Babka wrote: > On 01/02/2017 02:37 PM, Michal Hocko wrote: > > From: Michal Hocko <mhocko@xxxxxxxx> > > > > Using kmalloc with the vmalloc fallback for larger allocations is a > > common pattern in the kernel code. Yet we do not have any common helper > > for that and so users have invented their own helpers. Some of them are > > really creative when doing so. Let's just add kv[mz]alloc and make sure > > it is implemented properly. This implementation makes sure to not make > > a large memory pressure for > PAGE_SZE requests (__GFP_NORETRY) and also > > to not warn about allocation failures. This also rules out the OOM > > killer as the vmalloc is a more approapriate fallback than a disruptive > > user visible action. > > > > This patch also changes some existing users and removes helpers which > > are specific for them. In some cases this is not possible (e.g. > > ext4_kvmalloc, libcfs_kvzalloc, __aa_kvmalloc) because those seems to be > > broken and require GFP_NO{FS,IO} context which is not vmalloc compatible > > in general (note that the page table allocation is GFP_KERNEL). Those > > need to be fixed separately. > > > > apparmor has already claimed kv[mz]alloc so remove those and use > > __aa_kvmalloc instead to prevent from the naming clashes. > > > > Changes since v1 > > - define __vmalloc_node_flags for CONFIG_MMU=n > > > > Cc: Anatoly Stepanov <astepanov@xxxxxxxxxxxxxx> > > Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> > > Cc: Mike Snitzer <snitzer@xxxxxxxxxx> > > Cc: "Michael S. Tsirkin" <mst@xxxxxxxxxx> > > Cc: "Theodore Ts'o" <tytso@xxxxxxx> > > Reviewed-by: Andreas Dilger <adilger@xxxxxxxxx> # ext4 part > > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx> > > Acked-by: Vlastimil Babka <vbabka@xxxxxxx> > (but with a small fix and suggestion below) Thanks! > > > --- a/mm/util.c > > +++ b/mm/util.c > > @@ -346,6 +346,46 @@ unsigned long vm_mmap(struct file *file, unsigned long addr, > > } > > EXPORT_SYMBOL(vm_mmap); > > > > +/** > > + * kvmalloc_node - allocate contiguous memory from SLAB with vmalloc fallback > > + * @size: size of the request. > > + * @flags: gfp mask for the allocation - must be compatible with GFP_KERNEL. > > + * @node: numa node to allocate from > > + * > > + * Uses kmalloc to get the memory but if the allocation fails then falls back > > + * to the vmalloc allocator. Use kvfree for freeing the memory. > > + */ > > +void *kvmalloc_node(size_t size, gfp_t flags, int node) > > +{ > > + gfp_t kmalloc_flags = flags; > > + void *ret; > > + > > + /* > > + * vmalloc uses GFP_KERNEL for some internal allocations (e.g page tables) > > + * so the given set of flags has to be compatible. > > + */ > > + WARN_ON((flags & GFP_KERNEL) != GFP_KERNEL); > > Wouldn't a _ONCE be sufficient? It's unlikely that multiple wrong call sites > appear out of the blue, but we don't want to flood the log from a single > frequently called site. No strong feelings though. Fair enough, I will make it WARN_ON_ONCE. I wish WARN_ON_ONCE would be more clever, though. We can lose information about different call sites. I was thinking about how to deal with it and I stackdepot sounds like it could help here. But this is off-topic... > > + > > + /* > > + * Make sure that larger requests are not too disruptive - no OOM > > + * killer and no allocation failure warnings as we have a fallback > > + */ > > + if (size > PAGE_SIZE) > > + kmalloc_flags |= __GFP_NORETRY | __GFP_NOWARN; > > + > > + ret = kmalloc_node(size, kmalloc_flags, node); > > + > > + /* > > + * It doesn't really make sense to fallback to vmalloc for sub page > > + * requests > > + */ > > + if (ret || size < PAGE_SIZE) > > This should be size <= PAGE_SIZE. You are right of course! -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html