On Sat, Oct 16, 2021 at 02:51:30AM -0400, Paolo Bonzini wrote: > Commit 7661809d493b ("mm: don't allow oversized kvmalloc() calls") > restricted memory allocation with 'kvmalloc()' to sizes that fit > in an 'int', to protect against trivial integer conversion issues. > > However, the WARN triggers with KVM, when it allocates ancillary page > data whose size essentially depends on whatever userspace has passed to > the KVM_SET_USER_MEMORY_REGION ioctl. The warnings are easily raised by > syzkaller, but the largest allocation that KVM can do is 8 bytes per page > of guest memory; therefore, a 1 TiB memslot will cause a warning even > outside fuzzing, and those allocations are known to happen in the wild. > Google for example already has VMs that create 1.5tb memslots (12tb of > total guest memory spread across 8 virtual NUMA nodes). > > Use memcg accounting as evidence that the crazy large allocations are > expected---in which case, it is indeed a good idea to have them > properly accounted---and exempt them from the warning. Will memcg always have a "sane" upper bound? If so, yeah, this seems a better solution than dropping the WARN completely. :) Reviewed-by: Kees Cook <keescook@xxxxxxxxxxxx> -Kees > > Cc: Willy Tarreau <w@xxxxxx> > Cc: Kees Cook <keescook@xxxxxxxxxxxx> > Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > Reported-by: syzbot+e0de2333cbf95ea473e8@xxxxxxxxxxxxxxxxxxxxxxxxx > Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > --- > Linus, what do you think of this? It is a bit of a hack, > but the reasoning in the commit message does make at least > some sense. > > The alternative would be to just use __vmalloc in KVM, and add > __vcalloc too. The two underscores would suggest that something > "different" is going on, but I wonder what you prefer between > this and having a __vcalloc with 2-3 uses in the whole source. > > mm/util.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/mm/util.c b/mm/util.c > index 499b6b5767ed..31fca4a999c6 100644 > --- a/mm/util.c > +++ b/mm/util.c > @@ -593,8 +593,12 @@ void *kvmalloc_node(size_t size, gfp_t flags, int node) > if (ret || size <= PAGE_SIZE) > return ret; > > - /* Don't even allow crazy sizes */ > - if (WARN_ON_ONCE(size > INT_MAX)) > + /* > + * Don't even allow crazy sizes unless memcg accounting is > + * request. We take that as a sign that huge allocations > + * are indeed expected. > + */ > + if (likely(!(flags & __GFP_ACCOUNT)) && WARN_ON_ONCE(size > INT_MAX)) > return NULL; > > return __vmalloc_node(size, 1, flags, node, > -- > 2.27.0 > -- Kees Cook