On Sun, Oct 20, 2024 at 05:44:37PM +0100, Lorenzo Stoakes wrote: > +cc Linus, vmalloc reviewers > > On Sun, Oct 20, 2024 at 09:00:07AM -0400, Kent Overstreet wrote: > > On Sun, Oct 20, 2024 at 12:45:33PM +0100, Lorenzo Stoakes wrote: > > > On Sat, Oct 19, 2024 at 05:00:37PM -0400, Kent Overstreet wrote: > > > > A user with a 75 TB filesystem reported the following journal replay > > > > error: > > > > https://github.com/koverstreet/bcachefs/issues/769 > > > > > > > > In journal replay we have to sort and dedup all the keys from the > > > > journal, which means we need a large contiguous allocation. Given that > > > > the user has 128GB of ram, the 2GB limit on allocation size has become > > > > far too small. > > > > > > > > Cc: Vlastimil Babka <vbabka@xxxxxxx> > > > > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > > > > Signed-off-by: Kent Overstreet <kent.overstreet@xxxxxxxxx> > > > > --- > > > > mm/util.c | 6 ------ > > > > 1 file changed, 6 deletions(-) > > > > > > > > diff --git a/mm/util.c b/mm/util.c > > > > index 4f1275023eb7..c60df7723096 100644 > > > > --- a/mm/util.c > > > > +++ b/mm/util.c > > > > @@ -665,12 +665,6 @@ void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int node) > > > > if (!gfpflags_allow_blocking(flags)) > > > > return NULL; > > > > > > > > - /* Don't even allow crazy sizes */ > > > > - if (unlikely(size > INT_MAX)) { > > > > - WARN_ON_ONCE(!(flags & __GFP_NOWARN)); > > > > - return NULL; > > > > - } > > > > - > > > > > > Err, and not replace it with _any_ limit? That seems very unwise. > > > > large allocations will go to either the page allocator or vmalloc, and > > they have their own limits. > > Ah actually I misread it here, I see the allocation gets immediately sent > off to __kmalloc_node_noprof() and thus that'll apply its own limits before > doing this check prior to the vmalloc call. > > We actually do have a basic check in __vmalloc_node_range_noprof() that > prevents _totally_ insane requests, checking that size >> PAGE_SHIFT <= > totalram_pages(), so we shouldn't get anything too stupid here (I am > thinking especially of ptr + size overflow type situations). Yep, exactly. > But Linus explicitly introduced this INT_MAX check in commit 7661809d493b > ("mm: don't allow oversized kvmalloc() calls"), presumably for a reason, so > have cc'd him here in case he has an objection to this which amounts to a > revert of that patch. Ah, thanks for adding him. Yeah, the concern historically has been that integer truncation bugs are really nasty and entirely too common, and we have little in the way of tooling that can catch that. Unfortunately that's still the case today. Kees has been doing work on catching integer overflow, but I don't think we have anything coming yet for catching integer truncation. But given that vmalloc() already supports > INT_MAX requests, and memory sizes keep growing so 2GB is getting pretty small - I think it's time, this is going to come up in other places sooner or later.