Re: [PATCH 3/5] mm: BUG_ON to avoid NULL deference while __GFP_NOFAIL fails

Michal Hocko <mhocko@xxxxxxxx> · Wed, 24 Jul 2024 14:10:29 +0200

On Wed 24-07-24 20:55:42, Barry Song wrote:
> From: Barry Song <v-songbaohua@xxxxxxxx>
> 
> We have cases we still fail though callers might have __GFP_NOFAIL.
> Since they don't check the return, we are exposed to the security
> risks for NULL deference.
> 
> Though BUG_ON() is not encouraged by Linus, this is an unrecoverable
> situation.
> 
> Christoph Hellwig:
> The whole freaking point of __GFP_NOFAIL is that callers don't handle
> allocation failures.  So in fact a straight BUG is the right thing
> here.
> 
> Vlastimil Babka:
> It's just not a recoverable situation (WARN_ON is for recoverable
> situations). The caller cannot handle allocation failure and at the same
> time asked for an impossible allocation. BUG_ON() is a guaranteed oops
> with stracktrace etc. We don't need to hope for the later NULL pointer
> dereference (which might if really unlucky happen from a different
> context where it's no longer obvious what lead to the allocation failing).
> 
> Michal Hocko:
> Linus tends to be against adding new BUG() calls unless the failure is
> absolutely unrecoverable (e.g. corrupted data structures etc.). I am
> not sure how he would look at simply incorrect memory allocator usage to
> blow up the kernel. Now the argument could be made that those failures
> could cause subtle memory corruptions or even be exploitable which might
> be a sufficient reason to stop them early.

I think it is worth adding that size checks are not really actionable
because they either cause unexpected failure or BUG_ON. It is not too
much of a stretch to expect some of the user triggerable codepaths could
hit this - e.g. when input is not checked properly. Silent failure is
then a potential security risk.

The page allocator, on the other hand, can chose to keep retrying even
if that means that there is not reclaim going on and essentially cause a
busy loop in the kernel space. That would eventually cause soft/hard
lockup detector to fire (if an architecture offers a reliable one).
So essentially there is choice between two bad solutions and you have
chosen one that reliably bugs on rather than rely on something external
to intervene. The reasoning for that should be mentioned in the
changelog.

[...]
> diff --git a/mm/util.c b/mm/util.c
> index 0ff5898cc6de..a1be50c243f1 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -668,6 +668,7 @@ void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int node)
>  	/* Don't even allow crazy sizes */
>  	if (unlikely(size > INT_MAX)) {
>  		WARN_ON_ONCE(!(flags & __GFP_NOWARN));
> +		BUG_ON(flags & __GFP_NOFAIL);

I guess you want to switch the ordering. WARNING on top of BUG on seems
rather pointless IMHO.

>  		return NULL;
>  	}
>  
> -- 
> 2.34.1

-- 
Michal Hocko
SUSE Labs