Re: [PATCH v4 14/36] lib: add allocation tagging support for memory allocation profiling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 21, 2024 at 07:34:44PM -0500, Kent Overstreet wrote:
> On Wed, Feb 21, 2024 at 04:25:02PM -0800, Kees Cook wrote:
> > On Wed, Feb 21, 2024 at 06:29:17PM -0500, Kent Overstreet wrote:
> > > On Wed, Feb 21, 2024 at 03:05:32PM -0800, Kees Cook wrote:
> > > > On Wed, Feb 21, 2024 at 11:40:27AM -0800, Suren Baghdasaryan wrote:
> > > > > [...]
> > > > > +struct alloc_tag {
> > > > > +	struct codetag			ct;
> > > > > +	struct alloc_tag_counters __percpu	*counters;
> > > > > +} __aligned(8);
> > > > > [...]
> > > > > +#define DEFINE_ALLOC_TAG(_alloc_tag)						\
> > > > > +	static DEFINE_PER_CPU(struct alloc_tag_counters, _alloc_tag_cntr);	\
> > > > > +	static struct alloc_tag _alloc_tag __used __aligned(8)			\
> > > > > +	__section("alloc_tags") = {						\
> > > > > +		.ct = CODE_TAG_INIT,						\
> > > > > +		.counters = &_alloc_tag_cntr };
> > > > > [...]
> > > > > +static inline struct alloc_tag *alloc_tag_save(struct alloc_tag *tag)
> > > > > +{
> > > > > +	swap(current->alloc_tag, tag);
> > > > > +	return tag;
> > > > > +}
> > > > 
> > > > Future security hardening improvement idea based on this infrastructure:
> > > > it should be possible to implement per-allocation-site kmem caches. For
> > > > example, we could create:
> > > > 
> > > > struct alloc_details {
> > > > 	u32 flags;
> > > > 	union {
> > > > 		u32 size; /* not valid after __init completes */
> > > > 		struct kmem_cache *cache;
> > > > 	};
> > > > };
> > > > 
> > > > - add struct alloc_details to struct alloc_tag
> > > > - move the tags section into .ro_after_init
> > > > - extend alloc_hooks() to populate flags and size:
> > > > 	.flags = __builtin_constant_p(size) ? KMALLOC_ALLOCATE_FIXED
> > > > 					    : KMALLOC_ALLOCATE_BUCKETS;
> > > > 	.size = __builtin_constant_p(size) ? size : SIZE_MAX;
> > > > - during kernel start or module init, walk the alloc_tag list
> > > >   and create either a fixed-size kmem_cache or to allocate a
> > > >   full set of kmalloc-buckets, and update the "cache" member.
> > > > - adjust kmalloc core routines to use current->alloc_tag->cache instead
> > > >   of using the global buckets.
> > > > 
> > > > This would get us fully separated allocations, producing better than
> > > > type-based levels of granularity, exceeding what we have currently with
> > > > CONFIG_RANDOM_KMALLOC_CACHES.
> > > > 
> > > > Does this look possible, or am I misunderstanding something in the
> > > > infrastructure being created here?
> > > 
> > > Definitely possible, but... would we want this?
> > 
> > Yes, very very much. One of the worst and mostly unaddressed weaknesses
> > with the kernel right now is use-after-free based type confusion[0], which
> > depends on merged caches (or cache reuse).
> > 
> > This doesn't solve cross-allocator (kmalloc/page_alloc) type confusion
> > (as terrifyingly demonstrated[1] by Jann Horn), but it does help with
> > what has been a very common case of "use msg_msg to impersonate your
> > target object"[2] exploitation.
> 
> We have a ton of code that references PAGE_SIZE and uses the page
> allocator completely unnecessarily - that's something worth harping
> about at conferences; if we could motivate people to clean that stuff up
> it'd have a lot of positive effects.
> 
> > > That would produce a _lot_ of kmem caches
> > 
> > Fewer than you'd expect, but yes, there is some overhead. However,
> > out-of-tree forks of Linux have successfully experimented with this
> > already and seen good results[3].
> 
> So in that case - I don't think there's any need for a separate
> alloc_details; we'd just add a kmem_cache * to alloc_tag and then hook
> into the codetag init/unload path to create and destroy the kmem caches.

Okay, sounds good. There needs to be a place to track "is this a fixed
size or a run-time size" choice.

> No need to adjust the slab code either; alloc_hooks() itself could
> dispatch to kmem_cache_alloc() instead of kmalloc() if this is in use.

Right, it'd go to either kmem_cache_alloc() directly, or to a modified
kmalloc() that used the passed-in cache is the base for an array of sized
buckets, rather than the global (or 16-way global) buckets.

Yay for the future!

-- 
Kees Cook




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux