On Thu, Oct 03, 2024 at 05:50:39PM +0100, Matthew Wilcox wrote: > On Thu, Oct 03, 2024 at 04:27:12PM +0200, Vlastimil Babka wrote: > > On 9/25/24 21:46, Matthew Wilcox wrote: > > > Kees and I had a fun discussion at Plumbers. > > > > > > We're trying to harden against type confusion, where we think we have > > > a pointer to one thing, but it turns out to be a pointer to a different > > > thing. There's various ways this can be harmful, which Kees has laid out > > > before when adding slab buckets. eg see https://lwn.net/Articles/978976/ > > > > > > Not all allocations come from slab though. If we free a slab object > > > and the slab it was in gets freed back to the page allocator, it can > > > turn into almost anything else _quickly_ as the page allocator fronts > > > the buddy allocator with a stack of recently-freed pages (called PCP, > > > not to be confused with percpu memory), so if the attacker can arrange > > > for a page table allocation to come in soon after a slab free, it is > > > very likely to be the memory they have access to. > > > > > > My proposal is that we resolve this "type confusion" by having separate > > > PCP lists for different types of pages. We'll need to have this for > > > memdescs anyway, so this is just shifting some of the work left. > > > > > > We'd reduce the exploitability of type confusion by using a per-CPU, > > > per-type stack of recently used pages. To turn a slab page into a page > > > table page, the attacker would have to cause a dozen slabs to be freed on > > > this CPU, pushing this one into the buddy allocator. Then they'd have > > > to cause the allocating task to empty its stack of page table pages, > > > causing the attackable slab to be pulled from the buddy. It's still > > > possible, but it's harder. > > > > > > Harder enough? I don't know, hence this email. We can get into the > > > API design (and then the implementation design) if we have agreement > > > that this is the right approach to be taking. > > > > Not a security expert but I doubt it's harder enough? > > > > I thought the robust mitigation here was SLAB_VIRTUAL > > Well, this is for allocations that _don't_ come from slab. Like page > tables and page cache or anoymous memory. I'd really like to hear Jann's thoughts on this. My instinct is that if it makes it harder to attack but provides some better performance or reliability characteristics, it's very worth it. :) -- Kees Cook