Re: How does swsusp work with randomization features? (was: mm/slab: Initialise random_kmalloc_seed after initcalls)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Feb 15, 2025 at 05:53:29PM +0800, Huacai Chen wrote:
> On Fri, Feb 14, 2025 at 8:45 PM Harry (Hyeonggon) Yoo
> <42.hyeyoo@xxxxxxxxx> wrote:
> >
> > On Fri, Feb 14, 2025 at 06:02:52PM +0800, Huacai Chen wrote:
> > > On Fri, Feb 14, 2025 at 5:33 PM Harry (Hyeonggon) Yoo
> > > <42.hyeyoo@xxxxxxxxx> wrote:
> > > >
> > > > On Thu, Feb 13, 2025 at 11:20:22AM +0800, Huacai Chen wrote:
> > > > > Hi, Harry,
> > > > >
> > > > > On Wed, Feb 12, 2025 at 11:39 PM Harry (Hyeonggon) Yoo
> > > > > <42.hyeyoo@xxxxxxxxx> wrote:
> > > > > > On Wed, Feb 12, 2025 at 11:17 PM Huacai Chen <chenhuacai@xxxxxxxxxxx> wrote:
> > > > > > >
> > > > > > > Hibernation assumes the memory layout after resume be the same as that
> > > > > > > before sleep, but CONFIG_RANDOM_KMALLOC_CACHES breaks this assumption.
> > > > > >
> > > > > > Could you please elaborate what do you mean by
> > > > > > hibernation assumes 'the memory layout' after resume be the same as that
> > > > > > before sleep?
> > > > > >
> > > > > > I don't understand how updating random_kmalloc_seed breaks resuming from
> > > > > > hibernation. Changing random_kmalloc_seed affects which kmalloc caches
> > > > > > newly allocated objects are from, but it should not affect the objects that are
> > > > > > already allocated (before hibernation).
> > > > >
> > > > > When resuming, the booting kernel should switch to the target kernel,
> > > > > if the address of switch code (from the booting kernel) is the
> > > > > effective data of the target kernel, then the switch code may be
> > > > > overwritten.
> > > >
> > > > Hmm... I'm still missing some pieces.
> > > > How is the kernel binary overwritten when slab allocations are randomized?
> > > >
> > > > Also, I'm not sure if it's even safe to assume that the memory layout is the
> > > > same across boots. But I'm not an expert on swsusp anyway...
> > > >
> > > > It'd be really helpful for linux-pm folks to clarify 1) what are the
> > > > (architecture-independent) assumptions are for swsusp to work, and
> > > > 2) how architectures dealt with other randomization features like kASLR...
> > >
> >
> > [+Cc few more people that worked on slab hardening]
> >
> > > I'm sorry to confuse you. Binary overwriting is indeed caused by
> > > kASLR, so at least on LoongArch we should disable kASLR for
> > > hibernation.
> >
> > Understood.
> >
> > > Random kmalloc is another story, on LoongArch it breaks smpboot when
> > > resuming, the details are:
> > > 1, LoongArch uses kmalloc() family to allocate idle_task's
> > > stack/thread_info and other data structures.
> > > 2, If random kmalloc is enabled, idle_task's stack in the booting
> > > kernel may be other things in the target kernel.
> >
> > Slab hardening features try so hard to prevent such predictability.
> > For example, SLAB_FREELIST_RANDOM could also randomize the address
> > kmalloc objects are allocated at.
> >
> > Rather than hacking CONFIG_RANDOM_KMALLOC_CACHES like this, we could
> > have a single option to disable slab hardening features that makes
> > the address unpredictable.
> >
> > It'd be nice to have something like ARCH_SUPPORTS_SLAB_RANDOM which
> > some hardening features depend on. And then let some arches conditionally
> > not select ARCH_SUPPORTS_SLAB_RANDOM if hibernation's enabled
> > (at cost of less hardening)?
>
> This is not good, my patch doesn't disable RANDOM for hibernation, it
> just delays the initialization. When the system is running, all
> randomization is still usable.

I think at least we need a rule (like ARCH_SUPPORTS_SLAB_RANDOM)
for slab hardening features that prevents breaking hibernation
in the future. Without rules, introducing new hardening features could
break hibernation again.

But I'm not yet convinced if it's worth the complexity of hacking slab
hardening features (for security) just to make hibernation work on
some arches, which have already disabled kASLR anyway...

> For SLAB_FREELIST_RANDOM, I found that it doesn't break hibernation
> (at least on LoongArch), the reason is:
> 1. When I said "data overwritten" before, it doesn't mean that every
> byte shouldn't be overwritten, only some important parts matter.
> 2. On LoongArch, the important parts include: switch code, exception
> handlers, idle_task's stack/thread_info.
> 3. switch code and exception handlers are protected by automatically
> disabling kASLR from arch-specific code, idle_task's stack/thread_info
> is protected by delaying random seeds (this patch).
> 
> Why SLAB_FREELIST_RANDOM doesn't corrupt idle_task's
> stack/thread_info? Because the scope of randomization of
> SLAB_FREELIST_RANDOM is significantly less than RANDOM_KMALLOC_CACHES.
> When RANDOM_KMALLOC_CACHES enabled,

You mean when SLAB_FREELIST_RANDOM enabled?
Assuming that...

> the CPU1's idle task stack from
> the booting kernel may be the CPU2's idle task stack from the target
> kernel, and CPU2's idle task stack from the booting kernel may be the
> CPU1's idle task stack from the target kernel

What happens if it's not the case?

> but idle task's stack
> from the booting kernel won't be other things from the target kernel
> (and won't be overwritten by switching kernel).

What guarantees that it won't be overwritten?
To me it seems to be a fragile assumption that could be broken.

Am I missing something?

-- 
Harry




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux