Re: [PATCH v2 0/5] implement lightweight guard pages

Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> · Wed, 23 Oct 2024 12:40:38 +0100

On Wed, Oct 23, 2024 at 01:36:10PM +0200, David Hildenbrand wrote:
> On 23.10.24 13:31, Marco Elver wrote:
> > On Wed, 23 Oct 2024 at 11:29, David Hildenbrand <david@xxxxxxxxxx> wrote:
> > >
> > > On 23.10.24 11:18, Lorenzo Stoakes wrote:
> > > > On Wed, Oct 23, 2024 at 11:13:47AM +0200, David Hildenbrand wrote:
> > > > > On 23.10.24 11:06, Vlastimil Babka wrote:
> > > > > > On 10/23/24 10:56, Dmitry Vyukov wrote:
> > > > > > > >
> > > > > > > > Overall while I sympathise with this, it feels dangerous and a pretty major
> > > > > > > > change, because there'll be something somewhere that will break because it
> > > > > > > > expects faults to be swallowed that we no longer do swallow.
> > > > > > > >
> > > > > > > > So I'd say it'd be something we should defer, but of course it's a highly
> > > > > > > > user-facing change so how easy that would be I don't know.
> > > > > > > >
> > > > > > > > But I definitely don't think a 'introduce the ability to do cheap PROT_NONE
> > > > > > > > guards' series is the place to also fundmentally change how user access
> > > > > > > > page faults are handled within the kernel :)
> > > > > > >
> > > > > > > Will delivering signals on kernel access be a backwards compatible
> > > > > > > change? Or will we need a different API? MADV_GUARD_POISON_KERNEL?
> > > > > > > It's just somewhat painful to detect/update all userspace if we add
> > > > > > > this feature in future. Can we say signal delivery on kernel accesses
> > > > > > > is unspecified?
> > > > > >
> > > > > > Would adding signal delivery to guard PTEs only help enough the ASAN etc
> > > > > > usecase? Wouldn't it be instead possible to add some prctl to opt-in the
> > > > > > whole ASANized process to deliver all existing segfaults as signals instead
> > > > > > of -EFAULT ?
> > > > >
> > > > > Not sure if it is an "instead", you might have to deliver the signal in
> > > > > addition to letting the syscall fail (not that I would be an expert on
> > > > > signal delivery :D ).
> > > > >
> > > > > prctl sounds better, or some way to configure the behavior on VMA ranges;
> > > > > otherwise we would need yet another marker, which is not the end of the
> > > > > world but would make it slightly more confusing.
> > > > >
> > > >
> > > > Yeah prctl() sounds sensible, and since we are explicitly adding a marker
> > > > for guard pages here we can do this as a follow up too without breaking any
> > > > userland expectations, i.e. 'new feature to make guard pages signal' is not
> > > > going to contradict the default behaviour.
> > > >
> > > > So all makes sense to me, but I do think best as a follow up! :)
> > >
> > > Yeah, fully agreed. And my gut feeling is that it might not be that easy
> > > ... :)
> > >
> > > In the end, what we want is *some* notification that a guard PTE was
> > > accessed. Likely the notification must not necessarily completely
> > > synchronous (although it would be ideal) and it must not be a signal.
> > >
> > > Maybe having a different way to obtain that information from user space
> > > would work.
> >
> > For bug detection tools (like GWP-ASan [1]) it's essential to have
> > useful stack traces. As such, having this signal be synchronous would
> > be more useful. I don't see how one could get a useful stack trace (or
> > other information like what's stashed away in ucontext like CPU
> > registers) if this were asynchronous.
>
> Yes, I know. But it would be better than not getting *any* notification
> except of some syscalls simply failing with -EFAULT, and not having an idea
> which address was even accessed.
>
> Maybe the signal injection is easier than I think, but I somehow doubt it
> ...

Yeah I'm afraid I don't think this series is a place where I can
fundamentally change how something so sensitive works in the kernel.

It's espeically super sensitive because this is a uAPI change and a wrong
decision here could result in guard pages being broken out the gate and I
really don't want to risk that.

>
> --
> Cheers,
>
> David / dhildenb
>