On 23.10.24 13:31, Marco Elver wrote:
On Wed, 23 Oct 2024 at 11:29, David Hildenbrand <david@xxxxxxxxxx> wrote:
On 23.10.24 11:18, Lorenzo Stoakes wrote:
On Wed, Oct 23, 2024 at 11:13:47AM +0200, David Hildenbrand wrote:
On 23.10.24 11:06, Vlastimil Babka wrote:
On 10/23/24 10:56, Dmitry Vyukov wrote:
Overall while I sympathise with this, it feels dangerous and a pretty major
change, because there'll be something somewhere that will break because it
expects faults to be swallowed that we no longer do swallow.
So I'd say it'd be something we should defer, but of course it's a highly
user-facing change so how easy that would be I don't know.
But I definitely don't think a 'introduce the ability to do cheap PROT_NONE
guards' series is the place to also fundmentally change how user access
page faults are handled within the kernel :)
Will delivering signals on kernel access be a backwards compatible
change? Or will we need a different API? MADV_GUARD_POISON_KERNEL?
It's just somewhat painful to detect/update all userspace if we add
this feature in future. Can we say signal delivery on kernel accesses
is unspecified?
Would adding signal delivery to guard PTEs only help enough the ASAN etc
usecase? Wouldn't it be instead possible to add some prctl to opt-in the
whole ASANized process to deliver all existing segfaults as signals instead
of -EFAULT ?
Not sure if it is an "instead", you might have to deliver the signal in
addition to letting the syscall fail (not that I would be an expert on
signal delivery :D ).
prctl sounds better, or some way to configure the behavior on VMA ranges;
otherwise we would need yet another marker, which is not the end of the
world but would make it slightly more confusing.
Yeah prctl() sounds sensible, and since we are explicitly adding a marker
for guard pages here we can do this as a follow up too without breaking any
userland expectations, i.e. 'new feature to make guard pages signal' is not
going to contradict the default behaviour.
So all makes sense to me, but I do think best as a follow up! :)
Yeah, fully agreed. And my gut feeling is that it might not be that easy
... :)
In the end, what we want is *some* notification that a guard PTE was
accessed. Likely the notification must not necessarily completely
synchronous (although it would be ideal) and it must not be a signal.
Maybe having a different way to obtain that information from user space
would work.
For bug detection tools (like GWP-ASan [1]) it's essential to have
useful stack traces. As such, having this signal be synchronous would
be more useful. I don't see how one could get a useful stack trace (or
other information like what's stashed away in ucontext like CPU
registers) if this were asynchronous.
Yes, I know. But it would be better than not getting *any* notification
except of some syscalls simply failing with -EFAULT, and not having an
idea which address was even accessed.
Maybe the signal injection is easier than I think, but I somehow doubt
it ...
--
Cheers,
David / dhildenb