Re: F37 proposal: Add -fno-omit-frame-pointer to default compilation flags (System-Wide Change proposal)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Why does the unwinding need to happen in the kernel?  The kernel can
> already asynchronously invoke userspace code in the form of signal
> handlers.  Is the problem that it is necessary to collect profiling
> information in the middle of a system call, where another syscall
> would see inconsistent (and potentially exploitable) kernel state?

One of the primary values of system-wide profiling is being able to see how a particular library call might have caused undesirable code paths due to a syscall, and where/what was reached (given high enough sampling rate).

Does that need to happen in kernel space? I don't know, perse, other than perf needs to be able to do that work as it is what gives us the array of instruction pointers back. There was some chatter a number of years ago in perf about how to handle ORC from user-space, and if I'm summing this up correctly, it was basically..

 - When sampling in PERF_CONTEXT_KERNEL, stop unwinding at the syscall boundary
 - Append stacktrace samples to perf buffer ring
 - Upon rescheduling, backtrace a single time into user-space, and expect the consumer to know that N previous samples with matching task-id all have the user-space backtrace.

That's a pretty significant behavior change, and all tools would need surgery to support it. I have no idea if that is paletable to either side of the debate, but it was the one possible direction I saw.

It does have a number of pros, in that you can save a lot of unwinding time on syscall-heavy workloads by doing user-space unwinding once, and from scheduler task queues (so you can take faults), and can avoid the NMI context being the cost-center for accounting. But the cons are significant in that the behavior change is expansive, effects all tooling, and will require ORC data across the platform.

> Ouch.  That is a serious problem for a number of reasons, not least
> of which is security.  Having the kernel parse even more complex
> untrusted input in C is a horrible idea.

It might seem that way by the description I gave, but we're just talking an array of intptr_t or similar. There is no dereferencing or state machines like you have with DWARF. Runtime resolution is also essentially bsearch() on interval arrays. I really don't think it's the sort of thing that requires Rust.

As for eBPF, we'd still probably be in NMI context with this route, and would fail if we had to page in ORC tables. So that means we'd either have to take a per-task memory overhead to maintain the mutated form (probably unreasonable) or find a way for that to be done from the task's space when returning from the syscall.

> Christian, would this be sufficient for your needs?

I don't think so without significant work. The best case I see here is for perf to support user-space unwinding within the task, be it ORC or DWARF, and that unwinder not have to necessarily be in-tree with the kernel because we know they won't accept a DWARF unwinder again.

It does pose some questions on what would happen with carefully crafted DWARF data.
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux