Re: [RFC PATCH v6 3/3] arm64: Create a list of SYM_CODE functions, check return PC against list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 7/29/21 10:48 AM, Mark Rutland wrote:
> On Thu, Jul 29, 2021 at 09:06:26AM -0500, Madhavan T. Venkataraman wrote:
>> Responses inline...
>>
>> On 7/28/21 12:25 PM, Mark Rutland wrote:
>>> On Wed, Jun 30, 2021 at 05:33:56PM -0500, madvenka@xxxxxxxxxxxxxxxxxxx wrote:
>>>> From: "Madhavan T. Venkataraman" <madvenka@xxxxxxxxxxxxxxxxxxx>
>>>> ... <snip> ...
>>>> +static struct code_range	*sym_code_functions;
>>>> +static int			num_sym_code_functions;
>>>> +
>>>> +int __init init_sym_code_functions(void)
>>>> +{
>>>> +	size_t size;
>>>> +
>>>> +	size = (unsigned long)__sym_code_functions_end -
>>>> +	       (unsigned long)__sym_code_functions_start;
>>>> +
>>>> +	sym_code_functions = kmalloc(size, GFP_KERNEL);
>>>> +	if (!sym_code_functions)
>>>> +		return -ENOMEM;
>>>> +
>>>> +	memcpy(sym_code_functions, __sym_code_functions_start, size);
>>>> +	/* Update num_sym_code_functions after copying sym_code_functions. */
>>>> +	smp_mb();
>>>> +	num_sym_code_functions = size / sizeof(struct code_range);
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +early_initcall(init_sym_code_functions);
>>>
>>> What's the point of copying this, given we don't even sort it?
>>>
>>> If we need to keep it around, it would be nicer to leave it where the
>>> linker put it, but make it rodata or ro_after_init.
>>>
>>
>> I was planning to sort it for performance. I have a comment to that effect.
>> But I can remove the copy and retain the info in linker data.
> 
> I think for now it's better to place it in .rodata. If we need to sort
> this, we can rework that later, preferably sorting at compile time as
> with extable entries.
> 
> That way this is *always* in a usable state, and there's a much lower
> risk of this being corrupted by a stray write.
> 

OK.

>>>> +	/*
>>>> +	 * Check the return PC against sym_code_functions[]. If there is a
>>>> +	 * match, then the consider the stack frame unreliable. These functions
>>>> +	 * contain low-level code where the frame pointer and/or the return
>>>> +	 * address register cannot be relied upon. This addresses the following
>>>> +	 * situations:
>>>> +	 *
>>>> +	 *  - Exception handlers and entry assembly
>>>> +	 *  - Trampoline assembly (e.g., ftrace, kprobes)
>>>> +	 *  - Hypervisor-related assembly
>>>> +	 *  - Hibernation-related assembly
>>>> +	 *  - CPU start-stop, suspend-resume assembly
>>>> +	 *  - Kernel relocation assembly
>>>> +	 *
>>>> +	 * Some special cases covered by sym_code_functions[] deserve a mention
>>>> +	 * here:
>>>> +	 *
>>>> +	 *  - All EL1 interrupt and exception stack traces will be considered
>>>> +	 *    unreliable. This is the correct behavior as interrupts and
>>>> +	 *    exceptions can happen on any instruction including ones in the
>>>> +	 *    frame pointer prolog and epilog. Unless stack metadata is
>>>> +	 *    available so the unwinder can unwind through these special
>>>> +	 *    cases, such stack traces will be considered unreliable.
>>>
>>> As mentioned previously, we *can* reliably unwind precisely one step
>>> across an exception boundary, as we can be certain of the PC value at
>>> the time the exception was taken, but past this we can't be certain
>>> whether the LR is legitimate.
>>>
>>> I'd like that we capture that precisely in the unwinder, and I'm
>>> currently reworking the entry assembly to make that possible.
>>>
>>>> +	 *
>>>> +	 *  - A task can get preempted at the end of an interrupt. Stack
>>>> +	 *    traces of preempted tasks will show the interrupt frame in the
>>>> +	 *    stack trace and will be considered unreliable.
>>>> +	 *
>>>> +	 *  - Breakpoints are exceptions. So, all stack traces in the break
>>>> +	 *    point handler (including probes) will be considered unreliable.
>>>> +	 *
>>>> +	 *  - All of the ftrace entry trampolines are considered unreliable.
>>>> +	 *    So, all stack traces taken from tracer functions will be
>>>> +	 *    considered unreliable.
>>>> +	 *
>>>> +	 *  - The Function Graph Tracer return trampoline (return_to_handler)
>>>> +	 *    and the Kretprobe return trampoline (kretprobe_trampoline) are
>>>> +	 *    also considered unreliable.
>>>
>>> We should be able to unwind these reliably if we specifically identify
>>> them. I think we need a two-step check here; we should assume that
>>> SYM_CODE() is unreliable by default, but in specific cases we should
>>> unwind that reliably.
>>>
>>>> +	 * Some of the special cases above can be unwound through using
>>>> +	 * special logic in unwind_frame().
>>>> +	 *
>>>> +	 *  - return_to_handler() is handled by the unwinder by attempting
>>>> +	 *    to retrieve the original return address from the per-task
>>>> +	 *    return address stack.
>>>> +	 *
>>>> +	 *  - kretprobe_trampoline() can be handled in a similar fashion by
>>>> +	 *    attempting to retrieve the original return address from the
>>>> +	 *    per-task kretprobe instance list.
>>>
>>> We don't do this today,
>>>
>>>> +	 *
>>>> +	 *  - I reckon optprobes can be handled in a similar fashion in the
>>>> +	 *    future?
>>>> +	 *
>>>> +	 *  - Stack traces taken from the FTrace tracer functions can be
>>>> +	 *    handled as well. ftrace_call is an inner label defined in the
>>>> +	 *    Ftrace entry trampoline. This is the location where the call
>>>> +	 *    to a tracer function is patched. So, if the return PC equals
>>>> +	 *    ftrace_call+4, it is reliable. At that point, proper stack
>>>> +	 *    frames have already been set up for the traced function and
>>>> +	 *    its caller.
>>>> +	 *
>>>> +	 * NOTE:
>>>> +	 *   If sym_code_functions[] were sorted, a binary search could be
>>>> +	 *   done to make this more performant.
>>>> +	 */
>>>
>>> Since some of the above is speculative (e.g. the bit about optprobes),
>>> and as code will change over time, I think we should have a much terser
>>> comment, e.g.
>>>
>>> 	/*
>>> 	 * As SYM_CODE functions don't follow the usual calling
>>> 	 * conventions, we assume by default that any SYM_CODE function
>>> 	 * cannot be unwound reliably.
>>> 	 *
>>> 	 * Note that this includes exception entry/return sequences and
>>> 	 * trampoline for ftrace and kprobes.
>>> 	 */
>>>
>>> ... and then if/when we try to unwind a specific SYM_CODE function
>>> reliably, we add the comment for that specifically.
>>>
>>
>> Just to confirm, are you suggesting that I remove the entire large comment
>> detailing the various cases and replace the whole thing with the terse comment?
> 
> Yes.
> 
> For clarity, let's take your bullet-point list above as a list of
> examples, and make that:
> 
> 	/*
> 	 * As SYM_CODE functions don't follow the usual calling
> 	 * conventions, we assume by default that any SYM_CODE function
> 	 * cannot be unwound reliably.
> 	 *
> 	 * Note that this includes:
> 	 *
> 	 * - Exception handlers and entry assembly
> 	 * - Trampoline assembly (e.g., ftrace, kprobes)
> 	 * - Hypervisor-related assembly
> 	 * - Hibernation-related assembly
> 	 * - CPU start-stop, suspend-resume assembly
> 	 * - Kernel relocation assembly
> 	 */
> 

OK.

>> I did the large comment because of Mark Brown's input that we must be
>> verbose about all the cases so that it is clear in the future what the
>> different cases are and how we handle them in this code. As the code
>> evolves, the comments would evolve.
> 
> The bulk of the comment just enumerates cases and says we treat them as
> unreliable, which I think is already clear from the terser comment with
> the list. The cases which mention special treatment (e.g. for unwinding
> through return_to_handler) aren't actually handled here (and the
> kretprobes case isn't handled at all today), so this isn't the right
> place for those -- they'll inevitably drift from the implementation.
> 
>> I can replace the comment if you want. Please confirm.
> 
> Yes please. If you can use the wording I've suggested immediately above
> (with your list folded in), that would be great.
> 

OK. I will use your suggested text.

Thanks.

Madhavan



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux Kernel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux