Re: [RFC dwarves 1/4] dwarf_loader: mark functions that do not use expected registers for params

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 20/02/2023 22:30, Alexei Starovoitov wrote:
> On Mon, Feb 20, 2023 at 07:42:15PM +0000, Alan Maguire wrote:
>> On 20/02/2023 19:03, Alexei Starovoitov wrote:
>>> On Fri, Feb 17, 2023 at 11:10:30PM +0000, Alan Maguire wrote:
>>>> Calling conventions dictate which registers are used for
>>>> function parameters.
>>>>
>>>> When a function is optimized however, we need to ensure that
>>>> the non-optimized parameters do not violate expectations about
>>>> register use as this would violate expectations for tracing.
>>>> At CU initialization, create a mapping from parameter index
>>>> to expected DW_OP_reg, and use it to validate parameters
>>>> match with expectations.  A parameter which is passed via
>>>> the stack, as a constant, or uses an unexpected register,
>>>> violates these expectations and it (and the associated
>>>> function) are marked as having unexpected register mapping.
>>>>
>>>> Note though that there is as exception here that needs to
>>>> be handled; when a (typedef) struct is passed as a parameter,
>>>> it can use multiple registers so will throw off later register
>>>> expectations.  Exempt functions that have unexpected
>>>> register usage _and_ struct parameters (examples are found
>>>> in the "tracing_struct" test).
>>>>
>>>> Signed-off-by: Alan Maguire <alan.maguire@xxxxxxxxxx>
>>>> ---
>>>>  dwarf_loader.c | 109 ++++++++++++++++++++++++++++++++++++++++++++++---
>>>>  dwarves.h      |   5 +++
>>>>  2 files changed, 109 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/dwarf_loader.c b/dwarf_loader.c
>>>> index acdb68d..014e130 100644
>>>> --- a/dwarf_loader.c
>>>> +++ b/dwarf_loader.c
>>>> @@ -1022,6 +1022,51 @@ static int arch__nr_register_params(const GElf_Ehdr *ehdr)
>>>>  	return 0;
>>>>  }
>>>>  
>>>> +/* map from parameter index (0 for first, ...) to expected DW_OP_reg.
>>>> + * This will allow us to identify cases where optimized-out parameters
>>>> + * interfere with expectations about register contents on function
>>>> + * entry.
>>>> + */
>>>> +static void arch__set_register_params(const GElf_Ehdr *ehdr, struct cu *cu)
>>>> +{
>>>> +	memset(cu->register_params, -1, sizeof(cu->register_params));
>>>> +
>>>> +	switch (ehdr->e_machine) {
>>>> +	case EM_S390:
>>>> +		/* https://github.com/IBM/s390x-abi/releases/download/v1.6/lzsabi_s390x.pdf */
>>>> +		cu->register_params[0] = DW_OP_reg2;	// %r2
>>>> +		cu->register_params[1] = DW_OP_reg3;	// %r3
>>>> +		cu->register_params[2] = DW_OP_reg4;	// %r4
>>>> +		cu->register_params[3] = DW_OP_reg5;	// %r5
>>>> +		cu->register_params[4] = DW_OP_reg6;	// %r6
>>>> +		return;
>>>> +	case EM_X86_64:
>>>> +		/* //en.wikipedia.org/wiki/X86_calling_conventions#System_V_AMD64_ABI */
>>>> +		cu->register_params[0] = DW_OP_reg5;	// %rdi
>>>> +		cu->register_params[1] = DW_OP_reg4;	// %rsi
>>>> +		cu->register_params[2] = DW_OP_reg1;	// %rdx
>>>> +		cu->register_params[3] = DW_OP_reg2;	// %rcx
>>>> +		cu->register_params[4] = DW_OP_reg8;	// %r8
>>>> +		cu->register_params[5] = DW_OP_reg9;	// %r9
>>>> +		return;
>>>> +	case EM_ARM:
>>>> +		/* https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#machine-registers */
>>>> +	case EM_AARCH64:
>>>> +		/* https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#machine-registers */
>>>> +		cu->register_params[0] = DW_OP_reg0;
>>>> +		cu->register_params[1] = DW_OP_reg1;
>>>> +		cu->register_params[2] = DW_OP_reg2;
>>>> +		cu->register_params[3] = DW_OP_reg3;
>>>> +		cu->register_params[4] = DW_OP_reg4;
>>>> +		cu->register_params[5] = DW_OP_reg5;
>>>> +		cu->register_params[6] = DW_OP_reg6;
>>>> +		cu->register_params[7] = DW_OP_reg7;
>>>> +		return;
>>>> +	default:
>>>> +		return;
>>>> +	}
>>>> +}
>>>> +
>>>>  static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
>>>>  					struct conf_load *conf, int param_idx)
>>>>  {
>>>> @@ -1075,18 +1120,28 @@ static struct parameter *parameter__new(Dwarf_Die *die, struct cu *cu,
>>>>  		if (parm->has_loc &&
>>>>  		    attr_location(die, &loc.expr, &loc.exprlen) == 0 &&
>>>>  			loc.exprlen != 0) {
>>>> +			int expected_reg = cu->register_params[param_idx];
>>>>  			Dwarf_Op *expr = loc.expr;
>>>>  
>>>>  			switch (expr->atom) {
>>>>  			case DW_OP_reg0 ... DW_OP_reg31:
>>>> +				/* mark parameters that use an unexpected
>>>> +				 * register to hold a parameter; these will
>>>> +				 * be problematic for users of BTF as they
>>>> +				 * violate expectations about register
>>>> +				 * contents.
>>>> +				 */
>>>> +				if (expected_reg >= 0 && expected_reg != expr->atom)
>>>> +					parm->unexpected_reg = 1;
>>>> +				break;
>>>
>>> Overall I guess it's a step forward, since it addresses the immediate issue,
>>> but probably too fragile long term.
>>>
>>> Your earlier example:
>>>  __bpf_kfunc void tcp_reno_cong_avoid(struct sock *sk, u32 ack, u32 acked)
>>>
>>> had
>>> 0x0891dabe:     DW_TAG_formal_parameter
>>>                   DW_AT_location        (indexed (0x7a) loclist = 0x00f50eb1:
>>>                      [0xffffffff82031185, 0xffffffff8203119e): DW_OP_reg5 RDI
>>>                      [0xffffffff8203119e, 0xffffffff820311cc): DW_OP_reg3 RBX
>>>                      [0xffffffff820311cc, 0xffffffff820311d1): DW_OP_reg5 RDI
>>>                      [0xffffffff820311d1, 0xffffffff820311d2): DW_OP_reg3 RBX
>>>                      [0xffffffff820311d2, 0xffffffff820311d8): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value)
>>>
>>> 0x0891dad4:     DW_TAG_formal_parameter
>>>                   DW_AT_location        (indexed (0x7b) loclist = 0x00f50eda:
>>>                      [0xffffffff82031185, 0xffffffff820311bc): DW_OP_reg1 RDX
>>>                      [0xffffffff820311bc, 0xffffffff820311c8): DW_OP_reg0 RAX
>>>                      [0xffffffff820311c8, 0xffffffff820311d1): DW_OP_reg1 RDX)
>>>                   DW_AT_name    ("acked")
>>>
>>> Both args will fail above check. If I'm reading above code correctly.
>>> It checks that every reg in DW_AT_location matches ?
>>
>> It checks location info for those that have it; so in this case the location
>> lists specify rdi on entry for the first parameter (sk)
>>
>>
>> 0x068a0f3b:     DW_TAG_formal_parameter
>>                   DW_AT_location        (indexed (0x74) loclist = 0x00a4c5a0:
>>                      [0xffffffff81b87849, 0xffffffff81b87866): DW_OP_reg5 RDI
>>                      [0xffffffff81b87866, 0xffffffff81b87899): DW_OP_reg3 RBX
>>                      [0xffffffff81b87899, 0xffffffff81b878a0): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value)
>>                   DW_AT_name    ("sk")
>>                   DW_AT_decl_file       ("/home/opc/src/clang/bpf-next/net/ipv4/tcp_cong.c")
>>                   DW_AT_decl_line       (446)
>>                   DW_AT_type    (0x06886461 "sock *")
>>
>>
>> no location info for the second (ack):
>>
>> 0x068a0f47:     DW_TAG_formal_parameter
>>                   DW_AT_name    ("ack")
>>                   DW_AT_decl_file       ("/home/opc/src/clang/bpf-next/net/ipv4/tcp_cong.c")
>>                   DW_AT_decl_line       (446)
>>                   DW_AT_type    (0x06886451 "u32")
>>
>> ...so matching it is skipped, and rdx as the first element in the location list
>> for the third parameter (acked):
>>
>> 0x068a0f52:     DW_TAG_formal_parameter
>>                   DW_AT_location        (indexed (0x75) loclist = 0x00a4c5bb:
>>                      [0xffffffff81b87849, 0xffffffff81b87884): DW_OP_reg1 RDX
>>                      [0xffffffff81b87884, 0xffffffff81b87890): DW_OP_reg0 RAX
>>                      [0xffffffff81b87890, 0xffffffff81b87898): DW_OP_reg1 RDX)
>>                   DW_AT_name    ("acked")
>>                   DW_AT_decl_file       ("/home/opc/src/clang/bpf-next/net/ipv4/tcp_cong.c")
>>                   DW_AT_decl_line       (446)
>>                   DW_AT_type    (0x06886451 "u32")
>>
>>
>> So this would be okay using the register-checking approach.
> 
> I meant in all that it's not clear that first and only first location info is used.
> Is that the behavior of attr_location() ?
>

I did a bit of additional debugging here; you're right, attr_location() does not
work for location lists so they are skipped. It uses dwarf_getlocation(), but
if we use dwarf_getlocations() (added in elfutils 0.157), single locations and 
location lists will be supported, and in the case of location lists, the first
expression is examined. See [1] for the details of the fix.

> 
>>> Or just first ?
>>>
>>>>  			case DW_OP_breg0 ... DW_OP_breg31:
>>>>  				break;
>>>>  			default:
>>>> -				parm->optimized = 1;
>>>> +				parm->unexpected_reg = 1;
>>>>  				break;
>>>>  			}
>>>>  		} else if (has_const_value) {
>>>> -			parm->optimized = 1;
>>>> +			parm->unexpected_reg = 1;
>>>
>>> Is this part too restrictive as well?
>>> Just because one arg is constant it doesn't mean that the calling convention
>>> is not correct for this and other args.
>>>
>>
>> Great catch; this part is wrong; should just be parm->optimized = 1 as it
>> was before.
> 
> Will this fix change the stats you've quoted earlier ?
> 
> "
> With these changes, running pahole on a gcc-built
> vmlinux skips
> 
> - 1164 functions due to multiple inconsistent function
>   prototypes.  Most of these are "."-suffixed optimized
>   fuctions.
> - 331 functions due to unexpected register usage
>

This changes to

- 1084 functions skipped due to inconsistent function prototypes,
  597 of these are not "."-suffixed functions.
- 255 functions skipped due to unexpected register usage, of which
  only 16 are not "."-suffixed functions.

Overall it's 1339 functions, which is slightly less than
the 1495 we skipped before.
 
> For a clang-built kernel, the numbers are
> 
> - 539 functions with inconsistent prototypes are skipped
> - 209 functions with unexpected register usage are skipped
> "
> 
> How does it compare before/after ?
> iirc there were ~2500 functions skipped in vmlinux-gcc and now it's down to 1164+331 ?
> 
For clang we see

- 539 functions skipped due to inconsistent prototypes (unchanged)
- 272 functions skipped due to unexpected register usage

The increase here appears to be a result of supporting location
lists so we can do more checking on which registers parameters use.

For many of the 272 we see %r15 unexpectedly being used for boolean 
parameters. In some cases the __cold attribute is applied to functions
and they end up using atypical registers. And in some cases -
for example ZSTD_execSequenceEnd() - we see a stack-passed
parameter leading to later parameters using unexpected registers.
In other cases we see the fact that a function does not use
a parameter, resulting in unexpected register use for other
parameters  - __ddebug_add_module() is an example of this, where
the second parameter is unused in the code, so the third uses %rsi.
These are I think the cases we'd really like to catch.

I've sent a small series of fixups to address these issues [1].

Testing indicates we see no issues with missing BTF ids, and the
tracing_struct test still passes.

[1] https://lore.kernel.org/bpf/1676994522-1557-1-git-send-email-alan.maguire@xxxxxxxxxx/



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux