Re: [PATCH dwarves 1/5] dwarves: help dwarf loader spot functions with optimized-out parameters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2023-01-26 at 14:02 +0000, Alan Maguire wrote:
> On 26/01/2023 00:20, Eduard Zingerman wrote:
> > On Thu, 2023-01-26 at 01:42 +0200, Eduard Zingerman wrote:
> > > On Wed, 2023-01-25 at 22:52 +0000, Alan Maguire wrote:
> > > [...]
> > > > 
> > > > Thanks for this - I tried it, and we spot the optimization once we update
> > > > die__create_new_parameter() as follows:
> > > > 
> > > > diff --git a/dwarf_loader.c b/dwarf_loader.c
> > > > index f96b6ff..605ad45 100644
> > > > --- a/dwarf_loader.c
> > > > +++ b/dwarf_loader.c
> > > > @@ -1529,6 +1530,8 @@ static struct tag *die__create_new_parameter(Dwarf_Die *di
> > > >  
> > > >         if (ftype != NULL) {
> > > >                 ftype__add_parameter(ftype, parm);
> > > > +               if (parm->optimized)
> > > > +                       ftype->optimized_parms = 1;
> > > >                 if (param_idx >= 0) {
> > > >                         if (add_child_llvm_annotations(die, param_idx, conf, &(t
> > > >                                 return NULL;
> > > > 
> > > 
> > > Great, looks good.
> > > 
> > > > With that change, I see:
> > > > 
> > > > $ pahole --verbose --btf_encode_detached=test.btf test.o
> > > > btf_encoder__new: 'test.o' doesn't have '.data..percpu' section
> > > > Found 0 per-CPU variables!
> > > > Found 2 functions!
> > > > File test.o:
> > > > [1] INT int size=4 nr_bits=32 encoding=SIGNED
> > > > [2] PTR (anon) type_id=3
> > > > [3] PTR (anon) type_id=4
> > > > [4] INT char size=1 nr_bits=8 encoding=SIGNED
> > > > [5] FUNC_PROTO (anon) return=1 args=(1 argc, 2 argv)
> > > > [6] FUNC main type_id=5
> > > > added local function 'f', optimized-out params
> > > > skipping addition of 'f' due to optimized-out parameters
> > > 
> > > Sorry, I have one more silly program.
> > > 
> > > I talked to Yonghong today and we discussed if compiler can change a
> > > type of a function parameter as a result of some optimization.
> > > Consider the following example:
> > > 
> > >     $ cat test.c
> > >     struct st {
> > >       int a;
> > >       int b;
> > >     };
> > >     
> > >     __attribute__((noinline))
> > >     static int f(struct st *s) {
> > >       return s->a + s->b;
> > >     }
> > >     
> > >     int main(int argc, char *argv[]) {
> > >       struct st s = {
> > >         .a = (long)argv[0],
> > >         .b = (long)argv[1]
> > >       };
> > >       return f(&s);
> > >     }
> > > 
> > > When compiled by `clang` with -O3 the prototype of `f` is changed from
> > > `int f(struct *st)` to `int f(int, int)`:
> > > 
> > >     $ clang -O3 -g -c test.c -o test.o && llvm-objdump -d test.o
> > >     ...
> > >     0000000000000000 <main>:
> > >            0: 8b 3e                        	movl	(%rsi), %edi
> > >            2: 8b 76 08                     	movl	0x8(%rsi), %esi
> > >            5: eb 09                        	jmp	0x10 <f>
> > >            7: 66 0f 1f 84 00 00 00 00 00   	nopw	(%rax,%rax)
> > >     
> > >     0000000000000010 <f>:
> > >           10: 8d 04 37                     	leal	(%rdi,%rsi), %eax
> > >           13: c3                           	retq
> > >     
> > > But generated DWARF hides this information:
> > 
> > Actually, I'm not correct. The information is present because
> > `DW_AT_location` attribute is not present (just as 4.1.4 says).
> > So I think that the condition for optimized parameters detection has
> > to be adjusted one more time:
> > 
> > 			has_location = attr_location(die, &loc.expr, &loc.exprlen) == 0;
> > 			has_const_value = dwarf_attr(die, DW_AT_const_value, &attr) != NULL;
> > 
> > 			if (has_location && loc.exprlen != 0) {
> > 				Dwarf_Op *expr = loc.expr;
> > 
> > 				switch (expr->atom) {
> > 				case DW_OP_reg1 ... DW_OP_reg31:
> > 				case DW_OP_breg0 ... DW_OP_breg31:
> > 					break;
> > 				default:
> > 					parm->optimized = true;
> > 					break;
> > 				}
> > 			} else if (!has_location || has_const_value) {
> > 				parm->optimized = true;
> > 			}
> > 
> > (But again, the parameter is marked as optimized but the function is
> >  not skipped in the final BTF, so either I applied our last change
> >  incorrectly or something additional should be done).
> >  
> > wdyt?
> 
> I've been digging into this a bit, and the issue here is that for 
> gcc-generated DWARF at least, location info is often in the abstract 
> origin parameter references, so we have to combine observations across
> abstract origin reference and original parameter to determine for sure
> if the parameter really is missing location information. In the
> case of this program there are no abstract origin references, so
> it's a bit more straightforward, but we have to handle both cases
> I think.

Is it safe it ignore DW_TAG_subprogram's with DW_AT_abstract_origin's
and thus avoid the combine logic?
The way I read standard it looks like DW_AT_abstract_origin is only
present for instances that undergo some optimization.

> 
> I'll try and polish up a v2 series that incorporates this shortly;
> in testing it, it works on this case as desired I think:
> 
> LLVM_OBJCOPY=objcopy pahole --verbose -J ~/src/isra2/test2.o
> btf_encoder__new: '/home/alan/src/isra2/test2.o' doesn't have '.data..percpu' section
> Found 0 per-CPU variables!
> Found 13 functions!
> File /home/alan/src/isra2/test2.o:
> [1] INT long size=8 nr_bits=64 encoding=SIGNED
> [2] INT int size=4 nr_bits=32 encoding=SIGNED
> [3] PTR (anon) type_id=4
> [4] PTR (anon) type_id=5
> [5] INT char size=1 nr_bits=8 encoding=SIGNED
> [6] STRUCT st size=8
> 	a type_id=2 bits_offset=0
> 	b type_id=2 bits_offset=32
> [7] PTR (anon) type_id=6
> [8] FUNC_PROTO (anon) return=2 args=(2 argc, 3 argv)
> [9] FUNC main type_id=8
> added local function 'f', optimized-out params
> skipping addition of 'f' due to optimized-out parameters
> 
> Thanks!
> 
> Alan
> 
> > 
> > >     $ llvm-dwarfdump test.o
> > >     ...
> > >     0x0000005c:   DW_TAG_subprogram
> > >                     DW_AT_low_pc	(0x0000000000000010)
> > >                     DW_AT_high_pc	(0x0000000000000014)
> > >                     DW_AT_frame_base	(DW_OP_reg7 RSP)
> > >                     DW_AT_call_all_calls	(true)
> > >                     DW_AT_name	("f")
> > >                     DW_AT_decl_file	("/home/eddy/work/tmp/test.c")
> > >                     DW_AT_decl_line	(7)
> > >                     DW_AT_prototyped	(true)
> > >                     DW_AT_type	(0x00000074 "int")
> > >     
> > >     0x0000006b:     DW_TAG_formal_parameter
> > >                       DW_AT_name	("s")
> > >                       DW_AT_decl_file	("/home/eddy/work/tmp/test.c")
> > >                       DW_AT_decl_line	(7)
> > >                       DW_AT_type	(0x0000009e "st *")
> > >     
> > >     0x00000073:     NULL
> > >     ...
> > > 
> > > Is this important?
> > > (gcc does not do this for the particular example, but I don't know if
> > >  it could be tricked to under some conditions).
> > > 
> > > Thanks,
> > > Eduard
> > > 
> > > [...]
> > 





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux