Em Sun, Feb 17, 2019 at 09:32:54PM +0900, Taeung Song escreveu: > On 2/15/19 3:31 AM, Arnaldo Carvalho de Melo wrote: > > Em Fri, Feb 15, 2019 at 02:37:04AM +0900, Taeung Song escreveu: > > > I found two duplicate function prototypes 'wq_calc_node_cpumask' using > > > pfunct. > > > Is this a bug ? > > > $ pfunct -P /home/taeung/git/linux/kernel/workqueue.o | grep > > > wq_calc_node_cpumask > > > void wq_calc_node_cpumask(const struct workqueue_attrs * attrs, int node, > > > int cpu_going_down, cpumask_t * cpumask); > > > bool wq_calc_node_cpumask(const struct workqueue_attrs * attrs, int node, > > > int cpu_going_down, cpumask_t * cpumask); > > > So, I checked it like below: > > > $ readelf -s ~/git/linux/vmlinux | grep wq_calc_node_cpumask > > > 5887: ffffffff810815e0 143 FUNC LOCAL DEFAULT 1 > > > wq_calc_node_cpumask > > > $ addr2line -e ~/git/linux/vmlinux ffffffff810815e0 > > > /home/taeung/git/linux/kernel/workqueue.c:3650 > > > $ cd ~/git/linux > > > $ grep wq_calc_node_cpumask > > > kernel/workqueue.c > > > 3627: * wq_calc_node_cpumask - calculate a wq_attrs' cpumask for the > > > specified node > > > 3648:static bool wq_calc_node_cpumask(const struct workqueue_attrs *attrs, > > > int node, > > > 3766: if (wq_calc_node_cpumask(new_attrs, node, -1, tmp_attrs->cpumask)) { > > > 3938: if (wq_calc_node_cpumask(wq->dfl_pwq->pool->attrs, node, cpu_off, > > > cpumask)) { > > > > > > Or, the dwarf info of vmlinux was wrongly made up ? > > > > [root@quaco perf]# pfunct --decl_info --prototypes ../build/v5.0-rc3+/kernel/workqueue.o | grep -B2 wq_calc_node_cpumask > > /* Used at: /home/acme/git/linux/kernel/workqueue.c */ > > /* <3137e> /home/acme/git/linux/kernel/workqueue.c:3648 */ > > void wq_calc_node_cpumask(const struct workqueue_attrs * attrs, int node, int cpu_going_down, cpumask_t * cpumask); > > -- > > /* Used at: /home/acme/git/linux/kernel/workqueue.c */ > > /* <1ff33> /home/acme/git/linux/kernel/workqueue.c:3648 */ > > bool wq_calc_node_cpumask(const struct workqueue_attrs * attrs, int node, int cpu_going_down, cpumask_t * cpumask); > > [root@quaco perf]# > > > > Seem there is a difference in that dwarf_dieoffset(), so: > > > > [root@quaco perf]# readelf -wi ../build/v5.0-rc3+/kernel/workqueue.o | grep -B1 -A7 wq_calc_node_cpumask > > <1><1ff33>: Abbrev Number: 49 (DW_TAG_subprogram) > > <1ff34> DW_AT_name : (indirect string, offset: 0x498d): wq_calc_node_cpumask > > <1ff38> DW_AT_decl_file : 1 > > <1ff39> DW_AT_decl_line : 3648 > > <1ff3b> DW_AT_decl_column : 13 > > <1ff3c> DW_AT_prototyped : 1 > > <1ff3c> DW_AT_type : <0x2bb> > > <1ff40> DW_AT_inline : 1 (inlined) > > <1ff41> DW_AT_sibling : <0x1ffb1> > > [root@quaco perf]# > > > > So, see that DW_AT_inline? > > > > This is just a function that was not declared inline but was forced > > inline and inlined twice in that file. > > > > [root@quaco perf]# pfunct --cc_inlined ../build/v5.0-rc3+/kernel/workqueue.o | grep wq_calc_node_cpumask > > wq_calc_node_cpumask > > [root@quaco perf]# > > > > [root@quaco perf]# pfunct --help | grep -- --cc_inlined > > -H, --cc_inlined not declared inline, inlined by compiler > > [root@quaco perf]# > Thank you so much ! > I understood it. > But even though wq_calc_node_cpumask() was inlined by compiler, > (therefore, "DW_AT_inline : 1 (inlined)") > why are there its symbol and its code like below ? > The function can be compiled as two case ? > 1) inlined case > 2) normal function call case > so declared twice in that file ? > (as they seemed by "pfunct --decl_info") > $ readelf -s vmlinux | grep wq_calc_node_cpumask > 5887: ffffffff810815e0 143 FUNC LOCAL DEFAULT 1 > wq_calc_node_cpumask > $ objdump -d vmlinux | grep wq_calc_node_cpumask > ffffffff810815e0 <wq_calc_node_cpumask>: > ... > And wq_update_unbound_numa() calls the function wq_calc_node_cpumask() > $ objdump -d vmlinux > ... > ffffffff81082ec0 <wq_update_unbound_numa>: > ... > ffffffff81082f69: e8 72 e6 ff ff callq ffffffff810815e0 > <wq_calc_node_cpumask> I guess so, the compiler may have found a reason for inlining it in one case while no advantages on doing so in some other case. Looking at that ../build/v5.0-rc3+/kernel/workqueue.o case again, we first have the function, with that series of nops at its start (the function tracer area): 0000000000001b70 <wq_calc_node_cpumask>: * wq_calc_node_cpumask - calculate a wq_attrs' cpumask for the specified node 1b70: e8 00 00 00 00 callq 1b75 <wq_calc_node_cpumask+0x5> * @attrs: the wq_attrs of the default pwq of the target workqueue 1b75: 80 3d 00 00 00 00 00 cmpb $0x0,0x0(%rip) # 1b7c <wq_calc_node_cpumask+0xc> 1b7c: 48 8b 47 08 mov 0x8(%rdi),%rax 1b80: 74 4f je 1bd1 <wq_calc_node_cpumask+0x61> 1b82: 80 7f 10 00 cmpb $0x0,0x10(%rdi) 1b86: 75 49 jne 1bd1 <wq_calc_node_cpumask+0x61> <SNIP> Then later we have it being called from wq_update_unbound_numa(): 00000000000043a0 <wq_update_unbound_numa>: * wq_update_unbound_numa - update NUMA affinity of a wq for CPU hot[un]plug 43a0: e8 00 00 00 00 callq 43a5 <wq_update_unbound_numa+0x5> 43a5: 48 c7 c0 00 00 00 00 mov $0x0,%rax 43ac: 41 56 push %r14 <SNIP> 4439: 48 8b 00 mov (%rax),%rax 443c: 48 89 fb mov %rdi,%rbx 443f: 44 89 e6 mov %r12d,%esi 4442: 48 8b b8 c8 02 00 00 mov 0x2c8(%rax),%rdi 4449: e8 22 d7 ff ff callq 1b70 <wq_calc_node_cpumask> 444e: 84 c0 test %al,%al 4450: 0f 85 85 00 00 00 jne 44db <wq_update_unbound_numa+0x13b> 4456: 4d 63 ec movslq %r12d,%r13 4459: 48 8d 6b 20 lea 0x20(%rbx),%rbp <SNIP> And then there is an excerpt on the unlikely text section: Disassembly of section .text.unlikely: <SNIP> 00000000000000b3 <__queue_work.cold.49>: * Return %true iff I'm a worker execuing a work item on @wq. If b3: 48 c7 c7 00 00 00 00 mov $0x0,%rdi ba: e8 00 00 00 00 callq bf <__queue_work.cold.49+0xc> * I'm @worker, it's safe to dereference it without locking. bf: c6 05 00 00 00 00 01 movb $0x1,0x0(%rip) # c6 <__queue_work.cold.49+0x13> c6: 48 c7 c2 00 00 00 00 mov $0x0,%rdx cd: e9 00 00 00 00 jmpq d2 <wq_calc_node_cpumask.cold.50> 00000000000000d2 <wq_calc_node_cpumask.cold.50>: * stable. d2: 48 c7 c7 00 00 00 00 mov $0x0,%rdi d9: c6 05 00 00 00 00 01 movb $0x1,0x0(%rip) # e0 <wq_calc_node_cpumask.cold.50+0xe> e0: e8 00 00 00 00 callq e5 <wq_calc_node_cpumask.cold.50+0x13> * Return: %true if the resulting @cpumask is different from @attrs->cpumask, e5: 31 c0 xor %eax,%eax e7: c3 retq <SNIP> I haven't checked, but perhaps this can explain the two DWARF records for this function? But looking again with decl_info + -V I see something else: /* Used at: /home/acme/git/linux/kernel/workqueue.c */ /* <3137e> /home/acme/git/linux/kernel/workqueue.c:3648 */ void wq_calc_node_cpumask(const struct workqueue_attrs * attrs, int node, int cpu_going_down, cpumask_t * cpumask); /* definitions: 1 */ /* Used at: /home/acme/git/linux/kernel/workqueue.c */ /* <1ff33> /home/acme/git/linux/kernel/workqueue.c:3648 */ bool wq_calc_node_cpumask(const struct workqueue_attrs * attrs, int node, int cpu_going_down, cpumask_t * cpumask); /* definitions: 1 * Which is really strange, the return type changes from 'void' to 'bool', perhaps this is a bug in pfunct? I would have to dig deeper, but I'm busy right now, so just documenting what I found in this brief investigation, will try to come back to this when I find some time. So more things I saw, hey, it seems eu-readelf tells the whole story, see below: [root@quaco perf]# readelf -wi ../build/v5.0-rc3+/kernel/workqueue.o | grep wq_calc_node_cpumask <1ff34> DW_AT_name : (indirect string, offset: 0x498d): wq_calc_node_cpumask [root@quaco perf]# [root@quaco perf]# eu-readelf -winfo ../build/v5.0-rc3+/kernel/workqueue.o <SNIP> [ 1ff33] subprogram abbrev: 49 name (strp) "wq_calc_node_cpumask" decl_file (data1) workqueue.c (1) decl_line (data2) 3648 decl_column (data1) 13 prototyped (flag_present) yes type (ref4) [ 2bb] inline (data1) inlined (1) sibling (ref4) [ 1ffb1] [ 1ff45] formal_parameter abbrev: 23 name (strp) "attrs" decl_file (data1) workqueue.c (1) decl_line (data2) 3648 decl_column (data1) 64 type (ref4) [ 1f2ee] [ 1ff52] formal_parameter abbrev: 23 name (strp) "node" decl_file (data1) workqueue.c (1) decl_line (data2) 3648 decl_column (data1) 75 type (ref4) [ d2] [ 1ff5f] formal_parameter abbrev: 23 name (strp) "cpu_going_down" decl_file (data1) workqueue.c (1) decl_line (data2) 3649 decl_column (data1) 10 type (ref4) [ d2] [ 1ff6c] formal_parameter abbrev: 23 name (strp) "cpumask" decl_file (data1) workqueue.c (1) decl_line (data2) 3649 decl_column (data1) 37 And later: [ 3137e] subprogram abbrev: 235 abstract_origin (ref4) [ 1ff33] ranges (sec_offset) range list [ 3170] frame_base (exprloc) [ 0] call_frame_cfa GNU_all_call_sites (flag_present) yes sibling (ref4) [ 31675] [ 3138e] formal_parameter abbrev: 94 abstract_origin (ref4) [ 1ff45] location (sec_offset) location list [ 8cbb] GNU_locviews (sec_offset) location list [ 8cb7] [ 3139b] formal_parameter abbrev: 94 abstract_origin (ref4) [ 1ff52] location (sec_offset) location list [ 8cf5] GNU_locviews (sec_offset) location list [ 8cf1] [ 313a8] formal_parameter abbrev: 94 abstract_origin (ref4) [ 1ff5f] location (sec_offset) location list [ 8d31] GNU_locviews (sec_offset) location list [ 8d2b] [ 313b5] formal_parameter abbrev: 94 abstract_origin (ref4) [ 1ff6c] location (sec_offset) location list [ 8d7e] GNU_locviews (sec_offset) location list [ 8d7a] [ 313c2] inlined_subroutine abbrev: 48 abstract_origin (ref4) [ 2e987] entry_pc (addr) .text+0x0000000000001b88 <wq_calc_node_cpumask+0x18> GNU_entry_view (data2) 4 low_pc (addr) .text+0x0000000000001b88 <wq_calc_node_cpumask+0x18> high_pc (data8) 3 (.text+0x0000000000001b8b <wq_calc_node_cpumask+0x1b>) call_file (data1) workqueue.c (1) call_line (data2) 3655 call_column (data1) 2 sibling (ref4) [ 313ef] [ 313e9] formal_parameter abbrev: 1 abstract_origin (ref4) [ 2e998] [ 313ef] inlined_subroutine abbrev: 12 abstract_origin (ref4) [ 2fe89] entry_pc (addr) .text+0x0000000000001b8b <wq_calc_node_cpumask+0x1b> GNU_entry_view (data2) 1 ranges (sec_offset) range list [ 31a0] call_file (data1) workqueue.c (1) call_line (data2) 3655 call_column (data1) 2 sibling (ref4) [ 31452] [ 3140a] formal_parameter abbrev: 1 abstract_origin (ref4) [ 2feb5] [ 3140f] formal_parameter abbrev: 1 abstract_origin (ref4) [ 2fea8] [ 31414] formal_parameter abbrev: 1 abstract_origin (ref4) [ 2fe9b] [ 31419] inlined_subroutine abbrev: 44 abstract_origin (ref4) [ 300b7] entry_pc (addr) .text+0x0000000000001b8b <wq_calc_node_cpumask+0x1b> GNU_entry_view (data2) 3 low_pc (addr) .text+0x0000000000001b8b <wq_calc_node_cpumask+0x1b> high_pc (data8) 14 (.text+0x0000000000001b99 <wq_calc_node_cpumask+0x29>) call_file (data1) cpumask.h (13) call_line (data2) 405 call_column (data1) 9 [ 3143c] formal_parameter abbrev: 1 abstract_origin (ref4) [ 300f0] [ 31441] formal_parameter abbrev: 1 So back to what pfunct produces: /* Used at: /home/acme/git/linux/kernel/workqueue.c */ /* <3137e> /home/acme/git/linux/kernel/workqueue.c:3648 */ void wq_calc_node_cpumask(const struct workqueue_attrs * attrs, int node, int cpu_going_down, cpumask_t * cpumask); /* definitions: 1 */ /* Used at: /home/acme/git/linux/kernel/workqueue.c */ /* <1ff33> /home/acme/git/linux/kernel/workqueue.c:3648 */ bool wq_calc_node_cpumask(const struct workqueue_attrs * attrs, int node, int cpu_going_down, cpumask_t * cpumask); /* definitions: 1 * Above you can see the 3137e and 1ff33, which looks like it used it as a function and also inlined it somewhere else. And look at [ 3137e] subprogram abbrev: 235 abstract_origin (ref4) [ 1ff33] The DW_TAG_abstract_origin for 3137e points to 1ff33. Look at http://www.unix.org/reference/dwarf.2.06.pdf for the "3.3.8.2 Concrete Inlined Instances" section. - Arnaldo