Re: pfunct: duplicate function prototypes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Em Sun, Feb 17, 2019 at 09:32:54PM +0900, Taeung Song escreveu:
> On 2/15/19 3:31 AM, Arnaldo Carvalho de Melo wrote:
> > Em Fri, Feb 15, 2019 at 02:37:04AM +0900, Taeung Song escreveu:
> > > I found two duplicate function prototypes 'wq_calc_node_cpumask' using
> > > pfunct.
> > > Is this a bug ?

> > > $ pfunct -P /home/taeung/git/linux/kernel/workqueue.o | grep
> > > wq_calc_node_cpumask
> > > void wq_calc_node_cpumask(const struct workqueue_attrs  * attrs, int node,
> > > int cpu_going_down, cpumask_t * cpumask);
> > > bool wq_calc_node_cpumask(const struct workqueue_attrs  * attrs, int node,
> > > int cpu_going_down, cpumask_t * cpumask);

> > > So, I checked it like below:

> > > $ readelf -s ~/git/linux/vmlinux | grep wq_calc_node_cpumask
> > >    5887: ffffffff810815e0   143 FUNC    LOCAL  DEFAULT    1
> > > wq_calc_node_cpumask

> > > $ addr2line -e ~/git/linux/vmlinux ffffffff810815e0
> > > /home/taeung/git/linux/kernel/workqueue.c:3650

> > > $ cd ~/git/linux
> > > $ grep wq_calc_node_cpumask
> > > kernel/workqueue.c
> > > 3627: * wq_calc_node_cpumask - calculate a wq_attrs' cpumask for the
> > > specified node
> > > 3648:static bool wq_calc_node_cpumask(const struct workqueue_attrs *attrs,
> > > int node,
> > > 3766:		if (wq_calc_node_cpumask(new_attrs, node, -1, tmp_attrs->cpumask)) {
> > > 3938:	if (wq_calc_node_cpumask(wq->dfl_pwq->pool->attrs, node, cpu_off,
> > > cpumask)) {
> > > 
> > > Or, the dwarf info of vmlinux was wrongly made up ?
> > 
> > [root@quaco perf]# pfunct --decl_info --prototypes ../build/v5.0-rc3+/kernel/workqueue.o | grep -B2 wq_calc_node_cpumask
> > /* Used at: /home/acme/git/linux/kernel/workqueue.c */
> > /* <3137e> /home/acme/git/linux/kernel/workqueue.c:3648 */
> > void wq_calc_node_cpumask(const struct workqueue_attrs  * attrs, int node, int cpu_going_down, cpumask_t * cpumask);
> > --
> > /* Used at: /home/acme/git/linux/kernel/workqueue.c */
> > /* <1ff33> /home/acme/git/linux/kernel/workqueue.c:3648 */
> > bool wq_calc_node_cpumask(const struct workqueue_attrs  * attrs, int node, int cpu_going_down, cpumask_t * cpumask);
> > [root@quaco perf]#
> > 
> > Seem there is a difference in that dwarf_dieoffset(), so:
> > 
> > [root@quaco perf]# readelf -wi ../build/v5.0-rc3+/kernel/workqueue.o | grep -B1 -A7 wq_calc_node_cpumask
> >   <1><1ff33>: Abbrev Number: 49 (DW_TAG_subprogram)
> >      <1ff34>   DW_AT_name        : (indirect string, offset: 0x498d): wq_calc_node_cpumask
> >      <1ff38>   DW_AT_decl_file   : 1
> >      <1ff39>   DW_AT_decl_line   : 3648
> >      <1ff3b>   DW_AT_decl_column : 13
> >      <1ff3c>   DW_AT_prototyped  : 1
> >      <1ff3c>   DW_AT_type        : <0x2bb>
> >      <1ff40>   DW_AT_inline      : 1	(inlined)
> >      <1ff41>   DW_AT_sibling     : <0x1ffb1>
> > [root@quaco perf]#
> > 
> > So, see that DW_AT_inline?
> > 
> > This is just a function that was not declared inline but was forced
> > inline and inlined twice in that file.
> > 
> > [root@quaco perf]# pfunct --cc_inlined ../build/v5.0-rc3+/kernel/workqueue.o | grep wq_calc_node_cpumask
> > wq_calc_node_cpumask
> > [root@quaco perf]#
> > 
> > [root@quaco perf]# pfunct --help | grep -- --cc_inlined
> >    -H, --cc_inlined           not declared inline, inlined by compiler
> > [root@quaco perf]#
 
> Thank you so much !
> I understood it.
> But even though wq_calc_node_cpumask() was inlined by compiler,
> (therefore, "DW_AT_inline      : 1	(inlined)")
> why are there its symbol and its code like below ?
 
> The function can be compiled as two case ?
> 1) inlined case
> 2) normal function call case
> so declared twice in that file ?
> (as they seemed by "pfunct --decl_info")
 
> $ readelf -s vmlinux | grep wq_calc_node_cpumask
>   5887: ffffffff810815e0   143 FUNC    LOCAL  DEFAULT    1
> wq_calc_node_cpumask
 
> $ objdump -d vmlinux | grep wq_calc_node_cpumask
> ffffffff810815e0 <wq_calc_node_cpumask>:
> ...
 
> And wq_update_unbound_numa() calls the function wq_calc_node_cpumask()
> $ objdump -d vmlinux
> ...
> ffffffff81082ec0 <wq_update_unbound_numa>:
> ...
> ffffffff81082f69:       e8 72 e6 ff ff          callq  ffffffff810815e0
> <wq_calc_node_cpumask>

I guess so, the compiler may have found a reason for inlining it in one
case while no advantages on doing so in some other case.

Looking at that ../build/v5.0-rc3+/kernel/workqueue.o case again, we
first have the function, with that series of nops at its start (the
function tracer area):

0000000000001b70 <wq_calc_node_cpumask>:
 * wq_calc_node_cpumask - calculate a wq_attrs' cpumask for the specified node
    1b70:       e8 00 00 00 00          callq  1b75 <wq_calc_node_cpumask+0x5>
 * @attrs: the wq_attrs of the default pwq of the target workqueue
    1b75:       80 3d 00 00 00 00 00    cmpb   $0x0,0x0(%rip)        # 1b7c <wq_calc_node_cpumask+0xc>
    1b7c:       48 8b 47 08             mov    0x8(%rdi),%rax
    1b80:       74 4f                   je     1bd1 <wq_calc_node_cpumask+0x61>
    1b82:       80 7f 10 00             cmpb   $0x0,0x10(%rdi)
    1b86:       75 49                   jne    1bd1 <wq_calc_node_cpumask+0x61>
<SNIP>

Then later we have it being called from wq_update_unbound_numa():

00000000000043a0 <wq_update_unbound_numa>:
 * wq_update_unbound_numa - update NUMA affinity of a wq for CPU hot[un]plug
    43a0:       e8 00 00 00 00          callq  43a5 <wq_update_unbound_numa+0x5>
    43a5:       48 c7 c0 00 00 00 00    mov    $0x0,%rax
    43ac:       41 56                   push   %r14
    <SNIP>
    4439:       48 8b 00                mov    (%rax),%rax
    443c:       48 89 fb                mov    %rdi,%rbx
    443f:       44 89 e6                mov    %r12d,%esi
    4442:       48 8b b8 c8 02 00 00    mov    0x2c8(%rax),%rdi
    4449:       e8 22 d7 ff ff          callq  1b70 <wq_calc_node_cpumask>
    444e:       84 c0                   test   %al,%al
    4450:       0f 85 85 00 00 00       jne    44db <wq_update_unbound_numa+0x13b>
    4456:       4d 63 ec                movslq %r12d,%r13
    4459:       48 8d 6b 20             lea    0x20(%rbx),%rbp
<SNIP>

And then there is an excerpt on the unlikely text section:

Disassembly of section .text.unlikely:

<SNIP>
00000000000000b3 <__queue_work.cold.49>:
         * Return %true iff I'm a worker execuing a work item on @wq.  If
  b3:   48 c7 c7 00 00 00 00    mov    $0x0,%rdi
  ba:   e8 00 00 00 00          callq  bf <__queue_work.cold.49+0xc>
         * I'm @worker, it's safe to dereference it without locking.
  bf:   c6 05 00 00 00 00 01    movb   $0x1,0x0(%rip)        # c6 <__queue_work.cold.49+0x13>
  c6:   48 c7 c2 00 00 00 00    mov    $0x0,%rdx
  cd:   e9 00 00 00 00          jmpq   d2 <wq_calc_node_cpumask.cold.50>

00000000000000d2 <wq_calc_node_cpumask.cold.50>:
 * stable.
  d2:   48 c7 c7 00 00 00 00    mov    $0x0,%rdi
  d9:   c6 05 00 00 00 00 01    movb   $0x1,0x0(%rip)        # e0 <wq_calc_node_cpumask.cold.50+0xe>
  e0:   e8 00 00 00 00          callq  e5 <wq_calc_node_cpumask.cold.50+0x13>
 * Return: %true if the resulting @cpumask is different from @attrs->cpumask,
  e5:   31 c0                   xor    %eax,%eax
  e7:   c3                      retq
<SNIP>

I haven't checked, but perhaps this can explain the two DWARF records
for this function?

But looking again with decl_info + -V I see something else:

/* Used at: /home/acme/git/linux/kernel/workqueue.c */
/* <3137e> /home/acme/git/linux/kernel/workqueue.c:3648 */
void wq_calc_node_cpumask(const struct workqueue_attrs  * attrs, int node, int cpu_going_down, cpumask_t * cpumask);
/* definitions: 1 */

/* Used at: /home/acme/git/linux/kernel/workqueue.c */
/* <1ff33> /home/acme/git/linux/kernel/workqueue.c:3648 */
bool wq_calc_node_cpumask(const struct workqueue_attrs  * attrs, int node, int cpu_going_down, cpumask_t * cpumask);
/* definitions: 1 *

Which is really strange, the return type changes from 'void' to 'bool',
perhaps this is a bug in pfunct? I would have to dig deeper, but I'm
busy right now, so just documenting what I found in this brief
investigation, will try to come back to this when I find some time.

So more things I saw, hey, it seems eu-readelf tells the whole story,
see below:

[root@quaco perf]# readelf -wi ../build/v5.0-rc3+/kernel/workqueue.o | grep wq_calc_node_cpumask
    <1ff34>   DW_AT_name        : (indirect string, offset: 0x498d): wq_calc_node_cpumask
[root@quaco perf]#

[root@quaco perf]# eu-readelf -winfo ../build/v5.0-rc3+/kernel/workqueue.o 
<SNIP>
 [ 1ff33]    subprogram           abbrev: 49
             name                 (strp) "wq_calc_node_cpumask"
             decl_file            (data1) workqueue.c (1)
             decl_line            (data2) 3648
             decl_column          (data1) 13
             prototyped           (flag_present) yes
             type                 (ref4) [   2bb]
             inline               (data1) inlined (1)
             sibling              (ref4) [ 1ffb1]
 [ 1ff45]      formal_parameter     abbrev: 23
               name                 (strp) "attrs"
               decl_file            (data1) workqueue.c (1)
               decl_line            (data2) 3648
               decl_column          (data1) 64
               type                 (ref4) [ 1f2ee]
 [ 1ff52]      formal_parameter     abbrev: 23
               name                 (strp) "node"
               decl_file            (data1) workqueue.c (1)
               decl_line            (data2) 3648
               decl_column          (data1) 75
               type                 (ref4) [    d2]
 [ 1ff5f]      formal_parameter     abbrev: 23
               name                 (strp) "cpu_going_down"
               decl_file            (data1) workqueue.c (1)
               decl_line            (data2) 3649
               decl_column          (data1) 10
               type                 (ref4) [    d2]
 [ 1ff6c]      formal_parameter     abbrev: 23
               name                 (strp) "cpumask"
               decl_file            (data1) workqueue.c (1)
               decl_line            (data2) 3649
               decl_column          (data1) 37

And later:

[ 3137e]    subprogram           abbrev: 235
             abstract_origin      (ref4) [ 1ff33]
             ranges               (sec_offset) range list [  3170]
             frame_base           (exprloc)
              [ 0] call_frame_cfa
             GNU_all_call_sites   (flag_present) yes
             sibling              (ref4) [ 31675]
 [ 3138e]      formal_parameter     abbrev: 94
               abstract_origin      (ref4) [ 1ff45]
               location             (sec_offset) location list [  8cbb]
               GNU_locviews         (sec_offset) location list [  8cb7]
 [ 3139b]      formal_parameter     abbrev: 94
               abstract_origin      (ref4) [ 1ff52]
               location             (sec_offset) location list [  8cf5]
               GNU_locviews         (sec_offset) location list [  8cf1]
 [ 313a8]      formal_parameter     abbrev: 94
               abstract_origin      (ref4) [ 1ff5f]
               location             (sec_offset) location list [  8d31]
               GNU_locviews         (sec_offset) location list [  8d2b]
 [ 313b5]      formal_parameter     abbrev: 94
               abstract_origin      (ref4) [ 1ff6c]
               location             (sec_offset) location list [  8d7e]
               GNU_locviews         (sec_offset) location list [  8d7a]
 [ 313c2]      inlined_subroutine   abbrev: 48
               abstract_origin      (ref4) [ 2e987]
               entry_pc             (addr) .text+0x0000000000001b88 <wq_calc_node_cpumask+0x18>
               GNU_entry_view       (data2) 4
               low_pc               (addr) .text+0x0000000000001b88 <wq_calc_node_cpumask+0x18>
               high_pc              (data8) 3 (.text+0x0000000000001b8b <wq_calc_node_cpumask+0x1b>)
               call_file            (data1) workqueue.c (1)
               call_line            (data2) 3655
               call_column          (data1) 2
               sibling              (ref4) [ 313ef]
 [ 313e9]        formal_parameter     abbrev: 1
                 abstract_origin      (ref4) [ 2e998]
 [ 313ef]      inlined_subroutine   abbrev: 12
               abstract_origin      (ref4) [ 2fe89]
               entry_pc             (addr) .text+0x0000000000001b8b <wq_calc_node_cpumask+0x1b>
               GNU_entry_view       (data2) 1
               ranges               (sec_offset) range list [  31a0]
               call_file            (data1) workqueue.c (1)
               call_line            (data2) 3655
               call_column          (data1) 2
               sibling              (ref4) [ 31452]
 [ 3140a]        formal_parameter     abbrev: 1
                 abstract_origin      (ref4) [ 2feb5]
 [ 3140f]        formal_parameter     abbrev: 1
                 abstract_origin      (ref4) [ 2fea8]
 [ 31414]        formal_parameter     abbrev: 1
                 abstract_origin      (ref4) [ 2fe9b]
 [ 31419]        inlined_subroutine   abbrev: 44
                 abstract_origin      (ref4) [ 300b7]
                 entry_pc             (addr) .text+0x0000000000001b8b <wq_calc_node_cpumask+0x1b>
                 GNU_entry_view       (data2) 3
                 low_pc               (addr) .text+0x0000000000001b8b <wq_calc_node_cpumask+0x1b>
                 high_pc              (data8) 14 (.text+0x0000000000001b99 <wq_calc_node_cpumask+0x29>)
                 call_file            (data1) cpumask.h (13)
                 call_line            (data2) 405
                 call_column          (data1) 9
 [ 3143c]          formal_parameter     abbrev: 1
                   abstract_origin      (ref4) [ 300f0]
 [ 31441]          formal_parameter     abbrev: 1

 So back to what pfunct produces:

/* Used at: /home/acme/git/linux/kernel/workqueue.c */
/* <3137e> /home/acme/git/linux/kernel/workqueue.c:3648 */
void wq_calc_node_cpumask(const struct workqueue_attrs  * attrs, int node, int cpu_going_down, cpumask_t * cpumask);
/* definitions: 1 */

/* Used at: /home/acme/git/linux/kernel/workqueue.c */
/* <1ff33> /home/acme/git/linux/kernel/workqueue.c:3648 */
bool wq_calc_node_cpumask(const struct workqueue_attrs  * attrs, int node, int cpu_going_down, cpumask_t * cpumask);
/* definitions: 1 *

Above you can see the 3137e and 1ff33, which looks like it used it as a
function and also inlined it somewhere else.

And look at 

[ 3137e]    subprogram           abbrev: 235
             abstract_origin      (ref4) [ 1ff33]

The DW_TAG_abstract_origin for 3137e points to 1ff33.

Look at http://www.unix.org/reference/dwarf.2.06.pdf for the "3.3.8.2
Concrete Inlined Instances" section.

- Arnaldo



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux