Re: Bug: broken /proc/kcore in 6.13

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[ Cc'ing the proper folks ]

-- Steve


On Fri, 17 Jan 2025 11:36:05 +0100
Alexandre Ferrieux <alexandre.ferrieux@xxxxxxxxx> wrote:

> Hi,
> 
> Somewhere in the 6.13 branch (not bisected yet, sorry), it stopped being
> possible to disassemble the running kernel from gdb through /proc/kcore.
> 
> More precisely:
> 
>  - look up a function in /proc/kallsyms => 0xADDRESS
>  - tell gdb to "core /proc/kcore"
>  - tell gdb to "disass 0xADDRESS,+LENGTH" (no need for a symbol table)
> 
>  * if the function is within the main kernel text, it is okay
>  * if the function is within a module's text, an infinite loop happens:
> 
> 
> Example:
> 
>  # egrep -w ice_process_skb_fields\|ksys_write /proc/kallsyms
>  ffffffffaf296c80 T ksys_write
>  ffffffffc0b67180 t ice_process_skb_fields       [ice]
> 
>  # gdb -ex "core /proc/kcore" -ex "disass 0xffffffffaf296c80,+256" -ex quit
>  ...
>  Dump of assembler code from 0xffffffffaf296c80 to 0xffffffffaf296d80:
>    ...
>  End of assembler dump.
> 
>  # gdb -ex "core /proc/kcore" -ex "disass 0xffffffffc0b67180,+256" -ex quit
>  ...
>  Dump of assembler code from 0xffffffffc0b67180 to 0xffffffffc0b67280:
>  (***NOTHING***)
>  ^C <= inefficient, need kill -9
> 
> 
> Ftrace (see below) shows in this case read_kcore_iter() calls vread_iter() in an
> infinite loop:
> 
>         while (true) {
>                 read += vread_iter(iter, src, left);
>                 if (read == tsz)
>                         break;
> 
>                 src += read;
>                 left -= read;
> 
>                 if (fault_in_iov_iter_writeable(iter, left)) {
>                         ret = -EFAULT;
>                         goto out;
>                 }
>         }
> 
> As it turns out, in the offending situation, vread_iter() keeps returning 0,
> with "read" staying at its initial value of 0, and "tsz" nonzero. As a
> consequence, "src" stays stuck in a place where vread_iter() fails.
> 
> A cursory "git blame" shows that this interplay (vread_iter() legitimately
> returning zero, and read_kcore_iter() *not* testing it) has been there from
> quite some time. So, while this is arguably fragile, possibly the new situation
> lies in the actual memory layout that triggers the failing path.
> 
> Thanks for any insight, as this completely breaks debugging the running kernel
> in 6.13.
> 
> -Alex
> 
> 
> ------------
> # tracer: nop
> #
> # entries-in-buffer/entries-written: 0/0   #P:48
> #
> #           TASK-PID     CPU#     TIMESTAMP  FUNCTION
> #              | |         |         |         |
>            <...>-3304    [045]    487.295283: kprobe_read_kcore_iter:
> (read_kcore_iter+0x4/0xae0) pos=0x7fffc0b6b000
>            <...>-3304    [045]    487.295298: kprobe_vread_iter:
> (vread_iter+0x4/0x4e0) addr=0xffffffffc0b67000 len=384
>            <...>-3304    [045]    487.295326: kretprobe_vread_iter:
> (read_kcore_iter+0x3e6/0xae0 <- vread_iter) arg1=0
>            <...>-3304    [045]    487.295329: kprobe_vread_iter:
> (vread_iter+0x4/0x4e0) addr=0xffffffffc0b67000 len=384
>            <...>-3304    [045]    487.295338: kretprobe_vread_iter:
> (read_kcore_iter+0x3e6/0xae0 <- vread_iter) arg1=0
>            <...>-3304    [045]    487.295339: kprobe_vread_iter:
> (vread_iter+0x4/0x4e0) addr=0xffffffffc0b67000 len=384
>            <...>-3304    [045]    487.295345: kretprobe_vread_iter:
> (read_kcore_iter+0x3e6/0xae0 <- vread_iter) arg1=0
>            <...>-3304    [045]    487.295347: kprobe_vread_iter:
> (vread_iter+0x4/0x4e0) addr=0xffffffffc0b67000 len=384
>            <...>-3304    [045]    487.295352: kretprobe_vread_iter:
> (read_kcore_iter+0x3e6/0xae0 <- vread_iter) arg1=0
>            <...>-3304    [045]    487.295353: kprobe_vread_iter:
> (vread_iter+0x4/0x4e0) addr=0xffffffffc0b67000 len=384
> ...
> 





[Index of Archives]     [Linux USB Development]     [Linux USB Development]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux