Bug: broken /proc/kcore in 6.13

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Somewhere in the 6.13 branch (not bisected yet, sorry), it stopped being
possible to disassemble the running kernel from gdb through /proc/kcore.

More precisely:

 - look up a function in /proc/kallsyms => 0xADDRESS
 - tell gdb to "core /proc/kcore"
 - tell gdb to "disass 0xADDRESS,+LENGTH" (no need for a symbol table)

 * if the function is within the main kernel text, it is okay
 * if the function is within a module's text, an infinite loop happens:


Example:

 # egrep -w ice_process_skb_fields\|ksys_write /proc/kallsyms
 ffffffffaf296c80 T ksys_write
 ffffffffc0b67180 t ice_process_skb_fields       [ice]

 # gdb -ex "core /proc/kcore" -ex "disass 0xffffffffaf296c80,+256" -ex quit
 ...
 Dump of assembler code from 0xffffffffaf296c80 to 0xffffffffaf296d80:
   ...
 End of assembler dump.

 # gdb -ex "core /proc/kcore" -ex "disass 0xffffffffc0b67180,+256" -ex quit
 ...
 Dump of assembler code from 0xffffffffc0b67180 to 0xffffffffc0b67280:
 (***NOTHING***)
 ^C <= inefficient, need kill -9


Ftrace (see below) shows in this case read_kcore_iter() calls vread_iter() in an
infinite loop:

        while (true) {
                read += vread_iter(iter, src, left);
                if (read == tsz)
                        break;

                src += read;
                left -= read;

                if (fault_in_iov_iter_writeable(iter, left)) {
                        ret = -EFAULT;
                        goto out;
                }
        }

As it turns out, in the offending situation, vread_iter() keeps returning 0,
with "read" staying at its initial value of 0, and "tsz" nonzero. As a
consequence, "src" stays stuck in a place where vread_iter() fails.

A cursory "git blame" shows that this interplay (vread_iter() legitimately
returning zero, and read_kcore_iter() *not* testing it) has been there from
quite some time. So, while this is arguably fragile, possibly the new situation
lies in the actual memory layout that triggers the failing path.

Thanks for any insight, as this completely breaks debugging the running kernel
in 6.13.

-Alex


------------
# tracer: nop
#
# entries-in-buffer/entries-written: 0/0   #P:48
#
#           TASK-PID     CPU#     TIMESTAMP  FUNCTION
#              | |         |         |         |
           <...>-3304    [045]    487.295283: kprobe_read_kcore_iter:
(read_kcore_iter+0x4/0xae0) pos=0x7fffc0b6b000
           <...>-3304    [045]    487.295298: kprobe_vread_iter:
(vread_iter+0x4/0x4e0) addr=0xffffffffc0b67000 len=384
           <...>-3304    [045]    487.295326: kretprobe_vread_iter:
(read_kcore_iter+0x3e6/0xae0 <- vread_iter) arg1=0
           <...>-3304    [045]    487.295329: kprobe_vread_iter:
(vread_iter+0x4/0x4e0) addr=0xffffffffc0b67000 len=384
           <...>-3304    [045]    487.295338: kretprobe_vread_iter:
(read_kcore_iter+0x3e6/0xae0 <- vread_iter) arg1=0
           <...>-3304    [045]    487.295339: kprobe_vread_iter:
(vread_iter+0x4/0x4e0) addr=0xffffffffc0b67000 len=384
           <...>-3304    [045]    487.295345: kretprobe_vread_iter:
(read_kcore_iter+0x3e6/0xae0 <- vread_iter) arg1=0
           <...>-3304    [045]    487.295347: kprobe_vread_iter:
(vread_iter+0x4/0x4e0) addr=0xffffffffc0b67000 len=384
           <...>-3304    [045]    487.295352: kretprobe_vread_iter:
(read_kcore_iter+0x3e6/0xae0 <- vread_iter) arg1=0
           <...>-3304    [045]    487.295353: kprobe_vread_iter:
(vread_iter+0x4/0x4e0) addr=0xffffffffc0b67000 len=384
...





[Index of Archives]     [Linux USB Development]     [Linux USB Development]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux