Re: Bug: broken /proc/kcore in 6.13

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks Alexandre for the bug report.  It looks like you're CC'ing a
bunch of networking people because you're debugging something networking
related but the actual bug is in read_kcore_iter() so let's CC Lorenzo
instead.

The read_kcore_iter() code needs to be able to handle zero returns.
The comments in vread_iter() say that a zero will be returned in the
following situation.

 * If [addr...addr+count) doesn't includes any intersects with alive
 * vm_struct area, returns 0. @buf should be kernel's buffer.

I don't know the code well enough to say if the -EFAULT and goto out that
you wrote as a quick test is correct:

> +                               res = vread_iter(iter, src, left);
> +                               if (!res) {
> +                                       ret = -EFAULT;
> +                                       goto out;
> +                               }

Or if we should just break:

		res = vread_iter(iter, src, left);
		if (res == 0)
			break;
		read += res;
		if (read == tsz)
			break;

Either way, Lorenzo probably knows the answer so this will be an
easy fix thanks to your excelent bug report.  ;)

regards,
dan carpenter


On Fri, Jan 17, 2025 at 01:02:03PM +0100, Alexandre Ferrieux wrote:
> Hi,
> 
> Somewhere in the 6.13 branch (not bisected yet, sorry), it stopped being
> possible to disassemble the running kernel from gdb through /proc/kcore.
> 
> More precisely:
> 
>  - look up a function in /proc/kallsyms => 0xADDRESS
>  - tell gdb to "core /proc/kcore"
>  - tell gdb to "disass 0xADDRESS,+LENGTH" (no need for a symbol table)
> 
>  * if the function is within the main kernel text, it is okay
>  * if the function is within a module's text, an infinite loop happens:
> 
> 
> Example:
> 
>  # egrep -w ice_process_skb_fields\|ksys_write /proc/kallsyms
>  ffffffffaf296c80 T ksys_write
>  ffffffffc0b67180 t ice_process_skb_fields       [ice]
> 
>  # gdb -ex "core /proc/kcore" -ex "disass 0xffffffffaf296c80,+256" -ex quit
>  ...
>  Dump of assembler code from 0xffffffffaf296c80 to 0xffffffffaf296d80:
>    ...
>  End of assembler dump.
> 
>  # gdb -ex "core /proc/kcore" -ex "disass 0xffffffffc0b67180,+256" -ex quit
>  ...
>  Dump of assembler code from 0xffffffffc0b67180 to 0xffffffffc0b67280:
>  (***NOTHING***)
>  ^C <= inefficient, need kill -9
> 
> 
> Ftrace (see below) shows in this case read_kcore_iter() calls vread_iter() in an
> infinite loop:
> 
>         while (true) {
>                 read += vread_iter(iter, src, left);
>                 if (read == tsz)
>                         break;
> 
>                 src += read;
>                 left -= read;
> 
>                 if (fault_in_iov_iter_writeable(iter, left)) {
>                         ret = -EFAULT;
>                         goto out;
>                 }
>         }
> 
> As it turns out, in the offending situation, vread_iter() keeps returning 0,
> with "read" staying at its initial value of 0, and "tsz" nonzero. As a
> consequence, "src" stays stuck in a place where vread_iter() fails.
> 
> A cursory "git blame" shows that this interplay (vread_iter() legitimately
> returning zero, and read_kcore_iter() *not* testing it) has been there from
> quite some time. So, while this is arguably fragile, possibly the new situation
> lies in the actual memory layout that triggers the failing path.
> 
> To add weigh to this hypothesis, I forced "breaking out" of the loop in that
> case, see patch below, but while this cures the loop, all such attempts (on
> module-text addresses) lead to a zero return from vread_iter(), as though some
> internal (in-kernel) permission barrier prevented access to those areas.
> 
> 
> diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
> index e376f48c4b8b..8c5f29240542 100644
> --- a/fs/proc/kcore.c
> +++ b/fs/proc/kcore.c
> @@ -531,7 +531,13 @@ static ssize_t read_kcore_iter(struct kiocb *iocb, struct
> iov_iter *iter)
>                          * again until we are done.
>                          */
>                         while (true) {
> -                               read += vread_iter(iter, src, left);
> +                               long res;
> +                               res = vread_iter(iter, src, left);
> +                               if (!res) {
> +                                       ret = -EFAULT;
> +                                       goto out;
> +                               }
> +                               read += res;
>                                 if (read == tsz)
>                                         break;
> 
> 
> 
> Thanks for any insight, as this completely breaks debugging the running kernel
> in 6.13.
> 
> -Alex
> 
> 
> ------------
> # tracer: nop
> #
> # entries-in-buffer/entries-written: 0/0   #P:48
> #
> #           TASK-PID     CPU#     TIMESTAMP  FUNCTION
> #              | |         |         |         |
>            <...>-3304    [045]    487.295283: kprobe_read_kcore_iter:
> (read_kcore_iter+0x4/0xae0) pos=0x7fffc0b6b000
>            <...>-3304    [045]    487.295298: kprobe_vread_iter:
> (vread_iter+0x4/0x4e0) addr=0xffffffffc0b67000 len=384
>            <...>-3304    [045]    487.295326: kretprobe_vread_iter:
> (read_kcore_iter+0x3e6/0xae0 <- vread_iter) arg1=0
>            <...>-3304    [045]    487.295329: kprobe_vread_iter:
> (vread_iter+0x4/0x4e0) addr=0xffffffffc0b67000 len=384
>            <...>-3304    [045]    487.295338: kretprobe_vread_iter:
> (read_kcore_iter+0x3e6/0xae0 <- vread_iter) arg1=0
>            <...>-3304    [045]    487.295339: kprobe_vread_iter:
> (vread_iter+0x4/0x4e0) addr=0xffffffffc0b67000 len=384
>            <...>-3304    [045]    487.295345: kretprobe_vread_iter:
> (read_kcore_iter+0x3e6/0xae0 <- vread_iter) arg1=0
>            <...>-3304    [045]    487.295347: kprobe_vread_iter:
> (vread_iter+0x4/0x4e0) addr=0xffffffffc0b67000 len=384
>            <...>-3304    [045]    487.295352: kretprobe_vread_iter:
> (read_kcore_iter+0x3e6/0xae0 <- vread_iter) arg1=0
>            <...>-3304    [045]    487.295353: kprobe_vread_iter:
> (vread_iter+0x4/0x4e0) addr=0xffffffffc0b67000 len=384
> ...
> 
> 
> 




[Index of Archives]     [Kernel Development]     [Kernel Announce]     [Kernel Newbies]     [Linux Networking Development]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Device Mapper]

  Powered by Linux