Re: [PATCH v2 1/3] kgdb: Add kgdb_mem2ebin function for converting memory to binary format

Doug Anderson <dianders@xxxxxxxxxxxx> · Thu, 12 Dec 2024 16:55:27 -0800

Hi,

On Wed, Dec 11, 2024 at 7:40 AM Amal Raj T <tjarlama@xxxxxxxxx> wrote:
>
> From: Amal Raj T <amalrajt@xxxxxxxx>
>
> Add a new function kgdb_mem2ebin that converts memory
> to binary format, escaping special characters
> ('$', '#', and '}'). kgdb_mem2ebin function ensures
> that memory data is properly formatted and escaped
> before being sent over the wire. Additionally, this
> function reduces the amount of data exchanged between
> debugger compared to hex.
>
> Signed-off-by: Amal Raj T <amalrajt@xxxxxxxx>
> ---
>  include/linux/kgdb.h   |  1 +
>  kernel/debug/gdbstub.c | 31 +++++++++++++++++++++++++++++++
>  2 files changed, 32 insertions(+)
>
> diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> index 76e891ee9e37..fa3cf38a14de 100644
> --- a/include/linux/kgdb.h
> +++ b/include/linux/kgdb.h
> @@ -322,6 +322,7 @@ extern struct kgdb_io *dbg_io_ops;
>
>  extern int kgdb_hex2long(char **ptr, unsigned long *long_val);
>  extern char *kgdb_mem2hex(char *mem, char *buf, int count);
> +extern char *kgdb_mem2ebin(char *mem, char *buf, int count);
>  extern int kgdb_hex2mem(char *buf, char *mem, int count);
>
>  extern int kgdb_isremovedbreak(unsigned long addr);
> diff --git a/kernel/debug/gdbstub.c b/kernel/debug/gdbstub.c
> index f625172d4b67..6198d2eb49c4 100644
> --- a/kernel/debug/gdbstub.c
> +++ b/kernel/debug/gdbstub.c
> @@ -257,6 +257,37 @@ char *kgdb_mem2hex(char *mem, char *buf, int count)
>         return buf;
>  }
>
> +/*
> + * Convert memory to binary format for GDB remote protocol
> + * transmission, escaping special characters ($, #, and }).

Why exactly are those characters special? What considers them special
and so why do you need to escape them? I guess you really just care
about avoiding # and $ and you're using '}' as your escape character
so you need to escape that too?

Your function comment should describe the escaping method and ideally
provide a few examples.

> + */
> +char *kgdb_mem2ebin(char *mem, char *buf, int count)

One of the two buffers seems like it should be "const", right? That
would help document which was input and which was output. I guess
"mem" is the input?

"count" should be "size_t"

Presumably there should be two counts talking about the sizes of each
buffer, or at least some documentation should be in the function
comment talking about the fact that "buf" needs to be twice the size?

> +{
> +       char *tmp;
> +       int err;
> +
> +       tmp = buf + count;

Could use a comment that the buffer needs to be 2x long to handle
escaping and that you'll use the 2nd half as a temp buffer.

> +
> +       err = copy_from_kernel_nofault(tmp, mem, count);
> +       if (err)
> +               return NULL;
> +       while (count > 0) {

If you change `count` to `size_t` the above test won't work because
it'll be unsigned. Still probably better to use `size_t`, but just a
warning that you'll have to change the condition.

> +               unsigned char c = *tmp;
> +
> +               if (c == 0x7d || c == 0x23 || c == 0x24) {
> +                       *buf++ = 0x7d;

Don't hardcode. Much better to use '}', '#', '$'

> +                       *buf++ = c ^ 0x20;
> +               } else {
> +                       *buf++ = c;
> +               }
> +               count -= 1;
> +               tmp += 1;

count--;
tmp++;

> +       }
> +       *buf = 0;

Why is the resulting buffer '\0' terminated when the source buffer
isn't? Adding this termination means that the destination buffer needs
to be 1 byte bigger, right?

...or maybe the source buffer doesn't actually have any embedded '\0'
characters and you're using this for termination to the other side? It
would be good to clarify.

In other words, if the input is 2 bytes big:
'}}'

The output buffer will be 5 bytes big:
'}]}]\0'

> +
> +       return buf;

What exactly is this return value? It points right past the end of the buffer?

You seem to just use it as an error value (NULL means error, non-NULL
means no error). Why not return an error code then?

-Doug