Hi, On Mon, Jan 13, 2025 at 8:09 AM Amal <tjarlama@xxxxxxxxx> wrote: > > On Fri, Dec 13, 2024 at 12:55 AM Doug Anderson <dianders@xxxxxxxxxxxx> wrote: > > > > Hi, > > > > On Wed, Dec 11, 2024 at 7:40 AM Amal Raj T <tjarlama@xxxxxxxxx> wrote: > > > > > > From: Amal Raj T <amalrajt@xxxxxxxx> > > > > > > Add a new function kgdb_mem2ebin that converts memory > > > to binary format, escaping special characters > > > ('$', '#', and '}'). kgdb_mem2ebin function ensures > > > that memory data is properly formatted and escaped > > > before being sent over the wire. Additionally, this > > > function reduces the amount of data exchanged between > > > debugger compared to hex. > > > > > > Signed-off-by: Amal Raj T <amalrajt@xxxxxxxx> > > > --- > > > include/linux/kgdb.h | 1 + > > > kernel/debug/gdbstub.c | 31 +++++++++++++++++++++++++++++++ > > > 2 files changed, 32 insertions(+) FWIW, it looks like Stephen has already responded to your V3 and the protocol might change a bunch. I'm not going to do a detailed review on it at this time. I'll comment on a few of the things below, though. One other note is that you should have done a "reply to all" when responding to my review feedback, not just a reply to me. The responses should be archived on the lists unless there was a specific reason to exclude them. Adding the lists back here since I don't see anything sensitive in your reply. > > > diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h > > > index 76e891ee9e37..fa3cf38a14de 100644 > > > --- a/include/linux/kgdb.h > > > +++ b/include/linux/kgdb.h > > > @@ -322,6 +322,7 @@ extern struct kgdb_io *dbg_io_ops; > > > > > > extern int kgdb_hex2long(char **ptr, unsigned long *long_val); > > > extern char *kgdb_mem2hex(char *mem, char *buf, int count); > > > +extern char *kgdb_mem2ebin(char *mem, char *buf, int count); > > > extern int kgdb_hex2mem(char *buf, char *mem, int count); > > > > > > extern int kgdb_isremovedbreak(unsigned long addr); > > > diff --git a/kernel/debug/gdbstub.c b/kernel/debug/gdbstub.c > > > index f625172d4b67..6198d2eb49c4 100644 > > > --- a/kernel/debug/gdbstub.c > > > +++ b/kernel/debug/gdbstub.c > > > @@ -257,6 +257,37 @@ char *kgdb_mem2hex(char *mem, char *buf, int count) > > > return buf; > > > } > > > > > > +/* > > > + * Convert memory to binary format for GDB remote protocol > > > + * transmission, escaping special characters ($, #, and }). > > > > Why exactly are those characters special? What considers them special > > and so why do you need to escape them? I guess you really just care > > about avoiding # and $ and you're using '}' as your escape character > > so you need to escape that too? > > > > Your function comment should describe the escaping method and ideally > > provide a few examples. > > > > > Added relevant links in the commit message and commented two examples > in code > > > + */ > > > +char *kgdb_mem2ebin(char *mem, char *buf, int count) > > > > One of the two buffers seems like it should be "const", right? That > > would help document which was input and which was output. I guess > > "mem" is the input? > > > > "count" should be "size_t" > > > This was copied from `kgdb_mem2hex` which uses a similar template. > Should we continue using `int` or just this function need to be migrated to > `size_t` In general the existing kdb/kgdb functions are not always the best examples. I think size_t is correct here. Indeed, if you look at what "kgdb_mem2hex" does with "count" it passes it to copy_from_kernel_nofault() which takes a size_t. Feel free to post a (separate) patch fixing kgdb_mem2hex(). > > Presumably there should be two counts talking about the sizes of each > > buffer, or at least some documentation should be in the function > > comment talking about the fact that "buf" needs to be twice the size? > > > > > > > +{ > > > + char *tmp; > > > + int err; > > > + > > > + tmp = buf + count; > > > > Could use a comment that the buffer needs to be 2x long to handle > > escaping and that you'll use the 2nd half as a temp buffer. > > > > > Done, commented in v3 > > > + > > > + err = copy_from_kernel_nofault(tmp, mem, count); > > > + if (err) > > > + return NULL; > > > + while (count > 0) { > > > > If you change `count` to `size_t` the above test won't work because > > it'll be unsigned. Still probably better to use `size_t`, but just a > > warning that you'll have to change the condition. > > > > > > > + unsigned char c = *tmp; > > > + > > > + if (c == 0x7d || c == 0x23 || c == 0x24) { > > > + *buf++ = 0x7d; > > > > Don't hardcode. Much better to use '}', '#', '$' > > > Fixed in v3 > > > > > + *buf++ = c ^ 0x20; > > > + } else { > > > + *buf++ = c; > > > + } > > > + count -= 1; > > > + tmp += 1; > > > > count--; > > tmp++; > > > Fixed in v3 > > > + } > > > + *buf = 0; > > > > Why is the resulting buffer '\0' terminated when the source buffer > > isn't? Adding this termination means that the destination buffer needs > > to be 1 byte bigger, right? > > > > ...or maybe the source buffer doesn't actually have any embedded '\0' > > characters and you're using this for termination to the other side? It > > would be good to clarify. > > > > In other words, if the input is 2 bytes big: > > '}}' > > > > The output buffer will be 5 bytes big: > > '}]}]\0' > > > Don't think null termination is required in this case. > In this case, the `gdb_serial_stub` already initializes the buffer with `\0`, > and `put_packet` iterates over the buffer until the first occurrence of \0. > This works for `vmcoreinfo` query. But for input buffers with additional > `\0` characters need to be escaped, to make sure `put_packet` parsed the > entire buffer. > > > + > > > + return buf; > > > > What exactly is this return value? It points right past the end of the buffer? > > > > You seem to just use it as an error value (NULL means error, non-NULL > > means no error). Why not return an error code then? > > > Other encoders also return a `char *`. However there is no error handling > anywhere, the return value is discarded at the caller side. As per above, existing kdb/kgdb code isn't always the best example to follow. Personally I'd rather see an error code returned unless there's a good reason not to. If you really want to return a pointer like this, a bare minimum would be to document it. The kgdb_mem2hex() function you pointed to _does_ at least document what is returned. -Doug