On Sat, Jan 25, 2025 at 6:02 AM Jordan Rome <linux@xxxxxxxxxxxxxx> wrote: > > On Fri, Jan 24, 2025 at 7:09 PM Andrii Nakryiko > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > On Fri, Jan 24, 2025 at 10:22 AM Jordan Rome <linux@xxxxxxxxxxxxxx> wrote: > > > > > > Similar to `access_process_vm` but specific to strings. > > > Also chunks reads by page and utilizes `strscpy` > > > for handling null termination. > > > > > > Signed-off-by: Jordan Rome <linux@xxxxxxxxxxxxxx> > > > --- > > > include/linux/mm.h | 3 ++ > > > mm/memory.c | 119 +++++++++++++++++++++++++++++++++++++++++++++ > > > mm/nommu.c | 68 ++++++++++++++++++++++++++ > > > 3 files changed, 190 insertions(+) > > > > > > > [...] > > > > > + maddr = kmap_local_page(page); > > > + retval = strscpy(buf, maddr + offset, bytes); > > > + unmap_and_put_page(page, maddr); > > > + > > > + if (retval > -1 && retval < bytes) { > > > + /* found the end of the string */ > > > + buf += retval; > > > + goto out; > > > + } > > > + > > > + if (retval == -E2BIG) { > > > > nit: strscpy() can't return any other error, so I'd structure result > > handling as: > > > > if (retval < 0) { > > /* that annoying last byte copy */ > > retval = bytes; > > } > > if (retval < bytes) { > > /* "we are done" handling */ > > } > > > > /* common len, buf, addr adjustment logic stays here */ > > > > Ack. Actually, I thought of a way to make this cleaner > and correct. > > > > > but also here's the question. If we get E2BIG, while bytes is exactly > > how many bytes we have left in the buffer, the last byte should be > > zero, no? So this should be cleanly handled, right? Or do we have a > > test for that and it works already? > > > > Ok, I found an inconsistency that gets handled in the BPF helper > `bpf_copy_from_user_task_str`, which I'm going to fix in the next > version of this patch. > > Let me explain how this function is SUPPOSED to work > and enumerate some different test cases (a lot of which are in commit 3). > > Note1: all of the target strings > are null terminated (if you try to copy a string that's not null terminated > you may accidentally copy junk). > > Note2: strscopy returns E2BIG if the len requested isn't as long > as the string INCLUDING the nul terminator. So if you want to copy > "test_data\0" you need to request 10 bytes not 9. And if you request > 10 or anything greater it returns 9. yeah, that's actually smart for strscpy() to have this protocol, it allows to read full string and know that it's full (if the string fits exactly into the dst buffer). But I think on the BPF side we have a convention to return the size *including* the NUL byte, so we get a bit of inefficiency. But oh well, I think consistency is better to be maintained. > > Note3: This function, as opposed to the bpf helper calling > this function, returns the number copied NOT including the nul terminator. > > So... the target string is "test_data". > > Request 10 bytes, return 9. > Request 2 bytes, return 1. > Request 20 bytes, return 9. > > Now, let's say this string falls across a page boundary > where "_" is the last character in the page. > > Request 10 bytes, which becomes a request > for 5 bytes for the first page. strscopy returns E2BIG > and copies "test\0" into the buffer. We copy the last > bytes of the page into the buffer, which is the "_", > and then request 5 more bytes on the next page, > copying "data\0" and strscopy returns 4. Return 9. > > Now let's say the last "a" is the last character on the page. > Request 10 bytes, which becomes a request > for 9 bytes. strscopy returns E2BIG and copies "test_dat\0" > into the buffer. Once again we copy the last byte > of the page into the buffer, which is "a" > and we request 1 more byte of the next page, which > is the nul terminator. strscopy returns 0 and this > function returns 9. so I was worried about the situation where the string is test_data|\0, like you describe (where | is denoting page boundary, NUL is right after the page end), and we request 9 bytes (not 10 like in your example). By manually copying that "a" (last byte of the page), we might not return zero terminated dst buf (that was my theory). Would be good to have a test proving otherwise. But I'll go and read the latest version, maybe I'm overthinking this. > > Again note, that this version of the code has a bug > that is "handled" by the bpf helper and I'm going to fix. > > > > + retval = bytes; > > > + /* > > > + * Because strscpy always null terminates we need to > > > + * copy the last byte in the page if we are going to > > > + * load more pages > > > + */ > > > + if (bytes < len) { > > > + end = bytes - 1; > > > + copy_from_user_page(vma, > > > + page, > > > + addr + end, > > > + buf + end, [...]