> On Nov 5, 2020, at 10:16 AM, Song Liu <songliubraving@xxxxxx> wrote: > > > >> On Nov 4, 2020, at 6:25 PM, Daniel Xu <dxu@xxxxxxxxx> wrote: >> >> do_strncpy_from_user() may copy some extra bytes after the NUL > > We have multiple use of "NUL" here, should be "NULL"? Just realized strncpy_from_user.c uses "NUL", so nevermind... > >> terminator into the destination buffer. This usually does not matter for >> normal string operations. However, when BPF programs key BPF maps with >> strings, this matters a lot. >> >> A BPF program may read strings from user memory by calling the >> bpf_probe_read_user_str() helper which eventually calls >> do_strncpy_from_user(). The program can then key a map with the >> resulting string. BPF map keys are fixed-width and string-agnostic, >> meaning that map keys are treated as a set of bytes. >> >> The issue is when do_strncpy_from_user() overcopies bytes after the NUL >> terminator, it can result in seemingly identical strings occupying >> multiple slots in a BPF map. This behavior is subtle and totally >> unexpected by the user. >> >> This commit uses the proper word-at-a-time APIs to avoid overcopying. >> >> Fixes: 6ae08ae3dea2 ("bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers") >> Signed-off-by: Daniel Xu <dxu@xxxxxxxxx> >> --- >> lib/strncpy_from_user.c | 9 +++++++-- >> 1 file changed, 7 insertions(+), 2 deletions(-) >> >> diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c >> index e6d5fcc2cdf3..d084189eb05c 100644 >> --- a/lib/strncpy_from_user.c >> +++ b/lib/strncpy_from_user.c >> @@ -35,17 +35,22 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src, >> goto byte_at_a_time; >> >> while (max >= sizeof(unsigned long)) { >> - unsigned long c, data; >> + unsigned long c, data, mask, *out; >> >> /* Fall back to byte-at-a-time if we get a page fault */ >> unsafe_get_user(c, (unsigned long __user *)(src+res), byte_at_a_time); >> >> - *(unsigned long *)(dst+res) = c; >> if (has_zero(c, &data, &constants)) { >> data = prep_zero_mask(c, data, &constants); >> data = create_zero_mask(data); >> + mask = zero_bytemask(data); >> + out = (unsigned long *)(dst+res); >> + *out = (*out & ~mask) | (c & mask); >> return res + find_zero(data); >> + } else { > > This else clause is not needed, as we return in the if clause. > >> + *(unsigned long *)(dst+res) = c; >> } >> + >> res += sizeof(unsigned long); >> max -= sizeof(unsigned long); >> } >> -- >> 2.28.0