On Tue, Nov 03, 2020 at 09:45:41AM -0800, Daniel Xu wrote: > Hi, > > I recently received a bpftrace bug report [0] that identical strings > were being stored as separate entries in maps. I dug into the issue and > it turns out that bpf_probe_read*str() may store junk after the NUL > terminator due to how do_strncpy_from_user() does long-sized copies. > > Here is the code in question from lib/strncpy_from_user.c: > > *(unsigned long *)(dst+res) = c; > if (has_zero(c, &data, &constants)) { > data = prep_zero_mask(c, data, &constants); > data = create_zero_mask(data); > return res + find_zero(data); > } > > This behavior is likely to cause subtle issues in bpf programs so a > kernel fix may be necessary. Looks like progs/pyperf.h will hit this issue since it's doing: get_frame_data(frame_ptr, pidData, &frame, &sym)) { int32_t *symbol_id = bpf_map_lookup_elem(&symbolmap, &sym); where get_frame_data() is doing: bpf_probe_read_user_str(&symbol->file, sizeof(symbol->file), frame->co_filename + pidData->offsets.String_data); progs/profiler.inc.h and progs/strobemeta.h look ok, because they append to the end: size_t comm_length = bpf_core_read_str(payload, TASK_COMM_LEN, &task->comm); payload += comm_length; and as the last step do: unsigned long data_len = (void*)payload - (void*)data; bpf_perf_event_output( ... data_len); John, please review cilium uses of bpf_probe_read_str(). You might be hitting this issue as well. Certainly the kernel fix is necessary. premature optimization is the root of all evil.