On Wed, Feb 3, 2021 at 7:10 PM Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote: > This line gives a big clue: > > [160676.608966][ C4] RIP: 0010:0xffffffffc17d814c > > That address, without a function name, most likely means that it was > running in some generated code (mostly likely BPF) when it got > interrupted. We do have eBPF/XDP in our environment. > Right now, the ORC unwinder tries to fall back to frame pointers when it > encounters generated code: > > orc = orc_find(state->signal ? state->ip : state->ip - 1); > if (!orc) > /* > * As a fallback, try to assume this code uses a frame pointer. > * This is useful for generated code, like BPF, which ORC > * doesn't know about. This is just a guess, so the rest of > * the unwind is no longer considered reliable. > */ > orc = &orc_fp_entry; > state->error = true; > } > > Because the ORC unwinder is guessing from that point onward, it's > possible for it to read the KASAN stack redzone, if the generated code > hasn't set up frame pointers. So the best fix may be for the unwinder > to just always bypass KASAN when reading the stack. > > The unwinder has a mechanism for detecting and warning about > out-of-bounds, and KASAN is short-circuiting that. > > This should hopefully get rid of *all* the KASAN unwinder warnings, both > crypto and networking. It definitely worked on my dm-crypt case, and I've tried it without your previous AVX related patch. I will apply it to our tree and deploy to the staging KASAN environment to see how it fares with respect to networking stacks. Feel free to ping me if I don't get back to you with the results on Monday. Thanks for looking into this!