On Mon, Sep 27, 2021 at 01:50:56PM -0700, Josh Poimboeuf wrote: > On Mon, Sep 27, 2021 at 11:07:27AM -0700, Kees Cook wrote: > > On Mon, Sep 27, 2021 at 10:03:51AM +0100, Mark Rutland wrote: > > > On Fri, Sep 24, 2021 at 07:26:22AM -0700, Kees Cook wrote: > > > > On Fri, Sep 24, 2021 at 02:54:24PM +0100, Mark Rutland wrote: > > > > > On Thu, Sep 23, 2021 at 06:16:16PM -0700, Kees Cook wrote: > > > > > > On Thu, Sep 23, 2021 at 05:22:30PM -0700, Vito Caputo wrote: > > > > > > > Instead of unwinding stacks maybe the kernel should be sticking an > > > > > > > entrypoint address in the current task struct for get_wchan() to > > > > > > > access, whenever userspace enters the kernel? > > > > > > > > > > > > wchan is supposed to show where the kernel is at the instant the > > > > > > get_wchan() happens. (i.e. recording it at syscall entry would just > > > > > > always show syscall entry.) > > > > > > > > > > It's supposed to show where a blocked task is blocked; the "wait > > > > > channel". > > > > > > > > > > I'd wanted to remove get_wchan since it requires cross-task stack > > > > > walking, which is generally painful. > > > > > > > > Right -- this is the "fragile" part I'm worried about. > > > > I'd like to clarify this concern first -- is the proposed fix actually > > fragile? Because I think we'd be better off just restoring behavior than > > trying to invent new behavior... > > > > i.e. Josh, Jann, do you see any issues with Qi Zheng's fix here: > > https://lore.kernel.org/all/20210924062006.231699-4-keescook@xxxxxxxxxxxx/ > > Even with that patch, it doesn't lock the task's runqueue before reading > the stack, so there's still the possibility of the task running on > another CPU and the unwinder going off the rails a bit, which might be > used by an attacker in creative ways similar to the /proc/<pid>/stack > vulnerability Jann mentioned earlier. Since I think we're considering get_wchan() to be slow-path, can we just lock the runqueue and use arch_stack_walk_reliable()? -- Kees Cook