Re: [PATCH] proc: Disable /proc/$pid/wchan

Kees Cook <keescook@xxxxxxxxxxxx> · Wed, 29 Sep 2021 11:54:55 -0700

On Mon, Sep 27, 2021 at 01:50:56PM -0700, Josh Poimboeuf wrote:
> On Mon, Sep 27, 2021 at 11:07:27AM -0700, Kees Cook wrote:
> > On Mon, Sep 27, 2021 at 10:03:51AM +0100, Mark Rutland wrote:
> > > On Fri, Sep 24, 2021 at 07:26:22AM -0700, Kees Cook wrote:
> > > > On Fri, Sep 24, 2021 at 02:54:24PM +0100, Mark Rutland wrote:
> > > > > On Thu, Sep 23, 2021 at 06:16:16PM -0700, Kees Cook wrote:
> > > > > > On Thu, Sep 23, 2021 at 05:22:30PM -0700, Vito Caputo wrote:
> > > > > > > Instead of unwinding stacks maybe the kernel should be sticking an
> > > > > > > entrypoint address in the current task struct for get_wchan() to
> > > > > > > access, whenever userspace enters the kernel?
> > > > > > 
> > > > > > wchan is supposed to show where the kernel is at the instant the
> > > > > > get_wchan() happens. (i.e. recording it at syscall entry would just
> > > > > > always show syscall entry.)
> > > > > 
> > > > > It's supposed to show where a blocked task is blocked; the "wait
> > > > > channel".
> > > > > 
> > > > > I'd wanted to remove get_wchan since it requires cross-task stack
> > > > > walking, which is generally painful.
> > > > 
> > > > Right -- this is the "fragile" part I'm worried about.
> > 
> > I'd like to clarify this concern first -- is the proposed fix actually
> > fragile? Because I think we'd be better off just restoring behavior than
> > trying to invent new behavior...
> > 
> > i.e. Josh, Jann, do you see any issues with Qi Zheng's fix here:
> > https://lore.kernel.org/all/20210924062006.231699-4-keescook@xxxxxxxxxxxx/
> 
> Even with that patch, it doesn't lock the task's runqueue before reading
> the stack, so there's still the possibility of the task running on
> another CPU and the unwinder going off the rails a bit, which might be
> used by an attacker in creative ways similar to the /proc/<pid>/stack
> vulnerability Jann mentioned earlier.

Since I think we're considering get_wchan() to be slow-path, can we just
lock the runqueue and use arch_stack_walk_reliable()?

-- 
Kees Cook