On Wed, Mar 29, 2023 at 07:58:51PM +0200, Oleg Nesterov wrote: > On 03/29, Gregory Price wrote: > > > > On Wed, Mar 29, 2023 at 07:13:22PM +0200, Oleg Nesterov wrote: > > > > > > - if (selector && !access_ok(selector, sizeof(*selector))) > > > - return -EFAULT; > > > - > > > break; > > > default: > > > return -EINVAL; > > > > > > > The result of this would be either a task calling via prctl or a tracer > > calling via ptrace would be capable of setting selector to a bad pointer > > and producing a SIGSEGV on the next system call. > > Yes, > > > It's a pretty small footgun, but maybe that's reasonable? > > I hope this is reasonable, > > > From a user perspective, debugging this behavior would be nightmarish. > > Your call to prctl/ptrace would succeed and the process would continue > > to execute until the next syscall - at which point you incur a SIGSEGV, > > Yes. But how does this differ from the case when, for example, user > does prtcl(PR_SET_SYSCALL_USER_DISPATCH, selector = 1) ? Or another > bad address < TASK_SIZE? > > access_ok() will happily succeed, then later syscall_user_dispatch() > will equally trigger SIGSEGV. > > Oleg. > Last note on this before I push up another patch set. The change from __get_user to get_user also introduces a call to might_fault() which adds a larger callstack for every syscall / dispatch. This turns into a might_sleep and might_reschedule, which represent a very different pattern of execution from before. At the very least, syscall-user-dispatch will be less performant as the selector is validated on every syscall. I have to assume that is why they chose to validate it upon activating SUD - to avoid the overhead. The current cost of a dispatch is about 3-5us (2 context switches + the signal system). This could be a small amount of overhead comparatively. However, this additional overhead would apply to ALL system calls, regardless of whether they dispatch or not. That seems concerning for syscall hotpath code. So given this, the three options I presently see available are: 1) drop access_ok on SUD setup, validate the pointer on every syscall with get_user instead of __get user, or 2) create task_access_ok and deal with the TASK_SIZE implications (or not? there seems to be some argument for and against) 3) indescriminately untag all pointers and allow tracers to set selector to what otherwise would be a bad value in a particuarly degenerate case. (There seems to be some argument for/against this?) Will leave this for comment for a day or so before I push another set. Personally I fall on the side of untagging the pointer for the access_ok check, as its only real effect is the situation where a tracer has tagging and enabled, and is tracing an untagged task. That seems extremely narrow, and not particularly realistic, and the result is the tracee firing a SIGSEGV - which is equivalent to allowing the pointer being invalid in the first place without the additional overhead. ~Gregory