From: Alexei Starovoitov > Sent: 17 July 2020 18:29 > On Fri, Jul 17, 2020 at 9:25 AM Christoph Hellwig <hch@xxxxxx> wrote: > > > > On Fri, Jul 17, 2020 at 09:13:07AM -0700, Alexei Starovoitov wrote: > > > On Thu, Jul 16, 2020 at 10:52 PM Christoph Hellwig <hch@xxxxxx> wrote: > > > > > > > > Hi Alexei, > > > > > > > > I've just been auditing the sockopt code, and bpfilter looks really > > > > odd. Both getsockopts and setsockopt eventually end up > > > > in__bpfilter_process_sockopt, which then passes record to the > > > > userspace helper containing the address of the optval buffer. > > > > Which depending on bpf-cgroup might be in user or kernel space. > > > > But even if it is in userspace it would be in a different process > > > > than the bpfiler helper. What makes all this work? > > > > > > Hmm. Good point. bpfilter assumes user addresses. It will break > > > if bpf cgroup sockopt messes with it. > > > We had a different issue with bpf-cgroup-sockopt and iptables in the past. > > > Probably the easiest way forward is to special case this particular one. > > > With your new series is there a way to tell in bpfilter_ip_get_sockopt() > > > whether addr is kernel or user? And if it's the kernel just return with error. > > > > Yes, I can send a fix. But how do even the user space addressed work? > > If some random process calls getsockopt or setsockopt, how does the > > bpfilter user mode helper attach to its address space? > > The actual bpfilter processing is in two patches that we didn't land: > https://lore.kernel.org/patchwork/patch/902785/ > https://lore.kernel.org/patchwork/patch/902783/ > UMD is using process_vm_readv(). > The target process is waiting for the sockopt syscall to return, > so from the toctou perspective it's the same as the kernel doing copy_from_user. You need to be doing the user-spaces accesses from the target process's context. If it is waiting for the syscall to return then you aren't running it its context. There is also at least one sockopt that has an embedded pointer. So even if you've copied the sockopt buffer into kernel space you are doomed. If the request has come from a kernel thread with kernel buffers then the embedded buffer will be kernel. (This is probably more common for ioctl() requests.) I'm not sure there is any sane was to handle this. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)