On Sun, Mar 24, 2024 at 3:30 PM David Laight <David.Laight@xxxxxxxxxx> wrote: > > From: Alexei Starovoitov > > Sent: 24 March 2024 20:43 > > > > On Sun, Mar 24, 2024 at 1:05 PM David Laight <David.Laight@xxxxxxxxxx> wrote: > > > > > > From: Alexei Starovoitov > > > > Sent: 21 March 2024 06:08 > > > > > > > > On Wed, Mar 20, 2024 at 3:55 AM Puranjay Mohan <puranjay12@xxxxxxxxx> wrote: > > > > > > > > > > The JITs need to implement bpf_arch_uaddress_limit() to define where > > > > > the userspace addresses end for that architecture or TASK_SIZE is taken > > > > > as default. > > > > > > > > > > The implementation is as follows: > > > > > > > > > > REG_AX = SRC_REG > > > > > if(offset) > > > > > REG_AX += offset; > > > > > REG_AX >>= 32; > > > > > if (REG_AX <= (uaddress_limit >> 32)) > > > > > DST_REG = 0; > > > > > else > > > > > DST_REG = *(size *)(SRC_REG + offset); > > > > > > > > The patch looks good, but it seems to be causing s390 CI failures. > > > > > > I'm confused by the need for this check (and, IIRC, some other bpf > > > code that does kernel copies that can fault - and return an error). > > > > > > I though that the entire point of bpf was that is sanitised and > > > verified everything to limit what the 'program' could do in order > > > to stop it overwriting (or even reading) kernel structures that > > > is wasn't supposed to access. > > > > > > So it just shouldn't have a address that might be (in any way) > > > invalid. > > > > bpf tracing progs can call bpf_probe_read_kernel() which > > can read any kernel memory. > > This is nothing but an inlined version of it. > > It was the getsockopt() code were I saw the copy_nocheck() calls. > Those have to be broken. No. If you mean csum_partial_copy_nocheck() then they're fine. > Although the way some of the options use the ptr:len supplied by > the application you stand no chance of do an in-kernel call > without a proper buffer descriptor argument (with separate optlen > and bufferlen fields.) > > > > > > The only possible address verify is access_ok() to ensure that > > > a uses address really is a user address. > > > > access_ok() considerations don't apply. > > We're not dealing with user memory access. > > If you do need a check for 'not a user address' don't you want to just > require access_ok() fail? > That would be architecture independent. No. access_ok() can only be used on the user addr. access_ok() == false on the kernel addr doesn't mean that it's a kernel addr.