On Tue, Jan 31, 2023 at 9:55 AM Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > On Mon, Jan 30, 2023 at 9:36 PM Alexei Starovoitov > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > On Mon, Jan 30, 2023 at 04:44:12PM -0800, Joanne Koong wrote: > > > On Sun, Jan 29, 2023 at 3:39 PM Alexei Starovoitov > > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > > > On Fri, Jan 27, 2023 at 11:17:01AM -0800, Joanne Koong wrote: > > > > > Add skb dynptrs, which are dynptrs whose underlying pointer points > > > > > to a skb. The dynptr acts on skb data. skb dynptrs have two main > > > > > benefits. One is that they allow operations on sizes that are not > > > > > statically known at compile-time (eg variable-sized accesses). > > > > > Another is that parsing the packet data through dynptrs (instead of > > > > > through direct access of skb->data and skb->data_end) can be more > > > > > ergonomic and less brittle (eg does not need manual if checking for > > > > > being within bounds of data_end). > > > > > > > > > > For bpf prog types that don't support writes on skb data, the dynptr is > > > > > read-only (bpf_dynptr_write() will return an error and bpf_dynptr_data() > > > > > will return a data slice that is read-only where any writes to it will > > > > > be rejected by the verifier). > > > > > > > > > > For reads and writes through the bpf_dynptr_read() and bpf_dynptr_write() > > > > > interfaces, reading and writing from/to data in the head as well as from/to > > > > > non-linear paged buffers is supported. For data slices (through the > > > > > bpf_dynptr_data() interface), if the data is in a paged buffer, the user > > > > > must first call bpf_skb_pull_data() to pull the data into the linear > > > > > portion. > > > > > > > > Looks like there is an assumption in parts of this patch that > > > > linear part of skb is always writeable. That's not the case. > > > > See if (ops->gen_prologue || env->seen_direct_write) in convert_ctx_accesses(). > > > > For TC progs it calls bpf_unclone_prologue() which adds hidden > > > > bpf_skb_pull_data() in the beginning of the prog to make it writeable. > > > > > > I think we can make this assumption? For writable progs (referenced in > > > the may_access_direct_pkt_data() function), all of them have a > > > gen_prologue that unclones the buffer (eg tc_cls_act, lwt_xmit, sk_skb > > > progs) or their linear portion is okay to write into by default (eg > > > xdp, sk_msg, cg_sockopt progs). > > > > but the patch was preserving seen_direct_write in some cases. > > I'm still confused. > > seen_direct_write is used to determine whether to actually unclone or > not in the program's prologue function (eg tc_cls_act_prologue() -> > bpf_unclone_prologue() where in bpf_unclone_prologue(), if > direct_write was not true, then it can skip doing the actual > uncloning). > > I think the part of the patch you're talking about regarding > seen_direct_write is this in check_helper_call(): > > + if (func_id == BPF_FUNC_dynptr_data && > + dynptr_type == BPF_DYNPTR_TYPE_SKB) { > + bool seen_direct_write = env->seen_direct_write; > + > + regs[BPF_REG_0].type |= DYNPTR_TYPE_SKB; > + if (!may_access_direct_pkt_data(env, NULL, BPF_WRITE)) > + regs[BPF_REG_0].type |= MEM_RDONLY; > + else > + /* > + * Calling may_access_direct_pkt_data() will set > + * env->seen_direct_write to true if the skb is > + * writable. As an optimization, we can ignore > + * setting env->seen_direct_write. > + * > + * env->seen_direct_write is used by skb > + * programs to determine whether the skb's page > + * buffers should be cloned. Since data slice > + * writes would only be to the head, we can skip > + * this. > + */ > + env->seen_direct_write = seen_direct_write; > + } > > If the data slice for a skb dynptr is writable, then seen_direct_write > gets set to true (done internally in may_access_direct_pkt_data()) so > that the skb is actually uncloned, whereas if it's read-only, then > env->seen_direct_write gets reset to its original value (since the > may_access_direct_pkt_data() call will have set env->seen_direct_write > to true) I'm still confused. When may_access_direct_pkt_data() returns false it doesn't change seen_direct_write. When it returns true it also sets seen_direct_write=true. But the code above restores it to whatever value it had before. How is this correct? Are you saying that another may_access_direct_pkt_data() gets called somewhere in the verifier that sets seen_direct_write=true? But what's the harm in doing it twice or N times in all cases?