On Wed, 24 Aug 2022 at 00:27, Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > On Mon, Aug 22, 2022 at 7:31 PM Kumar Kartikeya Dwivedi > <memxor@xxxxxxxxx> wrote: > > > [...] > > > if (func_id == BPF_FUNC_dynptr_data && > > > - dynptr_type == BPF_DYNPTR_TYPE_SKB) { > > > + (dynptr_type == BPF_DYNPTR_TYPE_SKB || > > > + dynptr_type == BPF_DYNPTR_TYPE_XDP)) { > > > regs[BPF_REG_0].type = PTR_TO_PACKET | ret_flag; > > > regs[BPF_REG_0].range = meta.mem_size; > > > > It doesn't seem like this is safe. Since PTR_TO_PACKET's range can be > > modified by comparisons with packet pointers loaded from the xdp/skb > > ctx, how do we distinguish e.g. between a pkt slice obtained from some > > frag in a multi-buff XDP vs pkt pointer from a linear area? > > > > Someone can compare data_meta from ctx with PTR_TO_PACKET from > > bpf_dynptr_data on xdp dynptr (which might be pointing to a xdp mb > > frag). While MAX_PACKET_OFF is 0xffff, it can still be used to do OOB > > access for the linear area. reg_is_init_pkt_pointer will return true > > as modified range is not considered for it. Same kind of issues when > > doing comparison with data_end from ctx (though maybe you won't be > > able to do incorrect data access at runtime using that). > > > > I had a pkt_uid field in my patch [0] which disallowed comparisons > > among bpf_packet_pointer slices. Each call assigned a fresh pkt_uid, > > and that disabled comparisons for them. reg->id is used for var_off > > range propagation so it cannot be reused. > > > > Coming back to this: What we really want here is a PTR_TO_MEM with a > > mem_size, so maybe you should go that route instead of PTR_TO_PACKET > > (and add a type tag to maybe pretty print it also as a packet pointer > > in verifier log), or add some way to distinguish slice vs non-slice > > pkt pointers like I did in my patch. You might also want to add some > > tests for this corner case (there are some later in [0] if you want to > > reuse them). > > > > So TBH, I kinda dislike my own solution in [0] :). The complexity does > > not seem worth it. The pkt_uid distinction is more useful (and > > actually would be needed) in Toke's xdp queueing series, where in a > > dequeue program you have multiple xdp_mds and want scoped slice > > invalidations (i.e. adjust_head on one xdp_md doesn't invalidate > > slices of some other xdp_md). Here we can just get away with normal > > PTR_TO_MEM. > > > > ... Or just let me know if you handle this correctly already, or if > > this won't be an actual problem :). > > Ooh interesting, I hadn't previously taken a look at > try_match_pkt_pointers(), thanks for mentioning it :) > > The cleanest solution to me is to add the flag "DYNPTR_TYPE_{SKB/XDP}" > to PTR_TO_PACKET and change reg_is_init_pkt_pointer() to return false > if the DYNPTR_TYPE_{SKB/XDP} flag is present. I prefer this over > returning PTR_TO_MEM because it seems more robust (eg if in the future > we reject x behavior on the packet data reg types, this will > automatically apply to the data slices), and because it'll keep the > logic more efficient/simpler for the case when the pkt pointer has to > be cleared after any helper that changes pkt data is called (aka the > case where the data slice gets invalidated). > > What are your thoughts? > Thinking more deeply about it, probably not, we need more work here. I remember _now_ why I chose the pkt_uid approach (and this tells us my commit log lacks all the details about the motivation :( ). Consider how equivalency checking for packet pointers works in regsafe. It is checking type, then if old range > cur range, then offs, etc. The problem is, while we now don't prune on access to ptr_to_pkt vs ptr_to_pkt | dynptr_pkt types in same reg (since type differs we return false), we still prune if old range of ptr_to_pkt | dynptr_pkt > cur range of ptr_to_pkt | dynptr_pkt. Both could be pointing into separate frags, so this assumption would be incorrect. I would be able to trick the verifier into accessing data beyond the length of a different frag, by first making sure one line of exploration is verified, and then changing the register in another branch reaching the same branch target. Helpers can take packet pointers so the access can become a pruning point. It would think the rest of the stuff is safe, while they are not equivalent at all. It is ok if they are bit by bit equivalent (same type, range, off, etc.). If you start returning false whenever you see this type tag set, it will become too conservative (it considers reg copies of the same dynptr_data lookup as distinct). So you need some kind of id assigned during dynptr_data lookup to distinguish them.