Re: Contextually speaking...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



As I understand from a C compiler point of view ->data and ->data_end
are just arbitrary pointers embedded in a struct. Where does this
semantics arises from? I.e. how does eBPF verifier knows that data
ends where data_end points to?

On Sun, May 14, 2017 at 3:36 AM, David Miller <davem@xxxxxxxxxxxxx> wrote:
>
> Every eBPF program has a type, and that type is important because it
> determines the kind of "context" which will be passed into your
> program so that it can do it's work.
>
> The context is the argument passed into the main entry point of your
> eBPF program.
>
> The eBPF program type is specified when the program is loaded via the
> sys_bpf() system call.  For most of us this is usually achieved by
> calling bpf_load_program() in libbpf.  "enum bpf_prog_type" currently
> has the following values:
>
>         BPF_PROG_TYPE_SOCKET_FILTER
>         BPF_PROG_TYPE_KPROBE
>         BPF_PROG_TYPE_SCHED_CLS
>         BPF_PROG_TYPE_SCHED_ACT
>         BPF_PROG_TYPE_TRACEPOINT
>         BPF_PROG_TYPE_XDP
>         BPF_PROG_TYPE_PERF_EVENT
>         BPF_PROG_TYPE_CGROUP_SKB
>         BPF_PROG_TYPE_CGROUP_SOCK
>         BPF_PROG_TYPE_LWT_IN
>         BPF_PROG_TYPE_LWT_OUT
>         BPF_PROG_TYPE_LWT_XMIT
>
> More can appear in the future.
>
> For example, BPF_PROG_TYPE_SOCK_FILTER takes a "struct __sk_buff *" as
> it's context argument.  Programs of type BPF_PROG_TYPE_SCHED_CLS and
> BPF_PROG_TYPE_SCHED_ACT also take "struct __sk_buff *" as their
> context argument.
>
> These three program types have another thing in common, they are
> allowed to use the LD_ABS and LD_IND instructions to access packet
> data.  You cannot (currently) generate these from C code, only from
> hand written eBPF assembler.  But they are important to understand
> in their historical context.
>
> LD_ABS and LD_IND simply allow byte, half-word, and word sized loads
> to the packet data.  The value returned is in cpu endianness.  These
> two instructions come from classical BPF, and are thus older than some
> of you reading this text right now.
>
> Therefore, if you look at libpcap or any other piece of code that
> generates classical BPF, you will see that it makes use of LD_ABS and
> LD_IND.
>
> But from C code, you can load members of "struct __sk_buff" and access
> packet data directly using what you get from there.  We will refer to
> this as "direct packet access" And this brings us to an important
> topic.
>
> Any direct packet access must be properly validated before it is
> performed.  We'll get into what that means exactly in just a second.
> If proper validation is not performed, the eBPF verifier will reject
> your program and refuse to load it.
>
> Here is how you do it.  Let's write a very simple program that returns
> "1" if we have an ipv4 ethernet packet, and "0" otherwise.
>
> SEC("my_program")
> int my_main(struct __sk_buff *skb)
> {
>         void *data_end = (void *)(long)skb->data_end;
>         void *data = (void *)(long)skb->data;
>
> Here we load the extents of the packet data, basically the start and
> end pointers.  The casts in the assignments are necessary, so please
> just copy this pattern into your programs.
>
> The packet starts with the ethernet header, so let's get that going:
>
>         struct ethhdr *eth = (struct ethhdr *)(data);
>
> Now, we can't just go "eth->h_proto", that's illegal.  We have to
> explicitly test that such an access is in range and doesn't go
> beyond "data_end".
>
> So let's make that test:
>
>         if (eth + 1 > data_end)
>                 return 0;
>
> The eBPF verifier will see that "eth" holds a packet pointer,
> and also that you have made sure that from "eth" to "eth + 1"
> is inside the valid access range for the packet.
>
> Therefore, from this point forward you may validly access any part of
> "struct ethhdr" via the variable "eth".  Let's do that.
>
>         if (eth->h_proto == bpf_htons(ETH_P_IP))
>                 return 1;
>         return 0;
> }
>
> And that's it.
>
> The program type has another influence on your program.  It determines
> the meaning of your program's return value.
>
> A program of type BPF_PROG_TYPE_SOCK_FILTER returns the number of
> bytes of the packet which should be accepted by the filter.  A return
> value of zero means drop the packet.  A non-zero return value means to
> truncate the packet to that many bytes, and accept it.
>
> So our example program above needs a little bit of an adjustment to
> make it suitable for BPF_PROG_TYPE_SOCK_FILTER:
>
> SEC("my_program")
> int my_main(struct __sk_buff *skb)
> {
>         void *data_end = (void *)(long)skb->data_end;
>         void *data = (void *)(long)skb->data;
>         struct ethhdr *eth = (struct ethhdr *)(data);
>         int len = skb->len;
>
>         if (eth + 1 > data_end)
>                 return 0;
>         if (eth->h_proto == bpf_htons(ETH_P_IP))
>                 return len;
>         return 0;
> }
>
> So what changed is that we load "len" from the context metadata and
> return "len" when we want to accept the packet.  This says "accept
> the packet and do not truncate it."



[Index of Archives]     [Linux Networking Development]     [Fedora Linux Users]     [Linux SCTP]     [DCCP]     [Gimp]     [Yosemite Campsites]

  Powered by Linux