Re: XDP_REDIRECT with xsks_map and dev_map

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jun 5, 2020 at 12:57 AM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
>
> maharishi bhargava <bhargavamaharishi@xxxxxxxxx> writes:
>
> > On Wed, Jun 3, 2020 at 8:39 PM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
> >>
> >> maharishi bhargava <bhargavamaharishi@xxxxxxxxx> writes:
> >>
> >> > On Wed, Jun 3, 2020 at 4:41 PM Maciej Fijalkowski
> >> > <maciej.fijalkowski@xxxxxxxxx> wrote:
> >> >>
> >> >> On Wed, Jun 03, 2020 at 01:07:05PM +0200, Toke Høiland-Jørgensen wrote:
> >> >> > Maciej Fijalkowski <maciej.fijalkowski@xxxxxxxxx> writes:
> >> >> >
> >> >> > > On Wed, Jun 03, 2020 at 12:49:25PM +0200, Toke Høiland-Jørgensen wrote:
> >> >> > >> maharishi bhargava <bhargavamaharishi@xxxxxxxxx> writes:
> >> >> > >>
> >> >> > >> > On Tue, Jun 2, 2020 at 9:31 PM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
> >> >> > >> >>
> >> >> > >> >> maharishi bhargava <bhargavamaharishi@xxxxxxxxx> writes:
> >> >> > >> >>
> >> >> > >> >> > On Tue 2 Jun, 2020, 14:31 Toke Høiland-Jørgensen, <toke@xxxxxxxxxx> wrote:
> >> >> > >> >> >>
> >> >> > >> >> >> maharishi bhargava <bhargavamaharishi@xxxxxxxxx> writes:
> >> >> > >> >> >>
> >> >> > >> >> >> > Hi, in my XDP program, I want to redirect some packets using AF_XDP
> >> >> > >> >> >> > and redirect other packets directly from driver space.
> >> >> > >> >> >> > Redirection through AF_XDP works fine, but redirection through dev map
> >> >> > >> >> >> > stops after some packets are processed.
> >> >> > >> >> >>
> >> >> > >> >> >> Do you mean it stops even if you are *only* redirecting to a devmap, or
> >> >> > >> >> >> if you are first redirecting a few packets to AF_XDP, then to devmap?
> >> >> > >> >> >>
> >> >> > >> >> >> Also, which driver(s) are the physical NICs you're redirecting to/from
> >> >> > >> >> >> using, and which kernel version are you on?
> >> >> > >> >> >>
> >> >> > >> >> >> -Toke
> >> >> > >> >> >
> >> >> > >> >> >
> >> >> > >> >> >
> >> >> > >> >> > Currently, I'm trying to redirect packets only using devmap. But also
> >> >> > >> >> > have code for redirection using AF_XDP(only when a given condition is
> >> >> > >> >> > satisfied). A DPDK program is running in userspace which will receive
> >> >> > >> >> > packets from AF_XDP.
> >> >> > >> >>
> >> >> > >> >> Right, so it's just devmap redirect that breaks. What do you mean
> >> >> > >> >> 'redirection stops', exactly? How are you seeing this? Does xdp_monitor
> >> >> > >> >> (from samples/bpf) report any exceptions?
> >> >> > >> >>
> >> >> > >> >> -Toke
> >> >> > >> >>
> >> >> > >> > So, In my setup, there are three systems, Let's Assume A, B, C. System
> >> >> > >> > B is acting as a forwarder between A and C. So I can see the number of
> >> >> > >> > packets received at system C. To be specific, only 1024 packets are
> >> >> > >> > received. If I remove the xsks_map part from the code and don't run
> >> >> > >> > DPDK in userspace. This problem does not occur. Also if I forward all
> >> >> > >> > the packets using AF_XDP, there is no such issue.
> >> >> > >>
> >> >> > >> I thought you said you were seeing the problem when only redirecting to
> >> >> > >> a devmap? So why does the xsk_map code impact this? I think you may have
> >> >> > >> to share some code...
> >> >> > >
> >> >> > > Isn't the case here that either xsk_map or dev_map consumes the frame and
> >> >> > > therefore the latter doesn't see it? so cloning might be needed here?
> >> >> >
> >> >> > Yeah, certainly you can't redirect *the same packet* to both xsk_map and
> >> >> > devmap - but that wasn't what I understood was the use case here?
> >> >>
> >> >> Maybe the best would be if Maharishi shared the code as you requested :)
> >> >>
> >> >> >
> >> >> > -Toke
> >> >> >
> >> > CODE:
> >> > BPF MAPS:
> >> >
> >> >
> >> > struct bpf_map_def SEC("maps") xsks_map = {
> >> >     .type = BPF_MAP_TYPE_XSKMAP,
> >> >     .key_size = sizeof(int),
> >> >     .value_size = sizeof(int),
> >> >     .max_entries = 64,  /* Assume netdev has no more than 64 queues */
> >> > };
> >> >
> >> > struct bpf_map_def SEC("maps") tx_port = {
> >> >     .type = BPF_MAP_TYPE_DEVMAP,
> >> >     .key_size = sizeof(int),
> >> >     .value_size = sizeof(int),
> >> >     .max_entries = 1024,
> >> > };
> >> >
> >> > struct Ingress_qos_lts_value{
> >> >     struct bpf_spin_lock lock;
> >> >     u64 timestamp;
> >> > };
> >> > struct bpf_map_def SEC("maps") Ingress_qos_lts = {
> >> >     .type = BPF_MAP_TYPE_ARRAY,
> >> >     .key_size = sizeof(u32),
> >> >     .value_size = sizeof(struct Ingress_qos_lts_value),
> >> >     .max_entries = 1025,
> >> > };
> >> > BPF_ANNOTATE_KV_PAIR(Ingress_qos_lts,u32,struct Ingress_qos_lts_value);
> >> >
> >> >
> >> > SEC("prog")
> >> > int ebpf_filter(struct xdp_md *ctx){
> >> >     struct xdp_output xout;
> >> >    xout.output_port = 1;
> >> >     void* ebpf_packetStart = ((void*)(long)ctx->data);
> >> >     void* ebpf_packetEnd = ((void*)(long)ctx->data_end);
> >> >     u64 rate = 100;//100 Kbps
> >> >     rate *= 1000*1000*100;//10 Gbps
> >> >     u32 key = 1;//some key
> >> >     u64 packet_length=(ebpf_packetEnd-ebpf_packetStart-42)*8;
> >> >     packet_length *= 1000000000; //packet length * 10^9, to convert
> >> > rate from second to nanosecond
> >> >     struct Ingress_qos_lts_value* val;
> >> >     val = bpf_map_lookup_elem(&Ingress_qos_lts, &key);
> >> >     u64 now = bpf_ktime_get_ns();
> >> >     u64 lts;
> >> >     if (val) {
> >> >         bpf_spin_lock(&val->lock);
> >> >         lts = *(&val->timestamp)+(packet_length/rate);
> >> >         if(now>lts){
> >> >             lts = now;
> >> >         }
> >> >         *(&val->timestamp) = lts;
> >> >         bpf_spin_unlock(&val->lock);
> >> >                     // printk("Time : %x %x\n",lts,now);
> >> >         if(lts>now){
> >> >             return bpf_redirect_map(&xsks_map, ctx->rx_queue_index, 0);
> >> >         }
> >> >     }
> >> >     return  bpf_redirect_map(&tx_port,xout.output_port,0);
> >> > }
> >> >
> >> > So, Basically this code redirects the packet to some other interface
> >> > or sends the packet to userspace based on the incoming packet rate.
> >>
> >> Well, if you say it goes away when you remove the xsk code, the obvious
> >> explanation would be that the packets are being redirected to userspace
> >> instead? What does xdp_monitor say?
> >>
> >> -Toke
> >>
> > No packets are not going to userspace. NIC stops processing any more
> > packets after 1024 redirected packets. I'll post the results of
> > xdp_monitor asap.
> >
> > Also, one piece of information that might be helpful. In DPDK's
> > default code for creating xsk_socket, the value of bind_flags was 0.
> > When I changed it to XDP_COPY(1 << 1), everything started working
> > correctly. So, is it something like, the socket was getting created in
> > zero-copy mode and as far as I know, kernel version 5.3 does not
> > support zero-copy mode. Due to this xdp_redirect using DEVMAP was not
> > working as expected.
>
> Hmm, in zero-copy mode packets are DMA'ed directly into the
> userspace-provided buffer (hence zero copies). Pretty sure this is
> incompatible with forwarding them to the stack, whether or not the
> kernel supports zero-copy in the first place.

If you redirect to another netdev or XDP_PASS a packet to the kernel,
it will be copied and the copy is sent onwards to the system. As Toke
says, whatever you had in that packet will be visible to the process
you created/used the zero-copy AF_XDP socket in. But any further
modification to the packet will not be visible in the umem area, since
the packet was copied to another kernel internal buffer.

/Magnus

> Cc Magnus who will know for sure.
>
> -Toke
>




[Index of Archives]     [Linux Networking Development]     [Fedora Linux Users]     [Linux SCTP]     [DCCP]     [Gimp]     [Yosemite Campsites]

  Powered by Linux