Re: xdp_redirect ifindex vs port. Was: best API for returning/setting egress port?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/28/17 12:43 PM, Hannes Frederic Sowa wrote:
On 28.04.2017 07:30, Alexei Starovoitov wrote:
On 4/27/17 10:06 PM, John Fastabend wrote:
That is more or less what I was thinking as well. The other question
I have though is should we have a bpf_redirect() call for the simple
case where I use the ifindex directly. This will be helpful for taking
existing programs from tc_cls into xdp. I think it makes sense to have
both bpf_tx_allports(), bpf_tx_port(), and bpf_redirect().

I think so too.
Once netdevice is stored into netdev_array map the netdevice is pinned
and we need to figure out what to do if somebody tries to delete it.
Should we add a new netlink notifier that this netdev's refcnt is
almost zero and it's only in netdev_array(s) ?

We basically do that automatically in netdev_wait_allrefs:

pr_emerg("unregister_netdevice: waiting for %s to become free. Usage
count = %d\n",
         dev->name, refcnt);

It is a very unpleasant warning and users probably think about a bug in
the kernel at first.

I don't think we should wait for user space to clean that up but have to
do it automatically from the kernel. Maybe we can introduce a special
value that basically NOPs the transmission. The hash table itself would
install a netdevice notifier and would clean all tables. Could
definitely cause some storm in the kernel, if a lot of keys are mapped
to the same interface.

or should it be deleted from the array(s) automatically and
then user space will be notified post-deletion?
Both approaches have their pros and cons.

I am leaning more towards deleting it automatically. But walking all
tables and in there all keys might cause some unwanted load spikes.

yeah, when netdev is being unregistered we have to drop the ref
to avoid the warning.

Speaking of delete notifiers for maps.
Until recently all deletions were explicit and the program
could do extra perf_event_submit() if it needed to notify
user space of deletion.
But right now we have LRU map that deletes automatically and this
netdev_array will be deleting automatically too, so we probably
need some sort of generic map notifier for all map types.
It can be as simple as user space sleeping on read(map_fd, buf);
or waiting for epoll on map_fd and kernel wakes it up
when map element is deleted and pushes 'key/value' pair
into the buf. That should be generic for all map types,
but it needs some mechanism to avoid blocking, since number
of deletions can be huge. Something similar to perf's event record
'lost N records' should do the trick.




[Index of Archives]     [Linux Networking Development]     [Fedora Linux Users]     [Linux SCTP]     [DCCP]     [Gimp]     [Yosemite Campsites]

  Powered by Linux