Re: [PATCH RFC v4 0/9] tun: Introduce virtio-net hashing feature

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 29 Sep 2024 16:10:47 +0900
Akihiko Odaki <akihiko.odaki@xxxxxxxxxx> wrote:

> On 2024/09/29 11:07, Jason Wang wrote:
> > On Fri, Sep 27, 2024 at 3:51 PM Akihiko Odaki <akihiko.odaki@xxxxxxxxxx> wrote:  
> >>
> >> On 2024/09/27 13:31, Jason Wang wrote:  
> >>> On Fri, Sep 27, 2024 at 10:11 AM Akihiko Odaki <akihiko.odaki@xxxxxxxxxx> wrote:  
> >>>>
> >>>> On 2024/09/25 12:30, Jason Wang wrote:  
> >>>>> On Tue, Sep 24, 2024 at 5:01 PM Akihiko Odaki <akihiko.odaki@xxxxxxxxxx> wrote:  
> >>>>>>
> >>>>>> virtio-net have two usage of hashes: one is RSS and another is hash
> >>>>>> reporting. Conventionally the hash calculation was done by the VMM.
> >>>>>> However, computing the hash after the queue was chosen defeats the
> >>>>>> purpose of RSS.
> >>>>>>
> >>>>>> Another approach is to use eBPF steering program. This approach has
> >>>>>> another downside: it cannot report the calculated hash due to the
> >>>>>> restrictive nature of eBPF.
> >>>>>>
> >>>>>> Introduce the code to compute hashes to the kernel in order to overcome
> >>>>>> thse challenges.
> >>>>>>
> >>>>>> An alternative solution is to extend the eBPF steering program so that it
> >>>>>> will be able to report to the userspace, but it is based on context
> >>>>>> rewrites, which is in feature freeze. We can adopt kfuncs, but they will
> >>>>>> not be UAPIs. We opt to ioctl to align with other relevant UAPIs (KVM
> >>>>>> and vhost_net).
> >>>>>>  
> >>>>>
> >>>>> I wonder if we could clone the skb and reuse some to store the hash,
> >>>>> then the steering eBPF program can access these fields without
> >>>>> introducing full RSS in the kernel?  
> >>>>
> >>>> I don't get how cloning the skb can solve the issue.
> >>>>
> >>>> We can certainly implement Toeplitz function in the kernel or even with
> >>>> tc-bpf to store a hash value that can be used for eBPF steering program
> >>>> and virtio hash reporting. However we don't have a means of storing a
> >>>> hash type, which is specific to virtio hash reporting and lacks a
> >>>> corresponding skb field.  
> >>>
> >>> I may miss something but looking at sk_filter_is_valid_access(). It
> >>> looks to me we can make use of skb->cb[0..4]?  
> >>
> >> I didn't opt to using cb. Below is the rationale:
> >>
> >> cb is for tail call so it means we reuse the field for a different
> >> purpose. The context rewrite allows adding a field without increasing
> >> the size of the underlying storage (the real sk_buff) so we should add a
> >> new field instead of reusing an existing field to avoid confusion.
> >>
> >> We are however no longer allowed to add a new field. In my
> >> understanding, this is because it is an UAPI, and eBPF maintainers found
> >> it is difficult to maintain its stability.
> >>
> >> Reusing cb for hash reporting is a workaround to avoid having a new
> >> field, but it does not solve the underlying problem (i.e., keeping eBPF
> >> as stable as UAPI is unreasonably hard). In my opinion, adding an ioctl
> >> is a reasonable option to keep the API as stable as other virtualization
> >> UAPIs while respecting the underlying intention of the context rewrite
> >> feature freeze.  
> > 
> > Fair enough.
> > 
> > Btw, I remember DPDK implements tuntap RSS via eBPF as well (probably
> > via cls or other). It might worth to see if anything we miss here.  
> 
> Thanks for the information. I wonder why they used cls instead of 
> steering program. Perhaps it may be due to compatibility with macvtap 
> and ipvtap, which don't steering program.
> 
> Their RSS implementation looks cleaner so I will improve my RSS 
> implementation accordingly.
> 

DPDK needs to support flow rules. The specific case is where packets
are classified by a flow, then RSS is done across a subset of the queues.
The support for flow in TUN driver is more academic than useful,
I fixed it for current BPF, but doubt anyone is using it really.

A full steering program would be good, but would require much more
complexity to take a general set of flow rules then communicate that
to the steering program.






[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux