On Mon, Oct 9, 2023 at 3:05 AM Akihiko Odaki <akihiko.odaki@xxxxxxxxxx> wrote: > > > > On 2023/10/09 18:54, Willem de Bruijn wrote: > > On Mon, Oct 9, 2023 at 3:44 AM Akihiko Odaki <akihiko.odaki@xxxxxxxxxx> wrote: > >> > >> On 2023/10/09 17:13, Willem de Bruijn wrote: > >>> On Sun, Oct 8, 2023 at 12:22 AM Akihiko Odaki <akihiko.odaki@xxxxxxxxxx> wrote: > >>>> > >>>> virtio-net have two usage of hashes: one is RSS and another is hash > >>>> reporting. Conventionally the hash calculation was done by the VMM. > >>>> However, computing the hash after the queue was chosen defeats the > >>>> purpose of RSS. > >>>> > >>>> Another approach is to use eBPF steering program. This approach has > >>>> another downside: it cannot report the calculated hash due to the > >>>> restrictive nature of eBPF. > >>>> > >>>> Introduce the code to compute hashes to the kernel in order to overcome > >>>> thse challenges. An alternative solution is to extend the eBPF steering > >>>> program so that it will be able to report to the userspace, but it makes > >>>> little sense to allow to implement different hashing algorithms with > >>>> eBPF since the hash value reported by virtio-net is strictly defined by > >>>> the specification. > >>>> > >>>> The hash value already stored in sk_buff is not used and computed > >>>> independently since it may have been computed in a way not conformant > >>>> with the specification. > >>>> > >>>> Signed-off-by: Akihiko Odaki <akihiko.odaki@xxxxxxxxxx> > >>> > >>>> @@ -2116,31 +2172,49 @@ static ssize_t tun_put_user(struct tun_struct *tun, > >>>> } > >>>> > >>>> if (vnet_hdr_sz) { > >>>> - struct virtio_net_hdr gso; > >>>> + union { > >>>> + struct virtio_net_hdr hdr; > >>>> + struct virtio_net_hdr_v1_hash v1_hash_hdr; > >>>> + } hdr; > >>>> + int ret; > >>>> > >>>> if (iov_iter_count(iter) < vnet_hdr_sz) > >>>> return -EINVAL; > >>>> > >>>> - if (virtio_net_hdr_from_skb(skb, &gso, > >>>> - tun_is_little_endian(tun), true, > >>>> - vlan_hlen)) { > >>>> + if ((READ_ONCE(tun->vnet_hash.flags) & TUN_VNET_HASH_REPORT) && > >>>> + vnet_hdr_sz >= sizeof(hdr.v1_hash_hdr) && > >>>> + skb->tun_vnet_hash) { > >>> > >>> Isn't vnet_hdr_sz guaranteed to be >= hdr.v1_hash_hdr, by virtue of > >>> the set hash ioctl failing otherwise? > >>> > >>> Such checks should be limited to control path where possible > >> > >> There is a potential race since tun->vnet_hash.flags and vnet_hdr_sz are > >> not read at once. > > > > It should not be possible to downgrade the hdr_sz once v1 is selected. > > I see nothing that prevents shrinking the header size. > > tun->vnet_hash.flags is read after vnet_hdr_sz so the race can happen > even for the case the header size grows though this can be fixed by > reordering the two reads. One option is to fail any control path that tries to re-negotiate header size once this hash option is enabled? There is no practical reason to allow feature re-negotiation at any arbitrary time.