On Wed, 2021-01-20 at 11:19 -0400, Jason Gunthorpe wrote: > On Wed, Jan 20, 2021 at 04:04:44PM +0100, Martin Wilck wrote: > > Anyway, Jason seems to agree with you that the way it worked until > > 5.10, which was fine as far as I could tell, was wrong. I'd still > > appreciate some hints explaining what exactly was wrong with the > > old > > code, and how you guys reckon it should work instead. In particular > > considering Mohammad's statement I quoted further down. Was > > Mohammad > > wrong? > > In RDMA vlan support revolves around the gid_attr > > To have vlan support the device must copy the vlan from the gid_attrs > associated with every tx packet, and match the gid_attr table on > every > rx, including the vlan. > > For instance, rxe never calls rdma_read_gid_l2_fields to get the > gid_attr for tx, so it doesn't support vlan, at all. > > > What I got so far didn't help me much. I'd especially like to > > understand how you think the high-level user experience should be. > > A single rxe device created on the physical netdev. The core code gid > table stuff should import vlan entries of upper vlan net devices and > the general machinery should select those gid table entries when a > vlan is required. > > rxe should not be creatable on upper vlan net devices to emulate how > real HW works. > > If your use case that work was creating a rxe on a upper vlan device > and relying on the tx of vlan layer to stuff the vlan, then the > problem is how the core code manages the gid table. My use case was creating RXE on the physical device, creating VLANs on top of the same physical device, and create RDMA connections over these VLANs. This is what used to work. I have never observed e.g. interference between RDMA connections over two different VLANs on the same physical device, or RDMA connections directly on the physical device. Thanks for the explanations, anyway. Regards, Martin