On Wed, 2020-11-25 at 10:54 -0800, Jakub Kicinski wrote: > On Tue, 24 Nov 2020 15:44:13 -0400 Jason Gunthorpe wrote: > > On Tue, Nov 24, 2020 at 10:41:06AM -0800, Jakub Kicinski wrote: > > > On Tue, 24 Nov 2020 14:02:10 -0400 Jason Gunthorpe wrote: [snip] > > > > > > > > > > It has been like this for years, it is not some "act". > > > > > > > > It is long standing uABI that accelerators like RDMA/etc get to > > > > take the traffic before netdev. This cannot be reverted. I > > > > don't > > > > really understand what you are expecting here? > > > > > > Same. I don't really know what you expect me to do either. I > > > don't > > > think I can sign-off on kernel changes needed for DPDK. > > > > This patch is fine tuning the shared logic that splits the traffic > > to > > accelerator subsystems, I don't think netdev should have a veto > > here. This needs to be consensus among the various communities and > > subsystems that rely on this. > > > > Eli did not explain this well in his commit message. When he said > > DPDK > > he means RDMA which is the owner of the FLOW_NAMESPACE. Each > > accelerator subsystem gets hooked into this, so here VPDA is > > getting > > its own hook because re-using the the same hook between two kernel > > subsystems is buggy. > > I'm not so sure about this. > > The switchdev modeling is supposed to give users control over flow of > traffic in a sane, well defined way, as opposed to magic flow > filtering > of the early SR-IOV implementations which every vendor had their own > twist on. > > Now IIUC you're tapping traffic for DPDK/raw QPs _before_ all > switching > happens in the NIC? That breaks the switchdev model. We're back to > per-vendor magic. No this is after switching, nothing can precede switching! after switching and forwarding to the correct function/vport, The HW deumx rdma to rdma and eth(rest) to netdev. > > And why do you need a separate VDPA table in the first place? > Forwarding to a VDPA device has different semantics than forwarding > to > any other VF/SF? VDPA is yet another "RDMA" Application, similar to raw qp, it is different than VF/SF. switching can only forward to PF/VF/SF, it doesn't know or care about the end point app (netdev/rdma). Jakub, this is how rdma works and has been working for the past 20 years :), Jason is well aware of the lack of visibility, and i am sure rdma folks will improve this, they have been improving a lot lately, take rdma_tool for example. Bottom line the switching model is not the answer for rdma, another model is required, rdma by definition is HW oriented from day one, you can't think of it as an offloaded SW model. ( also in a real switch you can't define if a port is rdma or eth :) ) Anyway you have very valid points that Jason already raised in the past, but the challenge is more complicated than the challenges we have in netdev, simply because RDMA is RDMA, where the leading model is the HW model and the rdma spec and not the SW ..