On Sun, Jul 28, 2019 at 09:15:59PM +0200, Allan W. Nielsen wrote: > If we assume that the SwitchDev driver implemented such that all multicast > traffic goes to the CPU, then we should really have a way to install a HW > offload path in the silicon, such that these packets does not go to the CPU (as > they are known not to be use full, and a frame every 3 us is a significant load > on small DMA connections and CPU resources). > > If we assume that the SwitchDev driver implemented such that only "needed" > multicast packets goes to the CPU, then we need a way to get these packets in > case we want to implement the DLR protocol. I'm not familiar with the HW you're working with, so the below might not be relevant. In case you don't want to send all multicast traffic to the CPU (I'll refer to it later), you can install an ingress tc filter that traps to the CPU the packets you do want to receive. Something like: # tc qdisc add dev swp1 clsact # tc filter add dev swp1 pref 1 ingress flower skip_sw dst_mac \ 01:21:6C:00:00:01 action trap If your HW supports sharing the same filter among multiple ports, then you can install your filter in a tc shared block and bind multiple ports to it. Another option is to always send a *copy* of multicast packets to the CPU, but make sure the HW uses a policer that prevents the CPU from being overwhelmed. To avoid packets being forwarded twice (by HW and SW), you will need to mark such packets in your driver with 'skb->offload_fwd_mark = 1'. Now, in case user wants to allow the CPU to receive certain packets at a higher rate, a tc filter can be used. It will be identical to the filter I mentioned earlier, but with a 'police' action chained before 'trap'. I don't think this is currently supported by any driver, but I believe it's the right way to go: By default the CPU receives all the traffic it should receive and user can fine-tune it using ACLs.