On 29/05/2023 14:48, Ido Schimmel wrote: > For EVPN non-DF (Designated Forwarder) filtering we need to be able to > prevent decapsulated traffic from being flooded to a multi-homed host. > Filtering of multicast and broadcast traffic can be achieved using the > following flower filter: > > # tc filter add dev bond0 egress pref 1 proto all flower indev vxlan0 dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 action drop > > Unlike broadcast and multicast traffic, it is not currently possible to > filter unknown unicast traffic. The classification into unknown unicast > is performed by the bridge driver, but is not visible to other layers > such as tc. > > Solve this by adding a new 'l2_miss' bit to the tc skb extension. Clear > the bit whenever a packet enters the bridge (received from a bridge port > or transmitted via the bridge) and set it if the packet did not match an > FDB or MDB entry. If there is no skb extension and the bit needs to be > cleared, then do not allocate one as no extension is equivalent to the > bit being cleared. The bit is not set for broadcast packets as they > never perform a lookup and therefore never incur a miss. > > A bit that is set for every flooded packet would also work for the > current use case, but it does not allow us to differentiate between > registered and unregistered multicast traffic, which might be useful in > the future. > > To keep the performance impact to a minimum, the marking of packets is > guarded by the 'tc_skb_ext_tc' static key. When 'false', the skb is not > touched and an skb extension is not allocated. Instead, only a > 5 bytes nop is executed, as demonstrated below for the call site in > br_handle_frame(). > > Before the patch: > > ``` > memset(skb->cb, 0, sizeof(struct br_input_skb_cb)); > c37b09: 49 c7 44 24 28 00 00 movq $0x0,0x28(%r12) > c37b10: 00 00 > > p = br_port_get_rcu(skb->dev); > c37b12: 49 8b 44 24 10 mov 0x10(%r12),%rax > memset(skb->cb, 0, sizeof(struct br_input_skb_cb)); > c37b17: 49 c7 44 24 30 00 00 movq $0x0,0x30(%r12) > c37b1e: 00 00 > c37b20: 49 c7 44 24 38 00 00 movq $0x0,0x38(%r12) > c37b27: 00 00 > ``` > > After the patch (when static key is disabled): > > ``` > memset(skb->cb, 0, sizeof(struct br_input_skb_cb)); > c37c29: 49 c7 44 24 28 00 00 movq $0x0,0x28(%r12) > c37c30: 00 00 > c37c32: 49 8d 44 24 28 lea 0x28(%r12),%rax > c37c37: 48 c7 40 08 00 00 00 movq $0x0,0x8(%rax) > c37c3e: 00 > c37c3f: 48 c7 40 10 00 00 00 movq $0x0,0x10(%rax) > c37c46: 00 > > #ifdef CONFIG_HAVE_JUMP_LABEL_HACK > > static __always_inline bool arch_static_branch(struct static_key *key, bool branch) > { > asm_volatile_goto("1:" > c37c47: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) > br_tc_skb_miss_set(skb, false); > > p = br_port_get_rcu(skb->dev); > c37c4c: 49 8b 44 24 10 mov 0x10(%r12),%rax > ``` > > Subsequent patches will extend the flower classifier to be able to match > on the new 'l2_miss' bit and enable / disable the static key when > filters that match on it are added / deleted. > > Signed-off-by: Ido Schimmel <idosch@xxxxxxxxxx> > --- > > Notes: > v2: > * Use tc skb extension instead of adding a bit to the skb. > * Do not mark broadcast packets as they never perform a lookup and > therefore never incur a miss. > > include/linux/skbuff.h | 1 + > net/bridge/br_device.c | 1 + > net/bridge/br_forward.c | 3 +++ > net/bridge/br_input.c | 1 + > net/bridge/br_private.h | 27 +++++++++++++++++++++++++++ > 5 files changed, 33 insertions(+) > Nice approach. Acked-by: Nikolay Aleksandrov <razor@xxxxxxxxxxxxx>