Re: [PATCH net 1/1] net: stmmac: Prevent DSA tags from breaking COE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 29, 2023 at 05:11:48PM +0100, Romain Gantois wrote:
> Thanks for telling me about DSA_LOOP, I've tested several DSA tagging protocols 
> with the RZN1 GMAC1 hardware using this method. Here's what I found in a 
> nutshell:

Good job exploring the complexity of the problem in depth.

> For tagging protocols that change the EtherType field in the MAC header (e.g. 
> DSA_TAG_PROTO_(DSA/EDSA/BRCM/MTK/RTL4C_A/SJA1105): On TX the tagged frames are 
> almost always ignored by the checksum offload engine and IP header checker of 
> the MAC device. I say "almost always" because there is an 
> unlikely but nasty corner case where a DSA tag can be identical to an IP 
> EtherType value. In these cases, the frame will likely fail IP header checks 
> and be dropped by the MAC.

Yes, there are a few poorly designed DSA tagging formats where arbitrary
fields overlap with what the conduit interface sees as the EtherType field.
We don't design the tagging formats, as they are proprietary (except for those
derived from tag_8021q), we just support them. In some cases where the
switch has permitted that, we have implemented dynamic changing of
tagging protocols (like 'echo edsa > /sys/class/net/eth0/dsa/tagging')
in order to increase the compatibility between a particular switch and
its conduit interface. And where the compatibility with the default
tagging protocol was beyond broken, we accepted an alternative one
through the 'dsa-tag-protocol' device tree property.

> Ignoring these corner cases, the DSA frames will egress with a partial 
> checksum and be dropped by the recipient. On RX, these frames will, once again, 
> not be detected as IP frames by the MAC. So they will be transmitted to the CPU. 
> However, the stmmac driver will assume (wrongly in this case) that
> these frames' checksums have been verified by the MAC. So it will set 
> CHECKSUM_UNECESSARY:
> 
> https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c#L5493
>  
> And so the IP/TCP checksums will not be checked at all, which is not
> ideal.

Yup, this all stems from the fact that DSA inherits the checksum offload
features of the conduit (stmmac) from its vlan_features. People think
that vlan_features are inherited only by VLAN upper interfaces, but that
is not the case. Confusingly, in some cases, offloading NETIF_F_IP_CSUM |
NETIF_F_IPV6_CSUM really does work (Broadcom conduit + Broadcom switch,
Marvell conduit + Marvell switch, etc), so we can't remove this mechanism.
But it uncovers lack of API compliance in drivers such as the stmmac,
which is why it is a fragile mechanism.

> There are other DSA tagging protocols which cause different issues. For example 
> DSA_TAG_PROTO_BRCM_PREPEND, which seems to offset the whole MAC header, and 
> DSA_TAG_PROTO_LAN9303 which sets ETH_P_8021Q as its EtherType. I haven't dug too 
> deeply on these issues yet, since I'd rather deal with the checksumming issue 
> before getting distracted by VLAN offloading and other stuff.

I agree that what brcm-prepend does - shifting the entire frame to the
right by 4 octets - sounds problematic in general (making the conduit
see the EtherType as octets [3:2] of the original MAC SA). But you also
need to take a look at where those protocols are used, and if that is
relevant in any way to the stmmac.

	/* Broadcom BCM58xx chips have a flow accelerator on Port 8
	 * which requires us to use the prepended Broadcom tag type
	 */
	if (dev->chip_id == BCM58XX_DEVICE_ID && port == B53_CPU_PORT) {
		dev->tag_protocol = DSA_TAG_PROTO_BRCM_PREPEND;
		goto out;
	}

>From what I understand, DSA_TAG_PROTO_BRCM_PREPEND is only used
internally within Broadcom SoCs, so it seems likely that it's not
designed with generic compatibility in mind.

As for DSA_TAG_PROTO_LAN9303, let me guess what the problem was. TX was
fine, but on RX, the packets got dropped in hardware before they even
reached the stmmac driver, because it declares NETIF_F_HW_VLAN_CTAG_FILTER |
NETIF_F_HW_VLAN_STAG_FILTER as features, and the DSA tags effectively
look like unregistered VLAN traffic.

That is certainly an area where the lan9303 support can be improved.
Other VLAN-based taggers like tag_8021q perform vlan_vid_add() calls on
the conduit interface so that it won't drop the traffic even when it
uses hardware VLAN filtering.

> Among the tagging protocols I tested, the only one that didn't cause any issues 
> was DSA_TAG_PROTO_TRAILER, which only appends stuff to the frame.

It's very curious that you say this. Tail taggers are notoriously
problematic, because while the conduit will perform the checksum offload
function on the packets, the checksum calculation goes until the very end
of the frame. Thus, that checksum will be wrong after the switch consumes
the tail tag (and does not update the L4 checksum).

There is no way to overcome that except to not inherit any checksum
offload features for tail taggers. But that would break some other thing,
so we opted for having this line in the xmit procedure of tail taggers:

	if (skb->ip_summed == CHECKSUM_PARTIAL && skb_checksum_help(skb))
		return NULL;

But apparently we have been inconsistent in applying this to trailer_xmit()
as well. So DSA_TAG_PROTO_TRAILER should actually be a case of "checksum
is computed, but is incorrect after tag stripping", but you say that it
was the only one that worked fine.

> TLDR: The simplest solution seems to be to modify the stmmac TX and RX paths to 
> disable checksum offloading for frames that have a non-IP ethertype in 
> their MAC header. This will fix the checksum situation for DSA tagging protocols 
> that set non-IP and non-8021Q EtherTypes. Some edge cases like 
> DSA_TAG_PROTO_BRCM_PREPEND and DSA_TAG_PROTO_LAN9303 will require a completely 
> different solution if we want these MAC devices to handle them properly.
> Please share any thoughts you might have on this suggestion.

I think the overall idea is correct, with the small mentions of "let's
ignore brcm-prepend" and "lan9303 should work, maybe it's just a case of
disabling the VLAN filtering features through ethtool and testing again?".




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux