Felix Fietkau <nbd@xxxxxxxx> writes: > On 2020-12-17 18:26, Toke Høiland-Jørgensen wrote: >> Felix Fietkau <nbd@xxxxxxxx> writes: >>> If this becomes a problem, I think we should add a similar patch to >>> wireguard, which already calls skb_get_hash before encapsulating. >>> Other regular tunnels should already get a proper hash, since the flow >>> dissector will take care of it. >> >> But then we'd need to go around adding this to all the places that uses >> the hash just to work around a particular piece of broken(ish) hardware. >> And we're hard-coding a behaviour in mac80211 that means we'll *always* >> recompute the hash, even for hardware that's not similarly broken. >> >>> The reason I did this patch is because I have a patch to set the hw flow >>> hash in the skb on mtk_eth_soc, which does help GRO, but leads to >>> collisions on mac80211 fq. >> >> So wouldn't the right thing to do here be to put a flag into the RX >> device that makes the stack clear the hash after using it for GRO? > I don't think the hardware is broken, I think fq is simply making > assumptions about the hash that aren't met by the hw. > > The documentation in include/linux/skbuff.h mentions these requirements > for the skb hash: > * 1) Two packets in different flows have different hash values > * 2) Two packets in the same flow should have the same hash value > > FWIW, I think the 'should' from 2) probably belongs to 1), otherwise it > makes no sense. Two packets of the flow must return the same hash, > otherwise the hash is broken. I'm assuming this is a typo. There's some text further down indicating this is deliberate: * A driver may indicate a hash level which is less specific than the * actual layer the hash was computed on. For instance, a hash computed * at L4 may be considered an L3 hash. This should only be done if the * driver can't unambiguously determine that the HW computed the hash at * the higher layer. Note that the "should" in the second property above * permits this. So the way I'm reading that whole section, either the intent is that both properties should be fulfilled, or that the first one (being collision-free) is more important... > In addition to those properties, fq needs the hash to be > cryptographically secure, so that it can use reciprocal_scale to sort > flows into buckets without allowing an attacker to craft collisions. > That's also the reason why it used to use skb_get_hash_perturb with a > random perturbation until we got software hashes based on siphash. > > I think it's safe to assume that most hardware out there will not > provide collision resistant hashes, so in my opinion fq cannot rely on a > hardware hash. We don't need to go around and change all places that use > the hash, just those that assume a collision resistant one. I did a quick grep-based survey of uses of skb_get_hash() outside drivers - this is what I found (with my interpretations of what they're used for): net/core/dev.c : skb_tx_hash() - selecting TX queue w/reciprocal scale net/core/dev.c : RX flow steering, flow limiting net/core/dev.c : GRO net/core/filter.c : BPF helper include/net/ip_tunnels.h : flowi4_multipath_hash - so multipath selection? net/ipv{4,6}/route.c : multipath hashing (if l4) net/ipv6/seg6_iptunnel : building flow labels net/mac80211/tx.c : FQ net/mptcp/syncookies : storing cookies (XOR w/net_hash_mix()) net/netfilter/nft_hash.c : symhash input (seems to be load balancing) net/openvswitch : flow hashing and actions net/packet/af_packet.c : PACKET_FANOUT_HASH net/sched/sch_*.c : flow hashing for queueing Apart from GRO it's not obvious to me that a trivially attacker-controlled hash is safe in any of those uses? -Toke