On 7/20/22 8:03 AM, Russell King (Oracle) wrote: > On Tue, Jul 19, 2022 at 07:49:50PM -0400, Sean Anderson wrote: >> Second, to reduce packet loss it may be desirable to throttle packet >> throughput. In past discussions [2-4], this behavior has been >> controversial. > > It isn't controversial at all. It's something we need to support, but > the point I've been making is that if we're adding rate adaption, then > we need to do a better job when designing the infrastructure to cater > for all currently known forms of rate adaption amongst the knowledge > pool that we have, not just one. That's why I brought up the IPG-based > method used by 88x3310. > > Phylink development is extremely difficult, and takes months or years > for changes to get into mainline when updates to drivers are required - > this is why I have a massive queue of changes all the time. > >> It is the opinion of several developers that it is the >> responsibility of the system integrator or end user to set the link >> settings appropriately for rate adaptation. In particular, it was argued >> that it is difficult to determine whether a particular phy has rate >> adaptation enabled, and it is simpler to keep such determinations out of >> the kernel. > > I don't think I've ever said that... You haven't. This mostly stems from https://lore.kernel.org/netdev/DB8PR04MB6985139D4ABED85B701445A9EC050@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ where there was some discussion about whose responsibility it was to determine whether rate adaptation was supported. The implication being that we should delay support for rate adaptation until we could reliably determine whether it was supported. This of course is not quite implemented yet. While we can determine whether rate adaptation is actually in-use, I don't know if we can determine whether it is available before trying to bring the link up. >> Another criticism is that packet loss may happen anyway, such >> as if a faster link is used with a switch or repeater that does not support >> pause frames. > > That isn't what I've said. Packet loss may happen if (a) pause frames > can not be sent by a PHY in rate adaption mode and (b) if the MAC can't > pace its transmission for the media/line speed. This is a fundamental > fact where a PHY will only have so much buffering capability, that if > the MAC sends packets at a faster rate than the PHY can get them out, it > runs out of buffer space. That isn't a criticism, it's a statement of > fact. You're right. I mainly wanted to bring up what you just noted: that we may have packet loss anyway, and that higher-layer protocols already deal with packet loss. So a MAC unaware of the rate adaptation is not necessarily the worst thing. >> I believe that our current approach is limiting, especially when >> considering that rate adaptation (in two forms) has made it into IEEE >> standards. In general, When we have appropriate information we should set >> sensible defaults. To consider use a contrasting example, we enable pause >> frames by default for switches which autonegotiate for them. When it's the >> phy itself generating these frames, we don't even have to autonegotiate to >> know that we should enable pause frames. > > I'm not sure I understand what you're saying, because it doesn't match > what I've seen. > Sorry, I was unclear here. I meant link partners, not local (DSA) switches. > "we enable pause frames by default for swithes which autonegotiate for > them" - what are you talking about there? The "user" ports on the > switch, or the DSA/CPU ports? It has been argued that pause frames > should not be enabled for the CPU port, particularly when the CPU port > runs at a slower speed than the switch - which happens e.g. on the VF610 > platforms. > > Most CPU ports to switches I'm aware of are specified either using a > fixed link in firmware or default to a fixed link both without pause > frames. Maybe this is just a quirk of the mv88e6xxx setup. > > "when it's the phy itself generating these frames, we don't even have to > autonegotiate to know that we should enable pause frames." I'm not sure > that's got any relevance. When a PHY is in rate adapting mode, there are > two separate things that are going on. There's the media side link > negotiation and parameters, and then there's the requirements of the > host-side link. The parameters of the host-side link do not need to be > negotiated with the link partner, but they do potentially affect what > link modes we can negotiate with our link partner (for example, if the > PHY can't handle HD on the media side with the MAC operating FD). In any > case, if the PHY requires the MAC to receive pause frames for its rate > adaption to work, then this doesn't affect the media side > autonegotiation at all. Hence, I don't understand this comment. > >> Note that >> even when we determine (e.g.) the pause settings based on whether rate >> adaptation is enabled, they can still be overridden by userspace (using >> ethtool). It might be prudent to allow disabling of rate adaptation >> generally in ethtool as well. > > This is no longer true as this patch set overrides whatever receive > pause state has been negotiated or requested by userspace so that rate > adaption can still work. Right, I forgot to edit this. > The future work here is to work out whether we should disable rate > adaption if userspace requests receive pause frames to be disabled, or > whether switching to another form of controlling rate adaption would be > appropriate and/or possible. > I'm not sure what the best course here is either. --Sean