> On Tue, Nov 12, 2024 at 02:55:29AM +0000, Tristram.Ha@xxxxxxxxxxxxx wrote: > > > On Fri, Nov 08, 2024 at 05:56:33PM -0800, Tristram.Ha@xxxxxxxxxxxxx wrote: > > > > From: Tristram Ha <tristram.ha@xxxxxxxxxxxxx> > > > > > > > > The SGMII module of KSZ9477 switch can be setup in 3 ways: 0 for direct > > > > connect, 1 for 1000BaseT/1000BaseX SFP, and 2 for 10/100/1000BaseT SFP. > > > > > > This naming is rather odd. First off, i would drop 'SFP'. It does not > > > have to be an SFP on the other end, it could be another switch for > > > example. 1 is PHY_INTERFACE_MODE_1000BASEX and 2 is > > > PHY_INTERFACE_MODE_SGMII. > > > > > > > SFP is typically used so the default is 1. The driver can detect > > > > 10/100/1000BaseT SFP and change the mode to 2. > > > > > > phylink will tell you want mode to use. I would ignore what the > > > hardware detects, so this driver is just the same as every other > > > driver, making it easier to maintain. > > > > There are some issues I found that will need your advises. > > > > The phylink SFP code categorizes SFP using fiber cable as > > PHY_INTERFACE_MODE_1000BASEX and SFP using a regular RJ45 connector as > > PHY_INTERFACE_MODE_SGMII, which has a PHY that can be accessed through > > I2C connection with a PHY driver. > > Not quite correct, i think. If MDIO over I2C does not work, it will > still decide on 1000BaseX vs SGMII from the SFP eeprom contents. There > are some SFPs where the PHY is not accessible, and we have to live > with however it is configured. > > > Now when SGMII SFP is used the phylink > > cannot be created because it fails the validation in > > phylink_sfp_config_phy(). > > Please stop using 'SGMII SFP'. It should just be SGMII. The MAC should > not care what is on the other end, it could be a PHY, and SFP, or a > switch, all using Cisco SGMII. > > > The reason is the phydev has empty supported > > and advertising data fields as it is just created. > > Do you mean the phydev for the PHY in the SFP? Or do you have a second > phydev here? I'm confused. I am a little confused. There may be regular PHY using SGMII with MDIO access just like a RGMII PHY, but we are dealing specifically SFP here. The KSZ9477 switch board has a SFP cage where different SFP can be plugged in. The SFP driver has to be enabled in kernel configuration under Hardware Monitoring. The driver then can read the EEPROM of the SFP and access its PHY if available and provide support to the phylink driver. When the SFP EEPROM says it does not support 1000Base-T then the SFP bus code does not consider the SFP has a PHY and skips creating a MDIO bus for it and phylink_sfp_config_optical() is called to create the phylink. When the SFP says it supports 1000Base-T sfp_add_phy() is called by the SFP state machine and phylink_sfp_connect_phy() and phylink_sfp_config_phy() are run. It is in the last function that the validation fails as the just created phy device does not initialize its supported and advertising fields yet. The phy device has the opportunity later to fill them up if the phylink creation goes through, but that never happens. A fix is to fill those fields with sfp_support like this: @@ -3228,6 +3228,11 @@ static int phylink_sfp_config_phy(struct struct phylink_link_state config; int ret; + /* The newly created PHY device has empty settings. */ + if (linkmode_empty(phy->supported)) { + linkmode_copy(phy->supported, pl->sfp_support); + linkmode_copy(phy->advertising, pl->sfp_support); + } linkmode_copy(support, phy->supported); memset(&config, 0, sizeof(config)); The provided PCS driver from the DSA driver has an opportunity to change support with its validation check, but that does not look right as generally those checks remove certain bits from the link mode, but this requires completely copying new ones. And this still does not work as the advertising field passed to the PCS driver has a const modifier. > > I mentioned the SGMII module operates differently for two types of SFP: > > SGMII and 1000BASEX. The 1000Base-T SFP operates the same as 1000Base-SX > > fiber SFP, and the driver would like it to be assigned > > PHY_INTERFACE_MODE_1000BASEX, but it is always assigned > > PHY_INTERFACE_MODE_SGMII in sfp_select_interface because 1000baseT_Full > > is compared before 1000baseX_Full. > > > > Now I am not sure if those SFPs I tested have correct EEPROM. Some > > no-brand ones return 0xff value when the PHY driver reads the link status > > from them and so that driver cannot tell when the link is down. Other > > than that those SFPs operate correctly in forwarding traffic. > > There is no standardisation of how you access the PHY in an SFP. So > each manufacture can do their own thing. However, there are a small > number of PHYs actually used inside SFPs, and we have support for > those common ones. Now back to the discussion of the different modes used by the SGMII module. I think a better term like SerDes can be used to help understanding the operation, although I still cannot narrow down the precise definitions from looking at the internet. SGMII mode is said to support 10/100/1000Mbit. This is the default setting, so plugging such SFP allows the port to communicate without any register programming. The other mode is SerDes, which is fixed at 1000Mbit. This is typically used by SFP using fiber optics. This requires changing a register to make the port works. It seems those 1000Base-T SFPs all run in SerDes mode, at least from all SFPs I tried. The issue is then phylink assigns SGMII phy mode to such SFP as its EEPROM just says 1000Base-T support and not 1000BASEX phy mode so that the DSA driver can program the register correspondingly. Because of that the driver still needs to rely on its own detection to find out which mode to use. > Have you set pcs.poll? phylink will then poll the PCS every > second. You can report PCS status any time. I know about PCS polling. The SFP cage driver can provide link_up and link_down indications to the phylink driver. I thought this feature can be used without activating polling and when interrupt is not used. But that link_up indication can result with the port not connected as I mentioned. Anyway the SFP driver has its own state machine doing polling, so it is not like resources are not used. One more issue is if a SFP is not plugged in eventually the SFP driver says "please wait, module slow to respond." It may look like an error to regular users. The next error message definitely looks like it: "failed to read EEPROM: -EREMOTEIO." Do you know if anything can be done like defining some GPIO pins like setting up the tx_disable pin?