On Tue, Feb 20, 2024 at 04:50:37PM +0200, Vladimir Oltean wrote: > Hi Zachary, > > On Wed, Feb 14, 2024 at 05:21:54PM -0500, Zachary Goldstein via B4 Relay wrote: > > From: Zachary Goldstein <zachary.goldstein@xxxxxxxxxxxxxxxxx> > > > > The AQR107 family does not support XGMII, but USXGMII and > > 10GBASE-R instead. Since ce64c1f77a9d ("net: phy: aquantia: add USXGMII > > support and warn if XGMII mode is set") the kernel warns about XGMII > > being used. The LS1046A SoC does not support USXGMII, so use 10GBASE-R > > instead. > > > > Signed-off-by: Zachary Goldstein <zachary.goldstein@xxxxxxxxxxxxxxxxx> > > --- > > arch/arm64/boot/dts/freescale/fsl-ls1046a-rdb.dts | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1046a-rdb.dts b/arch/arm64/boot/dts/freescale/fsl-ls1046a-rdb.dts > > index 07f6cc6e354a..d2066f733dc5 100644 > > --- a/arch/arm64/boot/dts/freescale/fsl-ls1046a-rdb.dts > > +++ b/arch/arm64/boot/dts/freescale/fsl-ls1046a-rdb.dts > > @@ -149,7 +149,7 @@ ethernet@ea000 { > > > > ethernet@f0000 { /* 10GEC1 */ > > phy-handle = <&aqr106_phy>; > > - phy-connection-type = "xgmii"; > > + phy-connection-type = "10gbase-r"; > > }; > > > > ethernet@f2000 { /* 10GEC2 */ > > > > --- > > 2.40.1 > > base-commit: 841c35169323cd833294798e58b9bf63fa4fa1de > > > > > > You do not need this patch in upstream, and I strongly advise against > merging it as-is. I've just tested pure net-next on LS1046A-RDB, and I > can bring up fm1-mac9 without any warning. > > You'll notice that commit 5d93cfcf7360 ("net: dpaa: Convert to phylink") > did this in fman_memac.c: > > /* The internal connection to the serdes is XGMII, but this isn't > * really correct for the phy mode (which is the external connection). > * However, this is how all older device trees say that they want > * 10GBASE-R (aka XFI), so just convert it for them. > */ > if (mac_dev->phy_if == PHY_INTERFACE_MODE_XGMII) > mac_dev->phy_if = PHY_INTERFACE_MODE_10GBASER; > > So, what gets passed to the Aquantia PHY is PHY_INTERFACE_MODE_10GBASER, > even if in the device tree, phy-connection-type = "xgmii". > > Now, _if_ your patch were to be applied on top of upstream, it would > actually break fm1-mac9 like this (WARN_ON added by me for a call stack > of the phylink_validate() failure path): > > WARNING: CPU: 2 PID: 1 at drivers/net/phy/phylink.c:763 phylink_create+0x8f8/0x90c > Modules linked in: > CPU: 2 PID: 1 Comm: swapper/0 Tainted: G W 6.8.0-rc4-01058-g9e1deba407fb #1812 > Hardware name: LS1046A RDB Board (DT) > Call trace: > phylink_create+0x8f8/0x90c > dpaa_netdev_init+0x1a8/0x2c8 > dpaa_eth_probe+0xd70/0xf64 > platform_probe+0xa8/0xd0 > really_probe+0x130/0x2e4 > __driver_probe_device+0xa0/0x128 > driver_probe_device+0x3c/0x200 > __driver_attach+0xe8/0x1b4 > bus_for_each_dev+0xec/0x144 > driver_attach+0x24/0x30 > bus_add_driver+0x154/0x244 > driver_register+0x68/0x100 > __platform_driver_register+0x24/0x30 > dpaa_load+0x34/0x64 > do_one_initcall+0xf8/0x34c > ---[ end trace 0000000000000000 ]--- > fsl_dpaa_mac 1af0000.ethernet (unnamed net_device) (uninitialized): failed to validate link configuration for in-band status > fsl_dpaa_mac 1af0000.ethernet: error -EINVAL: Could not create phylink > fsl_dpa: probe of dpaa-ethernet.4 failed with error -22 > > It fails because of this in phylink_validate(): > > if (!test_bit(state->interface, interfaces)) > return -EINVAL; > > And state->interface (PHY_INTERFACE_MODE_10GBASER) is not in > mac_dev->phylink_config.supported_interfaces, because the fman_memac > code is not prepared to handle this combination. > > The device tree node for fm1-mac9 looks like this, generated by an > awkward merge between the following: > > ethernet@f0000 { > phy-connection-type = "xgmii"; // fsl-ls1046a-rdb.dts > local-mac-address = [...]; // U-Boot > cell-index = <0x8>; // qoriq-fman3-0-10g-0.dtsi > pcsphy-handle = <0x31>; // qoriq-fman3-0-10g-0.dtsi > compatible = "fsl,fman-memac"; // qoriq-fman3-0-10g-0.dtsi > reg = <0xf0000 0x1000>; // qoriq-fman3-0-10g-0.dtsi > phy-handle = <&aqr106_phy>; // fsl-ls1046a-rdb.dts > fsl,fman-ports = <0x2f 0x30>; // qoriq-fman3-0-10g-0.dtsi > }; > > Notice that unlike fm1-mac10 (node "ethernet@f2000"), there is no > pcs-handle-names property (fm1-mac10 has it defined in fsl-ls1046-post.dtsi, > whereas fm1-mac9 doesn't. Don't ask me why, I don't know....) > > So, knowing that of_property_match_string(mac_node, "pcs-handle-names", "xfi") > will return an error code for fm1-mac9, now please walk through memac_initialization() > and see what happens in the 2 cases: > > - mac_dev->phy_if == PHY_INTERFACE_MODE_XGMII (current device tree). The > code creates a default PCS and assigns it to memac->xfi_pcs like this: > if (err == -EINVAL || err == -ENODATA) > pcs = memac_pcs_create(mac_node, 0); > (...) > if (err && mac_dev->phy_if == PHY_INTERFACE_MODE_XGMII) > memac->xfi_pcs = pcs; > > - mac_dev->phy_if == PHY_INTERFACE_MODE_10GBASER (your modification). > The code will still create the default PCS, but assign it to > memac->sgmii_pcs instead! > > if (err && mac_dev->phy_if == PHY_INTERFACE_MODE_XGMII) // not XGMII, but 10GBASER! > memac->xfi_pcs = pcs; > else > memac->sgmii_pcs = pcs; > > And this is why, with a NULL memac->xfi_pcs, PHY_INTERFACE_MODE_10GBASER > will not be in phylink's supported_interfaces. > Admittedly, I had made the wrong assumption from looking at the dpaa driver code and fm1-mac10, that "xgmii" and "10gbase-r" were interchangeable. > To make your device tree patch work as intended with the current > mainline code, what you want is to also modify the driver like this: > > >From d6bda34b132d17d1c236d27436b9335fac22c062 Mon Sep 17 00:00:00 2001 > From: Vladimir Oltean <vladimir.oltean@xxxxxxx> > Date: Tue, 20 Feb 2024 16:12:27 +0200 > Subject: [PATCH] net: dpaa: fman_memac: accept phy-interface-type = > "10gbase-r" in the device tree > > We support the phy-interface-mode = "xgmii" conversion to "10gbase-r" > through code, but not actually through the device tree proper. This is > because boards such as LS1046A-RDB do not define pcs-handle-names in the > ethernet@f0000 device tree node, and the code only has fallback xfi_pcs > determination logic for "xgmii". > > By reversing the order between the fallback xfi_pcs assignment and the > "xgmii" overwrite with "10gbase-r", we are able to support both values > in the device tree, with identical behavior. > > Signed-off-by: Vladimir Oltean <vladimir.oltean@xxxxxxx> > --- > .../net/ethernet/freescale/fman/fman_memac.c | 18 +++++++++--------- > 1 file changed, 9 insertions(+), 9 deletions(-) > > diff --git a/drivers/net/ethernet/freescale/fman/fman_memac.c b/drivers/net/ethernet/freescale/fman/fman_memac.c > index e30bf75b1d48..0996759907a8 100644 > --- a/drivers/net/ethernet/freescale/fman/fman_memac.c > +++ b/drivers/net/ethernet/freescale/fman/fman_memac.c > @@ -1076,6 +1076,14 @@ int memac_initialization(struct mac_device *mac_dev, > unsigned long capabilities; > unsigned long *supported; > > + /* The internal connection to the serdes is XGMII, but this isn't > + * really correct for the phy mode (which is the external connection). > + * However, this is how all older device trees say that they want > + * 10GBASE-R (aka XFI), so just convert it for them. > + */ > + if (mac_dev->phy_if == PHY_INTERFACE_MODE_XGMII) > + mac_dev->phy_if = PHY_INTERFACE_MODE_10GBASER; > + > mac_dev->phylink_ops = &memac_mac_ops; > mac_dev->set_promisc = memac_set_promiscuous; > mac_dev->change_addr = memac_modify_mac_address; > @@ -1142,7 +1150,7 @@ int memac_initialization(struct mac_device *mac_dev, > * (and therefore that xfi_pcs cannot be set). If we are defaulting to > * XGMII, assume this is for XFI. Otherwise, assume it is for SGMII. > */ > - if (err && mac_dev->phy_if == PHY_INTERFACE_MODE_XGMII) > + if (err && mac_dev->phy_if == PHY_INTERFACE_MODE_10GBASER) > memac->xfi_pcs = pcs; > else > memac->sgmii_pcs = pcs; > @@ -1156,14 +1164,6 @@ int memac_initialization(struct mac_device *mac_dev, > goto _return_fm_mac_free; > } > > - /* The internal connection to the serdes is XGMII, but this isn't > - * really correct for the phy mode (which is the external connection). > - * However, this is how all older device trees say that they want > - * 10GBASE-R (aka XFI), so just convert it for them. > - */ > - if (mac_dev->phy_if == PHY_INTERFACE_MODE_XGMII) > - mac_dev->phy_if = PHY_INTERFACE_MODE_10GBASER; > - > /* TODO: The following interface modes are supported by (some) hardware > * but not by this driver: > * - 1000BASE-KX > > But! > > Device tree is stable ABI, and changes made to the device tree file are > meant to be backwards and forwards compatible with the code (it can be > provided separately and not necessarily in lockstep with the kernel > version. For example, I understand Arm SystemReady IR wants U-Boot to > provide its own device tree to Linux through EFI). So, in general, > device tree changes which only work with a corresponding kernel change > are frowned upon (unless maybe if the kernel change is a bug fix that is > backported to all relevant stable kernel branches). > > So at this stage we should take a step back and re-analyze the cost/benefit. > You said there is a stack trace in the Aquantia PHY driver, which there > is not (in current mainline kernels). I _think_ you are seeing the stack > trace with LSDK, which is currently distributed on top of linux-6.1.y > and has not yet integrated the fman_memac conversion to phylink - thus, > it does not contain commit 5d93cfcf7360 ("net: dpaa: Convert to phylink"). > At least, I do see this stack trace there. I think it can also be seen > with mainline kernels before the phylink conversion, but I did not test > those. > > The main take-away is: ALWAYS test the patch you are submitting to the > target kernel it is going to be applied to. Especially in the area of > device tree bindings for DPAA1, things are rarely as simple as they > appear :) If you don't, you will have an unintended negative effect > upon current mainline kernels (which must still work), and not the > intended effect upon LSDK (more below). This is my mistake, as at the time I had no way of testing this patch on 6.8. After looking at commit 5d93cfcf7360 ("net: dpaa: Convert to phylink"), I had made the (wrong) assumption that changing to "10gbase-r" was pretty harmless. > > There are a few options to go forward from here. > > As there is nothing broken in the mainline kernel where you are > submitting this patch, the simplest one, as bland as it may sound, is > simply to wait for a new LSDK release on top of linux-6.6.y. Even in > lf-6.1.y, AFAICS, nothing is broken except for the stack trace. You can > keep the patch in your local kernel tree copy to suppress that. > > The other option would be to submit the fman_memac change as a bug fix > for stable, wait for a while for it to have time to propagate, then > modify the device tree as well. But, it is much higher effort, and > there is no procedure in place, AFAIK, for LSDK to integrate your patch > (other than through upstream + a few months of waiting). The upcoming > LSDK release will be on top of linux-6.6.y, it will make your effort > irrelevant if it's only for suppressing the stack trace, and you are > racing against it. This path is only worth it if you have the dedication > to dig a bit deeper and tidy things up in the DPAA1 kernel support > (which would be appreciated though). > > _If_ you are using an older linux-stable branch but not LSDK, yes, the > feedback loop between your patch and its effect will close much sooner, > but you will have to convince the linux-stable people that it's worth > accepting your driver rework patches for a functional reason and not > just aesthetic (see Documentation/process/stable-kernel-rules.rst for > reference). Considering "xgmii" is a deprecated phy type for the AQR107 family, I still think there is some value to this change beyond aesthetics if support for "10gbase-r" is ever added.