Hi netdev, we are still stale in adding pcs driver here...can you please show us the way to go?? daniel posted another attempt but this seems also stale. https://patchwork.kernel.org/project/netdevbpf/patch/ba4e359584a6b3bc4b3470822c42186d5b0856f9.1721910728.git.daniel@xxxxxxxxxxxxxx/ @Angelo/Matthias, can you help solving this? left full quote below.... regards Frank > Gesendet: Mittwoch, 10. Juli 2024 um 13:17 Uhr > Von: "Frank Wunderlich" <frank-w@xxxxxxxxxxxxxxx> > An: "Russell King (Oracle)" <linux@xxxxxxxxxxxxxxx> > Cc: "David S. Miller" <davem@xxxxxxxxxxxxx>, "Eric Dumazet" <edumazet@xxxxxxxxxx>, "Jakub Kicinski" <kuba@xxxxxxxxxx>, "Paolo Abeni" <pabeni@xxxxxxxxxx>, "Rob Herring" <robh+dt@xxxxxxxxxx>, "Krzysztof Kozlowski" <krzysztof.kozlowski+dt@xxxxxxxxxx>, "Conor Dooley" <conor+dt@xxxxxxxxxx>, "Chunfeng Yun" <chunfeng.yun@xxxxxxxxxxxx>, "Vinod Koul" <vkoul@xxxxxxxxxx>, "Kishon Vijay Abraham I" <kishon@xxxxxxxxxx>, "Felix Fietkau" <nbd@xxxxxxxx>, "John Crispin" <john@xxxxxxxxxxx>, "Sean Wang" <sean.wang@xxxxxxxxxxxx>, "Mark Lee" <Mark-MC.Lee@xxxxxxxxxxxx>, "Lorenzo Bianconi" <lorenzo@xxxxxxxxxx>, "Matthias Brugger" <matthias.bgg@xxxxxxxxx>, "AngeloGioacchino Del Regno" <angelogioacchino.delregno@xxxxxxxxxxxxx>, "Andrew Lunn" <andrew@xxxxxxx>, "Heiner Kallweit" <hkallweit1@xxxxxxxxx>, "Alexander Couzens" <lynxis@xxxxxxx>, "Qingfang Deng" <dqfext@xxxxxxxxx>, "SkyLake Huang" <SkyLake.Huang@xxxxxxxxxxxx>, "Philipp Zabel" <p.zabel@xxxxxxxxxxxxxx>, netdev@xxxxxxxxxxxxxxx, devicetree@xxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, linux-arm-kernel@xxxxxxxxxxxxxxxxxxx, linux-mediatek@xxxxxxxxxxxxxxxxxxx, linux-phy@xxxxxxxxxxxxxxxxxxx, "Daniel Golle" <daniel@xxxxxxxxxxxxxx> > Betreff: Re: [RFC PATCH net-next v3 3/8] net: pcs: pcs-mtk-lynxi: add platform driver for MT7988 > > Hi Russel / netdev > > how can we proceed here? development for mt7988/Bpi-R4 is stalled here > because it is unclear what's the right way to go... > > We want to avoid much work adding a complete new framework for a "small" > change (imho this affects all SFP-like sub-devices and their linking to > phylink/pcs driver)...if this really the way to go then some boundary > conditions will be helpful. > > regards Frank > > Am 22.04.24 um 18:24 schrieb Daniel Golle: > > Hi Russell, > > Hi netdev crew, > > > > I haven't received any reply for this email and am still waiting for > > clarification regarding how inter-driver dependencies should be modelled > > in that case of mtk_eth_soc. My search for good examples for that in the > > kernel code was quite frustrating and all I've found are supposedly bugs > > of that exact cathegory. > > > > Please see my questions mentioned in the previous email I've sent and > > also a summary of suggested solutions inline below: > > > > On Wed, Feb 07, 2024 at 01:29:18AM +0000, Daniel Golle wrote: > >> Hi Russell, > >> > >> sorry for the extended time it took me to get back to this patch and > >> the comments you made for it. Understanding the complete scope of the > >> problem took a while (plus there were holidays and other fun things), > >> and also brought up further questions which I hope you can help me > >> find good answers for, see below: > >> > >> On Wed, Dec 13, 2023 at 04:04:12PM +0000, Russell King (Oracle) wrote: > >>> On Tue, Dec 12, 2023 at 03:47:18AM +0000, Daniel Golle wrote: > >>>> Introduce a proper platform MFD driver for the LynxI (H)SGMII PCS which > >>>> is going to initially be used for the MT7988 SoC. > >>>> > >>>> Signed-off-by: Daniel Golle <daniel@xxxxxxxxxxxxxx> > >>> I made some specific suggestions about what I wanted to see for > >>> "getting" PCS in the previous review, and I'm disappointed that this > >>> patch set is still inventing its own solution. > >>> > >>>> +struct phylink_pcs *mtk_pcs_lynxi_get(struct device *dev, struct device_node *np) > >>>> +{ > >>>> + struct platform_device *pdev; > >>>> + struct mtk_pcs_lynxi *mpcs; > >>>> + > >>>> + if (!np) > >>>> + return NULL; > >>>> + > >>>> + if (!of_device_is_available(np)) > >>>> + return ERR_PTR(-ENODEV); > >>>> + > >>>> + if (!of_match_node(mtk_pcs_lynxi_of_match, np)) > >>>> + return ERR_PTR(-EINVAL); > >>>> + > >>>> + pdev = of_find_device_by_node(np); > >>>> + if (!pdev || !platform_get_drvdata(pdev)) { > >>> This is racy - as I thought I described before, userspace can unbind > >>> the device in one thread, while another thread is calling this > >>> function. With just the right timing, this check succeeds, but... > >>> > >>>> + if (pdev) > >>>> + put_device(&pdev->dev); > >>>> + return ERR_PTR(-EPROBE_DEFER); > >>>> + } > >>>> + > >>>> + mpcs = platform_get_drvdata(pdev); > >>> mpcs ends up being read as NULL here. Even if you did manage to get a > >>> valid pointer, "mpcs" being devm-alloced could be freed from under > >>> you at this point... > >>> > >>>> + device_link_add(dev, mpcs->dev, DL_FLAG_AUTOREMOVE_CONSUMER); > >>> resulting in this accessing memory which has been freed. > >>> > >>> The solution would be either to suppress the bind/unbind attributes > >>> (provided the underlying struct device can't go away, which probably > >>> also means ensuring the same of the MDIO bus. Aternatively, adding > >>> a lock around the remove path and around the checking of > >>> platform_get_drvdata() down to adding the device link would probably > >>> solve it. > >>> > >>> However, I come back to my general point - this kind of stuff is > >>> hairy. Do we want N different implementations of it in various drivers > >>> with subtle bugs, or do we want _one_ implemenatation. > >>> > >>> If we go with the one implemenation approach, then we need to think > >>> about whether we should be using device links or not. The problem > >>> could be for network interfaces where one struct device is > >>> associated with multiple network interfaces. Using device links has > >>> the unfortunate side effect that if the PCS for one of those network > >>> interfaces is removed, _all_ network interfaces disappear. > >> I agree, esp. on this MT7988 removal of a PCS which may then not > >> even be in use also impairs connectivity on the built-in gigE DSA > >> switch. It would be nice to try to avoid this. > >> > >>> My original suggestion was to hook into phylink to cause that to > >>> take the link down when an in-use PCS gets removed. > >> I took a deep dive into how this could be done correctly and how > >> similar things are done for other drivers. Apart from the PCS there > >> often also is a muxing-PHY involved, eg. MediaTek's XFI T-PHY in this > >> case (previously often called "pextp" for no apparent reason) or > >> Marvell's comphy (mvebu-comphy). > >> > >> So let's try something simple on an already well-supported platform, > >> I thought and grabbed Marvell Armada CN9131-DB running vanilla Linux, > >> and it didn't even take some something racy, but plain: > >> > >> ip link set eth6 up > >> cd /sys/bus/platform/drivers/mvebu-comphy > >> echo f2120000.phy > unbind > >> echo f4120000.phy > unbind > >> echo f6120000.phy > unbind > >> ip link set eth6 down > >> > >> > >> That was enough. The result is a kernel crash, and the same should > >> apply on popular platforms like the SolidRun MACCHIATOBin and other > >> similar boards. > >> > >> So this gets me to think that there is a wider problem around > >> non-phylink-managed resources which may disappear while in use by > >> network drivers, and I guess that the same applies in many other > >> places. I don't have a SATA drive connected to that Marvell board, but > >> I can imagine what happens when suddenly removing the comphy instance > >> in charge of the SATA link and then a subsequent SATA hotplug event > >> happens or stuff like that... > >> > >> Anyway, back to network subsystem and phylink: > >> > >> Do you agree that we need a way to register (and unregister) PCS > >> providers with phylink, so we don't need *_get_by_of_node implementations > >> in each driver? If so, can you sketch out what the basic requirements > >> for an acceptable solution would be? > >> > >> Also, do you agree that lack of handling of disappearing resources, > >> such as PHYs (*not* network PHYs, but PHYs as in drivers/phy/*) or > >> syscons, is already a problem right now which should be addressed? > >> > >> If you imagine phylink to take care of the life-cycle of all link- > >> resources, what is vision about those things other than classic MDIO- > >> connected PHYs? > >> > >> And well, of course, the easy fix for the example above would be: > >> diff --git a/drivers/phy/marvell/phy-mvebu-cp110-comphy.c b/drivers/phy/marvell/phy-mvebu-cp110-comphy.c > >> index b0dd133665986..9517c96319595 100644 > >> --- a/drivers/phy/marvell/phy-mvebu-cp110-comphy.c > >> +++ b/drivers/phy/marvell/phy-mvebu-cp110-comphy.c > >> @@ -1099,6 +1099,7 @@ static const struct of_device_id mvebu_comphy_of_match_table[] = { > >> MODULE_DEVICE_TABLE(of, mvebu_comphy_of_match_table); > >> > >> static struct platform_driver mvebu_comphy_driver = { > >> + .suppress_bind_attrs = true, > >> .probe = mvebu_comphy_probe, > >> .driver = { > >> .name = "mvebu-comphy", > >> > >> But that should then apply to every single driver in drivers/phy/... > >> > > My remaining questions are: > > - Is it ok to just use .suppress_bind_attrs = true for PCS and PHY > > drivers to avoid having to care out hot-removal? > > Those components are anyway built-into the SoC so removing them > > is physically not possible. > > > > - In case of driver removal (rmmod -f) imho using a device-link would > > be sufficient to prevent the worst. However, we would have to live > > with the all-or-nothing nature of that approach, ie. once e.g. the > > USXGMII driver is being removed, *all* Ethernet interfaces would > > disappear with it (even those not actually using USXGMII). > > > > If the solutions mentioned above do not sound agreeable, please suggest > > how a good solution would look like, ideally also share an example of > > any driver in the kernel where this is done in the way you would like > > to have things done. > > > >>>> + > >>>> + return &mpcs->pcs; > >>>> +} > >>>> +EXPORT_SYMBOL(mtk_pcs_lynxi_get); > >>>> + > >>>> +void mtk_pcs_lynxi_put(struct phylink_pcs *pcs) > >>>> +{ > >>>> + struct mtk_pcs_lynxi *cur, *mpcs = NULL; > >>>> + > >>>> + if (!pcs) > >>>> + return; > >>>> + > >>>> + mutex_lock(&instance_mutex); > >>>> + list_for_each_entry(cur, &mtk_pcs_lynxi_instances, node) > >>>> + if (pcs == &cur->pcs) { > >>>> + mpcs = cur; > >>>> + break; > >>>> + } > >>>> + mutex_unlock(&instance_mutex); > >>> I don't see what this loop gains us, other than checking that the "pcs" > >>> is still on the list and hasn't already been removed. If that is all > >>> that this is about, then I would suggest: > >>> > >>> bool found = false; > >>> > >>> if (!pcs) > >>> return; > >>> > >>> mpcs = pcs_to_mtk_pcs_lynxi(pcs); > >>> mutex_lock(&instance_mutex); > >>> list_for_each_entry(cur, &mtk_pcs_lynxi_instances, node) > >>> if (cur == mpcs) { > >>> found = true; > >>> break; > >>> } > >>> mutex_unlock(&instance_mutex); > >>> > >>> if (WARN_ON(!found)) > >>> return; > >>> > >>> which makes it more obvious why this exists. > >> The idea was not only to make sure it hasn't been removed, but also > >> to be sure that what ever the private data pointer points to has > >> actually been created by that very driver. > >> > >> But yes, doing it in the way you suggest will work in the same way, > >> just when having my idea in mind it looks more fishy to use > >> pcs_to_mtk_pcs_lynxi() before we are 100% sure that what we dealing > >> with has previously been created by this driver. > >> > >> > >> Cheers > >> > >> > >> Daniel > >> >