On Tue, Dec 12, 2023 at 03:47:18AM +0000, Daniel Golle wrote: > Introduce a proper platform MFD driver for the LynxI (H)SGMII PCS which > is going to initially be used for the MT7988 SoC. > > Signed-off-by: Daniel Golle <daniel@xxxxxxxxxxxxxx> I made some specific suggestions about what I wanted to see for "getting" PCS in the previous review, and I'm disappointed that this patch set is still inventing its own solution. > +struct phylink_pcs *mtk_pcs_lynxi_get(struct device *dev, struct device_node *np) > +{ > + struct platform_device *pdev; > + struct mtk_pcs_lynxi *mpcs; > + > + if (!np) > + return NULL; > + > + if (!of_device_is_available(np)) > + return ERR_PTR(-ENODEV); > + > + if (!of_match_node(mtk_pcs_lynxi_of_match, np)) > + return ERR_PTR(-EINVAL); > + > + pdev = of_find_device_by_node(np); > + if (!pdev || !platform_get_drvdata(pdev)) { This is racy - as I thought I described before, userspace can unbind the device in one thread, while another thread is calling this function. With just the right timing, this check succeeds, but... > + if (pdev) > + put_device(&pdev->dev); > + return ERR_PTR(-EPROBE_DEFER); > + } > + > + mpcs = platform_get_drvdata(pdev); mpcs ends up being read as NULL here. Even if you did manage to get a valid pointer, "mpcs" being devm-alloced could be freed from under you at this point... > + device_link_add(dev, mpcs->dev, DL_FLAG_AUTOREMOVE_CONSUMER); resulting in this accessing memory which has been freed. The solution would be either to suppress the bind/unbind attributes (provided the underlying struct device can't go away, which probably also means ensuring the same of the MDIO bus. Aternatively, adding a lock around the remove path and around the checking of platform_get_drvdata() down to adding the device link would probably solve it. However, I come back to my general point - this kind of stuff is hairy. Do we want N different implementations of it in various drivers with subtle bugs, or do we want _one_ implemenatation. If we go with the one implemenation approach, then we need to think about whether we should be using device links or not. The problem could be for network interfaces where one struct device is associated with multiple network interfaces. Using device links has the unfortunate side effect that if the PCS for one of those network interfaces is removed, _all_ network interfaces disappear. My original suggestion was to hook into phylink to cause that to take the link down when an in-use PCS gets removed. > + > + return &mpcs->pcs; > +} > +EXPORT_SYMBOL(mtk_pcs_lynxi_get); > + > +void mtk_pcs_lynxi_put(struct phylink_pcs *pcs) > +{ > + struct mtk_pcs_lynxi *cur, *mpcs = NULL; > + > + if (!pcs) > + return; > + > + mutex_lock(&instance_mutex); > + list_for_each_entry(cur, &mtk_pcs_lynxi_instances, node) > + if (pcs == &cur->pcs) { > + mpcs = cur; > + break; > + } > + mutex_unlock(&instance_mutex); I don't see what this loop gains us, other than checking that the "pcs" is still on the list and hasn't already been removed. If that is all that this is about, then I would suggest: bool found = false; if (!pcs) return; mpcs = pcs_to_mtk_pcs_lynxi(pcs); mutex_lock(&instance_mutex); list_for_each_entry(cur, &mtk_pcs_lynxi_instances, node) if (cur == mpcs) { found = true; break; } mutex_unlock(&instance_mutex); if (WARN_ON(!found)) return; which makes it more obvious why this exists. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!