Hi Russell, thank for going into my patch in depth and taking your time for an elaborate and constructive answer. On Wed, Dec 06, 2023 at 08:23:05PM +0000, Russell King (Oracle) wrote: > On Wed, Dec 06, 2023 at 07:52:23PM +0000, Daniel Golle wrote: > > On Wed, Dec 06, 2023 at 06:55:50PM +0000, Russell King (Oracle) wrote: > > > On Wed, Dec 06, 2023 at 01:45:17AM +0000, Daniel Golle wrote: > > > > @@ -516,6 +538,21 @@ static struct phylink_pcs *mtk_mac_select_pcs(struct phylink_config *config, > > > > struct mtk_eth *eth = mac->hw; > > > > unsigned int sid; > > > > > > > > + if (mtk_is_netsys_v3_or_greater(eth)) { > > > > + switch (interface) { > > > > + case PHY_INTERFACE_MODE_1000BASEX: > > > > + case PHY_INTERFACE_MODE_2500BASEX: > > > > + case PHY_INTERFACE_MODE_SGMII: > > > > + return mtk_pcs_lynxi_select_pcs(mac->sgmii_pcs_of_node, interface); > > > > + case PHY_INTERFACE_MODE_5GBASER: > > > > + case PHY_INTERFACE_MODE_10GBASER: > > > > + case PHY_INTERFACE_MODE_USXGMII: > > > > + return mtk_usxgmii_select_pcs(mac->usxgmii_pcs_of_node, interface); > > > > > > From what I can see, neither of these two "select_pcs" methods that > > > you're calling makes any use of the "interface" you pass to them. > > > I'm not sure what they _could_ do with it either, given that what > > > you're effectively doing here is getting the phylink_pcs structure from > > > the driver, and each one only has a single phylink_pcs. > > > > Yes, you are right, the interface parameter isn't used, I will drop > > it from both mtk_*_select_pcs() prototypes. > > > > In the long run we may want something like > > struct phylink_pcs *of_pcs_get(struct device_node *np, phy_interface_t interface) > > provided by a to-be-built drivers/net/pcs/core.c... > > Again... it's not as simple as that. As soon as we get into the > situation that some _other_ driver becomes responsible for providing > the struct phylink_pcs pointer, we _then_ need to have some way of > dealing with that device going away. I assume you are referring to struct phylink_pcs *of_pcs_get(...), right? And true, it'd be quite a complex piece of infrastructure. > > By that I mean that the memory pointed to returned from such a function > that you are proposing above could be freed - or worse could be unmapped > from the kernel address space, and the same goes for the operations > structure as well - even more so if the "ops" are part of module data > and the module is unloaded. > > As I know how these discussions go (it's not my first time bringing up > these kinds of multi-driver interations), no, locking the module into > memory doesn't work, and shows a lack of a full understanding of the > problem. > > We need to have a way that when a PCS device is removed, that is > propagated up the management levels and causes the PCS to be gracefully > removed from the network driver (in other words, from phylink). I see -- spontanous removal of the PCS may not be a practical problem on that very hardware, but from the driver model point of view, it is. Should the callback for removal be implemented as part of the network driver or are you suggesting to add such infrastructure to phylink? > > I won't accept a hack that sticky-plasters around the problem - not for > code that I am involved in actively maintaining - and sticky-plastering > around this class of problem seems to happen all too often in the > kernel. Oh yes, I can see that every day when testing various SFP modules with exposed PHYs -- and often get stackdumps thrown at me when I remove them...