On Mon, Jul 01, 2024 at 10:46:48AM +0000, Omer Shpigelman wrote: > On 6/30/24 16:29, Leon Romanovsky wrote: > > On Fri, Jun 28, 2024 at 10:24:32AM +0000, Omer Shpigelman wrote: > >> On 6/19/24 13:52, Leon Romanovsky wrote: > >>> On Wed, Jun 19, 2024 at 09:27:54AM +0000, Omer Shpigelman wrote: > >>>> On 6/18/24 15:58, Leon Romanovsky wrote: > >>>>> On Tue, Jun 18, 2024 at 11:08:34AM +0000, Omer Shpigelman wrote: > >>>>>> On 6/17/24 22:04, Leon Romanovsky wrote: > >>>>>>> [Some people who received this message don't often get email from leon@xxxxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] > >>>>>>> > >>>>>>> On Mon, Jun 17, 2024 at 05:43:49PM +0000, Omer Shpigelman wrote: > >>>>>>>> On 6/13/24 22:18, Leon Romanovsky wrote: > >>>>>>>>> [Some people who received this message don't often get email from leon@xxxxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] > >>>>>>>>> > >>>>>>>>> On Thu, Jun 13, 2024 at 11:22:04AM +0300, Omer Shpigelman wrote: > >>>>>>>>>> Add an RDMA driver of Gaudi ASICs family for AI scaling. > >>>>>>>>>> The driver itself is agnostic to the ASIC in action, it operates according > >>>>>>>>>> to the capabilities that were passed on device initialization. > >>>>>>>>>> The device is initialized by the hbl_cn driver via auxiliary bus. > >>>>>>>>>> The driver also supports QP resource tracking and port/device HW counters. > >>>>>>>>>> > >>>>>>>>>> Signed-off-by: Omer Shpigelman <oshpigelman@xxxxxxxxx> > >>>>>>>>>> Co-developed-by: Abhilash K V <kvabhilash@xxxxxxxxx> > >>>>>>>>>> Signed-off-by: Abhilash K V <kvabhilash@xxxxxxxxx> > >>>>>>>>>> Co-developed-by: Andrey Agranovich <aagranovich@xxxxxxxxx> > >>>>>>>>>> Signed-off-by: Andrey Agranovich <aagranovich@xxxxxxxxx> > >>>>>>>>>> Co-developed-by: Bharat Jauhari <bjauhari@xxxxxxxxx> > >>>>>>>>>> Signed-off-by: Bharat Jauhari <bjauhari@xxxxxxxxx> > >>>>>>>>>> Co-developed-by: David Meriin <dmeriin@xxxxxxxxx> > >>>>>>>>>> Signed-off-by: David Meriin <dmeriin@xxxxxxxxx> > >>>>>>>>>> Co-developed-by: Sagiv Ozeri <sozeri@xxxxxxxxx> > >>>>>>>>>> Signed-off-by: Sagiv Ozeri <sozeri@xxxxxxxxx> > >>>>>>>>>> Co-developed-by: Zvika Yehudai <zyehudai@xxxxxxxxx> > >>>>>>>>>> Signed-off-by: Zvika Yehudai <zyehudai@xxxxxxxxx> > >>>>>>>>> > >> > >> <...> > >> > >>>> mlx5 IB driver doesn't export any symbol that is used by the core driver, > >>>> that's why the core driver can be loaded without the IB driver (althought > >>>> you'll get circular dependency if you would export). > >>> > >>> Yes, IB and ETH drivers are "users" of core driver. As RDMA maintainer, > >>> I'm reluctant to accept code that exports symbols from IB drivers to > >>> other subsystems. We have drivers/infiniband/core/ for that. > >>> > >> > >> We need the core driver to access the IB driver (and to the ETH driver as > >> well). As you wrote, we can't use exported symbols from our IB driver nor > >> rely on function pointers, but what about providing the core driver an ops > >> structure? meaning exporting a register function from the core driver that > >> should be called by the IB driver during auxiliary device probe. > >> Something like: > >> > >> int hbl_cn_register_ib_aux_dev(struct auxiliary_device *adev, > >> struct hbl_ib_ops *ops) > >> { > >> ... > >> } > >> EXPORT_SYMBOL(hbl_cn_register_ib_aux_dev); > >> > >> That's how only the parent driver exports symbols to the son driver so the > >> IB driver is a "user" of the core driver and so we count on the internal > >> module reference counter. But we also get the ability to access the IB > >> driver from the core driver (to report a HW error for example). > > > > Before you are talking about solutions, please explain in technical > > terms why you absolutely need to access IB from core driver and any > > other possible way is not possible. > > > > Thanks > > First of all, as a general assumption, everything we do today can also be > done with unidirectional drivers communication only. If the parent driver > cannot access the son driver directly, then we can have a blocking command > queue on the parent side that the parent driver will push to it and the > son driver will fetch from it, execute the command and unblock the parent. > That will work but it adds complexity which I'm not sure that is needed. > The second point is not necessarily about the direction of the > communication but more about generally using function pointers rather than > exported symbols - we have 2 flavors of functions for inter driver > communications: common functions and ASIC specific functions. The ASIC > specific functions are exposed and initialized per ASIC. If we convert > them to EXPORT_SYMBOLs then we expose ASIC specific functions regardless > of the ASIC in action. > Again, that will work but seems unnecessary. We can check the ASIC type > that was passed in each exported function and fail if a wrong ASIC type > was used, but it seems to me like an incorrect approach to use exported > symbols for ASIC specific communication. EXPORT_SYMBOLs were meant to be > used for driver level communication, not for utilizing device specific > capabilities. For that, an ops struct seems more appropriate. > That's why I'm suggesting to combine both exported symbols and function > pointers. Thanks for the explanation. I understand your concerns, but I don't see any technical justification for the need to access IB driver from the core. Thanks