On 7/17/24 15:33, Leon Romanovsky wrote: > On Wed, Jul 17, 2024 at 10:51:03AM +0000, Omer Shpigelman wrote: >> On 7/17/24 10:36, Leon Romanovsky wrote: >>> On Wed, Jul 17, 2024 at 07:08:59AM +0000, Omer Shpigelman wrote: >>>> On 7/16/24 16:40, Jason Gunthorpe wrote: >>>>> On Sun, Jul 14, 2024 at 10:18:12AM +0000, Omer Shpigelman wrote: >>>>>> On 7/12/24 16:08, Jason Gunthorpe wrote: >>>>>>> [You don't often get email from jgg@xxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] >>>>>>> >>>>>>> On Fri, Jun 28, 2024 at 10:24:32AM +0000, Omer Shpigelman wrote: >>>>>>> >>>>>>>> We need the core driver to access the IB driver (and to the ETH driver as >>>>>>>> well). As you wrote, we can't use exported symbols from our IB driver nor >>>>>>>> rely on function pointers, but what about providing the core driver an ops >>>>>>>> structure? meaning exporting a register function from the core driver that >>>>>>>> should be called by the IB driver during auxiliary device probe. >>>>>>>> Something like: >>>>>>>> >>>>>>>> int hbl_cn_register_ib_aux_dev(struct auxiliary_device *adev, >>>>>>>> struct hbl_ib_ops *ops) >>>>>>>> { >>>>>>>> ... >>>>>>>> } >>>>>>>> EXPORT_SYMBOL(hbl_cn_register_ib_aux_dev); >>>>>>> >>>>>>> Definately do not do some kind of double-register like this. >>>>>>> >>>>>>> The auxiliary_device scheme can already be extended to provide ops for >>>>>>> each sub device. >>>>>>> >>>>>>> Like >>>>>>> >>>>>>> struct habana_driver { >>>>>>> struct auxiliary_driver base; >>>>>>> const struct habana_ops *ops; >>>>>>> }; >>>>>>> >>>>>>> If the ops are justified or not is a different question. >>>>>>> >>>>>> >>>>>> Well, I suggested this double-register option because I got a comment that >>>>>> the design pattern of embedded ops structure shouldn't be used. >>>>>> So I'm confused now... >>>>> >>>>> Yeah, don't stick ops in random places, but the device_driver is the >>>>> right place. >>>>> >>>> >>>> Sorry, let me explain again. My original code has an ops structure >>>> exactly like you are suggesting now (see struct hbl_aux_dev in the first >>>> patch of the series). But I was instructed not to use this ops structure >>>> and to rely on exported symbols for inter-driver communication. >>>> I'll be happy to use this ops structure like in your example rather than >>>> converting my code to use exported symbols. >>>> Leon - am I missing anything? what's the verdict here? >>> >>> You are missing the main sentence from Jason's response: "don't stick ops in random places". >>> >>> It is fine to have ops in device driver, so the core driver can call them. However, in your >>> original code, you added ops everywhere. It caused to the need to implement module reference >>> counting and crazy stuff like calls to lock and unlock functions from the aux driver to the core. >>> >>> Verdict is still the same. Core driver should provide EXPORT_SYMBOLs, so the aux driver can call >>> them directly and enjoy from proper module loading and unloading. >>> >>> The aux driver can have ops in the device driver, so the core driver can call them to perform something >>> specific for that aux driver. >>> >>> Calls between aux drivers should be done via the core driver. >>> >>> Thanks >> >> The only place we have an ops structure is in the device driver, >> similarly to Jason's example. In our code it is struct hbl_aux_dev. What >> other random places did you see? > > This is exactly random place. > > I suggest you to take time, learn how existing drivers in netdev and > RDMA uses auxbus infrastructure and follow the same pattern. There are > many examples already in the kernel. > > And no, if you do everything right, you won't need custom module > reference counting and other hacks. There is nothing special in your > device/driver which requires special treatment. > > Thanks How come it is a random place? Look at irdma/i40e - they have an ops struct (struct i40e_ops) embedded in their shared aux struct (struct i40e_auxiliary_device) which is the wrapper of the base aux struct (struct auxiliary_device). This is very similar to what we have - a pointer to an ops struct (void *aux_ops) embedded in our shared aux struct (struct hbl_aux_dev) which is the wrapper of the base struct (struct auxiliary_device). The only difference is that they put their ops struct inside some info struct (struct i40e_info) and we have a separate pointer for that info (void *aux_data). In addition, they use the ops struct for accessing the net driver from the RDMA driver, meaning son-to-parent communication instead of having an exported symbol e.g. i40e_client_request_reset(). They have only a single operation as EXPORT_SYMBOL function for (un)registering the son - i40e_client_device_register() and i40e_client_device_unregister(). So what is the problem with how we implemented it?