On 2024-02-21 15:37, Xu Yilun wrote: > On Tue, Feb 20, 2024 at 12:11:26PM +0100, Marco Pagani wrote: >> >> >> On 2024-02-18 11:05, Xu Yilun wrote: >>> On Mon, Feb 05, 2024 at 06:47:34PM +0100, Marco Pagani wrote: >>>> >>>> >>>> On 2024-02-04 06:15, Xu Yilun wrote: >>>>> On Fri, Feb 02, 2024 at 06:44:01PM +0100, Marco Pagani wrote: >>>>>> >>>>>> >>>>>> On 2024-01-30 05:31, Xu Yilun wrote: >>>>>>>> +#define fpga_mgr_register_full(parent, info) \ >>>>>>>> + __fpga_mgr_register_full(parent, info, THIS_MODULE) >>>>>>>> struct fpga_manager * >>>>>>>> -fpga_mgr_register_full(struct device *parent, const struct fpga_manager_info *info); >>>>>>>> +__fpga_mgr_register_full(struct device *parent, const struct fpga_manager_info *info, >>>>>>>> + struct module *owner); >>>>>>>> >>>>>>>> +#define fpga_mgr_register(parent, name, mops, priv) \ >>>>>>>> + __fpga_mgr_register(parent, name, mops, priv, THIS_MODULE) >>>>>>>> struct fpga_manager * >>>>>>>> -fpga_mgr_register(struct device *parent, const char *name, >>>>>>>> - const struct fpga_manager_ops *mops, void *priv); >>>>>>>> +__fpga_mgr_register(struct device *parent, const char *name, >>>>>>>> + const struct fpga_manager_ops *mops, void *priv, struct module *owner); >>>>>>>> + >>>>>>>> void fpga_mgr_unregister(struct fpga_manager *mgr); >>>>>>>> >>>>>>>> +#define devm_fpga_mgr_register_full(parent, info) \ >>>>>>>> + __devm_fpga_mgr_register_full(parent, info, THIS_MODULE) >>>>>>>> struct fpga_manager * >>>>>>>> -devm_fpga_mgr_register_full(struct device *parent, const struct fpga_manager_info *info); >>>>>>>> +__devm_fpga_mgr_register_full(struct device *parent, const struct fpga_manager_info *info, >>>>>>>> + struct module *owner); >>>>>>> >>>>>>> Add a line here. I can do it myself if you agree. >>>>>> >>>>>> Sure, that is fine by me. I also spotted a typo in the commit log body >>>>>> (in taken -> is taken). Do you want me to send a v6, or do you prefer >>>>>> to fix that in place? >>>>> >>>>> No need, I can fix it. >>>>> >>>>>> >>>>>>> >>>>>>> There is still a RFC prefix for this patch. Are you ready to get it merged? >>>>>>> If yes, Acked-by: Xu Yilun <yilun.xu@xxxxxxxxx> >>>>>> >>>>>> I'm ready for the patch to be merged. However, I recently sent an RFC >>>>>> to propose a safer implementation of try_module_get() that would >>>>>> simplify the code and may also benefit other subsystems. What do you >>>>>> think? >>>>>> >>>>>> https://lore.kernel.org/linux-modules/20240130193614.49772-1-marpagan@xxxxxxxxxx/ >>>>> >>>>> I suggest take your fix to linux-fpga/for-next now. If your try_module_get() >>>>> proposal is applied before the end of this cycle, we could re-evaluate >>>>> this patch. >>>> >>>> That's fine by me. >>> >>> Sorry, I still found issues about this solution. >>> >>> void fpga_mgr_unregister(struct fpga_manager *mgr) >>> { >>> dev_info(&mgr->dev, "%s %s\n", __func__, mgr->name); >>> >>> /* >>> * If the low level driver provides a method for putting fpga into >>> * a desired state upon unregister, do it. >>> */ >>> fpga_mgr_fpga_remove(mgr); >>> >>> mutex_lock(&mgr->mops_mutex); >>> >>> mgr->mops = NULL; >>> >>> mutex_unlock(&mgr->mops_mutex); >>> >>> device_unregister(&mgr->dev); >>> } >>> >>> Note that fpga_mgr_unregister() doesn't have to be called in module_exit(). >>> So if we do fpga_mgr_get() then fpga_mgr_unregister(), We finally had a >>> fpga_manager dev without mops, this is not what the user want and cause >>> problem when using this fpga_manager dev for other FPGA APIs. >> >> How about moving mgr->mops = NULL from fpga_mgr_unregister() to >> class->dev_release()? In that way, mops will be set to NULL only when the >> manager dev refcount reaches 0. > > I'm afraid it doesn't help. The lifecycle of the module and the fpga > mgr dev is different. > > We use mops = NULL to indicate module has been freed or will be freed in no > time. On the other hand mops != NULL means module is still there, so > that try_module_get() could be safely called. It is possible someone > has got fpga mgr dev but not the module yet, at that time the module is > unloaded, then try_module_get() triggers crash. > >> >> If fpga_mgr_unregister() is called from module_exit(), we are sure that nobody >> got the manager dev earlier using fpga_mgr_get(), or it would have bumped up > > No, someone may get the manager dev but not the module yet, and been > scheduled out. > You are right. Overall, it's a bad idea. How about then using an additional bool flag instead of "overloading" the mops pointer? Something like: get: if (!mgr->owner_valid || !try_module_get(mgr->mops_owner)) remove: mgr->owner_valid = false; Another possibility that comes to my mind would be to "overload" the owner pointer itself by using the ERR_PTR/IS_ERR macros. However, it looks ugly to me. Thanks, Marco [...]