On Tue, Feb 27, 2024 at 12:49:06PM +0100, Marco Pagani wrote: > > > On 2024-02-21 15:37, Xu Yilun wrote: > > On Tue, Feb 20, 2024 at 12:11:26PM +0100, Marco Pagani wrote: > >> > >> > >> On 2024-02-18 11:05, Xu Yilun wrote: > >>> On Mon, Feb 05, 2024 at 06:47:34PM +0100, Marco Pagani wrote: > >>>> > >>>> > >>>> On 2024-02-04 06:15, Xu Yilun wrote: > >>>>> On Fri, Feb 02, 2024 at 06:44:01PM +0100, Marco Pagani wrote: > >>>>>> > >>>>>> > >>>>>> On 2024-01-30 05:31, Xu Yilun wrote: > >>>>>>>> +#define fpga_mgr_register_full(parent, info) \ > >>>>>>>> + __fpga_mgr_register_full(parent, info, THIS_MODULE) > >>>>>>>> struct fpga_manager * > >>>>>>>> -fpga_mgr_register_full(struct device *parent, const struct fpga_manager_info *info); > >>>>>>>> +__fpga_mgr_register_full(struct device *parent, const struct fpga_manager_info *info, > >>>>>>>> + struct module *owner); > >>>>>>>> > >>>>>>>> +#define fpga_mgr_register(parent, name, mops, priv) \ > >>>>>>>> + __fpga_mgr_register(parent, name, mops, priv, THIS_MODULE) > >>>>>>>> struct fpga_manager * > >>>>>>>> -fpga_mgr_register(struct device *parent, const char *name, > >>>>>>>> - const struct fpga_manager_ops *mops, void *priv); > >>>>>>>> +__fpga_mgr_register(struct device *parent, const char *name, > >>>>>>>> + const struct fpga_manager_ops *mops, void *priv, struct module *owner); > >>>>>>>> + > >>>>>>>> void fpga_mgr_unregister(struct fpga_manager *mgr); > >>>>>>>> > >>>>>>>> +#define devm_fpga_mgr_register_full(parent, info) \ > >>>>>>>> + __devm_fpga_mgr_register_full(parent, info, THIS_MODULE) > >>>>>>>> struct fpga_manager * > >>>>>>>> -devm_fpga_mgr_register_full(struct device *parent, const struct fpga_manager_info *info); > >>>>>>>> +__devm_fpga_mgr_register_full(struct device *parent, const struct fpga_manager_info *info, > >>>>>>>> + struct module *owner); > >>>>>>> > >>>>>>> Add a line here. I can do it myself if you agree. > >>>>>> > >>>>>> Sure, that is fine by me. I also spotted a typo in the commit log body > >>>>>> (in taken -> is taken). Do you want me to send a v6, or do you prefer > >>>>>> to fix that in place? > >>>>> > >>>>> No need, I can fix it. > >>>>> > >>>>>> > >>>>>>> > >>>>>>> There is still a RFC prefix for this patch. Are you ready to get it merged? > >>>>>>> If yes, Acked-by: Xu Yilun <yilun.xu@xxxxxxxxx> > >>>>>> > >>>>>> I'm ready for the patch to be merged. However, I recently sent an RFC > >>>>>> to propose a safer implementation of try_module_get() that would > >>>>>> simplify the code and may also benefit other subsystems. What do you > >>>>>> think? > >>>>>> > >>>>>> https://lore.kernel.org/linux-modules/20240130193614.49772-1-marpagan@xxxxxxxxxx/ > >>>>> > >>>>> I suggest take your fix to linux-fpga/for-next now. If your try_module_get() > >>>>> proposal is applied before the end of this cycle, we could re-evaluate > >>>>> this patch. > >>>> > >>>> That's fine by me. > >>> > >>> Sorry, I still found issues about this solution. > >>> > >>> void fpga_mgr_unregister(struct fpga_manager *mgr) > >>> { > >>> dev_info(&mgr->dev, "%s %s\n", __func__, mgr->name); > >>> > >>> /* > >>> * If the low level driver provides a method for putting fpga into > >>> * a desired state upon unregister, do it. > >>> */ > >>> fpga_mgr_fpga_remove(mgr); > >>> > >>> mutex_lock(&mgr->mops_mutex); > >>> > >>> mgr->mops = NULL; > >>> > >>> mutex_unlock(&mgr->mops_mutex); > >>> > >>> device_unregister(&mgr->dev); > >>> } > >>> > >>> Note that fpga_mgr_unregister() doesn't have to be called in module_exit(). > >>> So if we do fpga_mgr_get() then fpga_mgr_unregister(), We finally had a > >>> fpga_manager dev without mops, this is not what the user want and cause > >>> problem when using this fpga_manager dev for other FPGA APIs. > >> > >> How about moving mgr->mops = NULL from fpga_mgr_unregister() to > >> class->dev_release()? In that way, mops will be set to NULL only when the > >> manager dev refcount reaches 0. > > > > I'm afraid it doesn't help. The lifecycle of the module and the fpga > > mgr dev is different. > > > > We use mops = NULL to indicate module has been freed or will be freed in no > > time. On the other hand mops != NULL means module is still there, so > > that try_module_get() could be safely called. It is possible someone > > has got fpga mgr dev but not the module yet, at that time the module is > > unloaded, then try_module_get() triggers crash. > > > >> > >> If fpga_mgr_unregister() is called from module_exit(), we are sure that nobody > >> got the manager dev earlier using fpga_mgr_get(), or it would have bumped up > > > > No, someone may get the manager dev but not the module yet, and been > > scheduled out. > > > > You are right. Overall, it's a bad idea. How about then using an additional > bool flag instead of "overloading" the mops pointer? Something like: > > get: > if (!mgr->owner_valid || !try_module_get(mgr->mops_owner)) > > remove: > mgr->owner_valid = false; I'm not quite sure which function is actually mentioned by "remove". I assume it should be fpga_mgr_unregister(). IIUC this flag means no more reference to fpga mgr, but existing references are still valid. It works for me. But the name of this flag could be reconsidered to avoid misunderstanding. The owner is still valid (we still need to put the owner) but allows no more reference. Maybe "owner_inactive"? I still wanna this owner reference change been splitted, so that we could simply revert it when the try_module_get_safe() got accepted. Thanks, Yilun > > Another possibility that comes to my mind would be to "overload" the owner > pointer itself by using the ERR_PTR/IS_ERR macros. However, it looks ugly > to me. > > Thanks, > Marco > > > [...] >