On Mon, 27 May 2024 19:10:55 +0300 Ido Schimmel wrote: > On Wed, May 22, 2024 at 07:22:12AM -0700, Jakub Kicinski wrote: > > On Wed, 22 May 2024 13:56:11 +0000 Danielle Ratson wrote: > > > The event should match the below: > > > event == NETLINK_URELEASE && notify->protocol == NETLINK_GENERIC > > > > > > Then iterate over the list to look for work that matches the dev and portid. > > > The socket doesn’t close until the work is done in that case. > > > > Okay, good, yes. I think you can use one of the callbacks I mentioned > > below to achieve the same thing with less complexity than the notifier. > > Danielle already has a POC with the notifier and it's not that > complicated. I wasn't aware of the netlink notifier, but we found it > when we tried to understand how other netlink families get notified > about a socket being closed. > > Which advantages do you see in the sock_priv_destroy() approach? Are you > against the notifier approach? Notifier is not incorrect, but I worry it will result in more code, and basically duplication of what genl_sk_priv* does. Perhaps you managed to code it up very neatly - if so feel free to send the v6 and we can discuss further if needed? > > > > Easiest way to "notice" the socket got closed would probably be to add some > > > > info to genl_sk_priv_*(). ->sock_priv_destroy() will get called. But you can also > > > > get a close notification in the family > > > > ->unbind callback. > > Isn't the unbind callback only for multicast (whereas we are using > unicast)? True, should work in practice, I think. But sock_priv is much better. > > > Is there a scenario that we hit this event and won't intend to cancel the work? > > > > I think it's up to us. I don't see any legit reason for user space to > > intentionally cancel the flashing. So the only option is that user space > > is either buggy or has crashed, and the socket got closed before > > flashing finished. Right? > > We don't think that closing the socket / killing the process mid > flashing is a legitimate scenario. We looked into it in order to avoid > sending unicast notifications to a socket that did not ask for them but > gets them because it was bound to the port ID that was used by the old > socket. > > I agree that we don't need to cancel the work and can simply have the > work item stop sending notifications. User space will get an error if it > tries to flash a module that is already being flashed in the background. > WDYT? SGTM!