On Thu, Sep 26, 2024 at 10:06:14AM +0200, Alice Ryhl wrote: > On Wed, Sep 25, 2024 at 8:06 PM Carlos Llamas <cmllamas@xxxxxxxxxx> wrote: > > > > On Wed, Sep 25, 2024 at 07:52:37PM +0200, Alice Ryhl wrote: > > > > > I reviewed some other code paths to verify whether there are other > > > > > problems with processes dying concurrently with operations on freeze > > > > > notifications. I didn't notice any other memory safety issues, but I > > > > > > > > Yeah most other paths are protected with binder_procs_lock mutex. > > > > > > > > > noticed that binder_request_freeze_notification returns EINVAL if you > > > > > try to use it with a node from a dead process. That seems problematic, > > > > > as this means that there's no way to invoke that command without > > > > > risking an EINVAL error if the remote process dies. We should not > > > > > return EINVAL errors on correct usage of the driver. > > > > > > > > Agreed, this should probably be -ESRCH or something. I'll add it to v2, > > > > thanks for the suggestion. > > > > > > Well, maybe? I think it's best to not return errnos from these > > > commands at all, as they obscure how many commands were processed. > > > > This is problematic, particularly when it's a multi-command buffer. > > Userspace doesn't really know which one failed and if any of them > > succeeded. Agreed. > > > > > > > > Since the node still exists even if the process dies, perhaps we can > > > just let you create the freeze notification even if it's dead? We can > > > make it end up in the same state as if you request a freeze > > > notification and the process then dies afterwards. > > > > It's a dead node, there is no process associated with it. It would be > > incorrect to setup the notification as it doesn't have a frozen status > > anymore. We can't determine the ref->node->proc->is_frozen? > > > > We could silently fail and skip the notification, but I don't know if > > userspace will attempt to release it later... and fail with EINVAL. > > I mean, userspace *has* to be able to deal with the case where the > process dies *right after* the freeze notification is registered. If > we make the behavior where it's already dead be the same as that case, > then we're not giving userspace any new things it needs to care about. This is a fair point. To make it happen though, we would need to modify the behavior of the request a bit. If the node is dead, we could still attach the freeze notification to the reference but then we would skip sending the "current" frozen state. This last bit won't be guaranteed anymore. I _suppose_ this is ok, since as you mention, userspace should have to deal with the process dying anyway. I honestly don't really like this "fake successful" approach but then we don't handle driver errors very well either. So it might be worth it to avoid propagating this "dead node" error if we can. I'll do this for v2. Thanks, Carlos Llamas