On Wed, Sep 25, 2024 at 10:02:51AM +0200, 'Alice Ryhl' via kernel-team wrote: > On Tue, Sep 24, 2024 at 8:44 PM Carlos Llamas <cmllamas@xxxxxxxxxx> wrote: > > > > In binder_add_freeze_work() we iterate over the proc->nodes with the > > proc->inner_lock held. However, this lock is temporarily dropped to > > acquire the node->lock first (lock nesting order). This can race with > > binder_deferred_release() which removes the nodes from the proc->nodes > > rbtree and adds them into binder_dead_nodes list. This leads to a broken > > iteration in binder_add_freeze_work() as rb_next() will use data from > > binder_dead_nodes, triggering an out-of-bounds access: > > > > ================================================================== > > BUG: KASAN: global-out-of-bounds in rb_next+0xfc/0x124 > > Read of size 8 at addr ffffcb84285f7170 by task freeze/660 > > > > CPU: 8 UID: 0 PID: 660 Comm: freeze Not tainted 6.11.0-07343-ga727812a8d45 #18 > > Hardware name: linux,dummy-virt (DT) > > Call trace: > > rb_next+0xfc/0x124 > > binder_add_freeze_work+0x344/0x534 > > binder_ioctl+0x1e70/0x25ac > > __arm64_sys_ioctl+0x124/0x190 > > > > The buggy address belongs to the variable: > > binder_dead_nodes+0x10/0x40 > > [...] > > ================================================================== > > > > This is possible because proc->nodes (rbtree) and binder_dead_nodes > > (list) share entries in binder_node through a union: > > > > struct binder_node { > > [...] > > union { > > struct rb_node rb_node; > > struct hlist_node dead_node; > > }; > > > > Fix the race by checking that the proc is still alive. If not, simply > > break out of the iteration. > > > > Fixes: d579b04a52a1 ("binder: frozen notification") > > Cc: stable@xxxxxxxxxxxxxxx > > Signed-off-by: Carlos Llamas <cmllamas@xxxxxxxxxx> > > This change LGTM. > Reviewed-by: Alice Ryhl <aliceryhl@xxxxxxxxxx> > > I reviewed some other code paths to verify whether there are other > problems with processes dying concurrently with operations on freeze > notifications. I didn't notice any other memory safety issues, but I Yeah most other paths are protected with binder_procs_lock mutex. > noticed that binder_request_freeze_notification returns EINVAL if you > try to use it with a node from a dead process. That seems problematic, > as this means that there's no way to invoke that command without > risking an EINVAL error if the remote process dies. We should not > return EINVAL errors on correct usage of the driver. Agreed, this should probably be -ESRCH or something. I'll add it to v2, thanks for the suggestion. Cheers, Carlos Llamas