Hi, I add one nesting scenarios which is binder work return error is wrongly consumed in here: https://issuetracker.google.com/u/0/issues/79123724 Interpretation of the process see attached: (+ stands for production, - stands for consumption,BW_XXX is a shorthand for BINDE_WORK_XXX) 1) The first oneway reportOneDeath() consumed BINDER_WORK_RETURN_ERROR which is producted by bindApplication() 2) Ths second non-oneway reportOneDeath() consumed BR_TRANSACTION_COMPLETE which left by the first reportOneDeath() 3) The thread will hang in bindApplication() forever, because BINDER_WORK_RETURN_ERROR which is producted by bindApplication() is wrongly consumed by reportOneDeath() ________________________________________ 发件人: gityuan@xxxxxxxxx <gityuan@xxxxxxxxx> 发送时间: 2018年5月2日 22:30 收件人: gregkh@xxxxxxxxxxxxxxxxxxx; maco@xxxxxxxxxxx; tkjos@xxxxxxxxxx; arve@xxxxxxxxxxx 抄送: devel@xxxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; 袁辉辉 主题: [PATCH] ANDROID: binder: fix binder work return error is wrongly consumed From: yuanhuihui <yuanhuihui@xxxxxxxxxx> transaction consume mismatched binder work. as follow: 1) Client send transaction when left binder work in the thread->todo list. 2) in kernel space, client consumed BC_TRANSACTION, and produced BINDER_WORK_RETURN_ERROR into thread->todo list as target proc is dead 3) Client should get return error from thread->todo firstly, finish this transaction, and back to userspace immediately. 4) In fact, client pick up mismatched binder work cause unexpected results because of the previous remaining binder work. maybe cause return error consumed by other transaction context. Recently I encountered some versions of mobile phones to reproduce this problem that is oneway binder call can be blocked forever in system_server process. the root cause is that BINDER_WORK_RETURN_ERROR is comsumed by mismatch transacation context. binder thread is waiting for BINDER_WORK_RETURN_ERROR: "Binder:1253_C" prio=5 tid=107 Native at android.os.BinderProxy.transactNative(Native method) at android.os.BinderProxy.transact(Binder.java:767) at android.app.IApplicationThread$Stub$Proxy.bindApplication(IApplicationThread.java:1398) at com.android.server.am.ActivityManagerService.attachApplicationLocked(ActivityManagerService.java:7320) at com.android.server.am.ActivityManagerService.attachApplication(ActivityManagerService.java:7425) - locked <0x0bc6ee81> (a com.android.server.am.ActivityManagerService) at android.app.IActivityManager$Stub.onTransact(IActivityManager.java:291) at com.android.server.am.ActivityManagerService.onTransact(ActivityManagerService.java:2997) at android.os.Binder.execTransact(Binder.java:700) I see newest binder driver fixed by "ANDROID: binder: don't enqueue death notifications to thread todo". It really can solve the BINDER_WORK_DEAD_BINDER nesting scenarios :) But there is potential risks in the future, future functional extensions need to consider nesting issues, maybe extending more methods where we push to thread->todo. I think that using queueing return error transaction to the head of thread todo list is more appropriate, as follows: 1) During a transaction, the client will add BINDER_WORK_RETURN_ERROR into thread->todo list, however, it would be better to pick up BINDER_WORK_RETURN_ERROR firstly and finish the transaction immediately, jump out of the nest. 2) Client pick up the left binder work from thread->todo, using the same thread, do not need wake up other idle binder thread. 3) It works fine in the old binder version (before split big binder lock), binder_transaction only set thread return_error when target process has dead, do not add BINDER_WORK_RETURN_ERROR into thread->todo, but binder_thread_read() check return_error firstly. If occurs return_error, finish this transaction and back to userspace immediately. So I prefer to put the BINDER_WORK_RETURN_ERROR to the head of the queue, same as the old version of binder driver, once and for all. Signed-off-by: yuanhuihui <yuanhuihui@xxxxxxxxxx> --- drivers/android/binder.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 57 insertions(+), 2 deletions(-) diff --git a/drivers/android/binder.c b/drivers/android/binder.c index 4eab5be3d00f..1ed1809b8769 100644 --- a/drivers/android/binder.c +++ b/drivers/android/binder.c @@ -861,6 +861,60 @@ binder_enqueue_thread_work(struct binder_thread *thread, binder_inner_proc_unlock(thread->proc); } +/** + * binder_enqueue_work_head_ilocked() - Add an item to the head of work list + * @work: struct binder_work to add to list + * @target_list: list to add work to + * + * Adds the work to the specified list. Asserts that work + * is not already on a list. + * + * Requires the proc->inner_lock to be held. + */ +static void +binder_enqueue_work_head_ilocked(struct binder_work *work, + struct list_head *target_list) +{ + BUG_ON(target_list == NULL); + BUG_ON(work->entry.next && !list_empty(&work->entry)); + list_add(&work->entry, target_list); +} + +/** + * binder_enqueue_thread_work_head_ilocked() - Add an item to the head of thread work list + * @thread: thread to queue work to + * @work: struct binder_work to add to list + * + * Adds the work to the head of thread todo list, and enables processing + * of the todo queue. + * + * Requires the proc->inner_lock to be held. + */ +static void +binder_enqueue_thread_work_head_ilocked(struct binder_thread *thread, + struct binder_work *work) +{ + binder_enqueue_work_head_ilocked(work, &thread->todo); + thread->process_todo = true; +} + +/** + * binder_enqueue_thread_work_head() - Add an item to the head of thread work list + * @thread: thread to queue work to + * @work: struct binder_work to add to list + * + * Adds the work to the head of thread todo list, and enables processing + * of the todo queue. + */ +static void +binder_enqueue_thread_work_head(struct binder_thread *thread, + struct binder_work *work) +{ + binder_inner_proc_lock(thread->proc); + binder_enqueue_thread_work_head_ilocked(thread, work); + binder_inner_proc_unlock(thread->proc); +} + static void binder_dequeue_work_ilocked(struct binder_work *work) { @@ -3287,11 +3341,11 @@ static void binder_transaction(struct binder_proc *proc, BUG_ON(thread->return_error.cmd != BR_OK); if (in_reply_to) { thread->return_error.cmd = BR_TRANSACTION_COMPLETE; - binder_enqueue_thread_work(thread, &thread->return_error.work); + binder_enqueue_thread_work_head(thread, &thread->return_error.work); binder_send_failed_reply(in_reply_to, return_error); } else { thread->return_error.cmd = return_error; - binder_enqueue_thread_work(thread, &thread->return_error.work); + binder_enqueue_thread_work_head(thread, &thread->return_error.work); } } @@ -3929,6 +3983,7 @@ static int binder_thread_read(struct binder_proc *proc, ptr += sizeof(uint32_t); binder_stat_br(proc, thread, e->cmd); + goto done; /* RETURN_ERROR notifications can finish transactions */ } break; case BINDER_WORK_TRANSACTION_COMPLETE: { binder_inner_proc_unlock(proc); -- 2.14.1
Attachment:
oneway_binder_call_hang.jpg
Description: oneway_binder_call_hang.jpg
_______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel