On Mon, Sep 12, 2016 at 11:42 PM, Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: > On Mon, Sep 12, 2016 at 08:44:09PM -0700, Arve Hjønnevåg wrote: >> On Sat, Sep 10, 2016 at 10:28 AM, Greg Kroah-Hartman >> <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: >> > On Sat, Sep 10, 2016 at 06:37:29PM +0200, Thomas Gleixner wrote: >> >> On Sat, 10 Sep 2016, Peter Zijlstra wrote: >> >> >> >> > On Sat, Sep 10, 2016 at 09:16:59AM -0700, Christoph Hellwig wrote: >> >> > > On Thu, Sep 08, 2016 at 09:12:50AM -0700, Todd Kjos wrote: >> >> > > > In Android systems, the display pipeline relies on low >> >> > > > latency binder transactions and is therefore sensitive to >> >> > > > delays caused by contention for the global binder lock. >> >> > > > Jank is siginificantly reduced by disabling preemption >> >> > > > while the global binder lock is held. >> >> > > >> >> > > That's now how preempt_disable is supposed to use. It is for critical >> >> > >> >> > not, that's supposed to be _not_. Just to be absolutely clear, this is >> >> > NOT how you're supposed to use preempt_disable(). >> >> > >> >> > > sections that use per-cpu or similar resources. >> >> > > >> >> > > > >> >> > > > Originally-from: Riley Andrews <riandrews@xxxxxxxxxx> >> >> > > > Signed-off-by: Todd Kjos <tkjos@xxxxxxxxxx> >> >> > >> >> > > > @@ -389,7 +390,11 @@ static int task_get_unused_fd_flags(struct >> >> > > > binder_proc *proc, int flags) >> >> > > > rlim_cur = task_rlimit(proc->tsk, RLIMIT_NOFILE); >> >> > > > unlock_task_sighand(proc->tsk, &irqs); >> >> > > > >> >> > > > - return __alloc_fd(files, 0, rlim_cur, flags); >> >> > > > + preempt_enable_no_resched(); >> >> > > > + ret = __alloc_fd(files, 0, rlim_cur, flags); >> >> > > > + preempt_disable(); >> >> > >> >> > And the fact that people want to use preempt_enable_no_resched() shows >> >> > that they're absolutely clueless. >> >> > >> >> > This is so broken its not funny. >> >> > >> >> > NAK NAK NAK >> >> >> >> Indeed. Sprinkling random preempt_enabe/disable() pairs all over the place >> >> documents clearly that this is tinkering and not proper software >> >> engineering. >> > >> > I have pointed out in the other thread for this patch (the one that had >> > a patch that could be applied) that the single lock in the binder code >> > is the main problem here, it should be solved instead of this messing >> > around with priorities. >> > >> >> While removing the single lock in the binder driver would help reduce >> the problem that this patch tries to work around, it would not fix it. >> The largest problems occur when a very low priority thread gets >> preempted while holding the lock. When a high priority thread then >> needs the same lock it can't get it. Changing the driver to use more >> fine-grained locking would reduce the set of threads that can trigger >> this problem, but there are processes that receive work from both high >> and low priority threads and could still end up in the same situation. > > Ok, but how would this patch solve the problem? It only reduces the > window of when the preemption could occur, it doesn't stop it from > happening entirely. > I agree, this patch does not _solve_ the problem either. It only reduces it, as there are many places where it re-enables preemption for significant work. >> A previous attempt to fix this problem, changed the lock to use >> rt_mutex instead of mutex, but this apparently did not work as well as >> this patch. I believe the added overhead was noticeable, and it did >> not work when the preempted thread was in a different cgroup (I don't >> know if this is still the case). > > Why would rt_mutex solve this? If the task that holds the lock would run when you try to get the lock, there would be no long delay to get the lock. rt_mutex seems to be designed to solve this problem. > > I worry that this is only going to get worse as more portions of the > Android userspace/HAL get rewritten to use binder more and more. What > about trying to get rid of the lock entirely? That way the task > priority levels would work "properly" here. > I think a good first step would be to switch to locks with a smaller scope. I'm not how useful it would be to eliminate locks in this driver since it calls into other code that uses locks. > thanks, > > greg k-h -- Arve Hjønnevåg _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel