Re: [PATCH 14/31] sched_ext: Implement BPF extensible scheduler class

Tejun Heo <tj@xxxxxxxxxx> · Fri, 2 Dec 2022 08:01:37 -1000

Hello,

On Fri, Dec 02, 2022 at 12:08:27PM -0500, Barret Rhoden wrote:
> you might be able to avoid the double_lock_balance() by using
> move_queued_task(), which internally hands off the old rq lock and returns
> with the new rq lock.
> 
> the pattern for consume_dispatch_q() would be something like:
> 
> - kfunc from bpf, with this_rq lock held
> - notice p isn't on this_rq, goto remote_rq:
> - do sched_ext accounting, like the this_rq->dsq->nr--
> - unlock this_rq
> - p_rq = task_rq_lock(p)
> - double_check p->rq didn't change to this_rq during that unlock
> - new_rq = move_queued_task(p_rq, rf, p, new_cpu)
> - do sched_ext accounting like new_rq->dsq->nr++
> - unlock new_rq
> - relock the original this_rq
> - return to bpf
> 
> you still end up grabbing both locks, but just not at the same time.

Yeah, this probably would look better than the current double lock dancing,
especially in the finish_dispatch() path.

> plus, task_rq_lock() takes the guesswork out of whether you're getting p's
> rq lock or not.  it looks like you're using the holding_cpu to handle the
> race where p moves cpus after you read task_rq(p) but before you lock that
> task_rq.  maybe you can drop the whole concept of the holding_cpu?

->holding_cpu is there to basically detect intervening dequeues, so if we
lock them out with TASK_ON_RQ_MIGRATING, we might be able to drop it. I need
to look into it more tho. Things get pretty subtle around there, so I could
easily be missing something. I'll try this and let you know how it goes.

Thanks.

-- 
tejun