Re: [RFC PATCH 00/86] Make the kernel preemptible

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 08 2023 at 11:13, Peter Zijlstra wrote:
> On Wed, Nov 08, 2023 at 02:04:02AM -0800, Ankur Arora wrote:
> I'm not understanding, those should stay obviously.
>
> The current preempt_dynamic stuff has 5 toggles:
>
> /*
>  * SC:cond_resched
>  * SC:might_resched
>  * SC:preempt_schedule
>  * SC:preempt_schedule_notrace
>  * SC:irqentry_exit_cond_resched
>  *
>  *
>  * NONE:
>  *   cond_resched               <- __cond_resched
>  *   might_resched              <- RET0
>  *   preempt_schedule           <- NOP
>  *   preempt_schedule_notrace   <- NOP
>  *   irqentry_exit_cond_resched <- NOP
>  *
>  * VOLUNTARY:
>  *   cond_resched               <- __cond_resched
>  *   might_resched              <- __cond_resched
>  *   preempt_schedule           <- NOP
>  *   preempt_schedule_notrace   <- NOP
>  *   irqentry_exit_cond_resched <- NOP
>  *
>  * FULL:
>  *   cond_resched               <- RET0
>  *   might_resched              <- RET0
>  *   preempt_schedule           <- preempt_schedule
>  *   preempt_schedule_notrace   <- preempt_schedule_notrace
>  *   irqentry_exit_cond_resched <- irqentry_exit_cond_resched
>  */
>
> If you kill voluntary as we know it today, you can remove cond_resched
> and might_resched, but the remaining 3 are still needed to switch
> between NONE and FULL.

No. The whole point of LAZY is to keep preempt_schedule(),
preempt_schedule_notrace(), irqentry_exit_cond_resched() always enabled.

Look at my PoC: https://lore.kernel.org/lkml/87jzshhexi.ffs@tglx/

The idea is to always enable preempt count and keep _all_ preemption
points enabled.

For NONE/VOLUNTARY mode let the scheduler set TIF_NEED_RESCHED_LAZY
instead of TIF_NEED_RESCHED. In full mode set TIF_NEED_RESCHED.

Here is where the regular and the lazy flags are evaluated:

                Ret2user        Ret2kernel      PreemptCnt=0  need_resched()

NEED_RESCHED       Y                Y               Y         Y
LAZY_RESCHED       Y                N               N         Y

The trick is that LAZY is not folded into preempt_count so a 1->0
counter transition won't cause preempt_schedule() to be invoked because
the topmost bit (NEED_RESCHED) is set.

The scheduler can still decide to set TIF_NEED_RESCHED which will cause
an immediate preemption at the next preemption point.

This allows to force out a task which loops, e.g. in a massive copy or
clear operation, as it did not reach a point where TIF_NEED_RESCHED_LAZY
is evaluated after a time which is defined by the scheduler itself.

For my PoC I did:

    1) Set TIF_NEED_RESCHED_LAZY

    2) Set TIF_NEED_RESCHED when the task did not react on
       TIF_NEED_RESCHED_LAZY within a tick

I know that's crude but it just works and obviously requires quite some
refinement.

So the way how you switch between preemption modes is to select when the
scheduler sets TIF_NEED_RESCHED/TIF_NEED_RESCHED_LAZY. No static call
switching at all.

In full preemption mode it sets always TIF_NEED_RESCHED and otherwise it
uses the LAZY bit first, grants some time and then gets out the hammer
and sets TIF_NEED_RESCHED when the task did not reach a LAZY preemption
point.

Which means once the whole thing is in place then the whole
PREEMPT_DYNAMIC along with NONE, VOLUNTARY, FULL can go away along with
the cond_resched() hackery.

So I think this series is backwards.

It should add the LAZY muck with a Kconfig switch like I did in my PoC
_first_. Once that is working and agreed on, the existing muck can be
removed.

Thanks,

        tglx




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux