Re: [PATCH v2 7/9] sched: define TIF_ALLOW_RESCHED

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Mon, Sep 11, 2023 at 02:16:18PM -0700, Linus Torvalds wrote:
> > On Mon, 11 Sept 2023 at 13:50, Linus Torvalds
> > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > Except we've actually been *adding* to this whole mess, rather than
> > > removing it. So we have actively *expanded* on that preemption choice
> > > with PREEMPT_DYNAMIC.
> > 
> > Actually, that config option makes no sense.
> > 
> > It makes the sched_cond() behavior conditional with a static call.
> > 
> > But all the *real* overhead is still there and unconditional (ie all
> > the preempt count updates and the "did it go down to zero and we need
> > to check" code).
> > 
> > That just seems stupid. It seems to have all the overhead of a
> > preemptible kernel, just not doing the preemption.
> > 
> > So I must be mis-reading this, or just missing something important.
> > 
> > The real cost seems to be
> > 
> >    PREEMPT_BUILD -> PREEMPTION -> PREEMPT_COUNT
> > 
> > and PREEMPT vs PREEMPT_DYNAMIC makes no difference to that, since both
> > will end up with that, and thus both cases will have all the spinlock
> > preempt count stuff.
> > 
> > There must be some non-preempt_count cost that people worry about.
> > 
> > Or maybe I'm just mis-reading the Kconfig stuff entirely. That's
> > possible, because this seems *so* pointless to me.
> > 
> > Somebody please hit me with a clue-bat to the noggin.
> 
> Well, I was about to reply to your previous email explaining this, but 
> this one time I did read more email..
> 
> Yes, PREEMPT_DYNAMIC has all the preempt count twiddling and only nops 
> out the schedule()/cond_resched() calls where appropriate.
> 
> This work was done by a distro (SuSE) and if they're willing to ship this 
> I'm thinking the overheads are acceptable to them.
> 
> For a significant number of workloads the real overhead is the extra 
> preepmtions themselves more than the counting -- but yes, the counting is 
> measurable, but probably in the noise compared to other some of the other 
> horrible things we have done the past years.
> 
> Anyway, if distros are fine shipping with PREEMPT_DYNAMIC, then yes, 
> deleting the other options are definitely an option.

Yes, so my understanding is that distros generally worry more about 
macro-overhead, for example material changes to a random subset of key 
benchmarks that specific enterprise customers care about, and distros are 
not nearly as sensitive about micro-overhead that preempt_count() 
maintenance causes.

PREEMPT_DYNAMIC is basically a reflection of that: the desire to have only 
a single kernel image, but a boot-time toggle to differentiate between 
desktop and server loads and have CONFIG_PREEMPT (desktop) but also 
PREEMPT_VOLUNTARY behavior (server).

There's also the view that PREEMPT kernels are a bit more QA-friendly, 
because atomic code sequences are much better defined & enforced via kernel 
warnings. Without preempt_count we only have irqs-off warnings, that are 
only a small fraction of all critical sections in the kernel.

Ideally we'd be able to patch out most of the preempt_count maintenance 
overhead too - OTOH these days it's little more than noise on most CPUs, 
considering the kind of horrible security-workaround overhead we have on 
almost all x86 CPU types ... :-/

Thanks,

	Ingo




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux