From: Ankur Arora > Sent: 07 November 2023 21:57 ... > There are four main sets of preemption points in the kernel: > > 1. return to user > 2. explicit preemption points (cond_resched() and its ilk) > 3. return to kernel (tick/IPI/irq at irqexit) > 4. end of non-preemptible sections at (preempt_count() == preempt_offset) > ... > Policies: > > A - preemption=none: run to completion > B - preemption=voluntary: run to completion, unless a task of higher > sched-class awaits > C - preemption=full: optimized for low-latency. Preempt whenever a higher > priority task awaits. If you remove cond_resched() then won't both B and C require an extra IPI. That is probably OK for RT tasks but could get expensive for normal tasks that aren't bound to a specific cpu. I suspect C could also lead to tasks being pre-empted just before they sleep (eg after waking another task). There might already be mitigation for that, I'm not sure if a voluntary sleep can be done in a non-pre-emptible section. Certainly it should all help the scheduling of RT tasks - which can currently get delayed by a non-RT task in a slow kernel path. Although the worst one is the softint code... David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)