Re: [RFC PATCH v3 11/15] context-tracking: Introduce work deferral infrastructure

Valentin Schneider <vschneid@xxxxxxxxxx> · Fri, 29 Nov 2024 17:40:29 +0100

On 24/11/24 22:46, Frederic Weisbecker wrote:
> Le Fri, Nov 22, 2024 at 03:56:59PM +0100, Valentin Schneider a écrit :
>> On 20/11/24 18:30, Frederic Weisbecker wrote:
>> > Le Wed, Nov 20, 2024 at 06:10:43PM +0100, Valentin Schneider a écrit :
>> >> On 20/11/24 15:23, Frederic Weisbecker wrote:
>> >>
>> >> > Ah but there is CT_STATE_GUEST and I see the last patch also applies that to
>> >> > CT_STATE_IDLE.
>> >> >
>> >> > So that could be:
>> >> >
>> >> > bool ct_set_cpu_work(unsigned int cpu, unsigned int work)
>> >> > {
>> >> >    struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
>> >> >    unsigned int old;
>> >> >    bool ret = false;
>> >> >
>> >> >    preempt_disable();
>> >> >
>> >> >    old = atomic_read(&ct->state);
>> >> >
>> >> >    /* CT_STATE_IDLE can be added to last patch here */
>> >> >    if (!(old & (CT_STATE_USER | CT_STATE_GUEST))) {
>> >> >            old &= ~CT_STATE_MASK;
>> >> >            old |= CT_STATE_USER;
>> >> >    }
>> >>
>> >> Hmph, so that lets us leverage the cmpxchg for a !CT_STATE_KERNEL check,
>> >> but we get an extra loop if the target CPU exits kernelspace not to
>> >> userspace (e.g. vcpu or idle) in the meantime - not great, not terrible.
>> >
>> > The thing is, what you read with atomic_read() should be close to reality.
>> > If it already is != CT_STATE_KERNEL then you're good (minus racy changes).
>> > If it is CT_STATE_KERNEL then you still must do a failing cmpxchg() in any case,
>> > at least to make sure you didn't miss a context tracking change. So the best
>> > you can do is a bet.
>> >
>> >>
>> >> At the cost of one extra bit for the CT_STATE area, with CT_STATE_KERNEL=1
>> >> we could do:
>> >>
>> >>   old = atomic_read(&ct->state);
>> >>   old &= ~CT_STATE_KERNEL;
>> >
>> > And perhaps also old |= CT_STATE_IDLE (I'm seeing the last patch now),
>> > so you at least get a chance of making it right (only ~CT_STATE_KERNEL
>> > will always fail) and CPUs usually spend most of their time idle.
>> >
>> 
>> I'm thinking with:
>> 
>>         CT_STATE_IDLE		= 0,
>>         CT_STATE_USER		= 1,
>>         CT_STATE_GUEST		= 2,
>>         CT_STATE_KERNEL		= 4, /* Keep that as a standalone bit */
>
> Right!
>
>> 
>> we can stick with old &= ~CT_STATE_KERNEL; and that'll let the cmpxchg
>> succeed for any of IDLE/USER/GUEST.
>
> Sure but if (old & CT_STATE_KERNEL), cmpxchg() will consistently fail.
> But you can make a bet that it has switched to CT_STATE_IDLE between
> the atomic_read() and the first atomic_cmpxchg(). This way you still have
> a tiny chance to succeed.
>
> That is:
>
>    old = atomic_read(&ct->state);
>    if (old & CT_STATE_KERNEl)
>       old |= CT_STATE_IDLE;
>    old &= ~CT_STATE_KERNEL;
>
>
>    do {
>       atomic_try_cmpxchg(...)
>
> Hmm?

But it could equally be CT_STATE_{USER, GUEST}, right? That is, if we have
all of this enabled them we assume the isolated CPUs spend the least amount
of time in the kernel, if they don't we get to blame the user.