Hi Finn,
Am 23.03.2024 um 17:35 schrieb Finn Thain:
On Fri, 22 Mar 2024, Michael Schmitz wrote:
Am 22.03.2024 um 17:53 schrieb Michael Schmitz:
Am 22.03.2024 um 17:39 schrieb Finn Thain:
I find that patch description to be a bit confusing, since
preempt_schedule_irq() requires that interrupts were disabled:
BUG_ON(preempt_count() || !irqs_disabled());
Looking more closely I see that you're testing the IPL bits from the
stack not the status register...
Yes - the problem appears to be that if we enter preemption when about
to return to kernel code that had interrupts disabled, bad things may
happen [...]
That's independent from having interrupts disabled in the currently
active exception.
Is it? If interrupts were disabled by the code we're returning to, surely
they must be disabled still (in the Status reg, for the BUG_ON above).
They would still be disabled, but by calling preempt_schedule_irq() they
will be reenabled before calling schedule(). At that time, another
interrupt may come in, and change things from under the piece of code
that had interrupts disabled.
That won't happen without preemption.
What's more, I suspect schedule() may cause another process to exit that
would otherwise first take a signal, and signal delivery then repeats
the vma teardown on process exit. At least that is how I read those
'table already freed' stack traces.
It's not clear to me which pre-emption opportunities remain in
effect. E.g. is full preemption (in effect) the same as voluntary
preemption?
Good question - I'll have to instrument preempt_schedule_irq() and see
if it gets called at all with my patch.
Can't see preempt_schedule_irq() entered with my patch ...
Neither can I: A quick "stress-ng --zombie -1 -t 100" test passed without
ever calling preempt_schedule_irq() in QEMU.
Right. Whenever I allow preemption with IPL (in the saved frame's SR) >
0, I eventually get the 'table already free' panic. Trying to save the
whole frame for later printing in free_pointer_table() hasn't been too
helpful though. The saved frame's PC does not appear anywhere in the trace.
In general, full preemption is not the same as voluntary preemption.
Full preemption attempts to preempt on every return from interrupt or
syscall (given certain constraints are fulfilled such as we're not
currently in any interrupt or already preempting, and there's other work
that needs scheduling).
The constraint that interrupts must not be disabled in the saved stack
frame apparently is one constraint too many...
I dropped the mailing list from the recipients in my previous message.
This might a good time to follow up on the list in case Geert can shed
some light.
Adding Geert and list back in now ...
Cheers,
Michael