On 1/19/23 6:22 AM, Nicholas Piggin wrote:
On Thu Jan 19, 2023 at 8:22 AM AEST, Nadav Amit wrote:
On Jan 18, 2023, at 12:00 AM, Nicholas Piggin <npiggin@xxxxxxxxx> wrote:
+static void do_shoot_lazy_tlb(void *arg)
+{
+ struct mm_struct *mm = arg;
+
+ if (current->active_mm == mm) {
+ WARN_ON_ONCE(current->mm);
+ current->active_mm = &init_mm;
+ switch_mm(mm, &init_mm, current);
+ }
+}
I might be out of touch - doesn’t a flush already take place when we free
the page-tables, at least on common cases on x86?
IIUC exit_mmap() would free page-tables, and whenever page-tables are
freed, on x86, we do shootdown regardless to whether the target CPU TLB state
marks is_lazy. Then, flush_tlb_func() should call switch_mm_irqs_off() and
everything should be fine, no?
[ I understand you care about powerpc, just wondering on the effect on x86 ]
Now I come to think of it, Rik had done this for x86 a while back.
https://lore.kernel.org/all/20180728215357.3249-10-riel@xxxxxxxxxxx/
I didn't know about it when I wrote this, so I never dug into why it
didn't get merged. It might have missed the final __mmdrop races but
I'm not not sure, x86 lazy tlb mode is too complicated to know at a
glance. I would check with him though.
My point was that naturally (i.e., as done today), when exit_mmap() is
done, you release the page tables (not just the pages). On x86 it means
that you also send shootdown IPI to all the *lazy* CPUs to perform a
flush, so they would exit the lazy mode.
[ this should be true for 99% of the cases, excluding cases where there
were not page-tables, for instance ]
So the patch of Rik, I think, does not help in the common cases,
although it may perhaps make implicit actions more explicit in the code.