On Wed, May 11 2022 at 05:27, Kirill A. Shutemov wrote: > diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c > index f9fe71d1f42c..b320556e1c22 100644 > --- a/arch/x86/mm/tlb.c > +++ b/arch/x86/mm/tlb.c > @@ -185,6 +185,34 @@ static u8 gen_lam(struct task_struct *tsk, struct mm_struct *mm) > if (!tsk) > return LAM_NONE; > > + if (tsk->flags & PF_KTHREAD) { > + /* > + * For kernel thread use the most permissive LAM > + * used by the mm. It's required to handle kernel thread > + * memory accesses on behalf of a process. > + * > + * Adjust thread flags accodringly, so untagged_addr() would > + * work correctly. > + */ > + > + tsk->thread.features &= ~(X86_THREAD_LAM_U48 | > + X86_THREAD_LAM_U57); > + > + switch (mm->context.lam) { > + case LAM_NONE: > + return LAM_NONE; > + case LAM_U57: > + tsk->thread.features |= X86_THREAD_LAM_U57; > + return LAM_U57; > + case LAM_U48: > + tsk->thread.features |= X86_THREAD_LAM_U48; > + return LAM_U48; Pretending that LAM is configurable per thread and then having a magic override in the per process mm when accessing that process' memory from a kernel thread is inconsistent, a horrible hack and a recipe for hard to diagnose problems. LAM has to be enabled by the process _before_ creating threads and then stay enabled until the whole thing dies. That's the only sensible use case. I understand that tsk->thread.features is conveniant for the untagging mechanism, but the whole setup should be: prctl(ENABLE, which) if (can_enable_lam(which)) { mm->lam.c3_mask = CR3_LAM(which); mm->lam.untag_mask = UNTAG_LAM(which); current->thread.lam_untag_mask = mm->lam.untag_mask; } and can_enable_lam(which) if (current_is_multithreaded()) return -ETOOLATE; if (current->mm->lam_cr3_mask) return -EBUSY; .... Now vs. kernel threads. Doing this like the above is just the wrong place. If a kernel thread accesses user space memory of a process then it has to invoke kthread_use_mm(), right? So the obvious point to cache that setting is in kthread_use_mm() and kthread_unuse_mm() clears it: kthread_use_mm() current->thread.lam_untag_mask = mm->lam.untag_mask; kthread_unuse_mm() current->thread.lam_untag_mask = 0; This makes all of the mechanics trivial because CR3 switch then simply does: new_cr3 |= mm->lam.c3_mask; No conditionals and evaluations, nothing. Just straight forward and comprehensible code. Thanks, tglx