Re: [PATCHv3 6/8] x86/mm: Provide ARCH_GET_UNTAG_MASK and ARCH_ENABLE_TAGGED_ADDR

"Kirill A. Shutemov" <kirill@xxxxxxxxxxxxx> · Mon, 20 Jun 2022 02:40:27 +0300

On Thu, Jun 16, 2022 at 08:05:10PM +0300, Kirill A. Shutemov wrote:
> On Mon, Jun 13, 2022 at 04:42:57PM +0200, Michal Hocko wrote:
> > On Fri 10-06-22 17:35:25, Kirill A. Shutemov wrote:
> > [...]
> > > diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
> > > index 1962008fe743..93c8eba1a66d 100644
> > > --- a/arch/x86/kernel/process_64.c
> > > +++ b/arch/x86/kernel/process_64.c
> > > @@ -742,6 +742,32 @@ static long prctl_map_vdso(const struct vdso_image *image, unsigned long addr)
> > >  }
> > >  #endif
> > >  
> > > +static int prctl_enable_tagged_addr(unsigned long nr_bits)
> > > +{
> > > +	struct mm_struct *mm = current->mm;
> > > +
> > > +	/* Already enabled? */
> > > +	if (mm->context.lam_cr3_mask)
> > > +		return -EBUSY;
> > > +
> > > +	/* LAM has to be enabled before spawning threads */
> > > +	if (get_nr_threads(current) > 1)
> > > +		return -EBUSY;
> > 
> > This will not be sufficient in general. You can have mm shared with a
> > process without CLONE_THREAD. So you would also need to check also
> > MMF_MULTIPROCESS. But I do remember that general get_nr_threads is quite
> > tricky to use properly. Make sure to CC Oleg Nesterov for more details.
> > 
> > Also how does this work when the mm is shared with a kernel thread?
> 
> It seems we need to check mm_count to exclude kernel threads that use the
> mm. But I expect it to produce bunch of false-positives.
> 
> Or we can make all CPUs to do
> 
> 	switch_mm(current->mm, current->mm, current);
> 
> and get LAM bits updated regardless what mm it runs. It would also remove
> limitation that LAM can only be enabled when there's no threads.
> 
> But I feel that is a bad idea, but I have no clue why. :P

Below is what I meant. Maybe it's not that bad. I donno.

Any opinions?

diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 56822d313b96..69e6b11efa62 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -752,6 +752,16 @@ static bool lam_u48_allowed(void)
 	return find_vma(mm, DEFAULT_MAP_WINDOW) == NULL;
 }
 
+static void enable_lam_func(void *mm)
+{
+	struct mm_struct *loaded_mm = this_cpu_read(cpu_tlbstate.loaded_mm);
+
+	if (loaded_mm != mm)
+		return;
+
+	switch_mm(loaded_mm, loaded_mm, current);
+}
+
 static int prctl_enable_tagged_addr(unsigned long nr_bits)
 {
 	struct mm_struct *mm = current->mm;
@@ -760,10 +770,6 @@ static int prctl_enable_tagged_addr(unsigned long nr_bits)
 	if (mm->context.lam_cr3_mask)
 		return -EBUSY;
 
-	/* LAM has to be enabled before spawning threads */
-	if (get_nr_threads(current) > 1)
-		return -EBUSY;
-
 	if (!nr_bits) {
 		return -EINVAL;
 	} else if (nr_bits <= 6) {
@@ -785,8 +791,8 @@ static int prctl_enable_tagged_addr(unsigned long nr_bits)
 		return -EINVAL;
 	}
 
-	/* Update CR3 to get LAM active */
-	switch_mm(current->mm, current->mm, current);
+	on_each_cpu_mask(mm_cpumask(mm), enable_lam_func, mm, true);
+
 	return 0;
 }
 
-- 
 Kirill A. Shutemov