Re: [RFC PATCH 3/3] mm/migrate: Create move_phys_pages syscall

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Sep 10, 2023, at 14:52, Gregory Price wrote:
> On Sat, Sep 09, 2023 at 05:18:13PM +0200, Arnd Bergmann wrote:
>> 
>> I think a pointer to '__u64' is the most appropriate here,
>> that is compatible between 32-bit and 64-bit architectures
>> and covers all addresses until we get architectures with
>> 128-bit addressing.
>> 
>> Thinking about it more, I noticed an existing bug in
>> both sys_move_pages and your current version of
>> sys_move_phys_pages: the 'pages' array is in fact not
>> suitable for compat tasks and needs an is_compat_task
>> check to load a 32-bit address from compat userspace on
>> the "if (get_user(p, pages + i))" line.
>>
>
> I'll clean up the current implementation for what I have on a v2 of an
> RFC, and then look at adding some pull-ahead patches to fix both
> move_pages and move_phys_pages for compat processes.  Might take me a
> bit, I've only done compat work once before and I remember it being
> annoying to get right.

I think what you want is roughly this (untested):

--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2159,6 +2159,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
                         const int __user *nodes,
                         int __user *status, int flags)
 {
+       struct compat_uptr_t __user *compat_pages = (void __user *)pages;
        int current_node = NUMA_NO_NODE;
        LIST_HEAD(pagelist);
        int start, i;
@@ -2171,8 +2172,17 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
                int node;
 
                err = -EFAULT;
-               if (get_user(p, pages + i))
-                       goto out_flush;
+               if (in_compat_syscall() {
+                       compat_uptr_t cp;
+
+                       if (get_user(cp, compat_pages + i))
+                               goto out_flush;
+
+                       p = compat_ptr(cp);
+               } else {
+                       if (get_user(p, pages + i))
+                               goto out_flush;
+               }
                if (get_user(node, nodes + i))
                        goto out_flush;
 
alternatively you could use the get_compat_pages_array()
helper that is already used in the do_pages_stat()
function.

> I did see other work on migrate.c hanging around on the list, I'll
> double check this hasn't already been discovered/handled.

It looks like I broke it, and it was working before my own
5b1b561ba73c8 ("mm: simplify compat_sys_move_pages"), which
only handled the nodes=NULL path.

I suppose nobody noticed the regression because there are very
few 32-bit NUMA systems, and even fewer cases in which one
would run compat userspace to manage a 64-bit NUMA machine.

>
> This only requires plumbing new 2 flags through do_pages_move, and no
> new user-exposed types or information.
>
> Is there an ick-factor with the idea of adding the following?
>
> MPOL_MF_PHYS_ADDR : Treat page migration addresses as physical
> MPOL_MF_PFN : Treat page migration addresses as PFNs

I would strongly prefer supporting only one of the two, and
a 64-bit physical address seems like the logical choice here.

>> These do not seem to be problematic from the ASLR perspective, so
>> I guess it may still be useful without CAP_SYS_ADMIN.
>
> After reviewing the capabilities documentation it seems like
> CAP_SYS_NICE is the appropriate capability.  My last meassage I said
> CAP_SYS_ADMIN was probably correct, but I think using CAP_SYS_NICE
> is more appropriate unless there are strong feelings due to the use of
> PFN and Physcall Address.
>
> I'm not sure rowhammer is of great concern in this interface because you
> can't choose the destination address, only the destination node. Though
> I suppose someone could go nuts and try to "massage" a node in some way
> to get a statistical likelihood of placement (similar heap grooming).

I agree that this doesn't introduce any additional risk for rowhammer
attacks, but it seems slightly more logical to me to use CAP_SYS_ADMIN
if that is what the other interfaces use that handle physical addresses
and may leak address information.

     Arnd



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux