Re: [RFC PATCH 3/3] mm/migrate: Create move_phys_pages syscall

Thomas Gleixner <tglx@xxxxxxxxxxxxx> · Tue, 19 Sep 2023 02:17:15 +0200

On Thu, Sep 07 2023 at 03:54, Gregory Price wrote:
> Similar to the move_pages system call, instead of taking a pid and
> list of virtual addresses, this system call takes a list of physical
> addresses.

Silly question. Where are these physical addresses coming from?

In my naive understanding user space deals with virtual addresses for a
reason.

Exposing access to physical addresses is definitely helpful to write
more powerful exploits, so what are the restriction applied to this?

> +/*
> + * Move a list of pages in the address space of the currently executing
> + * process.
> + */
> +static int kernel_move_phys_pages(unsigned long nr_pages,
> +				  const void __user * __user *pages,
> +				  const int __user *nodes,
> +				  int __user *status, int flags)
> +{
> +	int err;
> +	nodemask_t target_nodes;
> +
> +	/* Check flags */

Documeting the obvious ...

> +	if (flags & ~(MPOL_MF_MOVE|MPOL_MF_MOVE_ALL))
> +		return -EINVAL;
> +
> +	if ((flags & MPOL_MF_MOVE_ALL) && !capable(CAP_SYS_NICE))
> +		return -EPERM;

According to this logic here MPOL_MF_MOVE is unrestricted, right?

But how is an unpriviledged process knowing which physical address the
pages have? Confused....

> +	/* All tasks mapping each page is checked in phys_page_migratable */
> +	nodes_setall(target_nodes);

How is the comment related to nodes_setall() and why is nodes_setall()
unconditional when target_nodes is only used in the @nodes != NULL case?

> +	if (nodes)
> +		err = do_pages_move(NULL, target_nodes, nr_pages, pages,
> +			nodes, status, flags);
> +	else
> +		err = do_pages_stat(NULL, nr_pages, pages, status);

Thanks,

        tglx