Re: [PATCH] mm: madvise: return correct bytes advised with process_madvise

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks Amit for the inputs!!

On 3/10/2022 12:20 AM, Nadav Amit wrote:
> ---
> mm/madvise.c | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 38d0f51..d3b49b3 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -1426,15 +1426,21 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
> 
> 	while (iov_iter_count(&iter)) {
> 		iovec = iov_iter_iovec(&iter);
> +		/*
> +		 * Even when [start, end) passed to do_madvise covers
> +		 * some unmapped addresses, it continues processing with
> +		 * returning ENOMEM at the end. Thus consider the range
> +		 * as processed when do_madvise() returns ENOMEM.
> +		 * This makes process_madvise() never returns ENOMEM.
> +		 */
> 
> I fully understand and relate to the basic motivation of this
> patch.
> 
> The ENOMEM that this patch checks for, IIUC, is the ENOMEM that is
> returned on unmapped holes. Such ENOMEM does not appear, according to
> the man page, to be a valid reason to return ENOMEM to userspace.
> Presumably process_madvise() is expected to skip unmapped holes
> and not to fail because of them>
True, that ENOMEM represents the VMA passed contains the unmapped holes.
Pasting the Documentation of do_madvise():
 *  -ENOMEM - addresses in the specified range are not currently
 *              mapped, or are outside the AS of the process.

Internally process_madvise() calls do_madvise() in a loop by passing the
vma it received in 'struct iovec'.  And I too agree here that
process_madvise() is expected to process the unmapped holes.

> Having said that, I do not think that the check that the patch does
> is clean or clearly documented.

If it is about the Documentation, how about adding: "Since
process_madvise() is expected to process unmapped holes, never return
ENOMEM received from do_madvise() to user". If the code changes can be
made further cleaner, please suggest.

> 
> In addition, this patch (and some work on process_madvise()) raise
> in my mind a couple of questions:
> 
> 1. There are other errors that process_madvise might encounter
>    and can be propagated back to userspace, but are not
>    documented. For instance if can_madv_lru_vma() fails on
>    MADV_COLD, userspace will get EINVAL. EINVAL is not documented
>    as a valid error-code for such case in either madvise() and
>    process_madvise() man pages.

I agree here with the man page documentations too and felt the same
while going through them. For the mentioned case too, in the madvise[1]
man page, EINVAL return type is only talked for MADV_DONTNEED  and
MADV_REMOVE. It should also contains for MADV_PAGEOUT, MADV_COLD and as
well for MADV_FREE. The other missing return types, which I came across,
in process_madvise are:
EINVAL - return from process_madvise_behavior_valid().
EINTR  - from mm_access()
EACCES - from mm_access()

Thanks,
Charan




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux