On Wed, Mar 09, 2022 at 10:57:59AM +0530, Charan Teja Kalla wrote: > The process_madvise() system call returns error even after processing > some VMA's passed in the 'struct iovec' vector list which leaves the > user confused to know where to restart the advise next. It is also > against this syscall man page[1] documentation where it mentions that > "return value may be less than the total number of requested bytes, if > an error occurred after some iovec elements were already processed.". > > Consider a user passed 10 VMA's in the 'struct iovec' vector list of > which 9 are processed but one. Then it just returns the error caused on > that failed VMA despite the first 9 VMA's processed, leaving the user > confused about on which VMA it is failed. Returning the number of bytes > processed here can help the user to know which VMA it is failed on and > thus can retry/skip the advise on that VMA. > > [1]https://man7.org/linux/man-pages/man2/process_madvise.2.html. > > Fixes: ecb8ac8b1f14("mm/madvise: introduce process_madvise() syscall: an external memory hinting API" > Signed-off-by: Charan Teja Kalla <quic_charante@xxxxxxxxxxx> > --- > mm/madvise.c | 12 +++++++++--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/mm/madvise.c b/mm/madvise.c > index 38d0f51..d3b49b3 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -1426,15 +1426,21 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec, > > while (iov_iter_count(&iter)) { > iovec = iov_iter_iovec(&iter); > + /* > + * Even when [start, end) passed to do_madvise covers > + * some unmapped addresses, it continues processing with > + * returning ENOMEM at the end. Thus consider the range > + * as processed when do_madvise() returns ENOMEM. > + * This makes process_madvise() never returns ENOMEM. > + */ Looks like that this patch has two things. first, returns processed bytes instead of error in case of error. Second, keep working on rest vmas on -ENOMEM due to unmapped hole. First thing totally makes sense to me(that's exactly I wanted to do but somehow missed) so it should go stable tree. However, second stuff might be arguble so it would be great if you split the patch. > ret = do_madvise(mm, (unsigned long)iovec.iov_base, > iovec.iov_len, behavior); > - if (ret < 0) > + if (ret < 0 && ret != -ENOMEM) > break; > iov_iter_advance(&iter, iovec.iov_len); > } > > - if (ret == 0) > - ret = total_len - iov_iter_count(&iter); > + ret = (total_len - iov_iter_count(&iter)) ? : ret; > > release_mm: > mmput(mm); > -- > 2.7.4 >