On Fri 11-03-22 20:59:06, Charan Teja Kalla wrote: > The process_madvise() system call is expected to skip holes in vma > passed through 'struct iovec' vector list. Where is this assumption coming from? From the man page I can see: : The advice might be applied to only a part of iovec if one of its : elements points to an invalid memory region in the remote : process. No further elements will be processed beyond that : point. > But do_madvise, which > process_madvise() calls for each vma, returns ENOMEM in case of unmapped > holes, despite the VMA is processed. > Thus process_madvise() should treat ENOMEM as expected and consider the > VMA passed to as processed and continue processing other vma's in the > vector list. Returning -ENOMEM to user, despite the VMA is processed, > will be unable to figure out where to start the next madvise. I am not sure I follow. With your previous patch and -ENOMEM from do_madvise you get the the answer you are looking for, no? With this applied you are loosing the information that some of the iters are not mapped or has a hole. Which might be a useful information especially when processing on remote tasks which are free to manipulate their address spaces. > Fixes: ecb8ac8b1f14("mm/madvise: introduce process_madvise() syscall: an external memory hinting API") > Cc: <stable@xxxxxxxxxxxxxxx> # 5.10+ > Signed-off-by: Charan Teja Kalla <quic_charante@xxxxxxxxxxx> > --- > Changes in V2: > -- Fixed handling of ENOMEM by process_madvise(). > -- Patch doesn't exist in V1. > > mm/madvise.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/mm/madvise.c b/mm/madvise.c > index e97e6a9..14fb76d 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -1426,9 +1426,16 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec, > > while (iov_iter_count(&iter)) { > iovec = iov_iter_iovec(&iter); > + /* > + * do_madvise returns ENOMEM if unmapped holes are present > + * in the passed VMA. process_madvise() is expected to skip > + * unmapped holes passed to it in the 'struct iovec' list > + * and not fail because of them. Thus treat -ENOMEM return > + * from do_madvise as valid and continue processing. > + */ > ret = do_madvise(mm, (unsigned long)iovec.iov_base, > iovec.iov_len, behavior); > - if (ret < 0) > + if (ret < 0 && ret != -ENOMEM) > break; > iov_iter_advance(&iter, iovec.iov_len); > } > -- > 2.7.4 -- Michal Hocko SUSE Labs