Hello Suren, On 2/2/21 11:12 PM, Suren Baghdasaryan wrote: > Hi Michael, > > On Tue, Feb 2, 2021 at 2:45 AM Michael Kerrisk (man-pages) > <mtk.manpages@xxxxxxxxx> wrote: >> >> Hello Suren (and Minchan and Michal) >> >> Thank you for the revisions! >> >> I've applied this patch, and done a few light edits. > > Thanks! > >> >> However, I have a questions about undocumented pieces in *madvise(2)*, >> as well as one other question. See below. >> >> On 2/2/21 6:30 AM, Suren Baghdasaryan wrote: >>> Initial version of process_madvise(2) manual page. Initial text was >>> extracted from [1], amended after fix [2] and more details added using >>> man pages of madvise(2) and process_vm_read(2) as examples. It also >>> includes the changes to required permission proposed in [3]. >>> >>> [1] https://lore.kernel.org/patchwork/patch/1297933/ >>> [2] https://lkml.org/lkml/2020/12/8/1282 >>> [3] https://patchwork.kernel.org/project/selinux/patch/20210111170622.2613577-1-surenb@xxxxxxxxxx/#23888311 >>> >>> Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx> >>> Reviewed-by: Michal Hocko <mhocko@xxxxxxxx> >>> --- >>> changes in v2: >>> - Changed description of MADV_COLD per Michal Hocko's suggestion >>> - Applied fixes suggested by Michael Kerrisk >>> changes in v3: >>> - Added Michal's Reviewed-by >>> - Applied additional fixes suggested by Michael Kerrisk >>> >>> NAME >>> process_madvise - give advice about use of memory to a process >>> >>> SYNOPSIS >>> #include <sys/uio.h> >>> >>> ssize_t process_madvise(int pidfd, >>> const struct iovec *iovec, >>> unsigned long vlen, >>> int advice, >>> unsigned int flags); >>> >>> DESCRIPTION >>> The process_madvise() system call is used to give advice or directions >>> to the kernel about the address ranges of another process or the calling >>> process. It provides the advice to the address ranges described by iovec >>> and vlen. The goal of such advice is to improve system or application >>> performance. >>> >>> The pidfd argument is a PID file descriptor (see pidfd_open(2)) that >>> specifies the process to which the advice is to be applied. >>> >>> The pointer iovec points to an array of iovec structures, defined in >>> <sys/uio.h> as: >>> >>> struct iovec { >>> void *iov_base; /* Starting address */ >>> size_t iov_len; /* Number of bytes to transfer */ >>> }; >>> >>> The iovec structure describes address ranges beginning at iov_base address >>> and with the size of iov_len bytes. >>> >>> The vlen represents the number of elements in the iovec structure. >>> >>> The advice argument is one of the values listed below. >>> >>> Linux-specific advice values >>> The following Linux-specific advice values have no counterparts in the >>> POSIX-specified posix_madvise(3), and may or may not have counterparts >>> in the madvise(2) interface available on other implementations. >>> >>> MADV_COLD (since Linux 5.4.1) >> >> I just noticed these version numbers now, and thought: they can't be >> right (because the system call appeared only in v5.11). So I removed >> them. But, of course in another sense the version numbers are (nearly) >> right, since these advice values were added for madvise(2) in Linux 5.4. >> However, they are not documented in the madvise(2) manual page. Is it >> correct to assume that MADV_COLD and MADV_PAGEOUT have exactly the same >> meaning in madvise(2) (but just for the calling process, of course)? > > Correct. They should be added in the madvise(2) man page as well IMHO. So, I decided to move the description of MADV_COLD and MADV_PAGEOUT to madvise(2) and refer to that page from the process_madvise(2) page. This avoids repeating the same information in two places. >>> Deactive a given range of pages which will make them a more probable >> >> I changed: s/Deactive/Deactivate/ > > thanks! > >> >>> reclaim target should there be a memory pressure. This is a >>> nondestructive operation. The advice might be ignored for some pages >>> in the range when it is not applicable. >>> >>> MADV_PAGEOUT (since Linux 5.4.1) >>> Reclaim a given range of pages. This is done to free up memory occupied >>> by these pages. If a page is anonymous it will be swapped out. If a >>> page is file-backed and dirty it will be written back to the backing >>> storage. The advice might be ignored for some pages in the range when >>> it is not applicable. >> >> [...] >> >>> The hint might be applied to a part of iovec if one of its elements points >>> to an invalid memory region in the remote process. No further elements will >>> be processed beyond that point. >> >> Is the above scenario the one that leads to the partial advice case described in >> RETURN VALUE? If yes, perhaps I should add some words to make that clearer. > > Correct. This describes the case when partial advice happens. Thanks. I added a few words to clarify this. >> You can see the light edits that I made in >> https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=e3ce016472a1b3ec5dffdeb23c98b9fef618a97b >> and following that I restructured DESCRIPTION a little in >> https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=3aac0708a9acee5283e091461de6a8410bc921a6 > > The edits LGTM. Thanks for checking them. Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/