Hi Peter, Please see a few comments below. Thanks, Alex On 3/22/21 11:08 PM, Peter Xu wrote:
Userfaultfd write-protect mode is supported starting from Linux 5.7. Signed-off-by: Peter Xu <peterx@xxxxxxxxxx> --- man2/ioctl_userfaultfd.2 | 84 ++++++++++++++++++++++++++++++++++++++-- 1 file changed, 81 insertions(+), 3 deletions(-) diff --git a/man2/ioctl_userfaultfd.2 b/man2/ioctl_userfaultfd.2 index d4a8375b8..5419687a6 100644 --- a/man2/ioctl_userfaultfd.2 +++ b/man2/ioctl_userfaultfd.2 @@ -234,6 +234,11 @@ operation is supported. The .B UFFDIO_UNREGISTER operation is supported. +.TP +.B 1 << _UFFDIO_WRITEPROTECT +The +.B UFFDIO_WRITEPROTECT +operation is supported. .PP This .BR ioctl (2) @@ -322,9 +327,6 @@ Track page faults on missing pages. .B UFFDIO_REGISTER_MODE_WP Track page faults on write-protected pages. .PP -Currently, the only supported mode is -.BR UFFDIO_REGISTER_MODE_MISSING . -.PP If the operation is successful, the kernel modifies the .I ioctls bit-mask field to indicate which @@ -443,6 +445,16 @@ operation: .TP .B UFFDIO_COPY_MODE_DONTWAKE Do not wake up the thread that waits for page-fault resolution +.TP +.B UFFDIO_COPY_MODE_WP +Copy the page with read-only permission. +This allows the user to trap the next write to the page, +which will block and generate another write-protect userfault message.
s/write-protect/write-protected/ ?
+This is only used when both +.B UFFDIO_REGISTER_MODE_MISSING +and +.B UFFDIO_REGISTER_MODE_WP +modes are enabled for the registered range. .PP The .I copy @@ -654,6 +666,72 @@ field of the structure was not a multiple of the system page size; or .I len was zero; or the specified range was otherwise invalid. +.SS UFFDIO_WRITEPROTECT (Since Linux 5.7) +Write-protect or write-unprotect an userfaultfd registered memory range +registered with mode +.BR UFFDIO_REGISTER_MODE_WP . +.PP +The +.I argp +argument is a pointer to a +.I uffdio_range +structure as shown below: +.PP +.in +4n +.EX +struct uffdio_writeprotect { + struct uffdio_range range; /* Range to change write permission */ + __u64 mode; /* Mode to change write permission */ +}; +.EE +.in +There're two mode bits that are supported in this structure: +.TP +.B UFFDIO_WRITEPROTECT_MODE_WP +When this mode bit is set, the ioctl will be a write-protect operation upon the +memory range specified by +.IR range . +Otherwise it'll be a write-unprotect operation upon the specified range, +which can be used to resolve an userfaultfd write-protect page fault. +.TP +.B UFFDIO_WRITEPROTECT_MODE_DONTWAKE +When this mode bit is set, +do not wake up any thread that waits for page-fault resolution after the operation. +This could only be specified if +.B UFFDIO_WRITEPROTECT_MODE_WP +is not specified. +.PP +This +.BR ioctl (2) +operation returns 0 on success. +On error, \-1 is returned and +.I errno +is set to indicate the error. +Possible errors include: +.TP +.B EINVAL +The +.I start +or the +.I len +field of the +.I ufdio_range +structure was not a multiple of the system page size; or +.I len +was zero; or the specified range was otherwise invalid. +.TP +.B EAGAIN +The process was interrupted and need to retry.
Maybe: "The process was interrupted; retry this call."? I don't know what other pager say about this kind of error.
+.TP +.B ENOENT +The range specified in +.I range +is not valid.
I'm not sure how this is different from the wording above in EINVAL. An "otherwise invalid range" was already giving EINVAL?
+For example, the virtual address does not exist, +or not registered with userfaultfd write-protect mode. +.TP +.B EFAULT +Encountered a generic fault during processing.
What is a "generic fault"?
.SH RETURN VALUE See descriptions of the individual operations, above. .SH ERRORS
-- Alejandro Colomar Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/ http://www.alejandro-colomar.es/