Re: [PATCH 1/1] ioctl_epoll.2: New page describing ioctl(2) operations for epoll fds

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Joe,

On Tue, Jun 04, 2024 at 06:17:40PM GMT, Joe Damato wrote:
> Signed-off-by: Joe Damato <jdamato@xxxxxxxxxx>

Thanks for the patch!  Please see a few comments below.

Have a lovely night!
Alex

> ---
>  man/man2/epoll_create.2 |   1 +
>  man/man2/epoll_ctl.2    |   1 +
>  man/man2/ioctl_epoll.2  | 203 ++++++++++++++++++++++++++++++++++++++++
>  man/man7/epoll.7        |   1 +
>  4 files changed, 206 insertions(+)
>  create mode 100644 man/man2/ioctl_epoll.2
> 
> diff --git a/man/man2/epoll_create.2 b/man/man2/epoll_create.2
> index f0327e8ba..2aa1745f5 100644
> --- a/man/man2/epoll_create.2
> +++ b/man/man2/epoll_create.2
> @@ -141,4 +141,5 @@ on overrun.
>  .BR close (2),
>  .BR epoll_ctl (2),
>  .BR epoll_wait (2),
> +.BR ioctl_epoll (2),
>  .BR epoll (7)
> diff --git a/man/man2/epoll_ctl.2 b/man/man2/epoll_ctl.2
> index 6d5bc032e..24bbe7405 100644
> --- a/man/man2/epoll_ctl.2
> +++ b/man/man2/epoll_ctl.2
> @@ -425,5 +425,6 @@ flag.
>  .SH SEE ALSO
>  .BR epoll_create (2),
>  .BR epoll_wait (2),
> +.BR ioctl_epoll (2),
>  .BR poll (2),
>  .BR epoll (7)
> diff --git a/man/man2/ioctl_epoll.2 b/man/man2/ioctl_epoll.2
> new file mode 100644
> index 000000000..1d53f458e
> --- /dev/null
> +++ b/man/man2/ioctl_epoll.2
> @@ -0,0 +1,203 @@
> +.\" Copyright (c) 2024, Joe Damato
> +.\" Written by Joe Damato <jdamato@xxxxxxxxxx>
> +.\"
> +.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> +.\"
> +.\"

Please use only one consecutive .\"

> +.TH ioctl_epoll 2 (date) "Linux man-pages (unreleased)"
> +.SH NAME
> +ioctl_epoll \- ioctl() operations for epoll file descriptors

Please add link pages <man2const/EPIOCSPARAMS.2const> and
<man2const/EPIOCGPARAMS.2const>.

> +.SH LIBRARY
> +Standard C library
> +.RI ( libc ", " \-lc )
> +.SH SYNOPSIS
> +.nf
> +.BR "#include <linux/eventpoll.h>" "  /* Definition of " EPIOC* " constants and struct epoll_params */"
> +.B "#include <sys/ioctl.h>"
> +.P
> +.BI "int ioctl(int " fd ", int " op ", void " *argp ");"

The '*' should be bold; not italics.

> +.fi
> +.SH DESCRIPTION
> +Various
> +.BR ioctl (2)
> +operations can be performed on an epoll file descriptor (created by a call
> +to
> +.BR epoll_create (2))
> +(since Linux 6.9 and glibc 2.40) using calls of the form:
> +.\" commit 18e2bf0edf4dd88d9656ec92395aa47392e85b61
> +.\" glibc commit 92c270d32caf3f8d5a02b8e46c7ec5d9d0315158
> +.P
> +.in +4n
> +.EX
> +ioctl(fd, op, argp);
> +.EE
> +.in
> +.P
> +In the above,
> +.I fd
> +is a file descriptor referring to an epoll file descriptor obtained with a
> +call to
> +.BR epoll_create (2).
> +.I op
> +is one of the operations listed below, and
> +.I argp
> +is a pointer to the data structure described below.

If argp is a pointer to a structure, then the prototype should document
that:

.BI "int ioctl(int " fd ", int " op ", struct epoll_params *" argp );

Also, I would document the two operations separately:

.BI "int ioctl(int " fd ", EPIOCSPARAMS, const struct epoll_params *" argp );
.BI "int ioctl(int " fd ", EPIOCGPARAMS, struct epoll_params *" argp );

This allows documenting that the 'S' one doesn't modify the argp
(AFAICS).

> +.\"
> +.P
> +All supported
> +.I op
> +values (described below) use an
> +.I argp
> +argument which is a pointer to a
> +.I epoll_params
> +structure, defined as:
> +.P
> +.in +4n
> +.EX
> +struct epoll_params {
> +    uint32_t busy_poll_usecs;   /* Number of usecs to busy poll */
> +    uint16_t busy_poll_budget;  /* Maximum number of packets to retrieve per poll */
> +    uint8_t prefer_busy_poll;   /* Boolean to enable or disable prefer busy poll  */
> +
> +    /* pad the struct to a multiple of 64bits */
> +    uint8_t __pad;              /* Must be zero */
> +};

You could add this type definition to the SYNOPSIS, below the function
prototypes.

> +.EE
> +.in
> +.P
> +The
> +.I busy_poll_usecs
> +field denotes the number of microseconds that the network stack will busy
> +poll. During this time period, the network device will be polled
> +repeatedly. This value cannot exceed
> +.B INT_MAX .
> +.in
> +.P
> +The
> +.I busy_poll_budget
> +field denotes the maximum number of packets that the network stack will
> +be retrieved on each poll attempt. This value cannot exceed
> +.B NAPI_POLL_WEIGHT
> +which, as of Linux 6.9, is 64, unless the process is run with
> +.B CAP_NET_ADMIN .
> +.P
> +The
> +.I prefer_busy_poll
> +field is a boolean field and must be either 0 (disabled) or 1 (enabled). If
> +enabled, this indicates to the network stack that busy poll is the
> +preferred method of processing network data and the network stack should
> +give the application the opportunity to busy poll. Without this option,
> +very busy systems may continue to do network processing via the normal
> +method of IRQs triggering softIRQ and NAPI.
> +.P
> +The supported
> +.I op
> +values:
> +.TP
> +.B EPIOCSPARAMS
> +This operation allows the caller to specify an
> +.I epoll_params
> +structure to configure the operation of epoll. Refer to the structure
> +description of the structure above to learn what configuration is
> +supported.
> +.TP
> +.B EPIOCGPARAMS
> +This operation allows the caller to retrieve the current
> +.I epoll_params
> +structure. This can be used to determine what the current settings are.
> +.SH RETURN VALUE
> +On success, 0 is returned.
> +On failure, \-1 is returned, and
> +.I errno
> +is set to indicate the error.
> +.SH ERRORS
> +.TP
> +.B EOPNOTSUPP
> +The kernel was not compiled with busy poll support.
> +.TP
> +.B ENOIOCTLCMD

Is this a thing?

	$ grep -rn ENOIOCTLCMD /usr/include/
	$ 

I suspect this is EINVAL in user space?  (Actually, you list the same
exact error condition for EINVAL below.)

> +The specified
> +.I op
> +is invalid.
> +.TP
> +.B EINVAL
> +The
> +.I fd
> +specified is not an epoll file descriptor, or the
> +.I op
> +specified is invalid, or the
> +.I __pad
> +was not initialized to zero, or
> +.I busy_poll_usecs
> +exceeds
> +.B INT_MAX ,
> +or
> +.I prefer_busy_poll
> +is not 0 or 1.

Please separate the different conditions for EINVAL into separate
entries.  The ones that are related can go in the same one, but the
unrelated ones are better split.

> +.TP
> +.B EPERM
> +The process is being run without
> +.I CAP_NET_ADMIN

This should be bold.

> +and the specified
> +.I busy_poll_budget

This should be

.I argp.busy_poll_budget

> +exceeds
> +.I NAPI_POLL_WEIGHT

This should be bold.

> +(which is 64 as of Linux 6.9).
> +.TP
> +.B EFAULT
> +.I argp
> +does not point to a valid memory address.
> +.SH EXAMPLES
> +.EX

Could you write an entire program, with a main()?

If not, it's fine; this is better than nothing.  But we prefer having
entire programs that users can paste and try.

> +/* Code to set the epoll params to enable busy polling */
> +\&
> +int epollfd = epoll_create1(0);
> +struct epoll_params params;
> +\&
> +if (epollfd == \-1) {
> +    perror("epoll_create1");
> +    exit(EXIT_FAILURE);
> +}
> +\&
> +memset(&params, 0, sizeof(struct epoll_params));
> +\&
> +params.busy_poll_usecs = 25;
> +params.busy_poll_budget = 8;
> +params.prefer_busy_poll = 1;
> +\&
> +if (ioctl(epollfd, EPIOCSPARAMS, &params) == \-1) {
> +    perror("ioctl");
> +    exit(EXIT_FAILURE);
> +}
> +\&
> +/* Code to show how to retrieve the current settings */
> +\&
> +memset(&params, 0, sizeof(struct epoll_params));
> +\&
> +if (ioctl(epollfd, EPIOCGPARAMS, &params) == \-1) {
> +    perror("ioctl");
> +    exit(EXIT_FAILURE);
> +}
> +\&
> +/* params struct now contains the current parameters */
> +\&
> +fprintf(stderr, "epoll usecs: %lu\\n", params.busy_poll_usecs);
> +fprintf(stderr, "epoll packet budget: %u\\n", params.busy_poll_budget);
> +fprintf(stderr, "epoll prefer busy poll: %u\\n", params.prefer_busy_poll);
> +\&
> +.SH History
> +Linux 6.9.
> +glibc 2.40.
> +.SH SEE ALSO
> +.BR ioctl (2),
> +.BR epoll_create (2),
> +.BR epoll_create1 (2),
> +.BR epoll (7)
> +.P
> +.I Documentation/networking/napi.rst
> +.P
> +and
> +.P
> +.I Documentation/admin-guide/sysctl/net.rst
> +.P
> +in the Linux kernel source tree

We could document that as

.I linux.git/Documentation/...

to not have to say "in the Linux kernel source tree".

> diff --git a/man/man7/epoll.7 b/man/man7/epoll.7
> index e7892922e..4ad032bdd 100644
> --- a/man/man7/epoll.7
> +++ b/man/man7/epoll.7
> @@ -606,5 +606,6 @@ is present in an epoll instance.
>  .BR epoll_create1 (2),
>  .BR epoll_ctl (2),
>  .BR epoll_wait (2),
> +.BR ioctl_epoll (2),
>  .BR poll (2),
>  .BR select (2)
> -- 
> 2.34.1
> 

-- 
<https://www.alejandro-colomar.es/>

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux