On 07/05/2017 10:53 AM, Mauricio Faria de Oliveira wrote:
This patch uses the original value of nr_events (from userspace) to
increment aio-nr and count against aio-max-nr, which resolves those.
This has been tested with v4.12+ (commit 650fc870a2ef on Linus tree).
The test-case and test-suite validation steps are included later in
this message.
Example on a system with 64 CPUs:
# cat /sys/devices/system/cpu/possible
0-63
# grep . /proc/sys/fs/aio-*
/proc/sys/fs/aio-max-nr:65536
/proc/sys/fs/aio-nr:0
test 1) number of aio contexts available with nr_events == 1
-------------------------------------------------------------
This test calls io_setup(1, ..) up to 65536 times, exiting on error.
- original kernel:
Only 256 aio contexts could be created successfully,
quickly falling into the aio-max-nr exceeded error path (-EAGAIN).
# ./io_setup 1 65536 | grep -m1 . - /proc/sys/fs/aio-nr
(standard input):io_setup(1, ): 256 calls with rc 0, last call with
rc -11.
/proc/sys/fs/aio-nr:131072
One might notice the aio-nr value is twice the aio-max-nr limit,
an effect of how the current code handles that 'nr_events *= 2'.
- patched kernel:
Almost all of the limit of aio contexts could be allocated,
eventually falling into the insufficient resources error path
(-ENOMEM):
# ./io_setup 1 65536 | grep -m1 . - /proc/sys/fs/aio-nr
(standard input):io_setup(1, ): 65516 calls with rc 0, last call
with rc -12.
/proc/sys/fs/aio-nr:65516
Notice the aio-nr value is now _under_ the aio-max-nr limit.
test 2) increment value for nr_events == 1
-------------------------------------------
This test calls io_setup(1, ..) only 1 time, to show the increment:
- original kernel:
# ./io_setup 1 1 | grep -m1 . - /proc/sys/fs/aio-nr
(standard input):io_setup(1, ) : 1 calls with rc 0, last call with
rc 0.
/proc/sys/fs/aio-nr:512
Notice the increment is 'num_online_cpus() * 8'.
- patched kernel:
# ./io_setup 1 1 | grep -m1 . - /proc/sys/fs/aio-nr
(standard input):io_setup(1, ): 1 calls with rc 0, last call with rc 0.
/proc/sys/fs/aio-nr:1
Notice the increment is exactly 1 (matches nr_events from userspace).
test 3) more aio contexts available with great-enough nr_events
----------------------------------------------------------------
The full aio-max-nr limit (65536) is available for greater nr_events.
This test calls io_setup(1024, ) exactly 64 times, without error.
- original kernel:
# ./io_setup 1024 64 | grep -m1 . - /proc/sys/fs/aio-nr
(standard input):io_setup(1024, ): 64 calls with rc 0, last call
with rc 0.
/proc/sys/fs/aio-nr:131072
Notice the aio-nr value is twice the aio-max-nr limit.
- patched kernel:
# ./io_setup 1024 64 | grep -m1 . - /proc/sys/fs/aio-nr
(standard input):io_setup(1024, ): 64 calls with rc 0, last call
with rc 0.
/proc/sys/fs/aio-nr:65536
Notice the aio-nr value is now _exactly_ the aio-max-nr limit.
Test-case: io_setup.c # gcc -o io_setup io_setup.c -laio
---------
"""
#include <libaio.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char *argv[]) {
int nr_events, nr_calls, rc, i;
io_context_t *ioctx;
/* usage: io_setup <nr_events for io_setup()> <max calls to
io_setup()> */
if (argc != 3)
return -1;
nr_events = atoi(argv[1]);
nr_calls = atoi(argv[2]);
ioctx = calloc(nr_calls, sizeof(*ioctx));
if (!ioctx)
return -2;
for (i = 0; i < nr_calls; i++)
if (rc = io_setup(nr_events, &ioctx[i]))
break;
printf("io_setup(%d, ): %d calls with rc 0, last call with rc
%d.\n", nr_events, i, rc);
fflush(stdout);
sleep(1);
return 0;
}
"""
Test-suite: libaio
----------
# curl
https://kojipkgs.fedoraproject.org/packages/libaio/0.3.110/7.fc26/src/libaio-0.3.110-7.fc26.src.rpm
| rpm2cpio | cpio -mid
# tar xf libaio-0.3.110.tar.gz
# cd libaio-0.3.110
# make
# make check 2>&1 | grep '^test cases'
test cases/2.t completed PASSED.
test cases/3.t completed PASSED.
test cases/4.t completed PASSED.
test cases/5.t completed PASSED.
test cases/6.t completed PASSED.
test cases/7.t completed PASSED.
test cases/11.t completed PASSED.
test cases/12.t completed PASSED.
test cases/13.t completed PASSED.
test cases/14.t completed PASSED.
test cases/15.t completed PASSED.
test cases/16.t completed PASSED.
test cases/10.t completed PASSED.
test cases/8.t completed PASSED.
--
Mauricio Faria de Oliveira
IBM Linux Technology Center