On Mon, Feb 03, 2025 at 01:40:39PM -0800, Joanne Koong wrote: > On Mon, Feb 3, 2025 at 12:01 PM Darrick J. Wong <djwong@xxxxxxxxxx> wrote: > > > > On Mon, Feb 03, 2025 at 11:57:23AM -0800, Joanne Koong wrote: > > > On Mon, Feb 3, 2025 at 11:41 AM Darrick J. Wong <djwong@xxxxxxxxxx> wrote: > > > > > > > > On Mon, Feb 03, 2025 at 11:23:20AM -0800, Joanne Koong wrote: > > > > > On Mon, Feb 3, 2025 at 10:59 AM Darrick J. Wong <djwong@xxxxxxxxxx> wrote: > > > > > > > > > > > > On Mon, Feb 03, 2025 at 10:04:04AM -0800, Joanne Koong wrote: > > > > > > > On Sun, Feb 2, 2025 at 6:25 AM Zorro Lang <zlang@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > On Tue, Jan 21, 2025 at 01:56:40PM -0800, Joanne Koong wrote: > > > > > > > > > Add support for reads/writes from buffers backed by hugepages. > > > > > > > > > This can be enabled through the '-h' flag. This flag should only be used > > > > > > > > > on systems where THP capabilities are enabled. > > > > > > > > > > > > > > > > > > This is motivated by a recent bug that was due to faulty handling of > > > > > > > > > userspace buffers backed by hugepages. This patch is a mitigation > > > > > > > > > against problems like this in the future. > > > > > > > > > > > > > > > > > > Signed-off-by: Joanne Koong <joannelkoong@xxxxxxxxx> > > > > > > > > > Reviewed-by: Brian Foster <bfoster@xxxxxxxxxx> > > > > > > > > > Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > > > > > > --- > > > > > > > > > > > > > > > > Those two test cases fail on old system which doesn't support > > > > > > > > MADV_COLLAPSE: > > > > > > > > > > > > > > > > fsx -N 10000 -l 500000 -h > > > > > > > > -fsx -N 10000 -o 8192 -l 500000 -h > > > > > > > > -fsx -N 10000 -o 128000 -l 500000 -h > > > > > > > > +MADV_COLLAPSE not supported. Can't support -h > > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h > > > > > > > > -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h > > > > > > > > -fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h > > > > > > > > +mapped writes DISABLED > > > > > > > > +MADV_COLLAPSE not supported. Can't support -h > > > > > > > > > > > > > > > > I'm wondering ... > > > > > > > > > > > > > > > > > ltp/fsx.c | 166 +++++++++++++++++++++++++++++++++++++++++++++++++----- > > > > > > > > > 1 file changed, 153 insertions(+), 13 deletions(-) > > > > > > > > > > > > > > > > > > diff --git a/ltp/fsx.c b/ltp/fsx.c > > > > > > > > > index 41933354..3be383c6 100644 > > > > > > > > > --- a/ltp/fsx.c > > > > > > > > > +++ b/ltp/fsx.c > > > > > > > > > static struct option longopts[] = { > > > > > > > > > {"replay-ops", required_argument, 0, 256}, > > > > > > > > > {"record-ops", optional_argument, 0, 255}, > > > > > > > > > @@ -2883,7 +3023,7 @@ main(int argc, char **argv) > > > > > > > > > setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout */ > > > > > > > > > > > > > > > > > > while ((ch = getopt_long(argc, argv, > > > > > > > > > - "0b:c:de:fg:i:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ", > > > > > > > > > + "0b:c:de:fg:hi:j:kl:m:no:p:qr:s:t:uw:xyABD:EFJKHzCILN:OP:RS:UWXZ", > > > > > > > > > longopts, NULL)) != EOF) > > > > > > > > > switch (ch) { > > > > > > > > > case 'b': > > > > > > > > > @@ -2916,6 +3056,14 @@ main(int argc, char **argv) > > > > > > > > > case 'g': > > > > > > > > > filldata = *optarg; > > > > > > > > > break; > > > > > > > > > + case 'h': > > > > > > > > > +#ifndef MADV_COLLAPSE > > > > > > > > > + fprintf(stderr, "MADV_COLLAPSE not supported. " > > > > > > > > > + "Can't support -h\n"); > > > > > > > > > + exit(86); > > > > > > > > > +#endif > > > > > > > > > + hugepages = 1; > > > > > > > > > + break; > > > > > > > > > > > > > > > > ... > > > > > > > > if we could change this part to: > > > > > > > > > > > > > > > > #ifdef MADV_COLLAPSE > > > > > > > > hugepages = 1; > > > > > > > > #endif > > > > > > > > break; > > > > > > > > > > > > > > > > to avoid the test failures on old systems. > > > > > > > > > > > > > > > > Or any better ideas from you :) > > > > > > > > > > > > > > Hi Zorro, > > > > > > > > > > > > > > It looks like MADV_COLLAPSE was introduced in kernel version 6.1. What > > > > > > > do you think about skipping generic/758 and generic/759 if the kernel > > > > > > > version is older than 6.1? That to me seems more preferable than the > > > > > > > paste above, as the paste above would execute the test as if it did > > > > > > > test hugepages when in reality it didn't, which would be misleading. > > > > > > > > > > > > Now that I've gotten to try this out -- > > > > > > > > > > > > There's a couple of things going on here. The first is that generic/759 > > > > > > and 760 need to check if invoking fsx -h causes it to spit out the > > > > > > "MADV_COLLAPSE not supported" error and _notrun the test. > > > > > > > > > > > > The second thing is that userspace programs can ensure the existence of > > > > > > MADV_COLLAPSE in multiple ways. The first way is through sys/mman.h, > > > > > > which requires that the underlying C library headers are new enough to > > > > > > include a definition. glibc 2.37 is new enough, but even things like > > > > > > Debian 12 and RHEL 9 aren't new enough to have that. Other C libraries > > > > > > might not follow glibc's practice of wrapping and/or redefining symbols > > > > > > in a way that you hope is the same as... > > > > > > > > > > > > The second way is through linux/mman.h, which comes from the kernel > > > > > > headers package; and the third way is for the program to define it > > > > > > itself if nobody else does. > > > > > > > > > > > > So I think the easiest way to fix the fsx.c build is to include > > > > > > linux/mman.h in addition to sys/mman.h. Sorry I didn't notice these > > > > > > > > > > Thanks for your input. Do we still need sys/mman.h if linux/mman.h is added? > > > > > > > > Yes, because glibc provides the mmap() function that wraps > > > > syscall(__NR_mmap, ...); > > > > > > > > > For generic/758 and 759, does it suffice to gate this on whether the > > > > > kernel version if 6.1+ and _notrun if not? My understanding is that > > > > > any kernel version 6.1 or newer will have MADV_COLLAPSE in its kernel > > > > > headers package and support the feature. > > > > > > > > No, because some (most?) vendors backport new features into existing > > > > kernels without revving the version number of that kernel. > > > > > > Oh okay, I see. That makes sense, thanks for the explanation. > > > > > > > > > > > Maybe the following fixes things? > > > > > > > > --D > > > > > > > > generic/759,760: fix MADV_COLLAPSE detection and inclusion > > > > > > > > On systems with "old" C libraries such as glibc 2.36 in Debian 12, the > > > > MADV_COLLAPSE flag might not be defined in any of the header files > > > > pulled in by sys/mman.h, which means that the fsx binary might not get > > > > built with any of the MADV_COLLAPSE code. If the kernel supports THP, > > > > the test will fail with: > > > > > > > > > QA output created by 760 > > > > > fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h > > > > > -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h > > > > > -fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -h > > > > > +mapped writes DISABLED > > > > > +MADV_COLLAPSE not supported. Can't support -h > > > > > > > > Fix both tests to detect fsx binaries that don't support MADV_COLLAPSE, > > > > then fix fsx.c to include the mman.h from the kernel headers (aka > > > > linux/mman.h) so that we can actually test IOs to and from THPs if the > > > > kernel is newer than the rest of userspace. > > > > > > > > Cc: <fstests@xxxxxxxxxxxxxxx> # v2025.02.02 > > > > Fixes: 627289232371e3 ("generic: add tests for read/writes from hugepages-backed buffers") > > > > Signed-off-by: "Darrick J. Wong" <djwong@xxxxxxxxxx> > > > > --- > > > > ltp/fsx.c | 1 + > > > > tests/generic/759 | 3 +++ > > > > tests/generic/760 | 3 +++ > > > > 3 files changed, 7 insertions(+) > > > > > > > > diff --git a/ltp/fsx.c b/ltp/fsx.c > > > > index 634c496ffe9317..cf9502a74c17a7 100644 > > > > --- a/ltp/fsx.c > > > > +++ b/ltp/fsx.c > > > > @@ -20,6 +20,7 @@ > > > > #include <strings.h> > > > > #include <sys/file.h> > > > > #include <sys/mman.h> > > > > +#include <linux/mman.h> > > > > #include <sys/uio.h> > > > > #include <stdbool.h> > > > > #ifdef HAVE_ERR_H > > > > diff --git a/tests/generic/759 b/tests/generic/759 > > > > index 6c74478aa8a0e0..3549c5ed6a9299 100755 > > > > --- a/tests/generic/759 > > > > +++ b/tests/generic/759 > > > > @@ -14,6 +14,9 @@ _begin_fstest rw auto quick > > > > _require_test > > > > _require_thp > > > > > > > > +$here/ltp/fsx -N 0 -h $TEST_DIR 2>&1 | grep -q 'MADV_COLLAPSE not supported' && \ > > > > + _notrun "fsx binary does not support MADV_COLLAPSE" > > > > + > > > > run_fsx -N 10000 -l 500000 -h > > > > run_fsx -N 10000 -o 8192 -l 500000 -h > > > > run_fsx -N 10000 -o 128000 -l 500000 -h > > > > diff --git a/tests/generic/760 b/tests/generic/760 > > > > index c71a196222ad3b..2fbd28502ae678 100755 > > > > --- a/tests/generic/760 > > > > +++ b/tests/generic/760 > > > > @@ -15,6 +15,9 @@ _require_test > > > > _require_odirect > > > > _require_thp > > > > > > > > +$here/ltp/fsx -N 0 -h $TEST_DIR 2>&1 | grep -q 'MADV_COLLAPSE not supported' && \ > > > > + _notrun "fsx binary does not support MADV_COLLAPSE" > > > > + > > > > > > I tried this out locally and it works for me: > > > > > > generic/759 8s ... [not run] fsx binary does not support MADV_COLLAPSE > > > Ran: generic/759 > > > Not run: generic/759 > > > Passed all 1 tests > > > > > > SECTION -- fuse > > > ========================= > > > Ran: generic/759 > > > Not run: generic/759 > > > Passed all 1 tests > > > > Does the test actually run if you /do/ have kernel/libc headers that > > define MADV_COLLAPSE? And if so, does that count as a Tested-by? > > > > I'm not sure if I fully understand the subtext of your question but > yes, the test runs on my system when MADV_COLLAPSE is defined in the > kernel/libc headers. Yep, you understood me correctly. :) > For sanity checking the inverse case, (eg MADV_COLLAPSE not defined), > I tried this out by modifying the ifdef/ifndef checks in fsx to > MADV_COLLAPSE_ to see if the '$here/ltp/fsx -N 0 -h $TEST_DIR 2>&1 | > grep -q 'MADV_COLLAPSE not supported'' line catches that. <nod> Sounds good then; I'll add this to my test clod and run it overnight. --D > > > --D > > > > > > > > Thanks, > > > Joanne > > > > > > > psize=`$here/src/feature -s` > > > > bsize=`$here/src/min_dio_alignment $TEST_DIR $TEST_DEV` > > > > > > >