Re: [PATCH 7/7] tools: add skeleton code for userland testing of VMA logic

Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> · Thu, 4 Jul 2024 11:51:36 +0100

On Wed, Jul 03, 2024 at 10:59:56PM GMT, SeongJae Park wrote:
> Hi Lorenzo,
>
> On Wed,  3 Jul 2024 12:57:38 +0100 Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> wrote:
>
> > Establish a new userland VMA unit testing implementation under
> > tools/testing which utilises existing logic providing maple tree support in
> > userland utilising the now-shared code previously exclusive to radix tree
> > testing.
> >
> > This provides fundamental VMA operations whose API is defined in mm/vma.h,
> > while stubbing out superfluous functionality.
> >
> > This exists as a proof-of-concept, with the test implementation functional
> > and sufficient to allow userland compilation of vma.c, but containing only
> > cursory tests to demonstrate basic functionality.
>
> Overall, looks good to me.  Appreciate this work.  Nonetheless, I have some
> trivial questions and comments below.

Thanks, appreciate the review!

>
> >
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx>
> > ---
> >  MAINTAINERS                            |   1 +
> >  include/linux/atomic.h                 |   2 +-
> >  include/linux/mmzone.h                 |   3 +-
>
> I doubt if changes to above two files are intentional.  Please read below
> comments.
>
> >  tools/testing/vma/.gitignore           |   6 +
> >  tools/testing/vma/Makefile             |  16 +
> >  tools/testing/vma/errors.txt           |   0
> >  tools/testing/vma/generated/autoconf.h |   2 +
>
> I'm also unsure if above two files are intentionally added.  Please read below
> comments.
>
> >  tools/testing/vma/linux/atomic.h       |  12 +
> >  tools/testing/vma/linux/mmzone.h       |  38 ++
> >  tools/testing/vma/vma.c                | 207 ++++++
> >  tools/testing/vma/vma_internal.h       | 882 +++++++++++++++++++++++++
> >  11 files changed, 1167 insertions(+), 2 deletions(-)
> >  create mode 100644 tools/testing/vma/.gitignore
> >  create mode 100644 tools/testing/vma/Makefile
> >  create mode 100644 tools/testing/vma/errors.txt
> >  create mode 100644 tools/testing/vma/generated/autoconf.h
> >  create mode 100644 tools/testing/vma/linux/atomic.h
> >  create mode 100644 tools/testing/vma/linux/mmzone.h
> >  create mode 100644 tools/testing/vma/vma.c
> >  create mode 100644 tools/testing/vma/vma_internal.h
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index ff3e113ed081..c21099d0a123 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -23983,6 +23983,7 @@ T:	git git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> >  F:	mm/vma.c
> >  F:	mm/vma.h
> >  F:	mm/vma_internal.h
> > +F:	tools/testing/vma/
>
> Thank you for addressing my comment on the previous version :)
>
> Btw, what do you think about moving the previous MAINTAINERS touching patch to
> the end of this patch series and making this change together at once?

Yeah I was thinking about separating that out actually, not hugely critical I
don't think, but if I end up respinning I can do that.

>
> >
> >  VMALLOC
> >  M:	Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> > diff --git a/include/linux/atomic.h b/include/linux/atomic.h
> > index 8dd57c3a99e9..badfba2fd10f 100644
> > --- a/include/linux/atomic.h
> > +++ b/include/linux/atomic.h
> > @@ -81,4 +81,4 @@
> >  #include <linux/atomic/atomic-long.h>
> >  #include <linux/atomic/atomic-instrumented.h>
> >
> > -#endif /* _LINUX_ATOMIC_H */
> > +#endif	/* _LINUX_ATOMIC_H */
>
> Maybe unintended change?

Ugh, sorry my bad. Again, I don't think this is so big as to need a respin
in itself, but if larger stuff comes up I will fix if you don't think this
is too big a deal?

>
> > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > index 41458892bc8a..30a22e57fa50 100644
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -1,4 +1,5 @@
> > -/* SPDX-License-Identifier: GPL-2.0 */
> > +/* SPDX-License-Identifier: GPL-2.0-or-later */
> > +
> >  #ifndef _LINUX_MMZONE_H
> >  #define _LINUX_MMZONE_H
> >
>
> To my understanding, the test adds tools/testing/vma/linux/mmzone.h and uses it
> instead of this file.  If I'm not missing something here, above license change
> may not really needed?
>
> > diff --git a/tools/testing/vma/.gitignore b/tools/testing/vma/.gitignore
> > new file mode 100644
> > index 000000000000..d915f7d7fb1a
> > --- /dev/null
> > +++ b/tools/testing/vma/.gitignore
> > @@ -0,0 +1,6 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +generated/bit-length.h
> > +generated/map-shift.h
>
> I guess we should also have 'generated/autoconf.h' here?  Please read below
> comment for the file, too.
>
> > +idr.c
> > +radix-tree.c
> > +vma
> > diff --git a/tools/testing/vma/Makefile b/tools/testing/vma/Makefile
> > new file mode 100644
> > index 000000000000..70e728f2eee3
> > --- /dev/null
> > +++ b/tools/testing/vma/Makefile
> > @@ -0,0 +1,16 @@
> > +# SPDX-License-Identifier: GPL-2.0-or-later
> > +
> > +.PHONY: default
> > +
> > +default: vma
> > +
> > +include ../shared/shared.mk
> > +
> > +OFILES = $(SHARED_OFILES) vma.o maple-shim.o
> > +TARGETS = vma
> > +
> > +vma:	$(OFILES) vma_internal.h ../../../mm/vma.c ../../../mm/vma.h
> > +	$(CC) $(CFLAGS) -o $@ $(OFILES) $(LDLIBS)
> > +
> > +clean:
> > +	$(RM) $(TARGETS) *.o radix-tree.c idr.c generated/map-shift.h generated/bit-length.h
>
> If my assumption about generated/autoconf.h file is not wrong, I think we
> should also remove the file here, too.  'git' wouldn't care, but I think
> removing generated/ directory with files under it would be clearer for
> working space management.
>
> > diff --git a/tools/testing/vma/errors.txt b/tools/testing/vma/errors.txt
> > new file mode 100644
> > index 000000000000..e69de29bb2d1
>
> I'm not seeing who is really using this empty file.  Is this file intentionally
> added?

Ughhhh no, this was a pure accident! I guess we can ask Andrew to drop this
part of the patch if no further respin is needed? May do a fix-patch actually.

Obviously will remove on next respin otherwise.

Thanks for that, great spot!

>
> > diff --git a/tools/testing/vma/generated/autoconf.h b/tools/testing/vma/generated/autoconf.h
> > new file mode 100644
> > index 000000000000..92dc474c349b
> > --- /dev/null
> > +++ b/tools/testing/vma/generated/autoconf.h
> > @@ -0,0 +1,2 @@
> > +#include "bit-length.h"
> > +#define CONFIG_XARRAY_MULTI 1
>
> Seems this file is automatically generated by ../shared/shared.mk.  If I'm not
> wrong, I think removing this and adding changes I suggested to .gitignore and
> Makefile would be needed?

Can do the same with this :) good spot.

>
> Since share.mk just copies the file while setting -I flag so that
> tools/testing/vma/vma.c can include files from share/ directory, maybe another
> option is simply including the file from the share/ directory without copying
> it here.
>
> Also, the previous patch (tools: separate out shared radix-tree components)
> that adds this file at tools/testing/shared/ would need to add SPDX License
> identifier?

This file already existed in the radix tree code, I just moved it.

>
> > diff --git a/tools/testing/vma/linux/atomic.h b/tools/testing/vma/linux/atomic.h
> > new file mode 100644
> > index 000000000000..e01f66f98982
> > --- /dev/null
> > +++ b/tools/testing/vma/linux/atomic.h
> > @@ -0,0 +1,12 @@
> > +/* SPDX-License-Identifier: GPL-2.0-or-later */
> > +
> > +#ifndef _LINUX_ATOMIC_H
> > +#define _LINUX_ATOMIC_H
> > +
> > +#define atomic_t int32_t
> > +#define atomic_inc(x) uatomic_inc(x)
> > +#define atomic_read(x) uatomic_read(x)
> > +#define atomic_set(x, y) do {} while (0)
> > +#define U8_MAX UCHAR_MAX
> > +
> > +#endif	/* _LINUX_ATOMIC_H */
> > diff --git a/tools/testing/vma/linux/mmzone.h b/tools/testing/vma/linux/mmzone.h
> > new file mode 100644
> > index 000000000000..e6a96c686610
> > --- /dev/null
> > +++ b/tools/testing/vma/linux/mmzone.h
> > @@ -0,0 +1,38 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
>
> I'm not very familiar with the license stuffs, but based on the changes to
> other files including that to include/linux/mmazone.h above, I was thinking
> this file would also need to update the license to GP-2.0-or-later.  Should
> this be updated so?

This was copied from tools/testing/memblock/linux/mmzone.h directly
as-is. I didn't think it was worth reworking memblock testing to share this
(again, this is meant to be a skeleton rather than a complete rework of how
testing is done :) but we needed the header.

Whenever you bounce code around there's always a risk of somebody noticing
something previously broken which would not really make sense for you to
address as part of your change, I think this is one of those circumstances.

If considered critical for licensing of course I can change, but that does
make me wonder whether that'd be better as a whole-repo change for all such
instances?

>
> [...]
> > diff --git a/tools/testing/vma/vma.c b/tools/testing/vma/vma.c
> > new file mode 100644
> > index 000000000000..1f32bc4d60c2
> > --- /dev/null
> > +++ b/tools/testing/vma/vma.c
> > @@ -0,0 +1,207 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +
> > +#include <stdbool.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +
> > +#include "maple-shared.h"
> > +#include "vma_internal.h"
> > +
> > +/*
> > + * Directly import the VMA implementation here. Our vma_internal.h wrapper
> > + * provides userland-equivalent functionality for everything vma.c uses.
> > + */
> > +#include "../../../mm/vma.c"
> > +
> > +const struct vm_operations_struct vma_dummy_vm_ops;
> > +
> > +#define ASSERT_TRUE(_expr)						\
> > +	do {								\
> > +		if (!(_expr)) {						\
> > +			fprintf(stderr,					\
> > +				"Assert FAILED at %s:%d:%s(): %s is FALSE.\n", \
> > +				__FILE__, __LINE__, __FUNCTION__, #_expr); \
> > +			return false;					\
> > +		}							\
> > +	} while (0)
> > +#define ASSERT_FALSE(_expr) ASSERT_TRUE(!(_expr))
> > +#define ASSERT_EQ(_val1, _val2) ASSERT_TRUE((_val1) == (_val2))
> > +#define ASSERT_NE(_val1, _val2) ASSERT_TRUE((_val1) != (_val2))
> > +
> > +static struct vm_area_struct *alloc_vma(struct mm_struct *mm,
> > +					unsigned long start,
> > +					unsigned long end,
> > +					pgoff_t pgoff,
> > +					vm_flags_t flags)
> > +{
> > +	struct vm_area_struct *ret = vm_area_alloc(mm);
> > +
> > +	if (ret == NULL)
> > +		return NULL;
> > +
> > +	ret->vm_start = start;
> > +	ret->vm_end = end;
> > +	ret->vm_pgoff = pgoff;
> > +	ret->__vm_flags = flags;
> > +
> > +	return ret;
> > +}
> > +
> > +static bool test_simple_merge(void)
> > +{
> > +	struct vm_area_struct *vma;
> > +	unsigned long flags = VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE;
> > +	struct mm_struct mm = {};
> > +	struct vm_area_struct *vma_left = alloc_vma(&mm, 0, 0x1000, 0, flags);
> > +	struct vm_area_struct *vma_middle = alloc_vma(&mm, 0x1000, 0x2000, 1, flags);
> > +	struct vm_area_struct *vma_right = alloc_vma(&mm, 0x2000, 0x3000, 2, flags);
> > +	VMA_ITERATOR(vmi, &mm, 0x1000);
> > +
> > +	ASSERT_FALSE(vma_link(&mm, vma_left));
> > +	ASSERT_FALSE(vma_link(&mm, vma_middle));
> > +	ASSERT_FALSE(vma_link(&mm, vma_right));
>
> So, vma_link() returns the error if failed, or zero, and therefore above
> assertions check if the function calls success as expected?  It maybe too
> straighforward to people who familiar with the code, but I think adding some
> comment explaining the intent of the test would be nice for new comers.
>
> IMHO, 'ASSERT_EQ(vma_link(...), 0)' may be easier to read.

Yeah I did weigh this up, but the standard kernel pattern for this would be:

if (vma_link(...)) {
	/* ... error handing ... */
}

So to me this is consistent. I do take your point though, it's debatable,
but I think it's ok as-is unless you feel strongly about it?

>
> Also, in case of assertion failures, the assertion prints the error and return
> false, to indicate the failure of the test, right?  Then, would the memory
> allocated before, e.g., that for vma_{left,middle,right} above be leaked?  I
> know this is just a test program in the user-space, but...  If this is
> intentional, I think clarifying it somewhere would be nice.

Unwinding memory would make this code really horrible to implement, I don't
think it's a big deal to leak userland memory in failed tests (the point of
which is to, of course, to not encounter thousands and thousands of assert
fails :).

I'm not sure it's really important to point this out too, it's obvious, and
it's distracting to do so. And again, it's really just a wrapper
implementation. As discussed elsewhere moving forward it'd make sense to
implement some 'userland kunit' style shared libraries that take care of
all of this for us.

>
> > +
> > +	vma = vma_merge_new_vma(&vmi, vma_left, vma_middle, 0x1000,
> > +				0x2000, 1);
> > +	ASSERT_NE(vma, NULL);
> > +
> > +	ASSERT_EQ(vma->vm_start, 0);
> > +	ASSERT_EQ(vma->vm_end, 0x3000);
> > +	ASSERT_EQ(vma->vm_pgoff, 0);
> > +	ASSERT_EQ(vma->vm_flags, flags);
> > +
> > +	vm_area_free(vma);
> > +	mtree_destroy(&mm.mm_mt);
> > +
> > +	return true;
> > +}
> > +
> > +static bool test_simple_modify(void)
> > +{
> > +	struct vm_area_struct *vma;
> > +	unsigned long flags = VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE;
> > +	struct mm_struct mm = {};
> > +	struct vm_area_struct *init_vma = alloc_vma(&mm, 0, 0x3000, 0, flags);
> > +	VMA_ITERATOR(vmi, &mm, 0x1000);
> > +
> > +	ASSERT_FALSE(vma_link(&mm, init_vma));
> > +
> > +	/*
> > +	 * The flags will not be changed, the vma_modify_flags() function
> > +	 * performs the merge/split only.
> > +	 */
> > +	vma = vma_modify_flags(&vmi, init_vma, init_vma,
> > +			       0x1000, 0x2000, VM_READ | VM_MAYREAD);
> > +	ASSERT_NE(vma, NULL);
> > +	/* We modify the provided VMA, and on split allocate new VMAs. */
> > +	ASSERT_EQ(vma, init_vma);
> > +
> > +	ASSERT_EQ(vma->vm_start, 0x1000);
> > +	ASSERT_EQ(vma->vm_end, 0x2000);
> > +	ASSERT_EQ(vma->vm_pgoff, 1);
> > +
> > +	/*
> > +	 * Now walk through the three split VMAs and make sure they are as
> > +	 * expected.
> > +	 */
>
> I like these kind comments :)

Thanks :) I try to maintain a nice balance between not adding _too many_
explanatory comments but not having globs of code that are hard to follow
without giving an idea what's going on.

>
> > +
> > +	vma_iter_set(&vmi, 0);
> > +	vma = vma_iter_load(&vmi);
> > +
> > +	ASSERT_EQ(vma->vm_start, 0);
> > +	ASSERT_EQ(vma->vm_end, 0x1000);
> > +	ASSERT_EQ(vma->vm_pgoff, 0);
> > +
> > +	vm_area_free(vma);
> > +	vma_iter_clear(&vmi);
> > +
> > +	vma = vma_next(&vmi);
> > +
> > +	ASSERT_EQ(vma->vm_start, 0x1000);
> > +	ASSERT_EQ(vma->vm_end, 0x2000);
> > +	ASSERT_EQ(vma->vm_pgoff, 1);
> > +
> > +	vm_area_free(vma);
> > +	vma_iter_clear(&vmi);
> > +
> > +	vma = vma_next(&vmi);
> > +
> > +	ASSERT_EQ(vma->vm_start, 0x2000);
> > +	ASSERT_EQ(vma->vm_end, 0x3000);
> > +	ASSERT_EQ(vma->vm_pgoff, 2);
> > +
> > +	vm_area_free(vma);
> > +	mtree_destroy(&mm.mm_mt);
> > +
> > +	return true;
> > +}
> > +
> > +static bool test_simple_expand(void)
> > +{
> > +	unsigned long flags = VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE;
> > +	struct mm_struct mm = {};
> > +	struct vm_area_struct *vma = alloc_vma(&mm, 0, 0x1000, 0, flags);
> > +	VMA_ITERATOR(vmi, &mm, 0);
> > +
> > +	ASSERT_FALSE(vma_link(&mm, vma));
> > +
> > +	ASSERT_FALSE(vma_expand(&vmi, vma, 0, 0x3000, 0, NULL));
> > +
> > +	ASSERT_EQ(vma->vm_start, 0);
> > +	ASSERT_EQ(vma->vm_end, 0x3000);
> > +	ASSERT_EQ(vma->vm_pgoff, 0);
> > +
> > +	vm_area_free(vma);
> > +	mtree_destroy(&mm.mm_mt);
> > +
> > +	return true;
> > +}
> > +
> > +static bool test_simple_shrink(void)
> > +{
> > +	unsigned long flags = VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE;
> > +	struct mm_struct mm = {};
> > +	struct vm_area_struct *vma = alloc_vma(&mm, 0, 0x3000, 0, flags);
> > +	VMA_ITERATOR(vmi, &mm, 0);
> > +
> > +	ASSERT_FALSE(vma_link(&mm, vma));
> > +
> > +	ASSERT_FALSE(vma_shrink(&vmi, vma, 0, 0x1000, 0));
> > +
> > +	ASSERT_EQ(vma->vm_start, 0);
> > +	ASSERT_EQ(vma->vm_end, 0x1000);
> > +	ASSERT_EQ(vma->vm_pgoff, 0);
> > +
> > +	vm_area_free(vma);
> > +	mtree_destroy(&mm.mm_mt);
> > +
> > +	return true;
> > +}
> > +
> > +int main(void)
> > +{
> > +	int num_tests = 0, num_fail = 0;
> > +
> > +	maple_tree_init();
> > +
> > +#define TEST(name)							\
> > +	do {								\
> > +		num_tests++;						\
> > +		if (!test_##name()) {					\
> > +			num_fail++;					\
> > +			fprintf(stderr, "Test " #name " FAILED\n");	\
> > +		}							\
> > +	} while (0)
> > +
> > +	TEST(simple_merge);
> > +	TEST(simple_modify);
> > +	TEST(simple_expand);
> > +	TEST(simple_shrink);
> > +
> > +#undef TEST
> > +
> > +	printf("%d tests run, %d passed, %d failed.\n",
> > +	       num_tests, num_tests - num_fail, num_fail);
> > +
> > +	return EXIT_SUCCESS;
>
> What do you think about making the return value indicates if the overall test
> has pass or failed, for easy integration with other test frameworks or scripts
> in future?

Yeah this is a good idea, will change on next respin.

>
> [...]
>
> I didn't read all of this patch series in detail yet (I'm not sure if I'll have
> time to do that, so please don't wait for me), but looks nice work overall to
> me.  Thank you for your efforts on this.

Thanks!

>
>
> Thanks,
> SJ