On 12/17/20 10:52 AM, Pavel Tatashin wrote:
In gup_test both gup_flags and test_flags use the same flags field.
This is broken, because gup_flags can be passed as raw value (via -F hex),
which can overwrite all the test flags.
Thanks for finding and fixing the "stuck at 0x1" bug!
The test is not quite as broken as you think, though, because you're not
looking at the history and intent of the FOLL_WRITE, and how that is used.
And that is leading you to make wider "fixes" than I'd recommend.
The new -F command line switch is the latest, newest (and therefore
flakiest) addition, and *that* part is worth fixing up, because in
addition to being stuck at 0x1 as you noticed, it also misleads everyone
into thinking that it's intended for general gup flags. It's not.
It really was meant to be a subtest control flag (controlling the behavior
of the dump pages sub-test), as you can see in the commit log and documentation:
For example:
./gup_test -ct -F 1 0 19 0x1000
Meaning:
-c: dump pages sub-test
-t: use THP pages
-F 1: use pin_user_pages() instead of get_user_pages()
0 19 0x1000: dump pages 0, 19, and 4096
Farther, in the actual gup_test.c all the passed gup_flags are erased and
unconditionally replaced with FOLL_WRITE.
This is still desirable; Kirill's original intention of only allowing
FOLL_WRITE or nothing is still what we want. There is no desire to pass
in general gup_flags.
Which means that test_flags are ignored, and code like this always
performs pin dump test:
155 if (gup->flags & GUP_TEST_FLAG_DUMP_PAGES_USE_PIN)
156 nr = pin_user_pages(addr, nr, gup->flags,
157 pages + i, NULL);
158 else
159 nr = get_user_pages(addr, nr, gup->flags,
160 pages + i, NULL);
161 break;
Add a new test_flags field, to allow raw gup_flags to work.
I think .test_control_flags field would be a good name, to make it very
clear that it's not destined for gup_flags. Just .test_flags is not quite
as clear a distinction from .gup_flags, as .test_control_flags is, IMHO.
Add a new subcommand for DUMP_USER_PAGES_TEST to specify that pin test
should be performed.
It's arguably better to just keep using the -c option instead, plus the new
.test_control_flags field. Otherwise you get a multiplication of possibilities
(and therefore, of command line options).
Remove unconditional overwriting of gup_flags via FOLL_WRITE. But,
Nope, let's not do that.
preserve the previous behaviour where FOLL_WRITE was the default flag,
and add a new option "-W" to unset FOLL_WRITE.
Or that either.
With the fix, dump works like this:
root@virtme:/# gup_test -c
---- page #0, starting from user virt addr: 0x7f8acb9e4000
page:00000000d3d2ee27 refcount:2 mapcount:1 mapping:0000000000000000
index:0x0 pfn:0x100bcf
anon flags: 0x300000000080016(referenced|uptodate|lru|swapbacked)
raw: 0300000000080016 ffffd0e204021608 ffffd0e208df2e88 ffff8ea04243ec61
raw: 0000000000000000 0000000000000000 0000000200000000 0000000000000000
page dumped because: gup_test: dump_pages() test
DUMP_USER_PAGES_TEST: done
root@virtme:/# gup_test -c -p
---- page #0, starting from user virt addr: 0x7fd19701b000
page:00000000baed3c7d refcount:1025 mapcount:1 mapping:0000000000000000
index:0x0 pfn:0x108008
anon flags: 0x300000000080014(uptodate|lru|swapbacked)
raw: 0300000000080014 ffffd0e204200188 ffffd0e205e09088 ffff8ea04243ee71
raw: 0000000000000000 0000000000000000 0000040100000000 0000000000000000
page dumped because: gup_test: dump_pages() test
DUMP_USER_PAGES_TEST: done
Refcount shows the difference between pin vs no-pin case.
Also change type of nr from int to long, as it counts number of pages.
Why? Is there a bug? gup only handles int sized amounts of pages so this
is just adding to the churn, yes?
Signed-off-by: Pavel Tatashin <pasha.tatashin@xxxxxxxxxx>
---
mm/gup_test.c | 9 +++------
mm/gup_test.h | 1 +
tools/testing/selftests/vm/gup_test.c | 11 +++++++++--
3 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/mm/gup_test.c b/mm/gup_test.c
index e3cf78e5873e..24c70c5814ba 100644
--- a/mm/gup_test.c
+++ b/mm/gup_test.c
@@ -94,7 +94,7 @@ static int __gup_test_ioctl(unsigned int cmd,
{
ktime_t start_time, end_time;
unsigned long i, nr_pages, addr, next;
- int nr;
+ long nr;
struct page **pages;
int ret = 0;
bool needs_mmap_lock =
@@ -126,9 +126,6 @@ static int __gup_test_ioctl(unsigned int cmd,
nr = (next - addr) / PAGE_SIZE;
}
- /* Filter out most gup flags: only allow a tiny subset here: */
- gup->flags &= FOLL_WRITE;
-
switch (cmd) {
case GUP_FAST_BENCHMARK:
nr = get_user_pages_fast(addr, nr, gup->flags,
@@ -152,7 +149,7 @@ static int __gup_test_ioctl(unsigned int cmd,
pages + i, NULL);
break;
case DUMP_USER_PAGES_TEST:
- if (gup->flags & GUP_TEST_FLAG_DUMP_PAGES_USE_PIN)
+ if (gup->test_flags & GUP_TEST_FLAG_DUMP_PAGES_USE_PIN)
nr = pin_user_pages(addr, nr, gup->flags,
pages + i, NULL);
else
@@ -187,7 +184,7 @@ static int __gup_test_ioctl(unsigned int cmd,
start_time = ktime_get();
- put_back_pages(cmd, pages, nr_pages, gup->flags);
+ put_back_pages(cmd, pages, nr_pages, gup->test_flags);
end_time = ktime_get();
gup->put_delta_usec = ktime_us_delta(end_time, start_time);
diff --git a/mm/gup_test.h b/mm/gup_test.h
index 90a6713d50eb..b1b376b5d3b2 100644
--- a/mm/gup_test.h
+++ b/mm/gup_test.h
@@ -22,6 +22,7 @@ struct gup_test {
__u64 size;
__u32 nr_pages_per_call;
__u32 flags;
+ __u32 test_flags;
/*
* Each non-zero entry is the number of the page (1-based: first page is
* page 1, so that zero entries mean "do nothing") from the .addr base.
diff --git a/tools/testing/selftests/vm/gup_test.c b/tools/testing/selftests/vm/gup_test.c
index 6c6336dd3b7f..42c71483729f 100644
--- a/tools/testing/selftests/vm/gup_test.c
+++ b/tools/testing/selftests/vm/gup_test.c
@@ -37,13 +37,13 @@ int main(int argc, char **argv)
{
struct gup_test gup = { 0 };
unsigned long size = 128 * MB;
- int i, fd, filed, opt, nr_pages = 1, thp = -1, repeats = 1, write = 0;
+ int i, fd, filed, opt, nr_pages = 1, thp = -1, repeats = 1, write = 1;
Let's leave that alone, as noted above.
unsigned long cmd = GUP_FAST_BENCHMARK;
int flags = MAP_PRIVATE;
char *file = "/dev/zero";
char *p;
- while ((opt = getopt(argc, argv, "m:r:n:F:f:abctTLUuwSH")) != -1) {
+ while ((opt = getopt(argc, argv, "m:r:n:F:f:abctTLUuwWSHp")) != -1) {
And also as noted above I don't think we need this or any of the remaining
hunks below.
switch (opt) {
case 'a':
cmd = PIN_FAST_BENCHMARK;
@@ -65,6 +65,10 @@ int main(int argc, char **argv)
*/
gup.which_pages[0] = 1;
break;
+ case 'p':
+ /* works only with DUMP_USER_PAGES_TEST */
+ gup.test_flags |= GUP_TEST_FLAG_DUMP_PAGES_USE_PIN;
+ break;
case 'F':
/* strtol, so you can pass flags in hex form */
gup.flags = strtol(optarg, 0, 0);
@@ -93,6 +97,9 @@ int main(int argc, char **argv)
case 'w':
write = 1;
break;
+ case 'W':
+ write = 0;
+ break;
case 'f':
file = optarg;
break;
thanks,
--
John Hubbard
NVIDIA