The patch titled Subject: mm/gup_benchmark: support pin_user_pages() and related calls has been added to the -mm tree. Its filename is mm-gup_benchmark-support-pin_user_pages-and-related-calls.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/mm-gup_benchmark-support-pin_user_pages-and-related-calls.patch and later at http://ozlabs.org/~akpm/mmotm/broken-out/mm-gup_benchmark-support-pin_user_pages-and-related-calls.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: John Hubbard <jhubbard@xxxxxxxxxx> Subject: mm/gup_benchmark: support pin_user_pages() and related calls Up until now, gup_benchmark supported testing of the following kernel functions: * get_user_pages(): via the '-U' command line option * get_user_pages_longterm(): via the '-L' command line option * get_user_pages_fast(): as the default (no options required) Add test coverage for the new corresponding pin_*() functions: * pin_user_pages_fast(): via the '-a' command line option * pin_user_pages(): via the '-b' command line option Also, add an option for clarity: '-u' for what is now (still) the default choice: get_user_pages_fast(). Also, for the commands that set FOLL_PIN, verify that the pages really are dma-pinned, via the new is_dma_pinned() routine. Those commands are: PIN_FAST_BENCHMARK : calls pin_user_pages_fast() PIN_BENCHMARK : calls pin_user_pages() In between the calls to pin_*() and unpin_user_pages(), check each page: if page_dma_pinned() returns false, then WARN and return. Do this outside of the benchmark timestamps, so that it doesn't affect reported times. Link: http://lkml.kernel.org/r/20191209225344.99740-26-jhubbard@xxxxxxxxxx Signed-off-by: John Hubbard <jhubbard@xxxxxxxxxx> Reviewed-by: Ira Weiny <ira.weiny@xxxxxxxxx> Cc: Alex Williamson <alex.williamson@xxxxxxxxxx> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx> Cc: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx> Cc: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx> Cc: Björn Töpel <bjorn.topel@xxxxxxxxx> Cc: Christoph Hellwig <hch@xxxxxx> Cc: Daniel Vetter <daniel@xxxxxxxx> Cc: Daniel Vetter <daniel.vetter@xxxxxxxx> Cc: Dan Williams <dan.j.williams@xxxxxxxxx> Cc: Dave Chinner <david@xxxxxxxxxxxxx> Cc: David Airlie <airlied@xxxxxxxx> Cc: "David S . Miller" <davem@xxxxxxxxxxxxx> Cc: Hans Verkuil <hverkuil-cisco@xxxxxxxxx> Cc: Jan Kara <jack@xxxxxxx> Cc: Jason Gunthorpe <jgg@xxxxxxxxxxxx> Cc: Jason Gunthorpe <jgg@xxxxxxxx> Cc: Jens Axboe <axboe@xxxxxxxxx> Cc: Jerome Glisse <jglisse@xxxxxxxxxx> Cc: Jonathan Corbet <corbet@xxxxxxx> Cc: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> Cc: Leon Romanovsky <leonro@xxxxxxxxxxxx> Cc: Magnus Karlsson <magnus.karlsson@xxxxxxxxx> Cc: Mauro Carvalho Chehab <mchehab@xxxxxxxxxx> Cc: Michael Ellerman <mpe@xxxxxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxx> Cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx> Cc: Mike Rapoport <rppt@xxxxxxxxxxxxx> Cc: Paul Mackerras <paulus@xxxxxxxxx> Cc: Shuah Khan <shuah@xxxxxxxxxx> Cc: Vlastimil Babka <vbabka@xxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/gup_benchmark.c | 65 +++++++++++++++++-- tools/testing/selftests/vm/gup_benchmark.c | 15 ++++ 2 files changed, 74 insertions(+), 6 deletions(-) --- a/mm/gup_benchmark.c~mm-gup_benchmark-support-pin_user_pages-and-related-calls +++ a/mm/gup_benchmark.c @@ -8,6 +8,8 @@ #define GUP_FAST_BENCHMARK _IOWR('g', 1, struct gup_benchmark) #define GUP_LONGTERM_BENCHMARK _IOWR('g', 2, struct gup_benchmark) #define GUP_BENCHMARK _IOWR('g', 3, struct gup_benchmark) +#define PIN_FAST_BENCHMARK _IOWR('g', 4, struct gup_benchmark) +#define PIN_BENCHMARK _IOWR('g', 5, struct gup_benchmark) struct gup_benchmark { __u64 get_delta_usec; @@ -19,6 +21,42 @@ struct gup_benchmark { __u64 expansion[10]; /* For future use */ }; +static void put_back_pages(int cmd, struct page **pages, unsigned long nr_pages) +{ + int i; + + switch (cmd) { + case GUP_FAST_BENCHMARK: + case GUP_LONGTERM_BENCHMARK: + case GUP_BENCHMARK: + for (i = 0; i < nr_pages; i++) + put_page(pages[i]); + break; + + case PIN_FAST_BENCHMARK: + case PIN_BENCHMARK: + unpin_user_pages(pages, nr_pages); + break; + } +} + +static void verify_dma_pinned(int cmd, struct page **pages, + unsigned long nr_pages) +{ + int i; + + switch (cmd) { + case PIN_FAST_BENCHMARK: + case PIN_BENCHMARK: + for (i = 0; i < nr_pages; i++) { + if (WARN(!page_dma_pinned(pages[i]), + "pages[%d] is NOT dma-pinned\n", i)) + break; + } + break; + } +} + static int __gup_benchmark_ioctl(unsigned int cmd, struct gup_benchmark *gup) { @@ -65,6 +103,14 @@ static int __gup_benchmark_ioctl(unsigne nr = get_user_pages(addr, nr, gup->flags, pages + i, NULL); break; + case PIN_FAST_BENCHMARK: + nr = pin_user_pages_fast(addr, nr, gup->flags, + pages + i); + break; + case PIN_BENCHMARK: + nr = pin_user_pages(addr, nr, gup->flags, pages + i, + NULL); + break; default: return -1; } @@ -75,15 +121,22 @@ static int __gup_benchmark_ioctl(unsigne } end_time = ktime_get(); + /* Shifting the meaning of nr_pages: now it is actual number pinned: */ + nr_pages = i; + gup->get_delta_usec = ktime_us_delta(end_time, start_time); gup->size = addr - gup->addr; + /* + * Take an un-benchmark-timed moment to verify DMA pinned + * state: print a warning if any non-dma-pinned pages are found: + */ + verify_dma_pinned(cmd, pages, nr_pages); + start_time = ktime_get(); - for (i = 0; i < nr_pages; i++) { - if (!pages[i]) - break; - put_page(pages[i]); - } + + put_back_pages(cmd, pages, nr_pages); + end_time = ktime_get(); gup->put_delta_usec = ktime_us_delta(end_time, start_time); @@ -101,6 +154,8 @@ static long gup_benchmark_ioctl(struct f case GUP_FAST_BENCHMARK: case GUP_LONGTERM_BENCHMARK: case GUP_BENCHMARK: + case PIN_FAST_BENCHMARK: + case PIN_BENCHMARK: break; default: return -EINVAL; --- a/tools/testing/selftests/vm/gup_benchmark.c~mm-gup_benchmark-support-pin_user_pages-and-related-calls +++ a/tools/testing/selftests/vm/gup_benchmark.c @@ -18,6 +18,10 @@ #define GUP_LONGTERM_BENCHMARK _IOWR('g', 2, struct gup_benchmark) #define GUP_BENCHMARK _IOWR('g', 3, struct gup_benchmark) +/* Similar to above, but use FOLL_PIN instead of FOLL_GET. */ +#define PIN_FAST_BENCHMARK _IOWR('g', 4, struct gup_benchmark) +#define PIN_BENCHMARK _IOWR('g', 5, struct gup_benchmark) + /* Just the flags we need, copied from mm.h: */ #define FOLL_WRITE 0x01 /* check pte is writable */ @@ -40,8 +44,14 @@ int main(int argc, char **argv) char *file = "/dev/zero"; char *p; - while ((opt = getopt(argc, argv, "m:r:n:f:tTLUwSH")) != -1) { + while ((opt = getopt(argc, argv, "m:r:n:f:abtTLUuwSH")) != -1) { switch (opt) { + case 'a': + cmd = PIN_FAST_BENCHMARK; + break; + case 'b': + cmd = PIN_BENCHMARK; + break; case 'm': size = atoi(optarg) * MB; break; @@ -63,6 +73,9 @@ int main(int argc, char **argv) case 'U': cmd = GUP_BENCHMARK; break; + case 'u': + cmd = GUP_FAST_BENCHMARK; + break; case 'w': write = 1; break; _ Patches currently in -mm which might be from jhubbard@xxxxxxxxxx are mm-gup-factor-out-duplicate-code-from-four-routines.patch mm-gup-move-try_get_compound_head-to-top-fix-minor-issues.patch mm-devmap-refactor-1-based-refcounting-for-zone_device-pages.patch goldish_pipe-rename-local-pin_user_pages-routine.patch mm-fix-get_user_pages_remotes-handling-of-foll_longterm.patch vfio-fix-foll_longterm-use-simplify-get_user_pages_remote-call.patch mm-gup-allow-foll_force-for-get_user_pages_fast.patch ib-umem-use-get_user_pages_fast-to-pin-dma-pages.patch mm-gup-introduce-pin_user_pages-and-foll_pin.patch goldish_pipe-convert-to-pin_user_pages-and-put_user_page.patch ib-corehwumem-set-foll_pin-via-pin_user_pages-fix-up-odp.patch mm-process_vm_access-set-foll_pin-via-pin_user_pages_remote.patch drm-via-set-foll_pin-via-pin_user_pages_fast.patch fs-io_uring-set-foll_pin-via-pin_user_pages.patch net-xdp-set-foll_pin-via-pin_user_pages.patch media-v4l2-core-set-pages-dirty-upon-releasing-dma-buffers.patch media-v4l2-core-pin_user_pages-foll_pin-and-put_user_page-conversion.patch vfio-mm-pin_user_pages-foll_pin-and-put_user_page-conversion.patch powerpc-book3s64-convert-to-pin_user_pages-and-put_user_page.patch powerpc-book3s64-convert-to-pin_user_pages-and-put_user_page-fix.patch mm-gup_benchmark-use-proper-foll_write-flags-instead-of-hard-coding-1.patch mm-tree-wide-rename-put_user_page-to-unpin_user_page.patch mm-gup-pass-flags-arg-to-__gup_device_-functions.patch mm-gup-track-foll_pin-pages.patch mm-gup_benchmark-support-pin_user_pages-and-related-calls.patch selftests-vm-run_vmtests-invoke-gup_benchmark-with-basic-foll_pin-coverage.patch