On giovedì 19 gennaio 2023 17:20:55 CEST Fabio M. De Francesco wrote: > The use of kmap() and kmap_atomic() are being deprecated in favor of > kmap_local_page(). > > There are two main problems with kmap(): (1) It comes with an overhead as > the mapping space is restricted and protected by a global lock for > synchronization and (2) it also requires global TLB invalidation when the > kmap’s pool wraps and it might block when the mapping space is fully > utilized until a slot becomes available. > > With kmap_local_page() the mappings are per thread, CPU local, can take > page faults, and can be called from any context (including interrupts). > It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore, > the tasks can be preempted and, when they are scheduled to run again, the > kernel virtual addresses are restored and still valid. > > The use of kmap_local_page() in fs/aio.c is "safe" in the sense that the > code don't hands the returned kernel virtual addresses to other threads > and there are no nesting which should be handled with the stack based > (LIFO) mappings/un-mappings order. Furthermore, the code between the old > kmap_atomic()/kunmap_atomic() did not depend on disabling page-faults > and/or preemption, so that there is no need to call pagefault_disable() > and/or preempt_disable() before the mappings. > > Therefore, replace kmap() and kmap_atomic() with kmap_local_page() in > fs/aio.c. > > Tested with xfstests on a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel > with HIGHMEM64GB enabled. > Hi Al, I see that this patch is here since Jan 19, 2023. Is there anything that prevents its merging? Am I expected to do further changes? Please notice that it already had three "Reviewed-by:" tags (again thanks to Ira, Jeff and Kent). Can you please take it in your three? Thanks, Fabio > > Cc: "Venkataramanan, Anirudh" <anirudh.venkataramanan@xxxxxxxxx> > Suggested-by: Ira Weiny <ira.weiny@xxxxxxxxx> > Reviewed-by: Ira Weiny <ira.weiny@xxxxxxxxx> > Reviewed-by: Jeff Moyer <jmoyer@xxxxxxxxxx> > Reviewed-by: Kent Overstreet <kent.overstreet@xxxxxxxxx> > Signed-off-by: Fabio M. De Francesco <fmdefrancesco@xxxxxxxxx> > --- > > I've tested with "./check -g aio". The tests in this group fail 3/26 > times, with and without my patch. Therefore, these changes don't introduce > further errors. I'm not aware of any other tests which I may run, so that > any suggestions would be precious and much appreciated :-) > > I'm resending this patch because some recipients were missing in the > previous submissions. In the meantime I'm also adding some more information > in the commit message. There are no changes in the code. > > Changes from v1: > Add further information in the commit message, and the > "Reviewed-by" tags from Ira and Jeff (thanks!). > > Changes from v2: > Rewrite a block of code between mapping/un-mapping to improve > readability in aio_setup_ring() and add a missing call to > flush_dcache_page() in ioctx_add_table() (thanks to Al Viro); > Add a "Reviewed-by" tag from Kent Overstreet (thanks). > > fs/aio.c | 46 +++++++++++++++++++++------------------------- > 1 file changed, 21 insertions(+), 25 deletions(-) > > diff --git a/fs/aio.c b/fs/aio.c > index 562916d85cba..9b39063dc7ac 100644 > --- a/fs/aio.c > +++ b/fs/aio.c > @@ -486,7 +486,6 @@ static const struct address_space_operations aio_ctx_aops > = { > > static int aio_setup_ring(struct kioctx *ctx, unsigned int nr_events) > { > - struct aio_ring *ring; > struct mm_struct *mm = current->mm; > unsigned long size, unused; > int nr_pages; > @@ -567,16 +566,12 @@ static int aio_setup_ring(struct kioctx *ctx, unsigned > int nr_events) ctx->user_id = ctx->mmap_base; > ctx->nr_events = nr_events; /* trusted copy */ > > - ring = kmap_atomic(ctx->ring_pages[0]); > - ring->nr = nr_events; /* user copy */ > - ring->id = ~0U; > - ring->head = ring->tail = 0; > - ring->magic = AIO_RING_MAGIC; > - ring->compat_features = AIO_RING_COMPAT_FEATURES; > - ring->incompat_features = AIO_RING_INCOMPAT_FEATURES; > - ring->header_length = sizeof(struct aio_ring); > - kunmap_atomic(ring); > - flush_dcache_page(ctx->ring_pages[0]); > + memcpy_to_page(ctx->ring_pages[0], 0, (const char *)&(struct aio_ring) { > + .nr = nr_events, .id = ~0U, .magic = AIO_RING_MAGIC, > + .compat_features = AIO_RING_COMPAT_FEATURES, > + .incompat_features = AIO_RING_INCOMPAT_FEATURES, > + .header_length = sizeof(struct aio_ring) }, > + sizeof(struct aio_ring)); > > return 0; > } > @@ -678,9 +673,10 @@ static int ioctx_add_table(struct kioctx *ctx, struct > mm_struct *mm) * we are protected from page migration > * changes ring_pages by - >ring_lock. > */ > - ring = kmap_atomic(ctx- >ring_pages[0]); > + ring = kmap_local_page(ctx- >ring_pages[0]); > ring->id = ctx->id; > - kunmap_atomic(ring); > + kunmap_local(ring); > + flush_dcache_page(ctx- >ring_pages[0]); > return 0; > } > > @@ -1021,9 +1017,9 @@ static void user_refill_reqs_available(struct kioctx > *ctx) * against ctx->completed_events below will make sure we do the > * safe/right thing. > */ > - ring = kmap_atomic(ctx->ring_pages[0]); > + ring = kmap_local_page(ctx->ring_pages[0]); > head = ring->head; > - kunmap_atomic(ring); > + kunmap_local(ring); > > refill_reqs_available(ctx, head, ctx->tail); > } > @@ -1129,12 +1125,12 @@ static void aio_complete(struct aio_kiocb *iocb) > if (++tail >= ctx->nr_events) > tail = 0; > > - ev_page = kmap_atomic(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]); > + ev_page = kmap_local_page(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]); > event = ev_page + pos % AIO_EVENTS_PER_PAGE; > > *event = iocb->ki_res; > > - kunmap_atomic(ev_page); > + kunmap_local(ev_page); > flush_dcache_page(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]); > > pr_debug("%p[%u]: %p: %p %Lx %Lx %Lx\n", ctx, tail, iocb, > @@ -1148,10 +1144,10 @@ static void aio_complete(struct aio_kiocb *iocb) > > ctx->tail = tail; > > - ring = kmap_atomic(ctx->ring_pages[0]); > + ring = kmap_local_page(ctx->ring_pages[0]); > head = ring->head; > ring->tail = tail; > - kunmap_atomic(ring); > + kunmap_local(ring); > flush_dcache_page(ctx->ring_pages[0]); > > ctx->completed_events++; > @@ -1211,10 +1207,10 @@ static long aio_read_events_ring(struct kioctx *ctx, > mutex_lock(&ctx->ring_lock); > > /* Access to ->ring_pages here is protected by ctx->ring_lock. */ > - ring = kmap_atomic(ctx->ring_pages[0]); > + ring = kmap_local_page(ctx->ring_pages[0]); > head = ring->head; > tail = ring->tail; > - kunmap_atomic(ring); > + kunmap_local(ring); > > /* > * Ensure that once we've read the current tail pointer, that > @@ -1246,10 +1242,10 @@ static long aio_read_events_ring(struct kioctx *ctx, > avail = min(avail, nr - ret); > avail = min_t(long, avail, AIO_EVENTS_PER_PAGE - pos); > > - ev = kmap(page); > + ev = kmap_local_page(page); > copy_ret = copy_to_user(event + ret, ev + pos, > sizeof(*ev) * avail); > - kunmap(page); > + kunmap_local(ev); > > if (unlikely(copy_ret)) { > ret = -EFAULT; > @@ -1261,9 +1257,9 @@ static long aio_read_events_ring(struct kioctx *ctx, > head %= ctx->nr_events; > } > > - ring = kmap_atomic(ctx->ring_pages[0]); > + ring = kmap_local_page(ctx->ring_pages[0]); > ring->head = head; > - kunmap_atomic(ring); > + kunmap_local(ring); > flush_dcache_page(ctx->ring_pages[0]); > > pr_debug("%li h%u t%u\n", ret, head, tail); > -- > 2.39.0