On Sun, Mar 19, 2023 at 01:10:47PM -0700, Andrew Morton wrote: > On Sun, 19 Mar 2023 07:09:31 +0000 Lorenzo Stoakes <lstoakes@xxxxxxxxx> wrote: > > > vmalloc() is, by design, not permitted to be used in atomic context and > > already contains components which may sleep, so avoiding spin locks is not > > a problem from the perspective of atomic context. > > > > The global vmap_area_lock is held when the red/black tree rooted in > > vmap_are_root is accessed and thus is rather long-held and under > > potentially high contention. It is likely to be under contention for reads > > rather than write, so replace it with a rwsem. > > > > Each individual vmap_block->lock is likely to be held for less time but > > under low contention, so a mutex is not an outrageous choice here. > > > > A subset of test_vmalloc.sh performance results:- > > > > fix_size_alloc_test 0.40% > > full_fit_alloc_test 2.08% > > long_busy_list_alloc_test 0.34% > > random_size_alloc_test -0.25% > > random_size_align_alloc_test 0.06% > > ... > > all tests cycles 0.2% > > > > This represents a tiny reduction in performance that sits barely above > > noise. > > > > The reason for making this change is to build a basis for vread() to be > > usable asynchronously, this eliminating the need for a bounce buffer when > > copying data to userland in read_kcore() and allowing that to be converted > > to an iterator form. > > > > I'm not understanding the final paragraph. How and where is vread() > used "asynchronously"? The basis for saying asynchronous was based on Documentation/filesystems/vfs.rst describing read_iter() as 'possibly asynchronous read with iov_iter as destination', and read_iter() is what is (now) invoked when accessing /proc/kcore. However I agree this is vague and it is clearer to refer to the fact that we are now directly writing to user memory and thus wish to avoid spinlocks as we may need to fault in user memory in doing so. Would it be ok for you to go ahead and replace that final paragraph with the below?:- The reason for making this change is to build a basis for vread() to write to user memory directly via an iterator; as a result we may cause page faults during which we must not hold a spinlock. Doing this eliminates the need for a bounce buffer in read_kcore() and thus permits that to be converted to also use an iterator, as a read_iter() handler.