On Sun, 19 Mar 2023 07:09:31 +0000 Lorenzo Stoakes <lstoakes@xxxxxxxxx> wrote: > vmalloc() is, by design, not permitted to be used in atomic context and > already contains components which may sleep, so avoiding spin locks is not > a problem from the perspective of atomic context. > > The global vmap_area_lock is held when the red/black tree rooted in > vmap_are_root is accessed and thus is rather long-held and under > potentially high contention. It is likely to be under contention for reads > rather than write, so replace it with a rwsem. > > Each individual vmap_block->lock is likely to be held for less time but > under low contention, so a mutex is not an outrageous choice here. > > A subset of test_vmalloc.sh performance results:- > > fix_size_alloc_test 0.40% > full_fit_alloc_test 2.08% > long_busy_list_alloc_test 0.34% > random_size_alloc_test -0.25% > random_size_align_alloc_test 0.06% > ... > all tests cycles 0.2% > > This represents a tiny reduction in performance that sits barely above > noise. > > The reason for making this change is to build a basis for vread() to be > usable asynchronously, this eliminating the need for a bounce buffer when > copying data to userland in read_kcore() and allowing that to be converted > to an iterator form. > I'm not understanding the final paragraph. How and where is vread() used "asynchronously"?