On Mon, Feb 24, 2025 at 1:12 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote: > > On Mon, Feb 24, 2025 at 12:53 PM Vlastimil Babka <vbabka@xxxxxxx> wrote: > > > > On 2/24/25 02:36, Suren Baghdasaryan wrote: > > > On Sat, Feb 22, 2025 at 8:44 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote: > > >> > > >> Don't know about this particular part but testing sheaves with maple > > >> node cache and stress testing mmap/munmap syscalls shows performance > > >> benefits as long as there is some delay to let kfree_rcu() do its job. > > >> I'm still gathering results and will most likely post them tomorrow. > > > > Without such delay, the perf is same or worse? > > The perf is about the same if there is no delay. > > > > > > Here are the promised test results: > > > > > > First I ran an Android app cycle test comparing the baseline against sheaves > > > used for maple tree nodes (as this patchset implements). I registered about > > > 3% improvement in app launch times, indicating improvement in mmap syscall > > > performance. > > > > There was no artificial 500us delay added for this test, right? > > Correct. No artificial changes in this test. > > > > > > Next I ran an mmap stress test which maps 5 1-page readable file-backed > > > areas, faults them in and finally unmaps them, timing mmap syscalls. > > > Repeats that 200000 cycles and reports the total time. Average of 10 such > > > runs is used as the final result. > > > 3 configurations were tested: > > > > > > 1. Sheaves used for maple tree nodes only (this patchset). > > > > > > 2. Sheaves used for maple tree nodes with vm_lock to vm_refcnt conversion [1]. > > > This patchset avoids allocating additional vm_lock structure on each mmap > > > syscall and uses TYPESAFE_BY_RCU for vm_area_struct cache. > > > > > > 3. Sheaves used for maple tree nodes and for vm_area_struct cache with vm_lock > > > to vm_refcnt conversion [1]. For the vm_area_struct cache I had to replace > > > TYPESAFE_BY_RCU with sheaves, as we can't use both for the same cache. > > > > Hm why we can't use both? I don't think any kmem_cache_create check makes > > them exclusive? TYPESAFE_BY_RCU only affects how slab pages are freed, it > > doesn't e.g. delay reuse of individual objects, and caching in a sheaf > > doesn't write to the object. Am I missing something? > > Ah, I was under impression that to use sheaves I would have to ensure > the freeing happens via kfree_rcu()->kfree_rcu_sheaf() path but now > that you mentioned that, I guess I could keep using kmem_cache_free() > and that would use free_to_pcs() internally... When time comes to free > the page, TYPESAFE_BY_RCU will free it after the grace period. > I can try that combination as well and see if anything breaks. This seems to be working fine. The new configuration is: 4. Sheaves used for maple tree nodes and for vm_area_struct cache with vm_lock to vm_refcnt conversion [1]. vm_area_struct cache uses both TYPESAFE_BY_RCU and sheaves (but obviously not kfree_rcu_sheaf()). > > > > > > The values represent the total time it took to perform mmap syscalls, less is > > > better. > > > > > > (1) baseline control > > > Little core 7.58327 6.614939 (-12.77%) > > > Medium core 2.125315 1.428702 (-32.78%) > > > Big core 0.514673 0.422948 (-17.82%) > > > > > > (2) baseline control > > > Little core 7.58327 5.141478 (-32.20%) > > > Medium core 2.125315 0.427692 (-79.88%) > > > Big core 0.514673 0.046642 (-90.94%) > > > > > > (3) baseline control > > > Little core 7.58327 4.779624 (-36.97%) > > > Medium core 2.125315 0.450368 (-78.81%) > > > Big core 0.514673 0.037776 (-92.66%) (4) baseline control Little core 7.58327 4.642977 (-38.77%) Medium core 2.125315 0.373692 (-82.42%) Big core 0.514673 0.043613 (-91.53%) I think the difference between (3) and (4) is noise. Thanks, Suren. > > > > > > Results in (3) vs (2) indicate that using sheaves for vm_area_struct > > > yields slightly better averages and I noticed that this was mostly due > > > to sheaves results missing occasional spikes that worsened > > > TYPESAFE_BY_RCU averages (the results seemed more stable with > > > sheaves). > > > > Thanks a lot, that looks promising! > > Indeed, that looks better than I expected :) > Cheers! > > > > > > [1] https://lore.kernel.org/all/20250213224655.1680278-1-surenb@xxxxxxxxxx/ > > > > >