Re: [PATCH v7 04/12] mm: multigenerational LRU: groundwork

Yu Zhao <yuzhao@xxxxxxxxxx> · Wed, 16 Mar 2022 15:37:06 -0600

On Wed, Mar 16, 2022 at 12:06 AM Barry Song <21cnbao@xxxxxxxxx> wrote:

< snipped>
> > The cost is not the point; the fairness is:
> >
> > 1) Ramdisk is fair to both LRU algorithms.
> > 2) Zram punishes the LRU algorithm that chooses incompressible pages.
> > IOW, this algorithm needs to compress more pages in order to save the
> > same amount of memory.
>
> I see your point. but my point is that with higher I/O cost to swap
> in and swap out pages,  more major faults(lower hit ratio) will
> contribute to the loss of final performance.
>
> So for the particular case, if we move to a real disk as a swap
> device, we might see the same result as zRAM I was using
> since you also reported more page faults.

If we wanted to talk about I/O cost, we would need to consider the
number of writes and writeback patterns as well. The LRU algorithm
that *unconsciously* picks more clean pages has an advantage because
writes are usually slower than reads. Similarly, the LRU algorithm
that *unconsciously* picks a cluster of cold pages that later would be
faulted in together also has the advantage because sequential reads
are faster than random reads. Do we want to go into this rabbit hole?
I think not. That's exactly why I suggested we focus on the fairness.
But, just outta curiosity, MGLRU was faster when swapping to a slow
MMC disk.

# mmc cid read /sys/class/mmc_host/mmc1/mmc1:0001
type: 'MMC'
manufacturer: 'SanDisk-Toshiba Corporation' ''
product: 'DA4064' 1.24400152
serial: 0x00000000
manfacturing date: 2006 aug

# baseline + THP=never
0 records/s
real 872.00 s
user 51.69 s
sys  483.09 s

    13.07%  __memcpy_neon
    11.37%  __pi_clear_page
     9.35%  _raw_spin_unlock_irq
     5.52%  mod_delayed_work_on
     5.17%  _raw_spin_unlock_irqrestore
     3.95%  do_raw_spin_lock
     3.87%  rmqueue_pcplist
     3.60%  local_daif_restore
     3.17%  free_unref_page_list
     2.74%  zap_pte_range
     2.00%  handle_mm_fault
     1.19%  do_anonymous_page

# MGLRU + THP=never
0 records/s
real 821.00 s
user 44.45 s
sys  428.21 s

    13.28%  __memcpy_neon
    12.78%  __pi_clear_page
     9.14%  _raw_spin_unlock_irq
     5.95%  _raw_spin_unlock_irqrestore
     5.08%  mod_delayed_work_on
     4.45%  do_raw_spin_lock
     3.86%  local_daif_restore
     3.81%  rmqueue_pcplist
     3.32%  free_unref_page_list
     2.89%  zap_pte_range
     1.89%  handle_mm_fault
     1.10%  do_anonymous_page

# baseline + THP=madvise
0 records/s
real 1341.00 s
user 68.15 s
sys  681.42 s

    12.33%  __memcpy_neon
    11.78%  _raw_spin_unlock_irq
     8.79%  __pi_clear_page
     7.63%  mod_delayed_work_on
     5.49%  _raw_spin_unlock_irqrestore
     3.23%  local_daif_restore
     3.00%  do_raw_spin_lock
     2.83%  rmqueue_pcplist
     2.21%  handle_mm_fault
     2.00%  zap_pte_range
     1.51%  free_unref_page_list
     1.33%  do_swap_page
     1.17%  do_anonymous_page

# MGLRU + THP=madvise
0 records/s
real 1315.00 s
user 60.59 s
sys  620.56 s

    12.34%  __memcpy_neon
    12.17%  _raw_spin_unlock_irq
     9.33%  __pi_clear_page
     7.33%  mod_delayed_work_on
     6.01%  _raw_spin_unlock_irqrestore
     3.27%  local_daif_restore
     3.23%  do_raw_spin_lock
     2.98%  rmqueue_pcplist
     2.12%  handle_mm_fault
     2.04%  zap_pte_range
     1.65%  free_unref_page_list
     1.27%  do_swap_page
     1.11%  do_anonymous_page