Re: [PATCHv4 14/17] zsmalloc: make zspage lock preemptible

Sergey Senozhatsky <senozhatsky@xxxxxxxxxxxx> · Fri, 14 Feb 2025 12:33:33 +0900

On (25/02/13 15:25), Yosry Ahmed wrote:
> On Thu, Feb 13, 2025 at 05:22:20PM +0900, Sergey Senozhatsky wrote:
> > On (25/02/13 16:21), Sergey Senozhatsky wrote:
> > > BASE
> > > ====
> > > 
> > > 1363.64user 157.08system 1:30.89elapsed 1673%CPU (0avgtext+0avgdata 825692maxresident)k
> > > 
> > > lock stats
> > > 
> > >                               class name    con-bounces    contentions   waittime-min   waittime-max waittime-total   waittime-avg    acq-bounces   acquisitions   holdtime-min   holdtime-max holdtime-total   holdtime-avg
> > >                    &pool->migrate_lock-R:             0              0           0.00           0.00           0.00           0.00          10001         702081           0.14         104.74      125571.64           0.18
> > >                             &class->lock:             1              1           0.25           0.25           0.25           0.25           6320         840542           0.06         809.72      191214.87           0.23
> > >                          &zspage->lock-R:             0              0           0.00           0.00           0.00           0.00           6452         664129           0.12         660.24      201888.61           0.30
> > >                 &zram->table[index].lock:             0              0           0.00           0.00           0.00           0.00        1716362        3096466           0.07         811.10      365551.24           0.12
> > >                             &zstrm->lock:             0              0           0.00           0.00           0.00           0.00              0         664129           1.68        1004.80    14853571.32          22.37
> > > 
> > > PATCHED
> > > =======
> > > 
> > > 1366.50user 154.89system 1:30.33elapsed 1684%CPU (0avgtext+0avgdata 825692maxresident)k
> > > 
> > > lock stats
> > > 
> > >                               class name    con-bounces    contentions   waittime-min   waittime-max waittime-total   waittime-avg    acq-bounces   acquisitions   holdtime-min   holdtime-max holdtime-total   holdtime-avg
> > >                          &pool->lock#3-R:             0              0           0.00           0.00           0.00           0.00           3648         701979           0.12          44.09      107333.02           0.15
> > >                             &class->lock:             0              0           0.00           0.00           0.00           0.00           5038         840434           0.06        1245.90      211814.60           0.25
> > >                          zsmalloc-page-R:             0              0           0.00           0.00           0.00           0.00              0         664078           0.05         699.35      236641.75           0.36
> > >                         zram-entry->lock:             0              0           0.00           0.00           0.00           0.00              0        3098328           0.06        2987.02      313339.11           0.10
> > >    &per_cpu_ptr(comp->stream, cpu)->lock:             0              0           0.00           0.00           0.00           0.00             23         664078           1.77        7071.30    14838397.61          22.34
> > 
> > So...
> > 
> > I added lock-stat handling to zspage->lock and to zram (in zram it's only
> > trylock that we can track, but it doesn't really bother me).  I also
> > renamed zsmalloc-page-R to old zspage->lock-R and zram-entry->lock to
> > old zram->table[index].lock, just in case if anyone cares.
> > 
> > Now bounces stats for zspage->lock and zram->table[index].lock look
> > pretty much like in BASE case.
> > 
> > PATCHED
> > =======
> > 
> >                               class name    con-bounces    contentions   waittime-min   waittime-max waittime-total   waittime-avg    acq-bounces   acquisitions   holdtime-min   holdtime-max holdtime-total   holdtime-avg
> >                          &pool->lock#3-R:             0              0           0.00           0.00           0.00           0.00           2702         703841           0.22         873.90      197110.49           0.28
> >                             &class->lock:             0              0           0.00           0.00           0.00           0.00           4590         842336           0.10        3329.63      256595.70           0.30
> >                           zspage->lock-R:             0              0           0.00           0.00           0.00           0.00           4750         665011           0.08        3360.60      258402.21           0.39
> >                  zram->table[index].lock:             0              0           0.00           0.00           0.00           0.00        1722291        3099346           0.12        6943.09      721282.34           0.23
> >    &per_cpu_ptr(comp->stream, cpu)->lock:             0              0           0.00           0.00           0.00           0.00             23         665011           2.84        7062.18    14896206.16          22.40
> > 
> 
> holdtime-max and holdtime-total are higher in the patched kernel. Not
> sure if this is just an artifact of lock holders being preemtible. 

Hmm, pool->lock shouldn't be affected at all, however BASE holds it much
longer than PATCHED

        holdtime-max            holdtime-total
BASE    104.74                  125571.64
PATCHED 44.09                   107333.02

Doesn't make sense.  I can understand zspage->lock and
zram->table[index].lock, but for zram->table[index] things look
strange (comparing run #1 and #2)

        holdtime-total
BASE    365551.24
PATCHED 313339.11

And run #3 is in its own league.

Very likely just a very very bad way to test things.

Re-based on 6.14.0-rc2-next-20250213.

BASE
====

PREEMPT_NONE

                              class name    con-bounces    contentions   waittime-min   waittime-max waittime-total   waittime-avg    acq-bounces   acquisitions   holdtime-min   holdtime-max holdtime-total   holdtime-avg
                   &pool->migrate_lock-R:             0              0           0.00           0.00           0.00           0.00           3624         702276           0.15          35.96      126562.90           0.18
                            &class->lock:             0              0           0.00           0.00           0.00           0.00           5084         840733           0.06         795.26      183238.22           0.22
                         &zspage->lock-R:             0              0           0.00           0.00           0.00           0.00           5358         664228           0.12          43.71      192732.71           0.29
                &zram->table[index].lock:             0              0           0.00           0.00           0.00           0.00        1528645        3095862           0.07         764.76      370881.23           0.12
                            &zstrm->lock:             0              0           0.00           0.00           0.00           0.00              0         664228           2.52        2033.81    14605911.45          21.99

PREEMPT_VOLUNTARY

                              class name    con-bounces    contentions   waittime-min   waittime-max waittime-total   waittime-avg    acq-bounces   acquisitions   holdtime-min   holdtime-max holdtime-total   holdtime-avg
                   &pool->migrate_lock-R:             0              0           0.00           0.00           0.00           0.00           3039         699556           0.14          50.78      125553.59           0.18
                            &class->lock:             0              0           0.00           0.00           0.00           0.00           5259         838005           0.06         943.43      177108.05           0.21
                         &zspage->lock-R:             0              0           0.00           0.00           0.00           0.00           5581         664096           0.12          81.56      190235.48           0.29
                &zram->table[index].lock:             0              0           0.00           0.00           0.00           0.00        1731706        3098570           0.07         796.87      366934.54           0.12
                            &zstrm->lock:             0              0           0.00           0.00           0.00           0.00              0         664096           3.38        5074.72    14472697.91          21.79

PREEMPT

                              class name    con-bounces    contentions   waittime-min   waittime-max waittime-total   waittime-avg    acq-bounces   acquisitions   holdtime-min   holdtime-max holdtime-total   holdtime-avg
                   &pool->migrate_lock-R:             0              0           0.00           0.00           0.00           0.00           2545         701827           0.14         773.56      125463.37           0.18
                            &class->lock:             0              0           0.00           0.00           0.00           0.00           4697         840281           0.06        1701.18      231657.38           0.28
                         &zspage->lock-R:             0              0           0.00           0.00           0.00           0.00           4778         664002           0.12         755.62      181215.17           0.27
                &zram->table[index].lock:             0              0           0.00           0.00           0.00           0.00        1731737        3096937           0.07        1703.92      384633.29           0.12
                            &zstrm->lock:             0              0           0.00           0.00           0.00           0.00              0         664002           2.85        3603.20    14586900.58          21.97

So somehow holdtime-max for per-CPU stream is 2.5x higher for PREEMPT_VOLUNTARY
than for PREEMPT_NONE.  And class->lock holdtime-total is much much higher for
PREEMPT than for any other preemption models.  And that's BASE kernel, which
runs fully atomic zsmalloc and zram.  I call this rubbish.