On Sun, 2023-07-02 at 17:13 -0700, David Rientjes wrote: > Thanks very much for looking at this, Jay! > > My colleague, Binder, has also been looking at opportunities to > optimize > memory usage when using SLUB. We're preparing to deprecate SLAB > internally and shift toward SLUB since SLAB is scheduled for removal > after > the next LTS kernel. > > Binder, do you have an evaluation with this patch similar to what Jay > did? > > Also, tangentially: we are looking at other opportunities for > reduction in > memory overhead when using SLUB. If you or anybody else are > interested in > being involved in a working group with this shared goal, please let > me > know. We could brainstorm, collaborate, and share data. > > Thanks again! > > Hi David, Thank you for keeping me informed. I'm interested in working together towards our shared goal. Thanks Jay Patel > On Wed, 28 Jun 2023, Jay Patel wrote: > > > In the previous version [1], we were able to reduce slub memory > > wastage, but the total memory was also increasing so to solve > > this problem have modified the patch as follow: > > > > 1) If min_objects * object_size > PAGE_ALLOC_COSTLY_ORDER, then it > > will return with PAGE_ALLOC_COSTLY_ORDER. > > 2) Similarly, if min_objects * object_size < PAGE_SIZE, then it > > will > > return with slub_min_order. > > 3) Additionally, I changed slub_max_order to 2. There is no > > specific > > reason for using the value 2, but it provided the best results in > > terms of performance without any noticeable impact. > > > > [1] > > https://lore.kernel.org/linux-mm/20230612085535.275206-1-jaypatel@xxxxxxxxxxxxx/ > > > > I have conducted tests on systems with 160 CPUs and 16 CPUs using > > 4K > > and 64K page sizes. The tests showed that the patch successfully > > reduces the total and wastage of slab memory without any noticeable > > performance degradation in the hackbench test. > > > > Test Results are as follows: > > 1) On 160 CPUs with 4K Page size > > > > +----------------+----------------+----------------+ > > > Total wastage in slub memory | > > +----------------+----------------+----------------+ > > > | After Boot | After Hackbench| > > > Normal | 2090 Kb | 3204 Kb | > > > With Patch | 1825 Kb | 3088 Kb | > > > Wastage reduce | ~12% | ~4% | > > +----------------+----------------+----------------+ > > > > +-----------------+----------------+----------------+ > > > Total slub memory | > > +-----------------+----------------+----------------+ > > > | After Boot | After Hackbench| > > > Normal | 500572 | 713568 | > > > With Patch | 482036 | 688312 | > > > Memory reduce | ~4% | ~3% | > > +-----------------+----------------+----------------+ > > > > hackbench-process-sockets > > +-------+-----+----------+----------+-----------+ > > > | Normal |With Patch| | > > +-------+-----+----------+----------+-----------+ > > > Amean | 1 | 1.3237 | 1.2737 | ( 3.78%) | > > > Amean | 4 | 1.5923 | 1.6023 | ( -0.63%) | > > > Amean | 7 | 2.3727 | 2.4260 | ( -2.25%) | > > > Amean | 12 | 3.9813 | 4.1290 | ( -3.71%) | > > > Amean | 21 | 6.9680 | 7.0630 | ( -1.36%) | > > > Amean | 30 | 10.1480 | 10.2170 | ( -0.68%) | > > > Amean | 48 | 16.7793 | 16.8780 | ( -0.59%) | > > > Amean | 79 | 28.9537 | 28.8187 | ( 0.47%) | > > > Amean | 110 | 39.5507 | 40.0157 | ( -1.18%) | > > > Amean | 141 | 51.5670 | 51.8200 | ( -0.49%) | > > > Amean | 172 | 62.8710 | 63.2540 | ( -0.61%) | > > > Amean | 203 | 74.6417 | 75.2520 | ( -0.82%) | > > > Amean | 234 | 86.0853 | 86.5653 | ( -0.56%) | > > > Amean | 265 | 97.9203 | 98.4617 | ( -0.55%) | > > > Amean | 296 | 108.6243 | 109.8770 | ( -1.15%) | > > +-------+-----+----------+----------+-----------+ > > > > 2) On 160 CPUs with 64K Page size > > +-----------------+----------------+----------------+ > > > Total wastage in slub memory | > > +-----------------+----------------+----------------+ > > > | After Boot |After Hackbench | > > > Normal | 919 Kb | 1880 Kb | > > > With Patch | 807 Kb | 1684 Kb | > > > Wastage reduce | ~12% | ~10% | > > +-----------------+----------------+----------------+ > > > > +-----------------+----------------+----------------+ > > > Total slub memory | > > +-----------------+----------------+----------------+ > > > | After Boot | After Hackbench| > > > Normal | 1862592 | 3023744 | > > > With Patch | 1644416 | 2675776 | > > > Memory reduce | ~12% | ~11% | > > +-----------------+----------------+----------------+ > > > > hackbench-process-sockets > > +-------+-----+----------+----------+-----------+ > > > | Normal |With Patch| | > > +-------+-----+----------+----------+-----------+ > > > Amean | 1 | 1.2547 | 1.2677 | ( -1.04%) | > > > Amean | 4 | 1.5523 | 1.5783 | ( -1.67%) | > > > Amean | 7 | 2.4157 | 2.3883 | ( 1.13%) | > > > Amean | 12 | 3.9807 | 3.9793 | ( 0.03%) | > > > Amean | 21 | 6.9687 | 6.9703 | ( -0.02%) | > > > Amean | 30 | 10.1403 | 10.1297 | ( 0.11%) | > > > Amean | 48 | 16.7477 | 16.6893 | ( 0.35%) | > > > Amean | 79 | 27.9510 | 28.0463 | ( -0.34%) | > > > Amean | 110 | 39.6833 | 39.5687 | ( 0.29%) | > > > Amean | 141 | 51.5673 | 51.4477 | ( 0.23%) | > > > Amean | 172 | 62.9643 | 63.1647 | ( -0.32%) | > > > Amean | 203 | 74.6220 | 73.7900 | ( 1.11%) | > > > Amean | 234 | 85.1783 | 85.3420 | ( -0.19%) | > > > Amean | 265 | 96.6627 | 96.7903 | ( -0.13%) | > > > Amean | 296 | 108.2543 | 108.2253 | ( 0.03%) | > > +-------+-----+----------+----------+-----------+ > > > > 3) On 16 CPUs with 4K Page size > > +-----------------+----------------+------------------+ > > > Total wastage in slub memory | > > +-----------------+----------------+------------------+ > > > | After Boot | After Hackbench | > > > Normal | 491 Kb | 727 Kb | > > > With Patch | 483 Kb | 670 Kb | > > > Wastage reduce | ~1% | ~8% | > > +-----------------+----------------+------------------+ > > > > +-----------------+----------------+----------------+ > > > Total slub memory | > > +-----------------+----------------+----------------+ > > > | After Boot | After Hackbench| > > > Normal | 105340 | 153116 | > > > With Patch | 103620 | 147412 | > > > Memory reduce | ~1.6% | ~4% | > > +-----------------+----------------+----------------+ > > > > hackbench-process-sockets > > +-------+-----+----------+----------+---------+ > > > | Normal |With Patch| | > > +-------+-----+----------+----------+---------+ > > > Amean | 1 | 1.0963 | 1.1070 | ( -0.97%) | > > > Amean | 4 | 3.7963) | 3.7957 | ( 0.02%) | > > > Amean | 7 | 6.5947) | 6.6017 | ( -0.11%) | > > > Amean | 12 | 11.1993) | 11.1730 | ( 0.24%) | > > > Amean | 21 | 19.4097) | 19.3647 | ( 0.23%) | > > > Amean | 30 | 27.7023) | 27.6040 | ( 0.35%) | > > > Amean | 48 | 44.1287) | 43.9630 | ( 0.38%) | > > > Amean | 64 | 58.8147) | 58.5753 | ( 0.41%) | > > +-------+----+---------+----------+-----------+ > > > > 4) On 16 CPUs with 64K Page size > > +----------------+----------------+----------------+ > > > Total wastage in slub memory | > > +----------------+----------------+----------------+ > > > | After Boot | After Hackbench| > > > Normal | 194 Kb | 349 Kb | > > > With Patch | 191 Kb | 344 Kb | > > > Wastage reduce | ~1% | ~1% | > > +----------------+----------------+----------------+ > > > > +-----------------+----------------+----------------+ > > > Total slub memory | > > +-----------------+----------------+----------------+ > > > | After Boot | After Hackbench| > > > Normal | 330304 | 472960 | > > > With Patch | 319808 | 458944 | > > > Memory reduce | ~3% | ~3% | > > +-----------------+----------------+----------------+ > > > > hackbench-process-sockets > > +-------+-----+----------+----------+---------+ > > > | Normal |With Patch| | > > +-------+----+----------+----------+----------+ > > > Amean | 1 | 1.9030 | 1.8967 | ( 0.33%) | > > > Amean | 4 | 7.2117 | 7.1283 | ( 1.16%) | > > > Amean | 7 | 12.5247 | 12.3460 | ( 1.43%) | > > > Amean | 12 | 21.7157 | 21.4753 | ( 1.11%) | > > > Amean | 21 | 38.2693 | 37.6670 | ( 1.57%) | > > > Amean | 30 | 54.5930 | 53.8657 | ( 1.33%) | > > > Amean | 48 | 87.6700 | 86.3690 | ( 1.48%) | > > > Amean | 64 | 117.1227 | 115.4893 | ( 1.39%) | > > +-------+----+----------+----------+----------+ > > > > Signed-off-by: Jay Patel <jaypatel@xxxxxxxxxxxxx> > > --- > > mm/slub.c | 52 +++++++++++++++++++++++++------------------------ > > --- > > 1 file changed, 25 insertions(+), 27 deletions(-) > > > > diff --git a/mm/slub.c b/mm/slub.c > > index c87628cd8a9a..0a1090c528da 100644 > > --- a/mm/slub.c > > +++ b/mm/slub.c > > @@ -4058,7 +4058,7 @@ EXPORT_SYMBOL(kmem_cache_alloc_bulk); > > */ > > static unsigned int slub_min_order; > > static unsigned int slub_max_order = > > - IS_ENABLED(CONFIG_SLUB_TINY) ? 1 : PAGE_ALLOC_COSTLY_ORDER; > > + IS_ENABLED(CONFIG_SLUB_TINY) ? 1 : 2; > > static unsigned int slub_min_objects; > > > > /* > > @@ -4087,11 +4087,10 @@ static unsigned int slub_min_objects; > > * the smallest order which will fit the object. > > */ > > static inline unsigned int calc_slab_order(unsigned int size, > > - unsigned int min_objects, unsigned int max_order, > > - unsigned int fract_leftover) > > + unsigned int min_objects, unsigned int max_order) > > { > > unsigned int min_order = slub_min_order; > > - unsigned int order; > > + unsigned int order, min_wastage = size, min_wastage_order = > > MAX_ORDER+1; > > > > if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE) > > return get_order(size * MAX_OBJS_PER_PAGE) - 1; > > @@ -4104,11 +4103,17 @@ static inline unsigned int > > calc_slab_order(unsigned int size, > > > > rem = slab_size % size; > > > > - if (rem <= slab_size / fract_leftover) > > - break; > > + if (rem < min_wastage) { > > + min_wastage = rem; > > + min_wastage_order = order; > > + } > > } > > > > - return order; > > + if (min_wastage_order <= slub_max_order) > > + return min_wastage_order; > > + else > > + return order; > > + > > } > > > > static inline int calculate_order(unsigned int size) > > @@ -4142,35 +4147,28 @@ static inline int calculate_order(unsigned > > int size) > > nr_cpus = nr_cpu_ids; > > min_objects = 4 * (fls(nr_cpus) + 1); > > } > > + > > + if ((min_objects * size) > (PAGE_SIZE << > > PAGE_ALLOC_COSTLY_ORDER)) > > + return PAGE_ALLOC_COSTLY_ORDER; > > + > > + if ((min_objects * size) <= PAGE_SIZE) > > + return slub_min_order; > > + > > max_objects = order_objects(slub_max_order, size); > > min_objects = min(min_objects, max_objects); > > > > - while (min_objects > 1) { > > - unsigned int fraction; > > - > > - fraction = 16; > > - while (fraction >= 4) { > > - order = calc_slab_order(size, min_objects, > > - slub_max_order, fraction); > > - if (order <= slub_max_order) > > - return order; > > - fraction /= 2; > > - } > > + while (min_objects >= 1) { > > + order = calc_slab_order(size, min_objects, > > + slub_max_order); > > + if (order <= slub_max_order) > > + return order; > > min_objects--; > > } > > > > - /* > > - * We were unable to place multiple objects in a slab. Now > > - * lets see if we can place a single object there. > > - */ > > - order = calc_slab_order(size, 1, slub_max_order, 1); > > - if (order <= slub_max_order) > > - return order; > > - > > /* > > * Doh this slab cannot be placed using slub_max_order. > > */ > > - order = calc_slab_order(size, 1, MAX_ORDER, 1); > > + order = calc_slab_order(size, 1, MAX_ORDER); > > if (order <= MAX_ORDER) > > return order; > > return -ENOSYS; > > -- > > 2.39.1 > > > >