On Fri, 2023-08-18 at 14:11 +0900, Hyeonggon Yoo wrote: > On Fri, Aug 11, 2023 at 3:52 PM Jay Patel <jaypatel@xxxxxxxxxxxxx> > wrote: > > On Fri, 2023-08-11 at 02:54 +0900, Hyeonggon Yoo wrote: > > > On Thu, Jul 20, 2023 at 7:24 PM Jay Patel <jaypatel@xxxxxxxxxxxxx > > > > > > > wrote: > > > > In the current implementation of the slub memory allocator, the > > > > slab > > > > order selection process follows these criteria: > > > > > > > > 1) Determine the minimum order required to serve the minimum > > > > number > > > > of > > > > objects (min_objects). This calculation is based on the formula > > > > (order > > > > = min_objects * object_size / PAGE_SIZE). > > > > 2) If the minimum order is greater than the maximum allowed > > > > order > > > > (slub_max_order), set slub_max_order as the order for this > > > > slab. > > > > 3) If the minimum order is less than the slub_max_order, > > > > iterate > > > > through a loop from minimum order to slub_max_order and check > > > > if > > > > the > > > > condition (rem <= slab_size / fract_leftover) holds true. Here, > > > > slab_size is calculated as (PAGE_SIZE << order), rem is > > > > (slab_size > > > > % > > > > object_size), and fract_leftover can have values of 16, 8, or > > > > 4. If > > > > the condition is true, select that order for the slab. > > > > > > > > > > > > However, in point 3, when calculating the fraction left over, > > > > it > > > > can > > > > result in a large range of values (like 1 Kb to 256 bytes on 4K > > > > page > > > > size & 4 Kb to 16 Kb on 64K page size with order 0 and goes on > > > > increasing with higher order) when compared to the remainder > > > > (rem). > > > > This > > > > can lead to the selection of an order that results in more > > > > memory > > > > wastage. To mitigate such wastage, we have modified point 3 as > > > > follows: > > > > To adjust the value of fract_leftover based on the page size, > > > > while > > > > retaining the current value as the default for a 4K page size. > > > > > > > > Test results are as follows: > > > > > > > > 1) On 160 CPUs with 64K Page size > > > > > > > > +-----------------+----------------+----------------+ > > > > > Total wastage in slub memory | > > > > +-----------------+----------------+----------------+ > > > > > | After Boot |After Hackbench | > > > > > Normal | 932 Kb | 1812 Kb | > > > > > With Patch | 729 Kb | 1636 Kb | > > > > > Wastage reduce | ~22% | ~10% | > > > > +-----------------+----------------+----------------+ > > > > > > > > +-----------------+----------------+----------------+ > > > > > Total slub memory | > > > > +-----------------+----------------+----------------+ > > > > > | After Boot | After Hackbench| > > > > > Normal | 1855296 | 2944576 | > > > > > With Patch | 1544576 | 2692032 | > > > > > Memory reduce | ~17% | ~9% | > > > > +-----------------+----------------+----------------+ > > > > > > > > hackbench-process-sockets > > > > +-------+-----+----------+----------+-----------+ > > > > > Amean | 1 | 1.2727 | 1.2450 | ( 2.22%) | > > > > > Amean | 4 | 1.6063 | 1.5810 | ( 1.60%) | > > > > > Amean | 7 | 2.4190 | 2.3983 | ( 0.86%) | > > > > > Amean | 12 | 3.9730 | 3.9347 | ( 0.97%) | > > > > > Amean | 21 | 6.9823 | 6.8957 | ( 1.26%) | > > > > > Amean | 30 | 10.1867 | 10.0600 | ( 1.26%) | > > > > > Amean | 48 | 16.7490 | 16.4853 | ( 1.60%) | > > > > > Amean | 79 | 28.1870 | 27.8673 | ( 1.15%) | > > > > > Amean | 110 | 39.8363 | 39.3793 | ( 1.16%) | > > > > > Amean | 141 | 51.5277 | 51.4907 | ( 0.07%) | > > > > > Amean | 172 | 62.9700 | 62.7300 | ( 0.38%) | > > > > > Amean | 203 | 74.5037 | 74.0630 | ( 0.59%) | > > > > > Amean | 234 | 85.6560 | 85.3587 | ( 0.35%) | > > > > > Amean | 265 | 96.9883 | 96.3770 | ( 0.63%) | > > > > > Amean | 296 | 108.6893 | 108.0870 | ( 0.56%) | > > > > +-------+-----+----------+----------+-----------+ > > > > > > > > 2) On 16 CPUs with 64K Page size > > > > > > > > +----------------+----------------+----------------+ > > > > > Total wastage in slub memory | > > > > +----------------+----------------+----------------+ > > > > > | After Boot | After Hackbench| > > > > > Normal | 273 Kb | 544 Kb | > > > > > With Patch | 260 Kb | 500 Kb | > > > > > Wastage reduce | ~5% | ~9% | > > > > +----------------+----------------+----------------+ > > > > > > > > +-----------------+----------------+----------------+ > > > > > Total slub memory | > > > > +-----------------+----------------+----------------+ > > > > > | After Boot | After Hackbench| > > > > > Normal | 275840 | 412480 | > > > > > With Patch | 272768 | 406208 | > > > > > Memory reduce | ~1% | ~2% | > > > > +-----------------+----------------+----------------+ > > > > > > > > hackbench-process-sockets > > > > +-------+----+---------+---------+-----------+ > > > > > Amean | 1 | 0.9513 | 0.9250 | ( 2.77%) | > > > > > Amean | 4 | 2.9630 | 2.9570 | ( 0.20%) | > > > > > Amean | 7 | 5.1780 | 5.1763 | ( 0.03%) | > > > > > Amean | 12 | 8.8833 | 8.8817 | ( 0.02%) | > > > > > Amean | 21 | 15.7577 | 15.6883 | ( 0.44%) | > > > > > Amean | 30 | 22.2063 | 22.2843 | ( -0.35%) | > > > > > Amean | 48 | 36.0587 | 36.1390 | ( -0.22%) | > > > > > Amean | 64 | 49.7803 | 49.3457 | ( 0.87%) | > > > > +-------+----+---------+---------+-----------+ > > > > > > > > Signed-off-by: Jay Patel <jaypatel@xxxxxxxxxxxxx> > > > > --- > > > > Changes from V3 > > > > 1) Resolved error and optimise logic for all arch > > > > > > > > Changes from V2 > > > > 1) removed all page order selection logic for slab cache base > > > > on > > > > wastage. > > > > 2) Increasing fraction size base on page size (keeping current > > > > value > > > > as default to 4K page) > > > > > > > > Changes from V1 > > > > 1) If min_objects * object_size > PAGE_ALLOC_COSTLY_ORDER, then > > > > it > > > > will return with PAGE_ALLOC_COSTLY_ORDER. > > > > 2) Similarly, if min_objects * object_size < PAGE_SIZE, then it > > > > will > > > > return with slub_min_order. > > > > 3) Additionally, I changed slub_max_order to 2. There is no > > > > specific > > > > reason for using the value 2, but it provided the best results > > > > in > > > > terms of performance without any noticeable impact. > > > > > > > > mm/slub.c | 17 +++++++---------- > > > > 1 file changed, 7 insertions(+), 10 deletions(-) > > > > > > > > diff --git a/mm/slub.c b/mm/slub.c > > > > index c87628cd8a9a..8f6f38083b94 100644 > > > > --- a/mm/slub.c > > > > +++ b/mm/slub.c > > > > @@ -287,6 +287,7 @@ static inline bool > > > > kmem_cache_has_cpu_partial(struct kmem_cache *s) > > > > #define OO_SHIFT 16 > > > > #define OO_MASK ((1 << OO_SHIFT) - 1) > > > > #define MAX_OBJS_PER_PAGE 32767 /* since slab.objects is > > > > u15 > > > > */ > > > > +#define SLUB_PAGE_FRAC_SHIFT 12 > > > > > > > > /* Internal SLUB flags */ > > > > /* Poison object */ > > > > @@ -4117,6 +4118,7 @@ static inline int > > > > calculate_order(unsigned > > > > int size) > > > > unsigned int min_objects; > > > > unsigned int max_objects; > > > > unsigned int nr_cpus; > > > > + unsigned int page_size_frac; > > > > > > > > /* > > > > * Attempt to find best configuration for a slab. This > > > > @@ -4145,10 +4147,13 @@ static inline int > > > > calculate_order(unsigned > > > > int size) > > > > max_objects = order_objects(slub_max_order, size); > > > > min_objects = min(min_objects, max_objects); > > > > > > > > - while (min_objects > 1) { > > > > + page_size_frac = ((PAGE_SIZE >> SLUB_PAGE_FRAC_SHIFT) > > > > == 1) > > > > ? 0 > > > > + : PAGE_SIZE >> SLUB_PAGE_FRAC_SHIFT; > > > > + > > > > + while (min_objects >= 1) { > > > > unsigned int fraction; > > > > > > > > - fraction = 16; > > > > + fraction = 16 + page_size_frac; > > > > while (fraction >= 4) { > > > > > > Sorry I'm a bit late for the review. > > > > > > IIRC hexagon/powerpc can have ridiculously large page sizes (1M > > > or > > > 256KB) > > > (but I don't know if such config is actually used, tbh) so I > > > think > > > there should be > > > an upper bound. > > > > Hi, > > I think that might not be required as arch with larger page size > > will required larger fraction value as per this exit condition (rem > > <= > > slab_size / fract_leftover) during calc_slab_order. > > Okay, with 256KB pages the fraction will start from 80, and then 40, > 20, 10, 5, ... > and 1/80 of 256KB is about 3KB. So it's to waste less even when the > machine uses large page sizes, > because 1/16 of 256KB is still large, right? Yes correct, so with this approach we can save memory wastage and total memory for slub when using larger page size :) > > > > > order = calc_slab_order(size, > > > > min_objects, > > > > slub_max_order, > > > > fraction); > > > > @@ -4159,14 +4164,6 @@ static inline int > > > > calculate_order(unsigned > > > > int size) > > > > min_objects--; > > > > } > > > > - /* > > > > - * We were unable to place multiple objects in a slab. > > > > Now > > > > - * lets see if we can place a single object there. > > > > - */ > > > > - order = calc_slab_order(size, 1, slub_max_order, 1); > > > > - if (order <= slub_max_order) > > > > - return order; > > > > > > I'm not sure if it's okay to remove this? > > > It was fine in v2 because the least wasteful order was chosen > > > regardless of fraction but that's not true anymore. > > > > > Ok, So my though are like if single object in slab with slab_size = > > PAGE_SIZE << slub_max_order and it wastage more then 1\4th of > > slab_size > > then it's better to skip this part and use MAX_ORDER instead of > > slub_max_order. > > Could you kindly share your perspective on this part? > > I simply missed that part! :) > That looks fine to me. > > > > Tha > > nks > > Jay Patel > > > Otherwise, everything looks fine to me. I'm too dumb to > > > anticipate > > > the outcome of increasing the slab order :P but this patch does > > > not > > > sound crazy to me. > > > > > > Thanks! > > > -- > > > Hyeonggon