From: Wonhyuk Yang <vvghjk1234@xxxxxxxxx> Best upa is determined by iterating 1 to max_upa. If the size of alloc_size is power of 2, numbers of iteration decrease to logarithmic level. Prime factorization of alloc_size makes it easy to get possible upas. When alloc_size is power of 2, we can avoid cost of the prime factorization and possible upas are 1, 2, 4, ... max_upa. Signed-off-by: Wonhyuk Yang <vvghjk1234@xxxxxxxxx> --- mm/percpu.c | 20 ++++++++------------ 1 file changed, 8 insertions(+), 12 deletions(-) diff --git a/mm/percpu.c b/mm/percpu.c index 66a93f096394..a24f3973744f 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -2689,18 +2689,17 @@ static struct pcpu_alloc_info * __init pcpu_build_alloc_info( /* * Determine min_unit_size, alloc_size and max_upa such that - * alloc_size is multiple of atom_size and is the smallest - * which can accommodate 4k aligned segments which are equal to - * or larger than min_unit_size. + * alloc_size is the maximu value of min_unit_size, atom_size. + * Also, alloc_size is power of 2 because both min_unit_size + * and atom_size are power of 2. */ min_unit_size = max_t(size_t, size_sum, PCPU_MIN_UNIT_SIZE); + min_unit_size = roundup_pow_of_two(min_unit_size); /* determine the maximum # of units that can fit in an allocation */ - alloc_size = roundup(min_unit_size, atom_size); - upa = alloc_size / min_unit_size; - while (alloc_size % upa || (offset_in_page(alloc_size / upa))) - upa--; - max_upa = upa; + alloc_size = max_t(size_t, min_unit_size, atom_size); + max_upa = alloc_size / min_unit_size; + /* group cpus according to their proximity */ for_each_possible_cpu(cpu) { @@ -2727,12 +2726,9 @@ static struct pcpu_alloc_info * __init pcpu_build_alloc_info( * Related to atom_size, which could be much larger than the unit_size. */ last_allocs = INT_MAX; - for (upa = max_upa; upa; upa--) { + for (upa = max_upa; upa; upa >>= 1) { int allocs = 0, wasted = 0; - if (alloc_size % upa || (offset_in_page(alloc_size / upa))) - continue; - for (group = 0; group < nr_groups; group++) { int this_allocs = DIV_ROUND_UP(group_cnt[group], upa); allocs += this_allocs; -- 2.17.1