On 2016/9/23 11:30, Nicholas Piggin wrote: > On Fri, 23 Sep 2016 00:30:20 +0800 > zijun_hu <zijun_hu@xxxxxxxx> wrote: > >> On 2016/9/22 20:37, Michal Hocko wrote: >>> On Thu 22-09-16 09:13:50, zijun_hu wrote: >>>> On 09/22/2016 08:35 AM, David Rientjes wrote: >>> [...] >>>>> The intent is as it is implemented; with your change, lazy_max_pages() is >>>>> potentially increased depending on the number of online cpus. This is >>>>> only a heuristic, changing it would need justification on why the new >>>>> value is better. It is opposite to what the comment says: "to be >>>>> conservative and not introduce a big latency on huge systems, so go with >>>>> a less aggressive log scale." NACK to the patch. >>>>> >>>> my change potentially make lazy_max_pages() decreased not increased, i seems >>>> conform with the comment >>>> >>>> if the number of online CPUs is not power of 2, both have no any difference >>>> otherwise, my change remain power of 2 value, and the original code rounds up >>>> to next power of 2 value, for instance >>>> >>>> my change : (32, 64] -> 64 >>>> 32 -> 32, 64 -> 64 >>>> the original code: [32, 63) -> 64 >>>> 32 -> 64, 64 -> 128 >>> >>> You still completely failed to explain _why_ this is an improvement/fix >>> or why it matters. This all should be in the changelog. >>> >> >> Hi npiggin, >> could you give some comments for this patch since lazy_max_pages() is introduced >> by you >> >> my patch is based on the difference between fls() and get_count_order() mainly >> the difference between fls() and get_count_order() will be shown below >> more MM experts maybe help to decide which is more suitable >> >> if parameter > 1, both have different return value only when parameter is >> power of two, for example >> >> fls(32) = 6 VS get_count_order(32) = 5 >> fls(33) = 6 VS get_count_order(33) = 6 >> fls(63) = 6 VS get_count_order(63) = 6 >> fls(64) = 7 VS get_count_order(64) = 6 >> >> @@ -594,7 +594,9 @@ static unsigned long lazy_max_pages(void) >> { >> unsigned int log; >> >> - log = fls(num_online_cpus()); >> + log = num_online_cpus(); >> + if (log > 1) >> + log = (unsigned int)get_count_order(log); >> >> return log * (32UL * 1024 * 1024 / PAGE_SIZE); >> } >> > > To be honest, I don't think I chose it with a lot of analysis. > It will depend on the kernel usage patterns, the arch code, > and the CPU microarchitecture, all of which would have changed > significantly. > > I wouldn't bother changing it unless you do some bench marking > on different system sizes to see where the best performance is. > (If performance is equal, fewer lazy pages would be better.) > > Good to see you taking a look at this vmalloc stuff. Don't be > discouraged if you run into some dead ends. > > Thanks, > Nick > thanks for your reply please don't pay attention to this patch any more since i don't have condition to do many test and comparison i just feel my change maybe be consistent with operation of rounding up to power of 2 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>