On 9/6/18 1:03 PM, Michal Hocko wrote: > On Thu 06-09-18 08:41:52, Alexander Duyck wrote: >> On Thu, Sep 6, 2018 at 8:13 AM Michal Hocko <mhocko@xxxxxxxxxx> wrote: >>> >>> On Thu 06-09-18 07:59:03, Dave Hansen wrote: >>>> On 09/05/2018 10:47 PM, Michal Hocko wrote: >>>>> why do you have to keep DEBUG_VM enabled for workloads where the boot >>>>> time matters so much that few seconds matter? >>>> >>>> There are a number of distributions that run with it enabled in the >>>> default build. Fedora, for one. We've basically assumed for a while >>>> that we have to live with it in production environments. >>>> >>>> So, where does leave us? I think we either need a _generic_ debug >>>> option like: >>>> >>>> CONFIG_DEBUG_VM_SLOW_AS_HECK >>>> >>>> under which we can put this an other really slow VM debugging. Or, we >>>> need some kind of boot-time parameter to trigger the extra checking >>>> instead of a new CONFIG option. >>> >>> I strongly suspect nobody will ever enable such a scary looking config >>> TBH. Besides I am not sure what should go under that config option. >>> Something that takes few cycles but it is called often or one time stuff >>> that takes quite a long but less than aggregated overhead of the former? >>> >>> Just consider this particular case. It basically re-adds an overhead >>> that has always been there before the struct page init optimization >>> went it. The poisoning just returns it in a different form to catch >>> potential left overs. And we would like to have as many people willing >>> to running in debug mode to test for those paths because they are >>> basically impossible to review by the code inspection. More importantnly >>> the major overhead is boot time so my question still stands. Is this >>> worth a separate config option almost nobody is going to enable? >>> >>> Enabling DEBUG_VM by Fedora and others serves us a very good testing >>> coverage and I appreciate that because it has generated some useful bug >>> reports. Those people are paying quite a lot of overhead in runtime >>> which can aggregate over time is it so much to ask about one time boot >>> overhead? >> >> The kind of boot time add-on I saw as a result of this was about 170 >> seconds, or 2 minutes and 50 seconds on a 12TB system. > > Just curious. How long does it take to get from power on to even reaach > boot loader on that machine... ;) > >> I spent a >> couple minutes wondering if I had built a bad kernel or not as I was >> staring at a dead console the entire time after the grub prompt since >> I hit this so early in the boot. That is the reason why I am so eager >> to slice this off and make it something separate. I could easily see >> this as something that would get in the way of other debugging that is >> going on in a system. > > But you would get the same overhead a kernel release ago when the > memmap init optimization was merged. So you are basically back to what > we used to have for years. Unless I misremember. You remeber this correctly: 2f47a91f4dab19aaaa05cdcfced9dfcaf3f5257e has data before vs after zeroing memory in memblock allocator. On SPARC with 15T we saved 55.4s, because SPARC has larger base pages, thus fewer struct pages. On x86 with 1T saved 15.8s: which is 189.6s if it was 12T machine Alexander is using, close to 170s he is seeing, but CPU must be faster. Pavel