On Thu, Nov 16, 2017 at 09:54:33AM +0100, Michal Hocko wrote: > On Wed 15-11-17 14:17:52, YASUAKI ISHIMATSU wrote: > > Hi Michal and Mel, > > > > To reproduce the issue, I specified the large trace buffer. The issue also occurs with > > trace_buf_size=12M and movable_node on 4.14.0. > > This is still 10x more than the default. Why do you need it in the first > place? You can of course find a size that will not fit into the initial > memory but I am questioning why do you want something like that during > early boot in the first place. > This confused me as well. I couldn't think of a sensible use-case for increasing the buffer other than stuffing trace_printk in multiple places for debugging purposes. Even in such cases, it would be feasible to disable the option in Kconfig just to have the large buffer. Otherwise, just wait until the system is booted and set if from userspace. The lack of a sensible use-case is why I took a fairly blunt approach to the problem. Keep it (relatively) simple and all that. > The whole deferred struct page allocation operates under assumption > that there are no large page allocator consumers that early during > the boot process. Yes. > If this assumption is not correct then we probably > need a generic way to describe this. Add-hoc trace specific thing is > far from idea, imho. If anything the first approach to disable the > deferred initialization via kernel command line option sounds much more > appropriate and simpler to me. So while the first approach was blunt, there are multiple other options. 1. Parse trace_buf_size in __setup to record the information before page alloc init starts. Take that into account in reset_deferred_meminit to increase the amount of memory that is serially initialised 2. Have tracing init with a small buffer and then resize it after page_alloc_init_late. This modifies tracing a bit but most of the helpers that are required are there. It would be more complex but it's doable 3. Add a kernel command line parameter that explicitly disables deferred meminit. We used to have something like this but it was never merged as we should be able to estimate how much memory is needed to boot. 4. Put a check into the page allocator slowpath that triggers serialised init if the system is booting and an allocation is about to fail. It would be such a cold path that it would never be noticable although it would leave dead code in the kernel image once boot had completed However, it would be preferable by far to have a sensible use-case as to why trace_buf_size= would be specified with a large buffer. It's hard to know what level of complexity is justified without it. -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>