Hi Baoquan, On Tue, May 11, 2021 at 09:36:41PM +0800, Baoquan He wrote: > On 05/10/21 at 01:56pm, David Hildenbrand wrote: > > On 10.05.21 13:44, Dave Young wrote: > > > Hi David, > > > > Hi Dave, > > > > > On 05/10/21 at 01:01pm, David Hildenbrand wrote: > > > [snip] > > > > It also bugged me for quite a bit that we don't have a sane way to achieve > > > > what we're doing here upstream. It somewhat feels like "this doesn't belong > > > > in the kernel and is user policy" but then, the existing kernel support is > > > > suboptimal. > > > > > > > > Maybe reserving some "maybe too big but okayish to boot the system in a sane > > > > environment -- e.g., X% of system RAM and at least Y" size first and > > > > shrinking it later as triggered by user space early (where we do seem to > > > > have a way to pre-calculate things now) might actually be a good direction > > > > to look into. > > > > > > Hmm, that is also an option we considered before. Even for your > > > suggestion we still need a kernel option to set the default ratio/value. > > > and the ratio/value should be another patch which expands crashkernel > > > syntax. > > > > Right. > > > > > > > > Actually the kconfig help text in this patch is indeed misleading, it is > > > not introducing crashkernel=a:b... and no need to explain about the > > > crashkernel syntax, the config option is actually just some interface we > > > can add any valid crashkernel settings to be used by default. So current > > > patch help text describes the default value of crash auto str, instead > > > of describes what crash auto str is. > > > > Right. And I would much rather prefer either > > > > a) handling "auto" completely in the kernel, not just setting some > > questionable default at compile time > > Thanks for the suggestions. > > If the way adding default value into kernel config is disliked, > this a) option looks good. We can get value with x% of system RAM, but > clamp it with CRASH_KERNEL_MIN/MAX. The CRASH_KERNEL_MIN/MAX may need be > defined with a default value for different ARCHes. It's very close to > our current implementation, and handling 'auto' in kernel. > > And kernel config provided so that people can tune the MIN/MAX value, > but no need to post patch to do the tuning each time if have to? Maybe I'm missing something, but the whole point is to avoid kernel configuration option at all. If the crashkernel=auto works good for 99% of the cases, there is no need to provide build time configuration along with it. There are plenty of ways users can control crashkernel reservations with the existing 2-4 (depending on architecture) command line options. Simply hard coding a reasonable defaults (e.g. "1G-64G:128M,64G-1T:256M,1T-:512M"), and using these defaults when crashkernel=auto is set would cover the same 99% of users you referred to. If we can resize the reservation later during boot this will also address David's concern about the wasted memory. You mentioned that amount of memory that is required for crash kernel reservation depends on the devices present on the system. Is is possible to detect how much memory is required at late stages of boot? > > b) passing it explicitly in via the cmdline > > > > > > > > And crashkernel=auto makes this more flexibly. We can tune the values > > > easily when upgrading. But if we pass a fixed value in userspace we > > > can not know if the value is set by distribution automatically or by user > > > manually thus we can not blindly update it. > > > > I think there are two different cases: > > > > > > 1. kernel space updates the value later during boot. "crashkernel=auto" > > really does the right thing, meaning > > > > a) allocate something reasonable and safe during early boot > > b) update the allocation during late boot when we know what kind of system > > we're running on > > > > Then, we indeed care about "crashkernel=auto" in the kernel and I think it > > would be a nice thing to have. The only question is on how to make that a > > little configurable, depending on different thingies we might want to run in > > the crashkernel (assuming someone doesn't want kdump). > > > > > > 2. user space updates the value later during boot > > > > IMHO we don't really car who decided on the value as we do the update from > > user space. If an admin messes with crashkernel=, the admin can also mess > > with kdump not doing any overwrites (e.g., make that configurable, or detect > > the overwrite in kdump somehow). > > > > -- > > Thanks, > > > > David / dhildenb > > > > -- Sincerely yours, Mike.