On Fri, Sep 29, 2023 at 07:56:37AM +0800, Baoquan He wrote: > On 09/27/23 at 07:46pm, Stanislav Kinsburskii wrote: > > On Thu, Sep 28, 2023 at 12:16:31PM -0700, Dave Hansen wrote: > > > On 9/27/23 17:38, Stanislav Kinsburskii wrote: > > > > On Thu, Sep 28, 2023 at 11:00:12AM -0700, Dave Hansen wrote: > > > >> On 9/27/23 17:02, Stanislav Kinsburskii wrote: > > > >>> On Thu, Sep 28, 2023 at 10:29:32AM -0700, Dave Hansen wrote: > > > >> ... > > > >>> Well, not exactly. That's something I'd like to have indeed, but from my > > > >>> POV this goal is out of scope of discussion at the moment. > > > >>> Let me try to express it the same way you did above: > > > >>> > > > >>> 1. Boot some kernel > > > >>> 2. Grow the deposited memory a bunch > > > >>> 5. Kexec > > > >>> 4. Kernel panic due to GPF upon accessing the memory deposited to > > > >>> hypervisor. > > > >> > > > >> I basically consider this a bug in the first kernel. It *can't* kexec > > > >> when it's left RAM in shambles. It doesn't know what features the new > > > >> kernel has and whether this is even safe. > > > >> > > > > > > > > Could you elaborate more on why this is a bug in the first kernel? > > > > Say, kernel memory can be allocated in big physically consequitive > > > > chunks by the first kernel for depositing. The information about these > > > > chunks is then passed the the second kernel via FDT or even command > > > > line, so the seconds kernel can reserve this region during booting. > > > > What's wrong with this approach? > > > > > > How do you know the second kernel can parse the FDT entry or the > > > command-line you pass to it? > > > > > > >> Can the new kernel even read the new device tree data? > > > > > > > > I'm not sure I understand the question, to be honest. > > > > Why can't it? This series contains code parts for both first and seconds > > > > kernels. > > > > > > How do you know the second kernel isn't the version *before* this series > > > gets merged? > > > > > > > The answer to both questions above is the following: the feature is deployed > > fleed-wide first, and enabled only upon the next deployment. > > It worth mentioning, that fleet-wide deployments usually don't need to support > > updates to a version older that the previous one. > > Also, since kexec is initialited by user space, it always can be > > enlightened about kernel capabilities and simply don't kexec to an > > incompatible kernel version. > > One more bit to mention, that it real life this problme exists only > > during initial transition, as once the upgrade to a kernel with a > > feature has happened, there won't be a revert to a versoin without it. > > > > > ... > > > >> I still think the only way this will possibly work when kexec'ing both > > > >> old and new kernels is to do it with the memory maps that *all* kernels > > > >> can read. > > > > > > > > Could you elaborate more on this? > > > > The avaiable memory map actually stays the same for both kernels. The > > > > difference here can be in a different list of memory regions to reserve, > > > > when the first kernel allocated and deposited another chunk, and thus > > > > the second kernel needs to reserve this memory as a new region upon > > > > booting. > > > > > > Please take a step back from your implementation for a moment. There > > > are two basic design points that need to be considered. > > > > > > First, *must* "System RAM" (according to the memory map) be persisted > > > across kexec? If no, then there's no problem to solve and we can stop > > > this thread. If yes, then some mechanism must be used to tell the new > > > kernel that the "System RAM" in the memory map is not normal RAM. > > > > > > Second, *if* we agree that some data must communicate across kexec, then > > > what mechanism should be used? You're arguing for a new mechanism that > > > only new kernels can use. I'm arguing that you should likely reuse an > > > existing mechanism (probably the UEFI/e820 maps) so that *ALL* kernels > > > can consume the information, old and new. > > > > > > > I'd answer yes, "System MAP" must be persisted across kexec. > > Could you elaborate on why there should be a mechanism to tell the > > kernel anything special about the existent "System map" in this context? > > Say, one can reserve a CMA region (or a crash kernel region, etc), store > > there some data, and then pass it across kexec. Reserved CMA region will > > still be a part of the "System MAP", won't it? > > Well, I haven't gone through all the discusison thread and clearly got > your intention and motivation. But here I have to say there's > misunderstanding. At least I am astonished when I heard the above > description. Who said a CMA region or a crahs kernel region need be > passed across kexec. Think kexec as a bootloader, in essence it's no > different than any other bootloader. When it jumps to 2nd kernel, the > whole system will be booted up and reconstructed on the system resources. > All the difference kexec has is it won't go through firmware to do those > detecting/testing/init. If the intentionn is to preserve any state or > region in 1st kernel, you absolutely got it wrong. > > This is not the first time people want to put burden on kexec because > of a specifica scenario, and this is not the 2nd time, and not 3rd time > in the recent 2 years. But I would say please think about what is kexec > reboot, what we expect it to do, whether the problem be fixed in its own > side. Frankly, I'm confused as I don't really understand, what you are arguing with exactly... Maybe I triggered some pain point, but I don't think you are reacting to what I actually said. I never said, that either CMA or crash kernel needs to be passed across kexec: I said they may be (and, actually are) passed in real worlds scenarios. Also, it's not just CMA, but pmem backed by RAM as well. What do I miss here? And to me it looks like I do think about kexec as a boot loader just like you mentioned, as the proposal in this series is to construct a device tree exactly the same way as it it's constructed by (for example) uboot for both x86 and arm64. So, if we think about kexec as a bootloader, why uboot can pass a resource to the new kernel, while the previous kernel can't do the same and why may it be considered as an additional burden? Thanks, Stanislav _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec