On Mon, Oct 21, 2024 at 09:33:17AM -0500, Eric W. Biederman wrote: > Yan Zhao <yan.y.zhao@xxxxxxxxx> writes: > > > The kexec destination addresses (incluing those for purgatory, the new > > kernel, boot params/cmdline, and initrd) are searched from the free area of > > memblock or RAM resources. Since they are not allocated by the currently > > running kernel, it is not guaranteed that they are accepted before > > relocating the new kernel. > > > > Accept the destination addresses for the new kernel, as the new kernel may > > not be able to or may not accept them by itself. > > > > Place the "accept" code immediately after the destination addresses pass > > sanity checks, so the code can be shared by both users of the kexec_load > > and kexec_file_load system calls. > > I am not at all certain this is sufficient, and I am a bit flummoxed > about the need to ever ``accept'' memory lazily. > > In a past life I wrote bootup firmware, and as part of that was the code > to initialize the contents of memory. When properly tuned and setup it > would never take more than a second to just blast initial values into > memory. That is because the ratio of memory per memory controller to > memory bandwidth stayed roughly constant while I was paying attention. > I expect that ratio to continue staying roughly constant or systems > will quickly start developing unacceptable boot times. > > As I recall Intel TDX is where the contents of memory are encrypted per > virtual machine. Which implies that you have the same challenge as > bootup initializing memory, and that is what ``accepting'' memory is. Yes, the kernel actually will accept initial memory used by itself in extract_kernel(), as in arch/x86/boot/compressed/misc.c. But the target kernel may not be able to accept memory for purgatory. And it's currently does not accept memory for boot params/cmdline, and initrd . > > I am concerned that an unfiltered accept_memory may result in memory > that has already been ``accepted'' being accepted again. This has > the potential to be wasteful in the best case, and the potential to > cause memory that is in use to be reinitialized losing the values > that are currently stored there. accept_memory() will not accept memory that has already been accepted. An unaccepted->bitmap is maintained and queried before accepting. (this is at least the implementation in drivers/firmware/efi/unaccepted_memory.c) If it's still a concern to you, is it better to add a check like this? if (range_contains_unaccepted_memory(mstart, size)) accept_memory(mstart, size); > > I am concerned that the target kernel won't know about about accepting > memory, or might not perform the work early enough and try to use memory > without accepting it first. The target kernel does accept memory before use it. But not including those in kexec segments for purgatory, boot params/cmdline, and initrd. > I would much prefer if getting into kexec_load would force the memory > acceptance out of lazy mode (or possibly not even work in lazy mode). > That keeps things simple for now. > > Once enough people have machines requiring the use of accept_memory > we can worry about optimizing things and pushing the accept_memory call > down into kexec_load. > > > Ugh. I just noticed another issue. Unless the memory we are talking > about is the memory reserved for kexec on panic kernels the memory needs > struct pages and everything setup so it can be allocated from anyway. > > Which is to say I think this is has the potential to conflict with > the accounting in try_to_accept_memory. Then could we put the accept into machine_kexec(), given that accept_memory() will not fail? > > Please just make memory acceptance ``eager'' non-lazy when using kexec. > Unless someone has messed their implementation badly it won't be a > significant amount of time in human terms, and it makes the code > so much easier to understand and think about. > yes, it's also an approach if the above cannot convince you. > > > > Cc: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> > > Reviewed-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> > > Signed-off-by: Yan Zhao <yan.y.zhao@xxxxxxxxx> > > --- > > kernel/kexec_core.c | 10 ++++++++++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c > > index c0caa14880c3..d97376eafc1a 100644 > > --- a/kernel/kexec_core.c > > +++ b/kernel/kexec_core.c > > @@ -210,6 +210,16 @@ int sanity_check_segment_list(struct kimage *image) > > } > > #endif > > > > + /* > > + * The destination addresses are searched from free memory ranges rather > > + * than being allocated from the current kernel, so they are not > > + * guaranteed to be accepted by the current kernel. > > + * Accept those initial pages for the new kernel since it may not be > > + * able to accept them by itself. > > + */ > > + for (i = 0; i < nr_segments; i++) > > + accept_memory(image->segment[i].mem, image->segment[i].memsz); > > + > > return 0; > > } > > > > base-commit: 8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec