On Tue, 14 Jan 2025 at 11:57, Dave Young <dyoung@xxxxxxxxxx> wrote: > > Hi, > > On Tue, 14 Jan 2025 at 11:26, Chen, Yu C <yu.c.chen@xxxxxxxxx> wrote: > > > > Hi Dave, > > Thanks for bringing this up, > > > > > -----Original Message----- > > > From: Dave Young <dyoung@xxxxxxxxxx> > > > Sent: Friday, January 10, 2025 2:30 PM > > > To: Chen, Yu C <yu.c.chen@xxxxxxxxx> > > > Cc: x86@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; kexec@xxxxxxxxxxxxxxxxxxx; > > > Ingo Molnar <mingo@xxxxxxxxxx>; Borislav Petkov <bp@xxxxxxxxx> > > > Subject: Re: [PATCH] x86/e820: update code comment about > > > e820_table_kexec > > > > > > > BTW, the conversation below drived me to read the e820 code: > > > > https://lore.kernel.org/all/CAMj1kXG1hbiafKRyC5qM1Vj5X7x- > > > dmLndqqo2AYnH > > > > MRxDz-80w@xxxxxxxxxxxxxx/T/#u > > > > > > > > It could be better to clean up the e820 tables, anyway the comment fix > > > > in this patch itself is good for now. > > > > > > > > Basically e820_table_firmware is used by kexec-tools kexec_load > > > > implementation, e820_table_kexec is used by kexec_file_load code to > > > > pass to the 2nd kernel in boot params. > > > > > > > > The e820_table_firmware is said to not be modified by the kernel in > > > > the code comment, but this is not true, at least the sev code updates > > > > the table. The hibernate code generates crc32 checksum and verifies > > > > it, but since AFAIK e820 table update only happens in boot phase, it > > > > will be stable on runtime. So we can just use e820_table_firmware for > > > > kexec use and drop the e820_table_kexec. With the change the > > > > kexec_file_load and kexec_load see the same memory ranges. > > > > > > > > Otherwise I thought we can use just one e820 table, dropping > > > > e820_table_firmware and e820_table_kexec, but then there will be > > > > fragments and memory waste due to the setup_data ranges are reserved > > > > and updated in e820_table, so the e820_table_firmware being still be > > > > used for kexec makes sense. > > > > > > > > Anyway I need to think more about it, please let me know if you have > > > > any concerns. > > > > > > Reread the old commit 12df216c61c89e31e27e74146115a9728880ca6f > > > x86/boot/e820: Introduce the bootloader provided e820_table_firmware[] > > > table > > > > > > It seems the e820_table_firmware was changed to be exported in sysfs > > > instead of e820_table_kexec, but I suspect this is wrong. > > > kexec-tools (kexec_load) use the exported sysfs memmap info, but > > > e820_table_firmware was created by above commit to be used by hibernation > > > and the table should not be changed, the fact is there are changes happen > > > from time to time. > > > > > > > This is not expected from the original introducing of e820_table_firmware, it > > should not be changed by OS. I suppose the changes to e820_table_firmware > > are because of the kexec requirements for /sys/firmware/memmap? > > Yes, I think so at least for the AMD part. Just maybe nobody noticed > this table is meant to keep not changed. > > > Another question is that, why does kexec_load() get the memory layout > > from /sys/firmware/memmap, but kexec_file_load() relies on the in-kernel > > e820_table_kexec? > > I forgot the details, searching the git log, the history is like this: > > Originally the table name is e820_table_saved, both kexec-tools and > kernel kexec_file_load use the data from e820_saved. Kexec-tools used > it via the sysfs memmap. > > Later commit 544a0f47e7803443980496d6c9ae78b6c2b3dbcb changed the > table name to e820_table_firmware, so both kernel and kexec-tools used > e820_table_firmware. > > And then the commit a09bae0f8aa08d4d76d0ebece26062a49ec51ac9 changed > the name to e820_table_kexec (kernel kexec_file_load use it) and later > the other commit created a new table e820_table_firmware. > > The key here is the kexec-tools usage of sysfs was missed. > > > > > > Question is does hibernation use the sysfs entries from its userspace tools? > > > > Hibernation does not use this sysfs entries in userspace(or uswsusp )as far > > as I know. > > Good to know, thanks! > > > > > > If yes, then we should have both kexec table and firmware table exported > > > because they are for different purposes and one may change, another not. > > > > > > If hibernation only uses the table within the kernel then it makes no sense to > > > export it to sysfs, we should only export the kexec table for kexec-tools use. > > > In this way both kexec_load (userspace) and kexec_file_load (kernel load) can > > > use same e820 table, it will reduce the bugs and be easier to maintain. > > > > I'm OK with not exposing e820_table_firmware to /sys/firmware/memmap, if > > kexec is the only user of /sys/firmware/memmap. > > Since ingo was involved in the table changes. Ingo, any suggestions? > Hi all, I just posted another patch to export kexec tables instead and included the comments update, please have a look. Thanks Dave