[PATCH] x86/e820: update code comment about e820_table_kexec

Dave Young dyoung at redhat.com
Tue Jan 28 05:48:08 PST 2025


On Tue, 14 Jan 2025 at 11:57, Dave Young <dyoung at redhat.com> wrote:
>
> Hi,
>
> On Tue, 14 Jan 2025 at 11:26, Chen, Yu C <yu.c.chen at intel.com> wrote:
> >
> > Hi Dave,
> > Thanks for bringing this up,
> >
> > > -----Original Message-----
> > > From: Dave Young <dyoung at redhat.com>
> > > Sent: Friday, January 10, 2025 2:30 PM
> > > To: Chen, Yu C <yu.c.chen at intel.com>
> > > Cc: x86 at kernel.org; linux-kernel at vger.kernel.org; kexec at lists.infradead.org;
> > > Ingo Molnar <mingo at redhat.com>; Borislav Petkov <bp at alien8.de>
> > > Subject: Re: [PATCH] x86/e820: update code comment about
> > > e820_table_kexec
> > >
> > > > BTW,  the conversation below drived me to read the e820 code:
> > > > https://lore.kernel.org/all/CAMj1kXG1hbiafKRyC5qM1Vj5X7x-
> > > dmLndqqo2AYnH
> > > > MRxDz-80w at mail.gmail.com/T/#u
> > > >
> > > > It could be better to clean up the e820 tables, anyway the comment fix
> > > > in this patch itself is good for now.
> > > >
> > > > Basically e820_table_firmware is used by kexec-tools kexec_load
> > > > implementation,  e820_table_kexec is used by kexec_file_load code to
> > > > pass to the 2nd kernel in boot params.
> > > >
> > > > The e820_table_firmware is said to not be modified by the kernel in
> > > > the code comment, but this is not true, at least the sev code updates
> > > > the table.    The hibernate code generates crc32 checksum and verifies
> > > > it, but since AFAIK e820 table update only happens in boot phase, it
> > > > will be stable on runtime.  So we can just use e820_table_firmware for
> > > > kexec use and drop the e820_table_kexec. With the change the
> > > > kexec_file_load and kexec_load see the same memory ranges.
> > > >
> > > > Otherwise I thought we can use just one e820 table, dropping
> > > > e820_table_firmware and e820_table_kexec, but then there will be
> > > > fragments and memory waste  due to the setup_data ranges are reserved
> > > > and updated in e820_table, so the e820_table_firmware being still be
> > > > used for kexec makes sense.
> > > >
> > > > Anyway I need to think more about it,  please let me know if you have
> > > > any concerns.
> > >
> > > Reread the old commit 12df216c61c89e31e27e74146115a9728880ca6f
> > >     x86/boot/e820: Introduce the bootloader provided e820_table_firmware[]
> > > table
> > >
> > > It seems the e820_table_firmware was changed to be exported in sysfs
> > > instead of e820_table_kexec,   but I suspect this is wrong.
> > > kexec-tools (kexec_load) use the exported sysfs memmap info,  but
> > > e820_table_firmware was created by above commit to be used by hibernation
> > > and the table should not be changed,  the fact is there are changes happen
> > > from time to time.
> > >
> >
> > This is not expected from the original introducing of e820_table_firmware, it
> > should not be changed by OS. I suppose the changes to e820_table_firmware
> > are because of the kexec requirements for /sys/firmware/memmap?
>
> Yes, I think so at least for the AMD part.  Just maybe nobody noticed
> this table is meant to keep not changed.
>
> > Another question is that, why does kexec_load() get the memory layout
> > from /sys/firmware/memmap, but  kexec_file_load() relies on the in-kernel
> > e820_table_kexec?
>
> I forgot the details, searching the git log, the history is like this:
>
> Originally the table name is e820_table_saved,  both kexec-tools and
> kernel kexec_file_load use the data from e820_saved.  Kexec-tools used
> it via the sysfs memmap.
>
> Later commit 544a0f47e7803443980496d6c9ae78b6c2b3dbcb changed the
> table name to e820_table_firmware, so both kernel and kexec-tools used
> e820_table_firmware.
>
> And then the commit a09bae0f8aa08d4d76d0ebece26062a49ec51ac9 changed
> the name to e820_table_kexec (kernel kexec_file_load use it) and later
> the other commit created a new table e820_table_firmware.
>
> The key here is the kexec-tools usage of sysfs was missed.
>
> >
> > > Question is does hibernation use the sysfs entries from its userspace tools?
> >
> > Hibernation does not use this sysfs entries in  userspace(or uswsusp )as far
> > as I know.
>
> Good to know, thanks!
>
> >
> > > If yes, then we should have both kexec table and firmware table exported
> > > because they are for different purposes and one may change, another not.
> > >
> > > If hibernation only uses the table within the kernel then it makes no sense to
> > > export it to sysfs, we should only export the kexec table for kexec-tools use.
> > > In this way  both kexec_load (userspace) and kexec_file_load (kernel load) can
> > > use same e820 table, it will reduce the bugs and be easier to maintain.
> >
> > I'm OK with not exposing e820_table_firmware  to /sys/firmware/memmap, if
> > kexec is the only user of /sys/firmware/memmap.
>
> Since ingo was involved in the table changes. Ingo,  any suggestions?
>

Hi all, I just posted another patch to export kexec tables instead and
included the comments update, please have a look.

Thanks
Dave




More information about the kexec mailing list