[PATCH] kexec_core: Accept unaccepted kexec destination addresses

Yan Zhao yan.y.zhao at intel.com
Mon Oct 21 20:12:24 PDT 2024


On Mon, Oct 21, 2024 at 09:33:17AM -0500, Eric W. Biederman wrote:
> Yan Zhao <yan.y.zhao at intel.com> writes:
> 
> > The kexec destination addresses (incluing those for purgatory, the new
> > kernel, boot params/cmdline, and initrd) are searched from the free area of
> > memblock or RAM resources. Since they are not allocated by the currently
> > running kernel, it is not guaranteed that they are accepted before
> > relocating the new kernel.
> >
> > Accept the destination addresses for the new kernel, as the new kernel may
> > not be able to or may not accept them by itself.
> >
> > Place the "accept" code immediately after the destination addresses pass
> > sanity checks, so the code can be shared by both users of the kexec_load
> > and kexec_file_load system calls.
> 
> I am not at all certain this is sufficient, and I am a bit flummoxed
> about the need to ever ``accept'' memory lazily.
> 
> In a past life I wrote bootup firmware, and as part of that was the code
> to initialize the contents of memory.  When properly tuned and setup it
> would never take more than a second to just blast initial values into
> memory.  That is because the ratio of memory per memory controller to
> memory bandwidth stayed roughly constant while I was paying attention.
> I expect that ratio to continue staying roughly constant or systems
> will quickly start developing unacceptable boot times.
> 
> As I recall Intel TDX is where the contents of memory are encrypted per
> virtual machine.  Which implies that you have the same challenge as
> bootup initializing memory, and that is what ``accepting'' memory is.
Yes, the kernel actually will accept initial memory used by itself in
extract_kernel(), as in arch/x86/boot/compressed/misc.c.

But the target kernel may not be able to accept memory for purgatory.
And it's currently does not accept memory for boot params/cmdline,
and initrd .

> 
> I am concerned that an unfiltered accept_memory may result in memory
> that has already been ``accepted'' being accepted again.  This has
> the potential to be wasteful in the best case, and the potential to
> cause memory that is in use to be reinitialized losing the values
> that are currently stored there.
accept_memory() will not accept memory that has already been accepted.
An unaccepted->bitmap is maintained and queried before accepting.
(this is at least the implementation in
drivers/firmware/efi/unaccepted_memory.c)

If it's still a concern to you, is it better to add a check like this?

if (range_contains_unaccepted_memory(mstart, size))
	accept_memory(mstart, size);

> 
> I am concerned that the target kernel won't know about about accepting
> memory, or might not perform the work early enough and try to use memory
> without accepting it first.
The target kernel does accept memory before use it. But not including those
in kexec segments for purgatory, boot params/cmdline, and initrd.


> I would much prefer if getting into kexec_load would force the memory
> acceptance out of lazy mode (or possibly not even work in lazy mode).
> That keeps things simple for now.
> 
> Once enough people have machines requiring the use of accept_memory
> we can worry about optimizing things and pushing the accept_memory call
> down into kexec_load.
> 
> 
> Ugh.  I just noticed another issue.  Unless the memory we are talking
> about is the memory reserved for kexec on panic kernels the memory needs
> struct pages and everything setup so it can be allocated from anyway.
> 
> Which is to say I think this is has the potential to conflict with
> the accounting in try_to_accept_memory.
Then could we put the accept into machine_kexec(), given that accept_memory()
will not fail?

> 
> Please just make memory acceptance ``eager'' non-lazy when using kexec.
> Unless someone has messed their implementation badly it won't be a
> significant amount of time in human terms, and it makes the code
> so much easier to understand and think about.
> 
yes, it's also an approach if the above cannot convince you.

> 
> 
> > Cc: Kirill A. Shutemov <kirill.shutemov at linux.intel.com>
> > Reviewed-by: Kirill A. Shutemov <kirill.shutemov at linux.intel.com>
> > Signed-off-by: Yan Zhao <yan.y.zhao at intel.com>
> > ---
> >  kernel/kexec_core.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> > index c0caa14880c3..d97376eafc1a 100644
> > --- a/kernel/kexec_core.c
> > +++ b/kernel/kexec_core.c
> > @@ -210,6 +210,16 @@ int sanity_check_segment_list(struct kimage *image)
> >  	}
> >  #endif
> >  
> > +	/*
> > +	 * The destination addresses are searched from free memory ranges rather
> > +	 * than being allocated from the current kernel, so they are not
> > +	 * guaranteed to be accepted by the current kernel.
> > +	 * Accept those initial pages for the new kernel since it may not be
> > +	 * able to accept them by itself.
> > +	 */
> > +	for (i = 0; i < nr_segments; i++)
> > +		accept_memory(image->segment[i].mem, image->segment[i].memsz);
> > +
> >  	return 0;
> >  }
> >  
> > base-commit: 8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b



More information about the kexec mailing list