[PATCH v5 07/16] kexec: add Kexec HandOver (KHO) generation helpers
Mike Rapoport
rppt at kernel.org
Wed Mar 26 04:59:40 PDT 2025
On Tue, Mar 25, 2025 at 02:56:52PM -0700, Frank van der Linden wrote:
> On Tue, Mar 25, 2025 at 12:19 PM Mike Rapoport <rppt at kernel.org> wrote:
> >
> > On Mon, Mar 24, 2025 at 11:40:43AM -0700, Frank van der Linden wrote:
> [...]
> > > Thanks for the work on this.
> > >
> > > Obviously it needs to happen while memblock is still active - but why
> > > as close as possible to buddy initialization?
> >
> > One reason is to have all memblock allocations done to autoscale the
> > scratch area. Another reason is to keep memblock structures small as long
> > as possible as memblock_reserve()ing the preserved memory would quite
> > inflate them.
> >
> > And it's overall simpler if memblock only allocates from scratch rather
> > than doing some of early allocations from scratch and some elsewhere and
> > still making sure they avoid the preserved ranges.
>
> Ah, thanks, I see the argument for the scratch area sizing.
>
> >
> > > Ordering is always a sticky issue when it comes to doing things during
> > > boot, of course. In this case, I can see scenarios where code that
> > > runs a little earlier may want to use some preserved memory. The
> >
> > Can you elaborate about such scenarios?
>
> There has, for example, been some talk about making hugetlbfs
> persistent. You could have hugetlb_cma active. The hugetlb CMA areas
> are set up quite early, quite some time before KHO restores memory. So
> that would have to be changed somehow if the location of the KHO init
> call would remain as close as possible to buddy init as possible. I
> suspect there may be other uses.
I think we can address this when/if implementing preservation for hugetlbfs
and it will be tricky.
If hugetlb in the first kernel uses a lot of memory, we just won't have
enough scratch space for early hugetlb reservations in the second kernel
regardless of hugetlb_cma. On the other hand, we already have the preserved
hugetlbfs memory, so we'd probably need to reserve less memory in the
second kernel.
But anyway, it's completely different discussion about how to preserve
hugetlbfs.
> > > current requirement in the patch set seems to be "after sparse/page
> > > init", but I'm not sure why it needs to be as close as possibly to
> > > buddy init.
> >
> > Why would you say that sparse/page init would be a requirement here?
>
> At least in its current form, the KHO code expects vmemmap to be
> initialized, as it does its restore base on page structures, as
> deserialize_bitmap expects them. I think the use of the page->private
> field was discussed in a separate thread, I think. If that is done
> differently, it wouldn't rely on vmemmap being initialized.
In the current form KHO does relies on vmemmap being allocated, but it does
not rely on it being initialized. Marking memblock ranges NOINT ensures
nothing touches the corresponding struct pages and KHO can use their fields
up to the point the memory is returned to KHO callers.
> A few more things I've noticed (not sure if these were discussed before):
>
> * Should KHO depend on CONFIG_DEFERRED_STRUCT_PAGE_INIT? Essentially,
> marking memblock ranges as NOINIT doesn't work without
> DEFERRED_STRUCT_PAGE_INIT. Although, if the page->private use
> disappears, this wouldn't be an issue anymore.
It does.
memmap_init_reserved_pages() is called always, no matter of
CONFIG_DEFERRED_STRUCT_PAGE_INIT is set or not and it skips initialization
of NOINIT regions.
> * As a future extension, it could be nice to store vmemmap init
> information in the KHO FDT. Then you can use that to init ranges in an
> optimized way (HVO hugetlb or DAX-style persisted ranges) straight
> away.
These days memmap contents is unstable because of the folio/memdesc
project, but in general carrying memory map data from kernel to kernel is
indeed something to consider.
> - Frank
--
Sincerely yours,
Mike.
More information about the linux-arm-kernel
mailing list