[RFC PATCH v2 0/7] Introduce persistent memory pool
Stanislav Kinsburskii
skinsburskii at linux.microsoft.com
Wed Sep 27 09:13:19 PDT 2023
On Wed, Sep 27, 2023 at 01:44:38PM +0800, Baoquan He wrote:
> Hi Stanislav,
>
> On 09/25/23 at 02:27pm, Stanislav Kinsburskii wrote:
> > This patch introduces a memory allocator specifically tailored for
> > persistent memory within the kernel. The allocator maintains
> > kernel-specific states like DMA passthrough device states, IOMMU state, and
> > more across kexec.
>
> Can you give more details about how this persistent memory pool will be
> utilized in a actual scenario? I mean, what problem have you met so that
> you have to introduce persistent memory pool to solve it?
>
The major reason we have at the moment, is that Linux root partition
running on top of the Microsoft hypervisor needs to deposit pages to
hypervisor in runtime, when hypervisor runs out of memory.
"Depositing" here means, that Linux passes a set of its PFNs to the
hypervisor via hypercall, and hypervisor then uses these pages for its
own needs.
Once deposited, these pages can't be accessed by Linux anymore and thus
must be preserved in "used" state across kexec, as hypervisor state is
unware of kexec. In the same time, these pages can we withdrawn when
usused. Thus, an allocator persistent across kexec looks reasonable for
this particular matter.
Also, the last patch in the series is aimed to demonstrate the usage,
described above.
Thanks,
Stanislav
> Thanks
> Baoquan
>
> >
> > The current implementation provides a foundation for custom solutions that
> > may be developed in the future. Although the design is kept concise and
> > straightforward to encourage discussion and feedback, it remains fully
> > functional.
> >
> > The persistent memory pool builds upon the continuous memory allocator
> > (CMA) and ensures CMA state persistency across kexec by incorporating the
> > CMA bitmap into the memory region instead of allocation it from kernel
> > memory.
> >
> > Persistent memory pool metadata is passed across kexec by using Flattened
> > Device Tree, which is added as another kexec segment for x86 architecture.
> >
> > Potential applications include:
> >
> > 1. Enabling various in-kernel entities to allocate persistent pages from
> > a unified memory pool, obviating the need for reserving multiple
> > regions.
> >
> > 2. For in-kernel components that need the allocation address to be
> > retained on kernel kexec, this address can be exposed to user space
> > and subsequently passed through the command line.
> >
> > 3. Distinct subsystems or drivers can set aside their region, allocating
> > a segment for their persistent memory pool, suitable for uses such as
> > file systems, key-value stores, and other applications.
> >
> > Notes:
> >
> > 1. The last patch of the series represents a use case for the feature.
> > However, the patch won't compile and is for illustrative purposes only
> > as the code being patched hasn't been merged yet.
> >
> > 2. The code being patched is currently under review by the community. The
> > series is named "Introduce /dev/mshv drivers":
> >
> > https://lkml.org/lkml/2023/9/22/1117
> >
> >
> > Changes since v1:
> >
> > 1. Persistent memory pool is now a wrapper on top of CMA instead of being a
> > new allocator.
> >
> > 2. Persistent memory pool metadata doesn't belong to the pool anymore and
> > is now passed via Flattened Device Tree instead over kexec to the new
> > kernel.
> >
> > The following series implements...
> >
> > ---
> >
> > Stanislav Kinsburskii (7):
> > kexec_file: Add fdt modification callback support
> > x86: kexec: Transfer existing fdt to the new kernel
> > x86: kexec: Enable fdt modification in callbacks
> > pmpool: Introduce persistent memory pool
> > pmpool: Update device tree on kexec
> > pmpool: Restore state from device tree post-kexec
> > Drivers: hv: Allocate persistent pages for root partition
> >
> >
> > arch/x86/Kconfig | 16 +++
> > arch/x86/kernel/kexec-bzimage64.c | 97 +++++++++++++++++
> > drivers/hv/hv_common.c | 13 ++
> > include/linux/kexec.h | 7 +
> > include/linux/pmpool.h | 22 ++++
> > kernel/kexec_file.c | 24 ++++
> > mm/Kconfig | 9 ++
> > mm/Makefile | 1
> > mm/pmpool.c | 208 +++++++++++++++++++++++++++++++++++++
> > 9 files changed, 394 insertions(+), 3 deletions(-)
> > create mode 100644 include/linux/pmpool.h
> > create mode 100644 mm/pmpool.c
> >
> >
> > _______________________________________________
> > kexec mailing list
> > kexec at lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
> >
More information about the kexec
mailing list