[PATCH makedumpfile] Add confidential VM unaccepted free pages support on Linux 6.16 and later
HAGIO KAZUHITO(萩尾 一仁)
k-hagio-ab at nec.com
Sun Jun 29 21:05:12 PDT 2025
Hi,
thank you for the patch, and sorry for the long delay..
On 2025/06/09 14:02, Zhiquan Li wrote:
> UEFI Specification version 2.9 introduces the concept of memory
> acceptance: some Virtual Machine platforms, such as Intel TDX or AMD
> SEV-SNP, requiring memory to be accepted before it can be used by the
> guest. Accepting happens via a protocol specific for the Virtual
> Machine platform[1].
>
> Before unaccepted memory is accepted by guest, any access from guest
> will result in the guest failed, as well as the kexec'ed kernel. So,
> the kexec'ed kernel will skip these pages and fill in zero data for the
> reader of vmcore.
>
> However, it introduces a problem. When exclude a page filled with zero,
> a pd_zero, which sizeof(page_desc_t) is 24 bytes, will be written into
> cd_header and this part is not compressed by design. As the unaccepted
> pages are exported as zero pages, which means ~1/170 of the capacity of
> unaccepted memory will be added into vmcore additionally. In fact, they
> should be considered as free pages and most of the time should be
> excluded.
>
> Unaccepted memory is unusable free memory, so they are not managed by
> buddy, instead, they are added to a new list zone.unaccepted_pages only
> when the order is MAX_PAGE_ORDER each time conventionally. The new
> page type, PGTY_unaccepted can be used to identify whether a page is
> unaccepted[2]. Therefore, add following changes to exclude them like
> free pages:
>
> 1. Add NUMBER(PAGE_UNACCEPTED_MAPCOUNT_VALUE) to identify a page is
> unaccepted, a kernel patch[3] to export the value of page type
> PAGE_UNACCEPTED_MAPCOUNT_VALUE since kernel 6.16.
>
> 2. Add a condition to exclude these unaccepted free pages.
>
> Dumping host kernel will not be impacted by the modification, because it
> cannot enable CONFIG_UNACCEPTED_MEMORY, so the page type
> PAGE_UNACCEPTED_MAPCOUNT_VALUE cannot be found in vmcoreinfo and skip
> the step.
>
> Here is a vmcore size statistic of a freshly booted TD VM with different
> memory sizes:
>
> VM.mem | Before After
> -------+----------------
> 512G | ~4.9G ~2.0G
> 256G | ~2.0G ~1.1G
>
> [1] https://lore.kernel.org/all/20230606142637.5171-1-kirill.shutemov@linux.intel.com/
> [2] https://lore.kernel.org/all/20240809114854.3745464-5-kirill.shutemov@linux.intel.com/
> [3] https://lore.kernel.org/all/20250405060610.860465-1-zhiquan1.li@intel.com/
>
> Signed-off-by: Zhiquan Li <zhiquan1.li at intel.com>
> ---
> makedumpfile.c | 14 ++++++++++++++
> makedumpfile.h | 3 +++
> 2 files changed, 17 insertions(+)
>
> diff --git a/makedumpfile.c b/makedumpfile.c
> index 2d3b08bb5d52..97ec2f06108b 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -2533,6 +2533,8 @@ write_vmcoreinfo_data(void)
> WRITE_NUMBER("PAGE_BUDDY_MAPCOUNT_VALUE", PAGE_BUDDY_MAPCOUNT_VALUE);
> WRITE_NUMBER("PAGE_OFFLINE_MAPCOUNT_VALUE",
> PAGE_OFFLINE_MAPCOUNT_VALUE);
> + WRITE_NUMBER("PAGE_UNACCEPTED_MAPCOUNT_VALUE",
> + PAGE_UNACCEPTED_MAPCOUNT_VALUE);
> WRITE_NUMBER("phys_base", phys_base);
> WRITE_NUMBER("KERNEL_IMAGE_SIZE", KERNEL_IMAGE_SIZE);
>
> @@ -2990,6 +2992,7 @@ read_vmcoreinfo(void)
> READ_NUMBER("PAGE_HUGETLB_MAPCOUNT_VALUE", PAGE_HUGETLB_MAPCOUNT_VALUE);
> READ_NUMBER("PAGE_OFFLINE_MAPCOUNT_VALUE", PAGE_OFFLINE_MAPCOUNT_VALUE);
> READ_NUMBER("PAGE_SLAB_MAPCOUNT_VALUE", PAGE_SLAB_MAPCOUNT_VALUE);
> + READ_NUMBER("PAGE_UNACCEPTED_MAPCOUNT_VALUE", PAGE_UNACCEPTED_MAPCOUNT_VALUE);
> READ_NUMBER("phys_base", phys_base);
> READ_NUMBER("KERNEL_IMAGE_SIZE", KERNEL_IMAGE_SIZE);
>
> @@ -6630,6 +6633,17 @@ check_order:
> nr_pages = 1 << private;
> pfn_counter = &pfn_free;
> }
> + /*
> + * Exclude the unaccepted free pages not managed by buddy.
> + * By convention, pages can be added to the zone.unaccepted_pages list
> + * only when the order is MAX_ORDER_NR_PAGES. Otherwise, the page is
> + * accepted immediately without being on the list.
> + */
> + else if ((info->dump_level & DL_EXCLUDE_FREE)
> + && isUnaccepted(_mapcount)) {
> + nr_pages = 1 << (ARRAY_LENGTH(zone.free_area) - 1);
just to clarify, does this mean that the order of unaccepted pages is
MAX_PAGE_ORDER but it's not set in struct page, so we need to set the
order here?
Thanks,
Kazu
> + pfn_counter = &pfn_free;
> + }
> /*
> * Exclude the non-private cache page.
> */
> diff --git a/makedumpfile.h b/makedumpfile.h
> index 944397a6f865..26940e7a3f81 100644
> --- a/makedumpfile.h
> +++ b/makedumpfile.h
> @@ -163,6 +163,8 @@ test_bit(int nr, unsigned long addr)
> && (NUMBER(PG_hwpoison) != NOT_FOUND_NUMBER))
> #define isAnon(mapping, flags, _mapcount) \
> (((unsigned long)mapping & PAGE_MAPPING_ANON) != 0 && !isSlab(flags, _mapcount))
> +#define isUnaccepted(_mapcount) (_mapcount == (int)NUMBER(PAGE_UNACCEPTED_MAPCOUNT_VALUE) \
> + && (NUMBER(PAGE_UNACCEPTED_MAPCOUNT_VALUE) != NOT_FOUND_NUMBER))
>
> #define PTOB(X) (((unsigned long long)(X)) << PAGESHIFT())
> #define BTOP(X) (((unsigned long long)(X)) >> PAGESHIFT())
> @@ -2257,6 +2259,7 @@ struct number_table {
> long PAGE_HUGETLB_MAPCOUNT_VALUE;
> long PAGE_OFFLINE_MAPCOUNT_VALUE;
> long PAGE_SLAB_MAPCOUNT_VALUE;
> + long PAGE_UNACCEPTED_MAPCOUNT_VALUE;
> long SECTION_SIZE_BITS;
> long MAX_PHYSMEM_BITS;
> long HUGETLB_PAGE_DTOR;
More information about the kexec
mailing list