[RFC 0/3] kdump: Check mem_map of CMA area in kdump
Michal Hocko
mhocko at suse.com
Mon Dec 18 07:19:20 PST 2023
On Mon 18-12-23 13:23:22, Pingfan Liu wrote:
> From: Pingfan Liu <piliu at redhat.com>
>
>
> First of all, this series is only for proof of concept. It only passes compilation.
>
> For years, CMA is proposed to be used as crashkernel reserved memory.
> But DIO prevent us to follow it since DMA may be in-flight and ruin the
> kdump kernel.
>
> This series exports the crash kernel's CMA area information through
> device-tree, and kdump kernel skips any page, which refcnt!=mapcount and
> has a potential DMA activity.
I didn't have time to look deeper into implementation (and I will get
back to it only early Jan) but mapcount based checks are really tricky
and unreliable. folio_maybe_dma_pinned sounds like a better test. You
definitely want to have that checked by more MM people and CC linux-mm.
> The exported information include:
> u64 kdump_cma_pfn;
> u64 kdump_cma_pg_cnt;
> u64 kdump_cma_pg_paddr;
>
> And they should be filled with Jiri's series "[PATCH 0/4] kdump:
> crashkernel reservation from CMA"
>
> After the conjunction of two series, the CMA used for kdump has only the
> following risk, where the following conditions:
> -1.a wrong code forges _refcnt and mapcount to the same value
> -2.the page is also used by DIO
>
>
> Is it acceptable, or any rescue e.g. CRC on page?
We alredy do have vm_debug=P which enables init time poisoning
on all struct pages. The value is then checked when the page is
allocated.
> Please share your thoughts.
Having a sanity check on exported cma pages makes some sense to me. The
exact check might be more involved with false positives but they
shouldn't be a major problem unless there are too many of them.
--
Michal Hocko
SUSE Labs
More information about the kexec
mailing list