[PATCH 4/7] x86: Remove custom definition of mk_pte()
Dave Hansen
dave.hansen at intel.com
Wed Feb 19 12:53:41 PST 2025
On 2/17/25 11:08, Matthew Wilcox (Oracle) wrote:
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index 593f10aabd45..9f480bdafd20 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -784,6 +784,9 @@ static inline pgprotval_t check_pgprot(pgprot_t pgprot)
> static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot)
> {
> phys_addr_t pfn = (phys_addr_t)page_nr << PAGE_SHIFT;
> + /* This bit combination is used to mark shadow stacks */
> + WARN_ON_ONCE((pgprot_val(pgprot) & (_PAGE_DIRTY | _PAGE_RW)) ==
> + _PAGE_DIRTY);
Looks sane to me. Good riddance to unnecessary arch-specific code.
Acked-by: Dave Hansen <dave.hansen at linux.intel.com>
Just one note (in case anyone ever trips over that WARN_ON_ONCE()): This
is a problem with the existing code and with your patch, but the
'pgprot' going to mk_pte() or pfn_pte() probably can't come from a
hardware PTE that has the dirty bit clear.
Old, pre-shadow-stack hardware could race between:
1. Software transitioning _PAGE_RW 1=>0
2. The CPU page walker trying to set
_PAGE_DIRTY in response to a write
and end up with a Write=0,Dirty=1 PTE.
That doesn't happen for kernel memory because most or all of the PTEs
that the kernel establishes have _PAGE_DIRTY=1 so the page walker never
tries to set _PAGE_DIRTY. It's also generally some kind of a bug if one
CPU is in the kernel trying to write to a page while another is making
the page read-only.
Anyway, I can basically see cases where this warning might trip, but
it's very likely on older hardware when the kernel is doing something
else silly.
More information about the linux-um
mailing list