[PATCH v2] riscv: mm: Implement pmdp_collapse_flush for THP

Andrew Jones ajones at ventanamicro.com
Mon Jan 30 00:03:35 PST 2023


On Mon, Jan 30, 2023 at 01:18:15PM +0530, Mayuresh Chitale wrote:
> When THP is enabled, 4K pages are collapsed into a single huge
> page using the generic pmdp_collapse_flush() which will further
> use flush_tlb_range() to shoot-down stale TLB entries. Unfortunately,
> the generic pmdp_collapse_flush() only invalidates cached leaf PTEs
> using address specific SFENCEs which results in repetitive (or
> unpredictable) page faults on RISC-V implementations which cache
> non-leaf PTEs.
> 
> Provide a RISC-V specific pmdp_collapse_flush() which ensures both
> cached leaf and non-leaf PTEs are invalidated by using non-address
> specific SFENCEs as recommended by the RISC-V privileged specification.
> 
> Fixes: e88b333142e4 ("riscv: mm: add THP support on 64-bit")
> Signed-off-by: Mayuresh Chitale <mchitale at ventanamicro.com>
> ---

Please add a changelog here under the --- to explain the differences
between this version and the last version. b4-diff shows me what
changed, but the changelog should be present to explain why it
changed.

>  arch/riscv/include/asm/pgtable.h |  4 ++++
>  arch/riscv/mm/pgtable.c          | 26 ++++++++++++++++++++++++++
>  2 files changed, 30 insertions(+)
> 
> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> index 4eba9a98d0e3..3e01f4f3ab08 100644
> --- a/arch/riscv/include/asm/pgtable.h
> +++ b/arch/riscv/include/asm/pgtable.h
> @@ -721,6 +721,10 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma,
>  	page_table_check_pmd_set(vma->vm_mm, address, pmdp, pmd);
>  	return __pmd(atomic_long_xchg((atomic_long_t *)pmdp, pmd_val(pmd)));
>  }
> +
> +#define pmdp_collapse_flush pmdp_collapse_flush
> +extern pmd_t pmdp_collapse_flush(struct vm_area_struct *vma,
> +				 unsigned long address, pmd_t *pmdp);
>  #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>  
>  /*
> diff --git a/arch/riscv/mm/pgtable.c b/arch/riscv/mm/pgtable.c
> index 6645ead1a7c1..5da1916c231e 100644
> --- a/arch/riscv/mm/pgtable.c
> +++ b/arch/riscv/mm/pgtable.c
> @@ -81,3 +81,29 @@ int pmd_free_pte_page(pmd_t *pmd, unsigned long addr)
>  }
>  
>  #endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */
> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +pmd_t pmdp_collapse_flush(struct vm_area_struct *vma,
> +					unsigned long address, pmd_t *pmdp)
> +{
> +	pmd_t pmd = pmdp_huge_get_and_clear(vma->vm_mm, address, pmdp);
> +
> +	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
> +	VM_BUG_ON(pmd_trans_huge(*pmdp));

These checks should come before the use of variables being checked, as is
done in the common version of the function.

> +	/*
> +	 * When leaf PTE enteries (regular pages) are collapsed into a leaf
> +	 * PMD entry (huge page), a valid non-leaf PTE is converted into a
> +	 * valid leaf PTE at the level 1 page table. The RISC-V privileged v1.12
> +	 * specification allows implementations to cache valid non-leaf PTEs,
> +	 * but the section "4.2.1 Supervisor Memory-Management Fence
> +	 * Instruction" recommends the following:
> +	 * "If software modifies a non-leaf PTE, it should execute SFENCE.VMA
> +	 * with rs1=x0. If any PTE along the traversal path had its G bit set,
> +	 * rs2 must be x0; otherwise, rs2 should be set to the ASID for which
> +	 * the translation is being modified."
> +	 * Based on the above recommendation, we should do full flush whenever
> +	 * leaf PTE entries are collapsed into a leaf PMD entry.
> +	 */
> +	flush_tlb_mm(vma->vm_mm);
> +	return pmd;
> +}
> +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
> -- 
> 2.34.1
>

Thanks,
drew



More information about the linux-riscv mailing list