[PATCH] arm64: mm: Fix pte_mkclean, pte_mkdirty semantics

Steve Capper steve.capper at arm.com
Fri Dec 1 09:22:14 PST 2017


On systems with hardware dirty bit management, the ltp madvise09 unit
test fails due to dirty bit information being lost and pages being
incorrectly freed.

This was bisected to:
	arm64: Ignore hardware dirty bit updates in ptep_set_wrprotect()

Reverting this commit leads to a separate problem, that the unit test
retains pages that should have been dropped due to the function
madvise_free_pte_range(.) not cleaning pte's properly.

Currently pte_mkclean only clears the software dirty bit, thus the
following code sequence can appear:

	pte = pte_mkclean(pte);
	if (pte_dirty(pte))
		// this condition can return true with HW DBM!

This patch also adjusts pte_mkclean to set PTE_RDONLY thus effectively
clearing both the SW and HW dirty information.

In order for this to function on systems without HW DBM, we need to
also adjust pte_mkdirty to remove the read only bit from writable pte's
to avoid infinite fault loops.

Reported-by: Bhupinder Thakur <bhupinder.thakur at linaro.org>
Fixes: 64c26841b349 ("arm64: Ignore hardware dirty bit updates in
ptep_set_wrprotect()")
Signed-off-by: Steve Capper <steve.capper at arm.com>
---
 arch/arm64/include/asm/pgtable.h | 33 ++++++++++++++++++---------------
 1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index b46e54c2399b..0fa7305765ab 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -135,12 +135,20 @@ static inline pte_t pte_mkwrite(pte_t pte)
 
 static inline pte_t pte_mkclean(pte_t pte)
 {
-	return clear_pte_bit(pte, __pgprot(PTE_DIRTY));
+	pte = clear_pte_bit(pte, __pgprot(PTE_DIRTY));
+	pte = set_pte_bit(pte, __pgprot(PTE_RDONLY));
+
+	return pte;
 }
 
 static inline pte_t pte_mkdirty(pte_t pte)
 {
-	return set_pte_bit(pte, __pgprot(PTE_DIRTY));
+	pte = set_pte_bit(pte, __pgprot(PTE_DIRTY));
+
+	if (pte_write(pte))
+		pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY));
+
+	return pte;
 }
 
 static inline pte_t pte_mkold(pte_t pte)
@@ -628,28 +636,23 @@ static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm,
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
 /*
- * ptep_set_wrprotect - mark read-only while preserving the hardware update of
- * the Access Flag.
+ * ptep_set_wrprotect - mark read-only while trasferring potential hardware
+ * dirty status (PTE_DBM && !PTE_RDONLY) to the software PTE_DIRTY bit.
  */
 #define __HAVE_ARCH_PTEP_SET_WRPROTECT
 static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long address, pte_t *ptep)
 {
 	pte_t old_pte, pte;
 
-	/*
-	 * ptep_set_wrprotect() is only called on CoW mappings which are
-	 * private (!VM_SHARED) with the pte either read-only (!PTE_WRITE &&
-	 * PTE_RDONLY) or writable and software-dirty (PTE_WRITE &&
-	 * !PTE_RDONLY && PTE_DIRTY); see is_cow_mapping() and
-	 * protection_map[]. There is no race with the hardware update of the
-	 * dirty state: clearing of PTE_RDONLY when PTE_WRITE (a.k.a. PTE_DBM)
-	 * is set.
-	 */
-	VM_WARN_ONCE(pte_write(*ptep) && !pte_dirty(*ptep),
-		     "%s: potential race with hardware DBM", __func__);
 	pte = READ_ONCE(*ptep);
 	do {
 		old_pte = pte;
+		/*
+		 * If hardware-dirty (PTE_WRITE/DBM bit set and PTE_RDONLY
+		 * clear), set the PTE_DIRTY bit.
+		 */
+		if (pte_hw_dirty(pte))
+			pte = pte_mkdirty(pte);
 		pte = pte_wrprotect(pte);
 		pte_val(pte) = cmpxchg_relaxed(&pte_val(*ptep),
 					       pte_val(old_pte), pte_val(pte));
-- 
2.11.0




More information about the linux-arm-kernel mailing list