[PATCH 2/4] arm64: hugetlbpages: Correctly handle swap entries in huge_pte_offset()

Punit Agrawal punit.agrawal at arm.com
Tue Apr 4 11:47:15 PDT 2017


Hi Mark,

Mark Rutland <mark.rutland at arm.com> writes:

> Hi Punit,
>
> On Thu, Mar 30, 2017 at 05:38:47PM +0100, Punit Agrawal wrote:
>> huge_pte_offset() does not correctly handle poisoned or migration page
>> table entries. 
>
> What exactly does it do wrong?
>
> Judging by the patch, we return NULL in some cases we shouldn't, right?

huge_pte_offset() returns NULL when it comes across swap entries for any
of the supported hugepage sizes.

>
> What can result from this? e.g. can we see data corruption?

In the tests I am running, it results in an error in the log -

[  344.165544] mm/pgtable-generic.c:33: bad pmd 000000083af00074.

when unmapping the page tables for the process that owns the poisoned
page.

In some instances, returning NULL instead of swap entries could lead to
data corruption - especially when the page tables contain migration swap
entries. But since hugepage migration is not enabled on arm64 I haven't
seen any corruption.

I've updated the commit log with more details locally.

>
>> Not knowing the size of the hugepage entry being
>> requested only compounded the problem.
>> 
>> The recently added hstate parameter can be used to determine the size of
>> hugepage being accessed. Use the size to find the correct page table
>> entry to return when coming across a swap page table entry.
>> 
>> Signed-off-by: Punit Agrawal <punit.agrawal at arm.com>
>> Cc: David Woods <dwoods at mellanox.com>
>
> Given this is a fix for a bug, it sounds like it should have a fixes
> tag, or a Cc stable...

The problem doesn't occur until we enable memory failure handling. So
there shouldn't be a problem on earlier kernels.

Thanks,
Punit

>
> Thanks,
> Mark.
>
>> ---
>>  arch/arm64/mm/hugetlbpage.c | 31 ++++++++++++++++---------------
>>  1 file changed, 16 insertions(+), 15 deletions(-)
>> 
>> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
>> index 9ca742c4c1ab..44014403081f 100644
>> --- a/arch/arm64/mm/hugetlbpage.c
>> +++ b/arch/arm64/mm/hugetlbpage.c
>> @@ -192,38 +192,39 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,
>>  pte_t *huge_pte_offset(struct mm_struct *mm,
>>  		       unsigned long addr, struct hstate *h)
>>  {
>> +	unsigned long sz = huge_page_size(h);
>>  	pgd_t *pgd;
>>  	pud_t *pud;
>> -	pmd_t *pmd = NULL;
>> -	pte_t *pte = NULL;
>> +	pmd_t *pmd;
>> +	pte_t *pte;
>>  
>>  	pgd = pgd_offset(mm, addr);
>>  	pr_debug("%s: addr:0x%lx pgd:%p\n", __func__, addr, pgd);
>>  	if (!pgd_present(*pgd))
>>  		return NULL;
>> +
>>  	pud = pud_offset(pgd, addr);
>> -	if (!pud_present(*pud))
>> +	if (pud_none(*pud) && sz != PUD_SIZE)
>>  		return NULL;
>> -
>> -	if (pud_huge(*pud))
>> +	else if (!pud_table(*pud))
>>  		return (pte_t *)pud;
>> +
>> +	if (sz == CONT_PMD_SIZE)
>> +		addr &= CONT_PMD_MASK;
>> +
>>  	pmd = pmd_offset(pud, addr);
>> -	if (!pmd_present(*pmd))
>> +	if (pmd_none(*pmd) &&
>> +	    !(sz == PMD_SIZE || sz == CONT_PMD_SIZE))
>>  		return NULL;
>> -
>> -	if (pte_cont(pmd_pte(*pmd))) {
>> -		pmd = pmd_offset(
>> -			pud, (addr & CONT_PMD_MASK));
>> -		return (pte_t *)pmd;
>> -	}
>> -	if (pmd_huge(*pmd))
>> +	else if (!pmd_table(*pmd))
>>  		return (pte_t *)pmd;
>> -	pte = pte_offset_kernel(pmd, addr);
>> -	if (pte_present(*pte) && pte_cont(*pte)) {
>> +
>> +	if (sz == CONT_PTE_SIZE) {
>>  		pte = pte_offset_kernel(
>>  			pmd, (addr & CONT_PTE_MASK));
>>  		return pte;
>>  	}
>> +
>>  	return NULL;
>>  }
>>  
>> -- 
>> 2.11.0
>> 
>> 
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel



More information about the linux-arm-kernel mailing list