[PATCH 1/2] arm64: trans_pgd: fix incorrect use of pmd_populate_kernel in copy_pte()

Rongwei Wang rongwei.wang at linux.alibaba.com
Sun Oct 31 19:14:22 PDT 2021



On 10/31/21 8:25 PM, Ard Biesheuvel wrote:
> On Sat, 30 Oct 2021 at 20:32, Rongwei Wang
> <rongwei.wang at linux.alibaba.com> wrote:
>>
>> In commit 5de59884ac0e ("arm64: trans_pgd: pass NULL instead
>> of init_mm to *_populate functions"), simply replace init_mm
>> with NULL for pmd_populate_kernel. But in commit 59511cfd08f3
>> ("arm64: mm: use XN table mapping attributes for user/kernel
>> mappings"), adding the check of mm context in
>> pmd_populate_kernel. And these changes will cause a crash when
>> executing copy_pte/trans_pgd.c, as follows:
>>
>> kernel BUG at arch/arm64/include/asm/pgalloc.h:79!
>> Internal error: Oops - BUG: 0 [#1] SMP
>> Modules linked in: rfkill(E) aes_ce_blk(E) aes_ce_cipher(E) ...
>> CPU: 21 PID: 1617 Comm: a.out Kdump: loaded Tainted: ... 5.15.0-rc7-mm1+ #8
>> Hardware name: ECS, BIOS 0.0.0 02/06/2015
>> pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>> pc : trans_pgd_create_copy+0x4ac/0x4f0
>> lr : trans_pgd_create_copy+0x34c/0x4f0
>> sp : ffff80001bf2bc50
>> x29: ffff80001bf2bc50 x28: ffff0010067f1000 x27: ffff800011072000
>> x26: ffff001fffff8000 x25: ffff008000000000 x24: 0040000000000041
>> x23: 0040000000000001 x22: ffff80001bf2bd68 x21: ffff80001188ded8
>> x20: ffff800000000000 x19: ffff000000000000 x18: 0000000000000000
>> x17: 0000000000000000 x16: 0000000000000000 x15: 00000000200004c0
>> x14: ffff00003fffffff x13: ffff007fffffffff x12: ffff800010f882a8
>> x11: 0000000000face57 x10: 0000000000000001 x9 : 0000000000000000
>> x8 : ffff00100cece000 x7 : ffff001001c9f000 x6 : ffff00100ae40000
>> x5 : 0000000000000040 x4 : 0000000000000000 x3 : ffff001fffff7000
>> x2 : ffff000000200000 x1 : ffff000040000000 x0 : ffff00100cecd000
>> Call trace:
>>   trans_pgd_create_copy+0x4ac/0x4f0
>>   machine_kexec_post_load+0x94/0x3bc
>>   do_kexec_load+0x11c/0x2e0
>>   __arm64_sys_kexec_load+0xa8/0xf4
>>   invoke_syscall+0x50/0x120
>>   el0_svc_common.constprop.0+0x58/0x190
>>   do_el0_svc+0x2c/0x90
>>   el0_svc+0x28/0xe0
>>   el0t_64_sync_handler+0xb0/0xb4
>>   el0t_64_sync+0x180/0x184
>> Code: f90000c0 d5033a9f d5033fdf 17ffff7b (d4210000)
>> ---[ end trace cc5461ffe1a085db ]---
>> Kernel panic - not syncing: Oops - BUG: Fatal exception
>>
>> This bug can be reproduced by a user case:
>>
>> void execute_kexec_load(void)
>> {
>>          syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
>>          syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
>>          syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
>>
>>          *(uint64_t*)0x200004c0 = 0;
>>          *(uint64_t*)0x200004c8 = 0;
>>          *(uint64_t*)0x200004d0 = 0;
>>          *(uint64_t*)0x200004d8 = 0;
>>          syscall(__NR_kexec_load, 0ul, 1ul, 0x200004c0ul, 0ul);
>> }
>>
>> And this patch just make some simple changes, and including
>> replace pmd_populate_kernel with pmd_populate.
>>
>> Fixes: 59511cfd08f3 ("arm64: mm: use XN table mapping attributes for user/kernel mappings")
>> Reported-by: Abaci <abaci at linux.alibaba.com>
>> Signed-off-by: Rongwei Wang <rongwei.wang at linux.alibaba.com>
>> ---
>>   arch/arm64/mm/trans_pgd.c | 7 ++++---
>>   1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c
>> index d7da8ca40d2e..3f1fc6cb9c9d 100644
>> --- a/arch/arm64/mm/trans_pgd.c
>> +++ b/arch/arm64/mm/trans_pgd.c
>> @@ -62,12 +62,13 @@ static int copy_pte(struct trans_pgd_info *info, pmd_t *dst_pmdp,
>>   {
>>          pte_t *src_ptep;
>>          pte_t *dst_ptep;
>> +       struct page *page;
>>          unsigned long addr = start;
>>
>> -       dst_ptep = trans_alloc(info);
>> -       if (!dst_ptep)
>> +       page = virt_to_page(trans_alloc(info));
>> +       if (!page)
>>                  return -ENOMEM;
>> -       pmd_populate_kernel(NULL, dst_pmdp, dst_ptep);
>> +       pmd_populate(NULL, dst_pmdp, page);
> 
> Are you sure this truly fixes the underlying issue rather than the symptom?
> 
Hi Ard

I just found bug line on 'VM_BUG_ON(mm != &init_mm)' shown below, seems 
a obvious incorrect use of 'pmd_populate_kernel' in copy_pte. It seems
these changes were introduced in this year.

static inline void
pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp, pte_t *ptep)
{
	VM_BUG_ON(mm != &init_mm);
  	__pmd_populate(pmdp, __pa(ptep), PMD_TYPE_TABLE | PMD_TABLE_UXN);
}

And I had run some testcases, not triggered this bug any more.
If I missing something, please remind me! And I will check it
again.

Thanks!

> pmd_populate() will create a table entry with the PXN attribute set,
> which means nothing below it will be executable by the kernel,
> regardless of the executable permissions at the PTE level.
> 
> 
>>          dst_ptep = pte_offset_kernel(dst_pmdp, start);
>>
>>          src_ptep = pte_offset_kernel(src_pmdp, start);
>> --
>> 2.27.0
>>



More information about the linux-arm-kernel mailing list