BUG: Bad page map in process/Bad Swap file entry, RPI CM4 on clone syscall

Max Schulze max.schulze at online.de
Thu Aug 18 10:14:12 PDT 2022


Hello,


> On 15.08.22 16:22, Will Deacon wrote:
>>
>>>
[...]
>>>
>>>
>>> [20:47:09] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
>>> [20:48:46] BUG: Bad page map in process projecta  pte:1110111111111111 pmd:800000001c40003
>>> [20:48:46] addr:0000007fa1c00000 vm_flags:00100073 anon_vma:ffffff805bf80d08 mapping:0000000000000000 index:7fa1c00
>>> [20:48:46] file:(null) fault:0x0 mmap:0x0 read_folio:0x0

> 
>> I hate to say it, but this all looks like memory corruption hitting the
>> page table and possibly the 'struct page' array to me :/
> 
> Perhaps a note on the occcurence: across devices, the "bad page map" differs at pte, but somehow is mostly consistent at pmd:800000001c40003 (though I have seen 800000001c02003 and 800000001c40003). Is this some "magic value"? Because when not, I think it would be highly unlikely to be the hardware.
> 
> It is not only my program that has the problem, I have seen
> 
> [Sun Aug 14 17:30:38 2022] BUG: Bad page map in process llvmpipe-3  pte:262d2626292a2627 pmd:800000001c01003
> 
> and
> [Sat Aug 13 11:53:43 2022] BUG: Bad page map in process Xorg:disk$1  pte:a098a09aa29ea8a4 pmd:800000001c01003
> [Sat Aug 13 11:53:43 2022] addr:00000055a961e000 vm_flags:200100073 anon_vma:ffffff804c07d8f8 mapping:0000000000000000 index:55a961e
> [Sat Aug 13 11:53:43 2022] file:(null) fault:0x0 mmap:0x0 read_folio:0x0
> 
> 
[..]

I am able to reproduce this on 6.0.0-rc1 . 
It looks like vm_normal_page does not recognize the page as being "normal" (?).
(mm/memory.c)
>               if (likely(!pte_special(pte)))
> 			goto check_pfn;
> 		if (vma->vm_ops && vma->vm_ops->find_special_page)
> 			return vma->vm_ops->find_special_page(vma, addr);
> 		if (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
> 			return NULL;
> 		if (is_zero_pfn(pfn))
> 			return NULL;
> 		if (pte_devmap(pte))
>[...]
> 			return NULL;
> 
> 		print_bad_pte(vma, addr, pte, NULL);



What would be helpful to do next? Is the KASAN warning a consequent error or the cause?

[ 18:42:59] 
[ 18:44:17] BUG: Bad page map in process projecta  pte:212725231f242323 pmd:800000001c01003
[ 18:44:17] addr:0000007fa1000000 vm_flags:00100073 anon_vma:ffffff8054090c38 mapping:0000000000000000 index:7fa1000
[ 18:44:17] file:(null) fault:0x0 mmap:0x0 read_folio:0x0
[ 18:44:17] CPU: 3 PID: 1135 Comm: projecta Tainted: G         C         6.0.0-rc1-v8-gc8f41281d1f4 #2
[ 18:44:17] Hardware name: Raspberry Pi Compute Module 4 Rev 1.0 (DT)
[ 18:44:17] Call trace:
[ 18:44:17] dump_backtrace.part.0 (arch/arm64/kernel/stacktrace.c:184) 
[ 18:44:17] show_stack (arch/arm64/kernel/stacktrace.c:191) 
[ 18:44:17] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4)) 
[ 18:44:17] dump_stack (lib/dump_stack.c:114) 
[ 18:44:17] print_bad_pte (mm/memory.c:567 (discriminator 12)) 
[ 18:44:17] vm_normal_page (mm/memory.c:638) 
[ 18:44:17] copy_page_range (mm/memory.c:951 mm/memory.c:1085 mm/memory.c:1171 mm/memory.c:1208 mm/memory.c:1232 mm/memory.c:1330) 
[ 18:44:17] dup_mm (kernel/fork.c:699 kernel/fork.c:1524) 
[ 18:44:17] copy_process (kernel/fork.c:1576 kernel/fork.c:2256) 
[ 18:44:17] kernel_clone (kernel/fork.c:2673) 
[ 18:44:17] __do_sys_clone (kernel/fork.c:2808) 
[ 18:44:17] __arm64_sys_clone (kernel/fork.c:2775) 
[ 18:44:17] invoke_syscall (arch/arm64/kernel/syscall.c:38 arch/arm64/kernel/syscall.c:52) 
[ 18:44:17] el0_svc_common.constprop.0 (./arch/arm64/include/asm/daifflags.h:28 arch/arm64/kernel/syscall.c:150) 
[ 18:44:17] do_el0_svc (arch/arm64/kernel/syscall.c:207) 
[ 18:44:17] el0_svc (arch/arm64/kernel/entry-common.c:133 arch/arm64/kernel/entry-common.c:142 arch/arm64/kernel/entry-common.c:625) 
[ 18:44:17] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:643) 
[ 18:44:17] el0t_64_sync (arch/arm64/kernel/entry.S:581) 
[ 18:44:17] Disabling lock debugging due to kernel taint
[ 18:44:17] BUG: Bad page map in process projecta  pte:2626262023222323 pmd:800000001c01003
[ 18:44:17] addr:0000007fa1001000 vm_flags:00100073 anon_vma:ffffff8054090c38 mapping:0000000000000000 index:7fa1001
[ 18:44:17] file:(null) fault:0x0 mmap:0x0 read_folio:0x0
[ 18:44:17] CPU: 3 PID: 1135 Comm: projecta Tainted: G    B    C         6.0.0-rc1-v8-gc8f41281d1f4 #2
[ 18:44:17] Hardware name: Raspberry Pi Compute Module 4 Rev 1.0 (DT)
[ 18:44:17] Call trace:
[ 18:44:17] dump_backtrace.part.0 (arch/arm64/kernel/stacktrace.c:184) 
[ 18:44:17] show_stack (arch/arm64/kernel/stacktrace.c:191) 
[ 18:44:17] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4)) 
[ 18:44:17] dump_stack (lib/dump_stack.c:114) 
[ 18:44:17] print_bad_pte (mm/memory.c:567 (discriminator 12)) 
[ 18:44:17] vm_normal_page (mm/memory.c:638) 
[ 18:44:17] copy_page_range (mm/memory.c:951 mm/memory.c:1085 mm/memory.c:1171 mm/memory.c:1208 mm/memory.c:1232 mm/memory.c:1330) 
[ 18:44:17] dup_mm (kernel/fork.c:699 kernel/fork.c:1524) 
[ 18:44:17] copy_process (kernel/fork.c:1576 kernel/fork.c:2256) 
[ 18:44:17] kernel_clone (kernel/fork.c:2673) 
[ 18:44:17] __do_sys_clone (kernel/fork.c:2808) 
[ 18:44:17] __arm64_sys_clone (kernel/fork.c:2775) 
[ 18:44:17] invoke_syscall (arch/arm64/kernel/syscall.c:38 arch/arm64/kernel/syscall.c:52) 
[ 18:44:17] el0_svc_common.constprop.0 (./arch/arm64/include/asm/daifflags.h:28 arch/arm64/kernel/syscall.c:150) 
[ 18:44:17] do_el0_svc (arch/arm64/kernel/syscall.c:207) 
[ 18:44:17] el0_svc (arch/arm64/kernel/entry-common.c:133 arch/arm64/kernel/entry-common.c:142 arch/arm64/kernel/entry-common.c:625) 
[ 18:44:17] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:643) 
[ 18:44:17] el0t_64_sync (arch/arm64/kernel/entry.S:581) 
[ 18:44:17] ==================================================================
[ 18:44:17] BUG: KASAN: wild-memory-access in __sync_icache_dcache (./include/asm-generic/bitops/generic-non-atomic.h:127 arch/arm64/mm/flush.c:62) 
[ 18:44:17] Read of size 8 at addr 00000096808c8880 by task projecta/1135

[ 18:44:17] CPU: 3 PID: 1135 Comm: projecta Tainted: G    B    C         6.0.0-rc1-v8-gc8f41281d1f4 #2
[ 18:44:17] Hardware name: Raspberry Pi Compute Module 4 Rev 1.0 (DT)
[ 18:44:17] Call trace:
[ 18:44:17] dump_backtrace.part.0 (arch/arm64/kernel/stacktrace.c:184) 
[ 18:44:17] show_stack (arch/arm64/kernel/stacktrace.c:191) 
[ 18:44:17] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4)) 
[ 18:44:17] print_report (mm/kasan/report.c:438) 
[ 18:44:17] kasan_report (mm/kasan/report.c:162 mm/kasan/report.c:497) 
[ 18:44:17] __asan_load8 (mm/kasan/generic.c:256) 
[ 18:44:17] __sync_icache_dcache (./include/asm-generic/bitops/generic-non-atomic.h:127 arch/arm64/mm/flush.c:62) 
[ 18:44:17] copy_page_range (./arch/arm64/include/asm/pgtable.h:327 ./arch/arm64/include/asm/pgtable.h:358 mm/memory.c:994 mm/memory.c:1085 mm/memory.c:1171 mm/memory.c:1208 mm/memory.c:1232 mm/memory.c:1330) 
[ 18:44:17] dup_mm (kernel/fork.c:699 kernel/fork.c:1524) 
[ 18:44:17] copy_process (kernel/fork.c:1576 kernel/fork.c:2256) 
[ 18:44:17] kernel_clone (kernel/fork.c:2673) 
[ 18:44:17] __do_sys_clone (kernel/fork.c:2808) 
[ 18:44:17] __arm64_sys_clone (kernel/fork.c:2775) 
[ 18:44:17] invoke_syscall (arch/arm64/kernel/syscall.c:38 arch/arm64/kernel/syscall.c:52) 
[ 18:44:17] el0_svc_common.constprop.0 (./arch/arm64/include/asm/daifflags.h:28 arch/arm64/kernel/syscall.c:150) 
[ 18:44:17] do_el0_svc (arch/arm64/kernel/syscall.c:207) 
[ 18:44:17] el0_svc (arch/arm64/kernel/entry-common.c:133 arch/arm64/kernel/entry-common.c:142 arch/arm64/kernel/entry-common.c:625) 
[ 18:44:17] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:643) 
[ 18:44:17] el0t_64_sync (arch/arm64/kernel/entry.S:581) 
[ 18:44:17] ==================================================================
[ 18:44:17] Unable to handle kernel paging request at virtual address 00000096808c8880
[ 18:44:17] Mem abort info:
[ 18:44:17]   ESR = 0x0000000096000004
[ 18:44:17]   EC = 0x25: DABT (current EL), IL = 32 bits
[ 18:44:17]   SET = 0, FnV = 0
[ 18:44:17]   EA = 0, S1PTW = 0
[ 18:44:17]   FSC = 0x04: level 0 translation fault
[ 18:44:17] Data abort info:
[ 18:44:17]   ISV = 0, ISS = 0x00000004
[ 18:44:17]   CM = 0, WnR = 0
[ 18:44:17] [00000096808c8880] address between user and kernel address ranges
[ 18:44:17] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[ 18:44:17] Modules linked in: rtc_pcf85063 regmap_i2c ov9281 rfkill bcm2835_unicam v4l2_dv_timings v4l2_fwnode v3d bcm2835_v4l2(C) v4l2_async bcm2835_codec(C) bcm2835_isp(C) videobuf2_vmalloc rpivid_hevc(C) v4l2_mem2mem drm_shmem_helper bcm2835_mmal_vchiq(C) gpu_sched videobuf2_dma_contig videobuf2_memops i2c_mux_pinctrl videobuf2_v4l2 videobuf2_common raspberrypi_hwmon i2c_mux videodev i2c_brcmstb i2c_bcm2835 vc_sm_cma(C) mc uio_pdrv_genirq nvmem_rmem uio drm fuse drm_panel_orientation_quirks backlight ipv6
[ 18:44:17] CPU: 3 PID: 1135 Comm: projecta Tainted: G    B    C         6.0.0-rc1-v8-gc8f41281d1f4 #2
[ 18:44:17] Hardware name: Raspberry Pi Compute Module 4 Rev 1.0 (DT)
[ 18:44:17] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 18:44:17] pc : __sync_icache_dcache (./include/asm-generic/bitops/generic-non-atomic.h:127 arch/arm64/mm/flush.c:62) 
[ 18:44:17] lr : __sync_icache_dcache (./include/asm-generic/bitops/generic-non-atomic.h:127 arch/arm64/mm/flush.c:62) 
[ 18:44:17] sp : ffffffc00d067630
[ 18:44:17] x29: ffffffc00d067630 x28: 0400000000000001 x27: 2626262023222323
[ 18:44:17] x26: 0000007fa1001000 x25: fffffffe010f2ce8 x24: 0000000000000000
[ 18:44:17] x23: fffffffe00000000 x22: 00000096808c8880 x21: 1ffffff801a0cece
[ 18:44:17] x20: 0000000000000000 x19: 00000098808c8880 x18: 0000000000000000
[ 18:44:17] x17: 3d3d3d3d3d3d3d3d x16: 3d3d3d3d3d3d3d3d x15: 3d3d3d3d3d3d3d3d
[ 18:44:17] x14: 3d3d3d3d3d3d3d3d x13: 3d3d3d3d3d3d3d3d x12: ffffffb8014cd81d
[ 18:44:17] x11: 1ffffff8014cd81c x10: ffffffb8014cd81c x9 : dfffffc000000000
[ 18:44:17] x8 : ffffffc00a66c0e7 x7 : 00000047feb327e4 x6 : 0000000000000001
[ 18:44:17] x5 : ffffffc00a66c0e0 x4 : ffffffb8014cd81d x3 : ffffffc0080b68e4
[ 18:44:17] x2 : 0000000000000000 x1 : ffffff804f3e0040 x0 : 0000000000000001
[ 18:44:17] Call trace:
[ 18:44:17] __sync_icache_dcache (./include/asm-generic/bitops/generic-non-atomic.h:127 arch/arm64/mm/flush.c:62) 
[ 18:44:17] copy_page_range (./arch/arm64/include/asm/pgtable.h:327 ./arch/arm64/include/asm/pgtable.h:358 mm/memory.c:994 mm/memory.c:1085 mm/memory.c:1171 mm/memory.c:1208 mm/memory.c:1232 mm/memory.c:1330) 
[ 18:44:17] dup_mm (kernel/fork.c:699 kernel/fork.c:1524) 
[ 18:44:17] copy_process (kernel/fork.c:1576 kernel/fork.c:2256) 
[ 18:44:17] kernel_clone (kernel/fork.c:2673) 
[ 18:44:17] __do_sys_clone (kernel/fork.c:2808) 
[ 18:44:17] __arm64_sys_clone (kernel/fork.c:2775) 
[ 18:44:17] invoke_syscall (arch/arm64/kernel/syscall.c:38 arch/arm64/kernel/syscall.c:52) 
[ 18:44:17] el0_svc_common.constprop.0 (./arch/arm64/include/asm/daifflags.h:28 arch/arm64/kernel/syscall.c:150) 
[ 18:44:17] do_el0_svc (arch/arm64/kernel/syscall.c:207) 
[ 18:44:17] el0_svc (arch/arm64/kernel/entry-common.c:133 arch/arm64/kernel/entry-common.c:142 arch/arm64/kernel/entry-common.c:625) 
[ 18:44:17] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:643) 
[ 18:44:17] el0t_64_sync (arch/arm64/kernel/entry.S:581) 
[ 18:44:17] Code: d37ae673 8b170276 aa1603e0 940f8ac1 (f8776a60)
All code
========
   0:	d37ae673 	lsl	x19, x19, #6
   4:	8b170276 	add	x22, x19, x23
   8:	aa1603e0 	mov	x0, x22
   c:	940f8ac1 	bl	0x3e2b10
  10:*	f8776a60 	ldr	x0, [x19, x23]		<-- trapping instruction

Code starting with the faulting instruction
===========================================
   0:	f8776a60 	ldr	x0, [x19, x23]
[ 18:44:17] ---[ end trace 0000000000000000 ]---
[ 18:44:17] note: projecta[1135] exited with preempt_count 2

Thanks,
Max



More information about the linux-arm-kernel mailing list