BUG: Bad page map in process/Bad Swap file entry, RPI CM4 on clone syscall

Will Deacon will at kernel.org
Wed Aug 24 08:30:47 PDT 2022


On Thu, Aug 18, 2022 at 07:14:12PM +0200, Max Schulze wrote:
> > On 15.08.22 16:22, Will Deacon wrote:
> >>> [20:47:09] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> >>> [20:48:46] BUG: Bad page map in process projecta  pte:1110111111111111 pmd:800000001c40003
> >>> [20:48:46] addr:0000007fa1c00000 vm_flags:00100073 anon_vma:ffffff805bf80d08 mapping:0000000000000000 index:7fa1c00
> >>> [20:48:46] file:(null) fault:0x0 mmap:0x0 read_folio:0x0
> 
> > 
> >> I hate to say it, but this all looks like memory corruption hitting the
> >> page table and possibly the 'struct page' array to me :/
> > 
> > Perhaps a note on the occcurence: across devices, the "bad page map"
> > differs at pte, but somehow is mostly consistent at pmd:800000001c40003
> > (though I have seen 800000001c02003 and 800000001c40003). Is this some
> > "magic value"? Because when not, I think it would be highly unlikely to
> > be the hardware.
> > 
> > It is not only my program that has the problem, I have seen
> > 
> > [Sun Aug 14 17:30:38 2022] BUG: Bad page map in process llvmpipe-3  pte:262d2626292a2627 pmd:800000001c01003
> > 
> > and
> > [Sat Aug 13 11:53:43 2022] BUG: Bad page map in process Xorg:disk$1  pte:a098a09aa29ea8a4 pmd:800000001c01003
> > [Sat Aug 13 11:53:43 2022] addr:00000055a961e000 vm_flags:200100073 anon_vma:ffffff804c07d8f8 mapping:0000000000000000 index:55a961e
> > [Sat Aug 13 11:53:43 2022] file:(null) fault:0x0 mmap:0x0 read_folio:0x0
> > 
> > 
> [..]
> 
> I am able to reproduce this on 6.0.0-rc1 . 
> It looks like vm_normal_page does not recognize the page as being "normal" (?).
> (mm/memory.c)

I think the issue is much more fundamental than that; you appear to have
page-table corruption (for example, "pte:262d2626292a2627" and
"pte:1110111111111111" are definitely corrupted) and so anything dealing
with 'struct page' derived from the physical address in the pte is going to
go wonky.

>From the logs here, the pmds look ok but these are the pte values I spotted:

0x1110111111111111
0x262d2626292a2627
0xa098a09aa29ea8a4
0x212725231f242323
0x2626262023222323

which don't seem to correspond to any sort of poison, but are possibly
artifacts of repeated patterns with random bits cleared?

Will



More information about the linux-arm-kernel mailing list