[PATCH -next v4 3/4] arm64: mm: add support for page table check

Tong Tiangen tongtiangen at huawei.com
Mon Apr 18 08:47:34 PDT 2022



在 2022/4/18 17:28, Anshuman Khandual 写道:
> On 4/18/22 09:14, Tong Tiangen wrote:
>> From: Kefeng Wang <wangkefeng.wang at huawei.com>
>>
[...]
>>   #endif
> 
> Ran this series on arm64 platform after enabling
> 
> - CONFIG_PAGE_TABLE_CHECK
> - CONFIG_PAGE_TABLE_CHECK_ENFORCED (avoiding kernel command line option)
> 
> After some time, the following error came up
> 
> [   23.266013] ------------[ cut here ]------------
> [   23.266807] kernel BUG at mm/page_table_check.c:90!
> [   23.267609] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> [   23.268503] Modules linked in:
> [   23.269012] CPU: 1 PID: 30 Comm: khugepaged Not tainted 5.18.0-rc3-00004-g60aa8e363a91 #2
> [   23.270383] Hardware name: linux,dummy-virt (DT)
> [   23.271210] pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [   23.272445] pc : page_table_check_clear.isra.6+0x114/0x148
> [   23.273429] lr : page_table_check_clear.isra.6+0x64/0x148
> [   23.274395] sp : ffff80000afb3ca0
> [   23.274994] x29: ffff80000afb3ca0 x28: fffffc00022558e8 x27: ffff80000a27f628
> [   23.276260] x26: ffff800009f9f2b0 x25: ffff00008a8d5000 x24: ffff800009f09fa0
> [   23.277527] x23: 0000ffff89e00000 x22: ffff800009f09fb8 x21: ffff000089414cc0
> [   23.278798] x20: 0000000000000200 x19: fffffc00022a0000 x18: 0000000000000001
> [   23.280066] x17: 0000000000000001 x16: 0000000000000000 x15: 0000000000000003
> [   23.281331] x14: 0000000000000068 x13: 00000000000000c0 x12: 0000000000000010
> [   23.282602] x11: fffffc0002320008 x10: fffffc0002320000 x9 : ffff800009fa1000
> [   23.283868] x8 : 00000000ffffffff x7 : 0000000000000001 x6 : ffff800009fa1f08
> [   23.285135] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
> [   23.286406] x2 : 00000000ffffffff x1 : ffff000080f2800c x0 : ffff000080f28000
> [   23.287673] Call trace:
> [   23.288123]  page_table_check_clear.isra.6+0x114/0x148
> [   23.289043]  __page_table_check_pmd_clear+0x3c/0x50
> [   23.289918]  pmdp_collapse_flush+0x114/0x370
> [   23.290692]  khugepaged+0x1170/0x19e0
> [   23.291356]  kthread+0x110/0x120
> [   23.291945]  ret_from_fork+0x10/0x20
> [   23.292596] Code: 91001041 b8e80024 51000482 36fffd62 (d4210000)
> [   23.293678] ---[ end trace 0000000000000000 ]---
> [   23.294511] note: khugepaged[30] exited with preempt_count 2
> 
> Looking into file mm/page_table_check.c where this problem occured.
> 
> /*
>   * An enty is removed from the page table, decrement the counters for that page
>   * verify that it is of correct type and counters do not become negative.
>   */
> static void page_table_check_clear(struct mm_struct *mm, unsigned long addr,
>                                     unsigned long pfn, unsigned long pgcnt)
> {
>          struct page_ext *page_ext;
>          struct page *page;
>          unsigned long i;
>          bool anon;
> 
>          if (!pfn_valid(pfn))
>                  return;
> 
>          page = pfn_to_page(pfn);
>          page_ext = lookup_page_ext(page);
>          anon = PageAnon(page);
> 
>          for (i = 0; i < pgcnt; i++) {
>                  struct page_table_check *ptc = get_page_table_check(page_ext);
> 
>                  if (anon) {
>                          BUG_ON(atomic_read(&ptc->file_map_count));
>                          BUG_ON(atomic_dec_return(&ptc->anon_map_count) < 0);
>                  } else {
>                          BUG_ON(atomic_read(&ptc->anon_map_count));
>   Triggered here ====>>  BUG_ON(atomic_dec_return(&ptc->file_map_count) < 0);
>                  }
>                  page_ext = page_ext_next(page_ext);
>          }
> }
> 
> Could you explain what was expected during pmdp_collapse_flush() which when
> failed, triggered this BUG_ON() ? This counter seems to be page table check
> specific, could it just go wrong ? I have not looked into the details about
> page table check mechanism.
> 
> - Anshuman
> .

Hi Anshuman:

Thanks for your job.

Let me briefly explain the principle of page table check(PTC).

PTC introduces the following struct for page mapping type count:
struct page_table_check {
         atomic_t anon_map_count;
         atomic_t file_map_count;
};
This structure can be obtained by "lookup_page_ext(page)"

When page table entries are set(pud/pmd/pte), page_table_check_set()  is 
called to increase the page mapping count, Also check for errors (eg:if 
a page is used for anonymous mapping, then the page cannot be used for 
file mapping at the same time).

When page table entries are clear(pud/pmd/pte), page_table_check_clear() 
  is called to decrease the page mapping count, Also check for errors.

The error check rules are described in the following documents: 
Documentation/vm/page_table_check.rst

The setting and clearing of page table entries are symmetrical.

Here __page_table_check_pmd_clear() trigger BUGON which indicates that 
the pmd entry file mapping count has become negative.

I guess if PTC didn't detect this exception, would there have been any 
problems?

Thanks,
Tong.



More information about the linux-arm-kernel mailing list