[PATCH v5 2/6] mm: remap unused subpages to shared zeropage when splitting isolated thp
Lance Yang
lance.yang at linux.dev
Fri Sep 19 01:14:11 PDT 2025
On 2025/9/19 15:55, David Hildenbrand wrote:
>>> I think where possible we really only want to identify problematic
>>> (tagged) pages and skip them. And we should either look into fixing KSM
>>> as well or finding out why KSM is not affected.
>>
>> Yeah. Seems like we could introduce a new helper,
>> folio_test_mte_tagged(struct
>> folio *folio). By default, it would return false, and architectures like
>> arm64
>> can override it.
>
> If we add a new helper it should instead express the semantics that we
> cannot deduplicate.
Agreed.
>
> For THP, I recall that only some pages might be tagged. So likely we
> want to check per page.
Yes, a per-page check would be simpler.
>
>>
>> Looking at the code, the PG_mte_tagged flag is not set for regular THP.
>
> I think it's supported for THP per page. Only for hugetlb we tag the
> whole thing through the head page instead of individual pages.
Right. That's exactly what I meant.
>
>> The MTE
>> status actually comes from the VM_MTE flag in the VMA that maps it.
>>
>
> During the rmap walk we could check the VMA flag, but there would be no
> way to just stop the THP shrinker scanning this page early.
>
>> static inline bool folio_test_hugetlb_mte_tagged(struct folio *folio)
>> {
>> bool ret = test_bit(PG_mte_tagged, &folio->flags.f);
>>
>> VM_WARN_ON_ONCE(!folio_test_hugetlb(folio));
>>
>> /*
>> * If the folio is tagged, ensure ordering with a likely subsequent
>> * read of the tags.
>> */
>> if (ret)
>> smp_rmb();
>> return ret;
>> }
>>
>> static inline bool page_mte_tagged(struct page *page)
>> {
>> bool ret = test_bit(PG_mte_tagged, &page->flags.f);
>>
>> VM_WARN_ON_ONCE(folio_test_hugetlb(page_folio(page)));
>>
>> /*
>> * If the page is tagged, ensure ordering with a likely subsequent
>> * read of the tags.
>> */
>> if (ret)
>> smp_rmb();
>> return ret;
>> }
>>
>> contpte_set_ptes()
>> __set_ptes()
>> __set_ptes_anysz()
>> __sync_cache_and_tags()
>> mte_sync_tags()
>> set_page_mte_tagged()
>>
>> Then, having the THP shrinker skip any folios that are identified as
>> MTE-tagged.
>
> Likely we should just do something like (maybe we want better naming)
>
> #ifndef page_is_mergable
> #define page_is_mergable(page) (true)
> #endif
Maybe something like page_is_optimizable()? Just a thought ;p
>
> And for arm64 have it be
>
> #define page_is_mergable(page) (!page_mte_tagged(page))
>
>
> And then do
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 1f0813b956436..1cac9093918d6 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -4251,7 +4251,8 @@ static bool thp_underused(struct folio *folio)
>
> for (i = 0; i < folio_nr_pages(folio); i++) {
> kaddr = kmap_local_folio(folio, i * PAGE_SIZE);
> - if (!memchr_inv(kaddr, 0, PAGE_SIZE)) {
> + if (page_is_mergable(folio_page(folio, i)) &&
> + !memchr_inv(kaddr, 0, PAGE_SIZE)) {
> num_zero_pages++;
> if (num_zero_pages > khugepaged_max_ptes_none) {
> kunmap_local(kaddr);
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 946253c398072..476a9a9091bd3 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -306,6 +306,8 @@ static bool try_to_map_unused_to_zeropage(struct
> page_vma_mapped_walk *pvmw,
>
> if (PageCompound(page))
> return false;
> + if (!page_is_mergable(page))
> + return false;
> VM_BUG_ON_PAGE(!PageAnon(page), page);
> VM_BUG_ON_PAGE(!PageLocked(page), page);
> VM_BUG_ON_PAGE(pte_present(ptep_get(pvmw->pte)), page);
Looks good to me!
>
>
> For KSM, similarly just bail out early. But still wondering if this is
> already checked
> somehow for KSM.
+1 I'm looking for a machine to test it on.
More information about the Linux-mediatek
mailing list