[PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures
Balbir Singh
balbirs at nvidia.com
Sat Nov 22 04:04:54 PST 2025
On 11/11/25 02:28, Catalin Marinas wrote:
> On Mon, Nov 10, 2025 at 10:53:33AM +0100, David Hildenbrand (Red Hat) wrote:
>> On 10.11.25 10:48, Jan Polensky wrote:
>>> On Mon, Nov 10, 2025 at 10:09:31AM +0100, David Hildenbrand (Red Hat) wrote:
>>>> On 09.11.25 01:36, Jan Polensky wrote:
>>>>> The previous change added __GFP_ZEROTAGS when allocating the huge zero
>>>>> folio to ensure tag initialization for arm64 with MTE enabled. However,
>>>>> on s390 this flag is unnecessary and triggers a regression
>>>>> (observed as a crash during repeated 'dnf makecache').
> [...]
>>>> I think the problem is that post_alloc_hook() does
>>>>
>>>> if (zero_tags) {
>>>> /* Initialize both memory and memory tags. */
>>>> for (i = 0; i != 1 << order; ++i)
>>>> tag_clear_highpage(page + i);
>>>>
>>>> /* Take note that memory was initialized by the loop above. */
>>>> init = false;
>>>> }
>>>>
>>>> And tag_clear_highpage() is a NOP on other architectures.
>
> Hmm, another thing I missed. Sorry about this.
>
>>> Which works by the way for our arch (s390).
>>>
>>> include/linux/gfp_types.h | 4 ++++
>>> 1 file changed, 4 insertions(+)
>>>
>>> diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h
>>> index 65db9349f905..c12d8a601bb3 100644
>>> --- a/include/linux/gfp_types.h
>>> +++ b/include/linux/gfp_types.h
>>> @@ -85,7 +85,11 @@ enum {
>>> #define ___GFP_HARDWALL BIT(___GFP_HARDWALL_BIT)
>>> #define ___GFP_THISNODE BIT(___GFP_THISNODE_BIT)
>>> #define ___GFP_ACCOUNT BIT(___GFP_ACCOUNT_BIT)
>>> +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>>> #define ___GFP_ZEROTAGS BIT(___GFP_ZEROTAGS_BIT)
>>> +#else
>>> +#define ___GFP_ZEROTAGS 0
>>> +#endif
>>> #ifdef CONFIG_KASAN_HW_TAGS
>>> #define ___GFP_SKIP_ZERO BIT(___GFP_SKIP_ZERO_BIT)
>>> #define ___GFP_SKIP_KASAN BIT(___GFP_SKIP_KASAN_BIT)
>>>
>>> This solution would be sufficient from my side, and I would appreciate a
>>> quick application if there are no objections.
>>
>> As raised, to be sure that __HAVE_ARCH_TAG_CLEAR_HIGHPAGE is always seen
>> early in that file, it should likely become a CONFIG_ thing.
>
> I'm fine with either option above but I'll throw one more in the mix:
>
> --------------------8<--------------------------------
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 2312e6ee595f..dcff91533590 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -33,6 +33,7 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
> unsigned long vaddr);
> #define vma_alloc_zeroed_movable_folio vma_alloc_zeroed_movable_folio
>
> +bool arch_has_tag_clear_highpage(void);
> void tag_clear_highpage(struct page *to);
> #define __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 125dfa6c613b..318d091db843 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -967,18 +967,13 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
> return vma_alloc_folio(flags, 0, vma, vaddr);
> }
>
> +bool arch_has_tag_clear_highpage(void)
> +{
> + return system_supports_mte();
> +}
> +
> void tag_clear_highpage(struct page *page)
> {
> - /*
> - * Check if MTE is supported and fall back to clear_highpage().
> - * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and
> - * post_alloc_hook() will invoke tag_clear_highpage().
> - */
> - if (!system_supports_mte()) {
> - clear_highpage(page);
> - return;
> - }
> -
> /* Newly allocated page, shouldn't have been tagged yet */
> WARN_ON_ONCE(!try_page_mte_tagging(page));
> mte_zero_clear_page_tags(page_address(page));
> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
> index 105cc4c00cc3..7aa56179ccef 100644
> --- a/include/linux/highmem.h
> +++ b/include/linux/highmem.h
> @@ -251,6 +251,11 @@ static inline void clear_highpage_kasan_tagged(struct page *page)
>
> #ifndef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>
> +static inline bool arch_has_tag_clear_highpage(void)
> +{
> + return false;
> +}
> +
> static inline void tag_clear_highpage(struct page *page)
> {
> }
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index e4efda1158b2..5ab15431bc06 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1798,7 +1798,8 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
> {
> bool init = !want_init_on_free() && want_init_on_alloc(gfp_flags) &&
> !should_skip_init(gfp_flags);
> - bool zero_tags = init && (gfp_flags & __GFP_ZEROTAGS);
> + bool zero_tags = init && (gfp_flags & __GFP_ZEROTAGS) &&
> + arch_has_tag_clear_highpage();
> int i;
>
> set_page_private(page, 0);
> --------------------8<--------------------------------
>
> Reasoning: with MTE on arm64, you can't have kasan-tagged pages in the
> kernel which are also exposed to user because the tags are shared (same
> physical location). The 'zero_tags' initialisation in post_alloc_hook()
> makes sense for this behaviour. With virtual tagging (briefly announced
> in [1], full specs not public yet), both the user and the kernel can
> have their own tags - more like KASAN_SW_TAGS but without the compiler
> instrumentation. The kernel won't be able to zero the tags for the user
> since they are in virtual space. It can, however, continue to use Kasan
> tags even if the pages are mapped in user space. In this case, I'd
> rather use the kernel_init_pages() call further down in
> post_alloc_hook() than replicating it in tag_clear_highpage(). When we
> get to upstreaming virtual tagging (informally vMTE, sometime next
> year), I'd like to have a kernel image that supports both, so the
> decision on whether to call tag_clear_highpage() will need to be
> dynamic.
>
> [1] https://developer.arm.com/community/arm-community-blogs/b/architectures-and-processors-blog/posts/future-architecture-technologies-poe2-and-vmte
>
I've run into the issue where due to init being set to false if zero_tags was set,
the system does not clear the zero_folio. I just spent a lot of time debugging it :)
Catalin, were you going to send out this patch as a fix to be included in mm-unstable?
I've for now reverted your __GFP_ZEROTAGS change to get_huge_zero_folio() for my testing
I am on the current mm-new branch.
Balbir
More information about the linux-arm-kernel
mailing list