[PATCH] mm/huge_memory: restrict __GFP_ZEROTAGS to HW tagging architectures
David Hildenbrand (Red Hat)
david at kernel.org
Mon Nov 24 02:57:09 PST 2025
On 11/22/25 13:04, Balbir Singh wrote:
> On 11/11/25 02:28, Catalin Marinas wrote:
>> On Mon, Nov 10, 2025 at 10:53:33AM +0100, David Hildenbrand (Red Hat) wrote:
>>> On 10.11.25 10:48, Jan Polensky wrote:
>>>> On Mon, Nov 10, 2025 at 10:09:31AM +0100, David Hildenbrand (Red Hat) wrote:
>>>>> On 09.11.25 01:36, Jan Polensky wrote:
>>>>>> The previous change added __GFP_ZEROTAGS when allocating the huge zero
>>>>>> folio to ensure tag initialization for arm64 with MTE enabled. However,
>>>>>> on s390 this flag is unnecessary and triggers a regression
>>>>>> (observed as a crash during repeated 'dnf makecache').
>> [...]
>>>>> I think the problem is that post_alloc_hook() does
>>>>>
>>>>> if (zero_tags) {
>>>>> /* Initialize both memory and memory tags. */
>>>>> for (i = 0; i != 1 << order; ++i)
>>>>> tag_clear_highpage(page + i);
>>>>>
>>>>> /* Take note that memory was initialized by the loop above. */
>>>>> init = false;
>>>>> }
>>>>>
>>>>> And tag_clear_highpage() is a NOP on other architectures.
>>
>> Hmm, another thing I missed. Sorry about this.
>>
>>>> Which works by the way for our arch (s390).
>>>>
>>>> include/linux/gfp_types.h | 4 ++++
>>>> 1 file changed, 4 insertions(+)
>>>>
>>>> diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h
>>>> index 65db9349f905..c12d8a601bb3 100644
>>>> --- a/include/linux/gfp_types.h
>>>> +++ b/include/linux/gfp_types.h
>>>> @@ -85,7 +85,11 @@ enum {
>>>> #define ___GFP_HARDWALL BIT(___GFP_HARDWALL_BIT)
>>>> #define ___GFP_THISNODE BIT(___GFP_THISNODE_BIT)
>>>> #define ___GFP_ACCOUNT BIT(___GFP_ACCOUNT_BIT)
>>>> +#ifdef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>>>> #define ___GFP_ZEROTAGS BIT(___GFP_ZEROTAGS_BIT)
>>>> +#else
>>>> +#define ___GFP_ZEROTAGS 0
>>>> +#endif
>>>> #ifdef CONFIG_KASAN_HW_TAGS
>>>> #define ___GFP_SKIP_ZERO BIT(___GFP_SKIP_ZERO_BIT)
>>>> #define ___GFP_SKIP_KASAN BIT(___GFP_SKIP_KASAN_BIT)
>>>>
>>>> This solution would be sufficient from my side, and I would appreciate a
>>>> quick application if there are no objections.
>>>
>>> As raised, to be sure that __HAVE_ARCH_TAG_CLEAR_HIGHPAGE is always seen
>>> early in that file, it should likely become a CONFIG_ thing.
>>
>> I'm fine with either option above but I'll throw one more in the mix:
>>
>> --------------------8<--------------------------------
>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
>> index 2312e6ee595f..dcff91533590 100644
>> --- a/arch/arm64/include/asm/page.h
>> +++ b/arch/arm64/include/asm/page.h
>> @@ -33,6 +33,7 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
>> unsigned long vaddr);
>> #define vma_alloc_zeroed_movable_folio vma_alloc_zeroed_movable_folio
>>
>> +bool arch_has_tag_clear_highpage(void);
>> void tag_clear_highpage(struct page *to);
>> #define __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>>
>> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
>> index 125dfa6c613b..318d091db843 100644
>> --- a/arch/arm64/mm/fault.c
>> +++ b/arch/arm64/mm/fault.c
>> @@ -967,18 +967,13 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
>> return vma_alloc_folio(flags, 0, vma, vaddr);
>> }
>>
>> +bool arch_has_tag_clear_highpage(void)
>> +{
>> + return system_supports_mte();
>> +}
>> +
>> void tag_clear_highpage(struct page *page)
>> {
>> - /*
>> - * Check if MTE is supported and fall back to clear_highpage().
>> - * get_huge_zero_folio() unconditionally passes __GFP_ZEROTAGS and
>> - * post_alloc_hook() will invoke tag_clear_highpage().
>> - */
>> - if (!system_supports_mte()) {
>> - clear_highpage(page);
>> - return;
>> - }
>> -
>> /* Newly allocated page, shouldn't have been tagged yet */
>> WARN_ON_ONCE(!try_page_mte_tagging(page));
>> mte_zero_clear_page_tags(page_address(page));
>> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
>> index 105cc4c00cc3..7aa56179ccef 100644
>> --- a/include/linux/highmem.h
>> +++ b/include/linux/highmem.h
>> @@ -251,6 +251,11 @@ static inline void clear_highpage_kasan_tagged(struct page *page)
>>
>> #ifndef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
>>
>> +static inline bool arch_has_tag_clear_highpage(void)
>> +{
>> + return false;
>> +}
>> +
>> static inline void tag_clear_highpage(struct page *page)
>> {
>> }
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index e4efda1158b2..5ab15431bc06 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -1798,7 +1798,8 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
>> {
>> bool init = !want_init_on_free() && want_init_on_alloc(gfp_flags) &&
>> !should_skip_init(gfp_flags);
>> - bool zero_tags = init && (gfp_flags & __GFP_ZEROTAGS);
>> + bool zero_tags = init && (gfp_flags & __GFP_ZEROTAGS) &&
>> + arch_has_tag_clear_highpage();
>> int i;
>>
>> set_page_private(page, 0);
>> --------------------8<--------------------------------
>>
>> Reasoning: with MTE on arm64, you can't have kasan-tagged pages in the
>> kernel which are also exposed to user because the tags are shared (same
>> physical location). The 'zero_tags' initialisation in post_alloc_hook()
>> makes sense for this behaviour. With virtual tagging (briefly announced
>> in [1], full specs not public yet), both the user and the kernel can
>> have their own tags - more like KASAN_SW_TAGS but without the compiler
>> instrumentation. The kernel won't be able to zero the tags for the user
>> since they are in virtual space. It can, however, continue to use Kasan
>> tags even if the pages are mapped in user space. In this case, I'd
>> rather use the kernel_init_pages() call further down in
>> post_alloc_hook() than replicating it in tag_clear_highpage(). When we
>> get to upstreaming virtual tagging (informally vMTE, sometime next
>> year), I'd like to have a kernel image that supports both, so the
>> decision on whether to call tag_clear_highpage() will need to be
>> dynamic.
>>
>> [1] https://developer.arm.com/community/arm-community-blogs/b/architectures-and-processors-blog/posts/future-architecture-technologies-poe2-and-vmte
>>
>
> I've run into the issue where due to init being set to false if zero_tags was set,
> the system does not clear the zero_folio. I just spent a lot of time debugging it :)
>
> Catalin, were you going to send out this patch as a fix to be included in mm-unstable?
> I've for now reverted your __GFP_ZEROTAGS change to get_huge_zero_folio() for my testing
>
> I am on the current mm-new branch.
We have a fix upstream now:
commit 5bebe8de19264946d398ead4e6c20c229454a552
Author: Linus Torvalds <torvalds at linux-foundation.org>
Date: Tue Nov 18 08:21:27 2025 -0800
mm/huge_memory: Fix initialization of huge zero folio
Andrew could consider picking it up as well temporarily to fix the issue
until we rebase on top of the new kernel.
--
Cheers
David
More information about the linux-arm-kernel
mailing list