[PATCH 2/3] arm64: pageattr: Use walk_page_range_novma() to change memory permissions

Dev Jain dev.jain at arm.com
Fri Jun 6 04:08:18 PDT 2025


On 06/06/25 4:26 pm, Lorenzo Stoakes wrote:
> On Fri, Jun 06, 2025 at 04:09:51PM +0530, Dev Jain wrote:
>> On 06/06/25 3:19 pm, Lorenzo Stoakes wrote:
>>> On Fri, May 30, 2025 at 02:34:06PM +0530, Dev Jain wrote:
>>>> Move away from apply_to_page_range(), which does not honour leaf mappings,
>>>> to walk_page_range_novma(). The callbacks emit a warning and return EINVAL
>>>> if a partial range is detected.
>>> Hm a follow up question here - why not just improve apply_to_page_range() to
>>> honour leaf mappings?
>>>
>>> What does honouring leaf mappings actually mean? You mean handling huge pages?
>> Sorry, I always confuse between block, page and leaf mappings :) I mean to say
>> block mappings, yes, huge pages.
> Sometimes I think we like to give different names to things just to make life
> confusing ;)
>
>>> Would it be all that difficult to implement?
>> That is how I did it initially. But I think we get into the same problem
>> which you are describing w.r.t extending walk_page_range_novma - currently we
>> return EINVAL in case we encounter a block mapping in apply_to_page_range,
>> basically asserting that the callers always operate on page mappings. Removing this
>> assertion and generalizing apply_to_page_range kind of sounds the same as
>> removing the locking assertion and generalizing walk_page_range_novma...
> (Again keep in mind walk_page_range_novma no longer exists :)

Ya I mean the pagewalk API.

>
> Yeah it's problematic I guess in that you have a pte_fn_t and would have to get
> into gross stuff like pretending a PMD entry is a PTE entry etc.

Yes, since the pagewalk API has level callbacks it makes life easier.

>
> Ugh god why do we do this to ourselves.

I know right :)

>
>>> It seems like you're pushing a bunch of the 'applying' logic over from there to
>>> a walker that isn't maybe best suited to it and having to introduce an iffy new
>>> form of locking...
>> IMHO I think it is the reverse. Commit aee16b3cee2746880e40945a9b5bff4f309cfbc4
>> introduces apply_to_page_range, and commit e6473092bd9116583ce9ab8cf1b6570e1aa6fc83
>> introduces pagewalk. The commit messages say that the former is meant to be used
>> on page mappings, while the latter is generic. The latter implies that the former
>> was actually never meant to exist...
> What a mess.
>
> Maybe the least-worst solution is to just add a new
> walk_kernel_page_table_range_unlocked() function without an assert and in the
> comment heavily underline that _you must have made sure this is safe_.
>
> This needs revisting in general, I find the use of init_mm.mmap_lock pretty
> gross.

There you said it! I have been reading kernel mapping code for some time and
the entire point of using the init_mm.mmap_lock has been mutual exclusion,
completely throwing out of the window what the mmap_lock actually means.
For example, we take init_mm write lock for ptdump_walk_pgd(), which does
not sound right to me since the only thing ptdump actually does is walk
the pagetables, yet we take the most restrictive lock.

>>> Can we go vice-versa? :)
>>>
>>> Also obviously walk_page_range_novma() doesn't exist any more :P
>>> walk_kernel_page_table_range() is the preferred solution.
>>>
>>>> Signed-off-by: Dev Jain <dev.jain at arm.com>
>>>> ---
>>>>    arch/arm64/mm/pageattr.c | 69 +++++++++++++++++++++++++++++++++++++---
>>>>    1 file changed, 64 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
>>>> index 39fd1f7ff02a..a5c829c64969 100644
>>>> --- a/arch/arm64/mm/pageattr.c
>>>> +++ b/arch/arm64/mm/pageattr.c
>>>> @@ -8,6 +8,7 @@
>>>>    #include <linux/mem_encrypt.h>
>>>>    #include <linux/sched.h>
>>>>    #include <linux/vmalloc.h>
>>>> +#include <linux/pagewalk.h>
>>>>
>>>>    #include <asm/cacheflush.h>
>>>>    #include <asm/pgtable-prot.h>
>>>> @@ -20,6 +21,67 @@ struct page_change_data {
>>>>    	pgprot_t clear_mask;
>>>>    };
>>>>
>>>> +static pteval_t set_pageattr_masks(unsigned long val, struct mm_walk *walk)
>>>> +{
>>>> +	struct page_change_data *masks = walk->private;
>>>> +	unsigned long new_val = val;
>>>> +
>>>> +	new_val &= ~(pgprot_val(masks->clear_mask));
>>>> +	new_val |= (pgprot_val(masks->set_mask));
>>>> +
>>>> +	return new_val;
>>>> +}
>>>> +
>>>> +static int pageattr_pud_entry(pud_t *pud, unsigned long addr,
>>>> +			      unsigned long next, struct mm_walk *walk)
>>>> +{
>>>> +	pud_t val = pudp_get(pud);
>>>> +
>>>> +	if (pud_leaf(val)) {
>>>> +		if (WARN_ON_ONCE((next - addr) != PUD_SIZE))
>>>> +			return -EINVAL;
>>>> +		val = __pud(set_pageattr_masks(pud_val(val), walk));
>>>> +		set_pud(pud, val);
>>>> +		walk->action = ACTION_CONTINUE;
>>>> +	}
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static int pageattr_pmd_entry(pmd_t *pmd, unsigned long addr,
>>>> +			      unsigned long next, struct mm_walk *walk)
>>>> +{
>>>> +	pmd_t val = pmdp_get(pmd);
>>>> +
>>>> +	if (pmd_leaf(val)) {
>>>> +		if (WARN_ON_ONCE((next - addr) != PMD_SIZE))
>>>> +			return -EINVAL;
>>>> +		val = __pmd(set_pageattr_masks(pmd_val(val), walk));
>>>> +		set_pmd(pmd, val);
>>>> +		walk->action = ACTION_CONTINUE;
>>>> +	}
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static int pageattr_pte_entry(pte_t *pte, unsigned long addr,
>>>> +			      unsigned long next, struct mm_walk *walk)
>>>> +{
>>>> +	pte_t val = ptep_get(pte);
>>>> +
>>>> +	val = __pte(set_pageattr_masks(pte_val(val), walk));
>>>> +	set_pte(pte, val);
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static const struct mm_walk_ops pageattr_ops = {
>>>> +	.pud_entry	= pageattr_pud_entry,
>>>> +	.pmd_entry	= pageattr_pmd_entry,
>>>> +	.pte_entry	= pageattr_pte_entry,
>>>> +	.walk_lock	= PGWALK_NOLOCK,
>>>> +};
>>>> +
>>>>    bool rodata_full __ro_after_init = IS_ENABLED(CONFIG_RODATA_FULL_DEFAULT_ENABLED);
>>>>
>>>>    bool can_set_direct_map(void)
>>>> @@ -49,9 +111,6 @@ static int change_page_range(pte_t *ptep, unsigned long addr, void *data)
>>>>    	return 0;
>>>>    }
>>>>
>>>> -/*
>>>> - * This function assumes that the range is mapped with PAGE_SIZE pages.
>>>> - */
>>>>    static int __change_memory_common(unsigned long start, unsigned long size,
>>>>    				pgprot_t set_mask, pgprot_t clear_mask)
>>>>    {
>>>> @@ -61,8 +120,8 @@ static int __change_memory_common(unsigned long start, unsigned long size,
>>>>    	data.set_mask = set_mask;
>>>>    	data.clear_mask = clear_mask;
>>>>
>>>> -	ret = apply_to_page_range(&init_mm, start, size, change_page_range,
>>>> -					&data);
>>>> +	ret = walk_page_range_novma(&init_mm, start, start + size,
>>>> +				    &pageattr_ops, NULL, &data);
>>>>
>>>>    	/*
>>>>    	 * If the memory is being made valid without changing any other bits
>>>> --
>>>> 2.30.2
>>>>



More information about the linux-arm-kernel mailing list