[PATCH v2 14/14] arm64/mm: Add ptep_get_and_clear_full() to optimize process teardown

Ryan Roberts ryan.roberts at arm.com
Tue Nov 28 03:15:19 PST 2023


On 28/11/2023 07:32, Barry Song wrote:
>> +#define __HAVE_ARCH_PTEP_GET_AND_CLEAR_FULL
>> +static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm,
>> +				unsigned long addr, pte_t *ptep, int full)
>> +{
>> +	pte_t orig_pte = __ptep_get(ptep);
>> +
>> +	if (!pte_valid_cont(orig_pte) || !full) {
>> +		contpte_try_unfold(mm, addr, ptep, orig_pte);
>> +		return __ptep_get_and_clear(mm, addr, ptep);
>> +	} else
>> +		return contpte_ptep_get_and_clear_full(mm, addr, ptep);
>> +}
>> +
> 
> Hi Ryan,
> 
> I feel quite hard to understand the code. when !pte_valid_cont(orig_pte),
> we will call contpte_try_unfold(mm, addr, ptep, orig_pte);
> 
> but in contpte_try_unfold(), we call unfold only if pte_valid_cont()
> is true:
> static inline void contpte_try_unfold(struct mm_struct *mm, unsigned long addr,
>                                         pte_t *ptep, pte_t pte) 
> {
>         if (contpte_is_enabled(mm) && pte_valid_cont(pte))
>                 __contpte_try_unfold(mm, addr, ptep, pte);
> }
> 
> so do you mean the below?
> 
> if (!pte_valid_cont(orig_pte))
> 	return __ptep_get_and_clear(mm, addr, ptep);
> 
> if (!full) {
> 	contpte_try_unfold(mm, addr, ptep, orig_pte);
> 	return __ptep_get_and_clear(mm, addr, ptep);	
> } else {
> 	return contpte_ptep_get_and_clear_full(mm, addr, ptep);
> }

Yes, this is equivalent. In general, I was trying not to spray `if
(pte_valid_cont(orig_pte))` checks everywhere to guard contpte_try_unfold() and
instead put the checks into contpte_try_unfold() (hence the 'try'). I figured
just calling it unconditionally and letting the compiler optimize as it sees fit
was the cleanest approach.

But in this instance I can see this is confusing. I'll modify as you suggest.
Thanks!

> 
> Thanks
> Barry
> 
> 




More information about the linux-arm-kernel mailing list