[PATCH] KVM: arm64: Consider NUMA affinity when allocating per-CPU stack_page

Marc Zyngier maz at kernel.org
Mon Apr 15 00:36:14 PDT 2024


On Mon, 15 Apr 2024 04:36:14 +0100,
Li RongQing <lirongqing at baidu.com> wrote:
> 
> per-CPU stack_page are dominantly accessed from their own local CPUs,
> so allocate them node-local to improve performance.

Do you have any performance data to back this up?

Given that this is only used in the non-VHE case, and that by doing so
you have left quite a lot of performance on the floor already, I'm
even more surprised to see the performance argument.

Furthermore, you don't address the allocation of per-CPU data, which
has a much larger potential impact, given how the HYP code is
structured.

> 
> Signed-off-by: Li RongQing <lirongqing at baidu.com>
> ---
>  arch/arm64/kvm/arm.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index c4a0a35..d745d01 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -2330,15 +2330,15 @@ static int __init init_hyp_mode(void)
>  	 * Allocate stack pages for Hypervisor-mode
>  	 */
>  	for_each_possible_cpu(cpu) {
> -		unsigned long stack_page;
> +		struct page *page;
>  
> -		stack_page = __get_free_page(GFP_KERNEL);
> -		if (!stack_page) {
> +		page = alloc_pages_node(cpu_to_node(cpu), GFP_KERNEL, 0);
> +		if (!page) {
>  			err = -ENOMEM;
>  			goto out_err;
>  		}
>  
> -		per_cpu(kvm_arm_hyp_stack_page, cpu) = stack_page;
> +		per_cpu(kvm_arm_hyp_stack_page, cpu) = page_address(page);
>  	}
>  
>  	/*

The patch itself looks correct, however partial and lacking some
convincing data.

	M.

-- 
Without deviation from the norm, progress is not possible.



More information about the linux-arm-kernel mailing list