[resend PATCH 2/2] dim: pass dim_sample to net_dim() by reference

Kiyanovski, Arthur akiyano at amazon.com
Thu Oct 31 11:28:19 PDT 2024


> -----Original Message-----
> From: Caleb Sander Mateos <csander at purestorage.com>
> Sent: Wednesday, October 30, 2024 5:23 PM
> 
> net_dim() is currently passed a struct dim_sample argument by value.
> struct dim_sample is 24 bytes. Since this is greater 16 bytes, x86-64 passes it
> on the stack. All callers have already initialized dim_sample on the stack, so
> passing it by value requires pushing a duplicated copy to the stack. Either
> witing to the stack and immediately reading it, or perhaps dereferencing
> addresses relative to the stack pointer in a chain of push instructions, seems
> to perform quite poorly.
> 
> In a heavy TCP workload, mlx5e_handle_rx_dim() consumes 3% of CPU time,
> 94% of which is attributed to the first push instruction to copy dim_sample on
> the stack for the call to net_dim():
> // Call ktime_get()
>   0.26 |4ead2:   call   4ead7 <mlx5e_handle_rx_dim+0x47>
> // Pass the address of struct dim in %rdi
>        |4ead7:   lea    0x3d0(%rbx),%rdi
> // Set dim_sample.pkt_ctr
>        |4eade:   mov    %r13d,0x8(%rsp)
> // Set dim_sample.byte_ctr
>        |4eae3:   mov    %r12d,0xc(%rsp)
> // Set dim_sample.event_ctr
>   0.15 |4eae8:   mov    %bp,0x10(%rsp)
> // Duplicate dim_sample on the stack
>  94.16 |4eaed:   push   0x10(%rsp)
>   2.79 |4eaf1:   push   0x10(%rsp)
>   0.07 |4eaf5:   push   %rax
> // Call net_dim()
>   0.21 |4eaf6:   call   4eafb <mlx5e_handle_rx_dim+0x6b>
> 
> To allow the caller to reuse the struct dim_sample already on the stack, pass
> the struct dim_sample by reference to net_dim().
> 
> Signed-off-by: Caleb Sander Mateos <csander at purestorage.com>
> ---

Thank you for this patch.

For the ENA part:

Reviewed-by: Arthur Kiyanovski <akiyano at amazon.com>

Thanks,
Arthur



More information about the Linux-mediatek mailing list