[resend PATCH 2/2] dim: pass dim_sample to net_dim() by reference
Kiyanovski, Arthur
akiyano at amazon.com
Thu Oct 31 11:28:19 PDT 2024
> -----Original Message-----
> From: Caleb Sander Mateos <csander at purestorage.com>
> Sent: Wednesday, October 30, 2024 5:23 PM
>
> net_dim() is currently passed a struct dim_sample argument by value.
> struct dim_sample is 24 bytes. Since this is greater 16 bytes, x86-64 passes it
> on the stack. All callers have already initialized dim_sample on the stack, so
> passing it by value requires pushing a duplicated copy to the stack. Either
> witing to the stack and immediately reading it, or perhaps dereferencing
> addresses relative to the stack pointer in a chain of push instructions, seems
> to perform quite poorly.
>
> In a heavy TCP workload, mlx5e_handle_rx_dim() consumes 3% of CPU time,
> 94% of which is attributed to the first push instruction to copy dim_sample on
> the stack for the call to net_dim():
> // Call ktime_get()
> 0.26 |4ead2: call 4ead7 <mlx5e_handle_rx_dim+0x47>
> // Pass the address of struct dim in %rdi
> |4ead7: lea 0x3d0(%rbx),%rdi
> // Set dim_sample.pkt_ctr
> |4eade: mov %r13d,0x8(%rsp)
> // Set dim_sample.byte_ctr
> |4eae3: mov %r12d,0xc(%rsp)
> // Set dim_sample.event_ctr
> 0.15 |4eae8: mov %bp,0x10(%rsp)
> // Duplicate dim_sample on the stack
> 94.16 |4eaed: push 0x10(%rsp)
> 2.79 |4eaf1: push 0x10(%rsp)
> 0.07 |4eaf5: push %rax
> // Call net_dim()
> 0.21 |4eaf6: call 4eafb <mlx5e_handle_rx_dim+0x6b>
>
> To allow the caller to reuse the struct dim_sample already on the stack, pass
> the struct dim_sample by reference to net_dim().
>
> Signed-off-by: Caleb Sander Mateos <csander at purestorage.com>
> ---
Thank you for this patch.
For the ENA part:
Reviewed-by: Arthur Kiyanovski <akiyano at amazon.com>
Thanks,
Arthur
More information about the Linux-mediatek
mailing list