[linux-next:master] [randomize_kstack] a96ef5848c: will-it-scale.per_thread_ops 7.7% improvement

kernel test robot oliver.sang at intel.com
Tue Mar 31 01:41:40 PDT 2026



Hello,

kernel test robot noticed a 7.7% improvement of will-it-scale.per_thread_ops on:


commit: a96ef5848cb096226bf6aff31a90d8b136d99b71 ("randomize_kstack: Unify random source across arches")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master


testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E  CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:

	nr_task: 100%
	mode: thread
	test: lseek1
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+--------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops 4.7% improvement |
| test parameters  | cpufreq_governor=performance                                 |
|                  | mode=thread                                                  |
|                  | nr_task=100%                                                 |
|                  | test=getppid1                                                |
+------------------+--------------------------------------------------------------+



Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260331/202603311659.6aa92f2c-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-14/performance/x86_64-rhel-9.4/thread/100%/debian-13-x86_64-20250902.cgz/lkp-srf-2sp2/lseek1/will-it-scale

commit: 
  37beb42560 ("randomize_kstack: Maintain kstack_offset per task")
  a96ef5848c ("randomize_kstack: Unify random source across arches")

37beb42560165869 a96ef5848cb096226bf6aff31a9 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 1.474e+09            +7.7%  1.588e+09        will-it-scale.192.threads
   7675604            +7.7%    8270154        will-it-scale.per_thread_ops
 1.474e+09            +7.7%  1.588e+09        will-it-scale.workload
     37.77           -16.4%      31.57        vmstat.cpu.us
     60.95            +6.2       67.17        mpstat.cpu.all.sys%
     38.17            -6.3       31.90        mpstat.cpu.all.usr%
      0.96           +17.7%       1.13        turbostat.IPC
    442.92            +3.6%     459.07        turbostat.PkgWatt
      0.01 ± 22%     -28.9%       0.01 ± 20%  perf-stat.i.MPKI
 1.295e+11           +11.4%  1.442e+11        perf-stat.i.branch-instructions
   7450278 ± 77%    +144.3%   18200516 ± 35%  perf-stat.i.branch-misses
   4646936 ±  5%     -10.0%    4180940 ±  5%  perf-stat.i.cache-references
      1.04           -15.2%       0.88        perf-stat.i.cpi
 5.856e+11           +18.0%  6.907e+11        perf-stat.i.instructions
      0.96           +18.0%       1.13        perf-stat.i.ipc
      0.00 ±  3%     -15.2%       0.00 ±  2%  perf-stat.overall.MPKI
      0.01 ± 77%      +0.0        0.01 ± 35%  perf-stat.overall.branch-miss-rate%
     10.98 ±  5%      +1.3       12.24 ±  5%  perf-stat.overall.cache-miss-rate%
      1.04           -15.3%       0.88        perf-stat.overall.cpi
      0.96           +18.0%       1.13        perf-stat.overall.ipc
    119899            +9.5%     131347        perf-stat.overall.path-length
  1.29e+11           +11.4%  1.437e+11        perf-stat.ps.branch-instructions
   7425104 ± 77%    +144.1%   18126281 ± 35%  perf-stat.ps.branch-misses
   4734855 ±  5%     -10.3%    4248990 ±  5%  perf-stat.ps.cache-references
 5.837e+11           +18.0%  6.885e+11        perf-stat.ps.instructions
 1.767e+14           +18.0%  2.086e+14        perf-stat.total.instructions
      8.77 ±  2%      -8.8        0.00        perf-profile.calltrace.cycles-pp.arch_exit_to_user_mode_prepare.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
     39.72            -8.2       31.51 ±  5%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
     42.72            -8.0       34.73 ±  5%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.llseek
     26.12            -2.5       23.64 ±  4%  perf-profile.calltrace.cycles-pp.__x64_sys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
     13.12 ±  3%      -1.7       11.38 ±  6%  perf-profile.calltrace.cycles-pp.fdget_pos.__x64_sys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
      7.17 ±  3%      -1.4        5.74 ±  7%  perf-profile.calltrace.cycles-pp.__fget_files.fdget_pos.__x64_sys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe
      4.86            -0.3        4.53 ±  4%  perf-profile.calltrace.cycles-pp.mutex_unlock.__x64_sys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
      1.39            +0.3        1.68 ±  7%  perf-profile.calltrace.cycles-pp.lseek at plt
      2.48 ±  2%      +0.5        3.02 ±  8%  perf-profile.calltrace.cycles-pp.testcase
      1.20 ±  5%      +1.2        2.39 ±  8%  perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
      0.00            +1.4        1.40 ±  8%  perf-profile.calltrace.cycles-pp.prandom_u32_state.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
     39.88            +7.5       47.39 ±  4%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.llseek
      8.77 ±  2%      -8.5        0.29 ±  5%  perf-profile.children.cycles-pp.arch_exit_to_user_mode_prepare
     39.86            -8.2       31.69 ±  5%  perf-profile.children.cycles-pp.do_syscall_64
     42.77            -8.0       34.78 ±  5%  perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     26.26            -2.4       23.83 ±  4%  perf-profile.children.cycles-pp.__x64_sys_lseek
     13.22 ±  3%      -1.8       11.44 ±  6%  perf-profile.children.cycles-pp.fdget_pos
      7.24 ±  3%      -1.4        5.79 ±  7%  perf-profile.children.cycles-pp.__fget_files
     98.29            -0.4       97.92        perf-profile.children.cycles-pp.llseek
      4.92            -0.3        4.59 ±  4%  perf-profile.children.cycles-pp.mutex_unlock
      0.20 ±  2%      -0.0        0.18 ±  2%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      0.19            -0.0        0.16 ±  2%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
      0.18            -0.0        0.15 ±  3%  perf-profile.children.cycles-pp.hrtimer_interrupt
      0.18 ±  2%      -0.0        0.16 ±  2%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      0.10 ±  3%      -0.0        0.08 ±  4%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      0.10            -0.0        0.08 ±  4%  perf-profile.children.cycles-pp.tick_nohz_handler
      0.08            -0.0        0.06 ±  7%  perf-profile.children.cycles-pp.update_process_times
      0.76 ±  2%      +0.2        0.92 ±  7%  perf-profile.children.cycles-pp.lseek at plt
      2.43 ±  2%      +0.5        2.95 ±  8%  perf-profile.children.cycles-pp.testcase
      1.22 ±  4%      +1.2        2.45 ±  8%  perf-profile.children.cycles-pp.x64_sys_call
      0.00            +1.4        1.40 ±  8%  perf-profile.children.cycles-pp.prandom_u32_state
     26.94            +4.7       31.60 ±  4%  perf-profile.children.cycles-pp.entry_SYSCALL_64
      8.72 ±  2%      -8.5        0.23 ±  5%  perf-profile.self.cycles-pp.arch_exit_to_user_mode_prepare
      7.18 ±  3%      -1.4        5.78 ±  7%  perf-profile.self.cycles-pp.__fget_files
      4.84            -0.3        4.52 ±  4%  perf-profile.self.cycles-pp.mutex_unlock
      1.07 ±  3%      -0.1        0.94 ±  7%  perf-profile.self.cycles-pp.fdget_pos
      0.06 ±  7%      +0.0        0.08 ± 10%  perf-profile.self.cycles-pp.lseek at plt
      3.62            +0.2        3.83 ±  4%  perf-profile.self.cycles-pp.do_syscall_64
      1.68 ±  2%      +0.4        2.05 ±  8%  perf-profile.self.cycles-pp.testcase
      1.16 ±  5%      +1.3        2.41 ±  8%  perf-profile.self.cycles-pp.x64_sys_call
      0.00            +1.3        1.33 ±  8%  perf-profile.self.cycles-pp.prandom_u32_state
     13.18 ±  2%      +1.7       14.89 ±  6%  perf-profile.self.cycles-pp.entry_SYSCALL_64
     18.28            +3.9       22.16 ±  5%  perf-profile.self.cycles-pp.llseek


***************************************************************************************************

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-14/performance/x86_64-rhel-9.4/thread/100%/debian-13-x86_64-20250902.cgz/lkp-srf-2sp2/getppid1/will-it-scale

commit: 
  37beb42560 ("randomize_kstack: Maintain kstack_offset per task")
  a96ef5848c ("randomize_kstack: Unify random source across arches")

37beb42560165869 a96ef5848cb096226bf6aff31a9 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 1.987e+09            +4.7%  2.079e+09        will-it-scale.192.threads
  10346487            +4.7%   10828131        will-it-scale.per_thread_ops
 1.987e+09            +4.7%  2.079e+09        will-it-scale.workload
      0.79           +20.3%       0.95        turbostat.IPC
     53.28            +3.8       57.11        mpstat.cpu.all.sys%
     45.85            -3.8       42.01        mpstat.cpu.all.usr%
 1.111e+11           +10.2%  1.225e+11        perf-stat.i.branch-instructions
   4803948 ±  2%      +9.6%    5267473 ±  5%  perf-stat.i.cache-references
      1.27           -17.3%       1.05        perf-stat.i.cpi
 4.821e+11           +21.0%  5.833e+11        perf-stat.i.instructions
      0.79           +20.9%       0.95        perf-stat.i.ipc
      0.00 ±  4%     -18.4%       0.00 ±  3%  perf-stat.overall.MPKI
      0.01 ± 60%      -0.0        0.00        perf-stat.overall.branch-miss-rate%
      1.27           -17.3%       1.05        perf-stat.overall.cpi
      0.79           +21.0%       0.95        perf-stat.overall.ipc
     73248           +15.6%      84666        perf-stat.overall.path-length
 1.107e+11           +10.2%  1.221e+11        perf-stat.ps.branch-instructions
   4903095 ±  2%      +9.2%    5356604 ±  4%  perf-stat.ps.cache-references
 4.806e+11           +21.0%  5.814e+11        perf-stat.ps.instructions
 1.455e+14           +21.0%   1.76e+14        perf-stat.total.instructions
      4.78 ± 16%      -2.7        2.06 ±  5%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.getppid
      5.09 ±  9%      -2.5        2.58 ± 10%  perf-profile.calltrace.cycles-pp.__task_pid_nr_ns.__x64_sys_getppid.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
      5.69 ±  8%      -2.5        3.18 ± 10%  perf-profile.calltrace.cycles-pp.__x64_sys_getppid.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
      6.88 ±  3%      -2.2        4.72 ± 16%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.getppid
      1.36 ±  2%      +1.0        2.38 ± 13%  perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
      0.00            +1.5        1.50 ± 15%  perf-profile.calltrace.cycles-pp.getppid at plt
      0.00            +1.9        1.88 ± 14%  perf-profile.calltrace.cycles-pp.prandom_u32_state.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
      5.46 ± 17%      +2.1        7.56 ±  5%  perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.getppid
     10.78 ±  5%      +3.8       14.62 ± 10%  perf-profile.calltrace.cycles-pp.testcase
     44.13 ±  7%      -5.9       38.20 ±  5%  perf-profile.children.cycles-pp.entry_SYSCALL_64
      4.84 ± 16%      -2.7        2.16 ±  6%  perf-profile.children.cycles-pp.syscall_return_via_sysret
      5.85 ±  9%      -2.6        3.28 ± 11%  perf-profile.children.cycles-pp.__x64_sys_getppid
      5.17 ±  9%      -2.5        2.64 ± 11%  perf-profile.children.cycles-pp.__task_pid_nr_ns
      6.04 ±  3%      -1.8        4.22 ± 15%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      2.12 ± 10%      -1.7        0.38 ± 12%  perf-profile.children.cycles-pp.arch_exit_to_user_mode_prepare
     98.57            -0.6       97.99        perf-profile.children.cycles-pp.getppid
      0.51 ±  7%      +0.3        0.83 ± 14%  perf-profile.children.cycles-pp.getppid at plt
      1.44 ±  2%      +1.0        2.42 ± 13%  perf-profile.children.cycles-pp.x64_sys_call
      0.00            +1.9        1.88 ± 14%  perf-profile.children.cycles-pp.prandom_u32_state
      6.32 ±  5%      +2.4        8.68 ± 11%  perf-profile.children.cycles-pp.testcase
     20.95 ±  9%     +10.1       31.04 ±  9%  perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
     23.16 ±  6%      -7.0       16.11 ± 16%  perf-profile.self.cycles-pp.entry_SYSCALL_64
      4.79 ± 16%      -2.6        2.16 ±  6%  perf-profile.self.cycles-pp.syscall_return_via_sysret
      5.09 ±  9%      -2.5        2.60 ± 11%  perf-profile.self.cycles-pp.__task_pid_nr_ns
      2.04 ± 10%      -1.7        0.33 ±  7%  perf-profile.self.cycles-pp.arch_exit_to_user_mode_prepare
      4.56 ±  3%      -1.3        3.25 ± 15%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      4.30 ±  2%      -1.1        3.19 ± 15%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      1.40 ±  7%      +0.6        1.98 ± 12%  perf-profile.self.cycles-pp.testcase
      1.35 ±  2%      +1.0        2.35 ± 13%  perf-profile.self.cycles-pp.x64_sys_call
      0.00            +1.7        1.68 ± 15%  perf-profile.self.cycles-pp.prandom_u32_state
     20.89 ±  9%     +10.1       30.94 ± 10%  perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki




More information about the linux-riscv mailing list