[linus:master] [nvme] 63dfa10043: fsmark.files_per_sec 6.4% improvement

kernel test robot oliver.sang at intel.com
Fri Mar 15 01:21:13 PDT 2024



Hello,

kernel test robot noticed a 6.4% improvement of fsmark.files_per_sec on:


commit: 63dfa1004322d596417f23da43cdc43cf6298c71 ("nvme: move NVME_QUIRK_DEALLOCATE_ZEROES out of nvme_config_discard")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: fsmark
test machine: 96 threads 2 sockets Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz (Cascade Lake) with 128G memory
parameters:

	iterations: 8
	disk: 1SSD
	nr_threads: 16
	fs: ext4
	filesize: 8K
	test_size: 75G
	sync_method: fsyncBeforeClose
	nr_directories: 16d
	nr_files_per_directory: 256fpd
	cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240315/202403151552.e3809b61-oliver.sang@intel.com

=========================================================================================
compiler/cpufreq_governor/disk/filesize/fs/iterations/kconfig/nr_directories/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase:
  gcc-12/performance/1SSD/8K/ext4/8/x86_64-rhel-8.3/16d/256fpd/16/debian-12-x86_64-20240206.cgz/fsyncBeforeClose/lkp-csl-2sp3/75G/fsmark

commit: 
  152694c829 ("nvme: set max_hw_sectors unconditionally")
  63dfa10043 ("nvme: move NVME_QUIRK_DEALLOCATE_ZEROES out of nvme_config_discard")

152694c82950a093 63dfa1004322d596417f23da43c 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    492322 ±  8%     +15.1%     566574 ±  2%  meminfo.Active(anon)
    501325 ±  8%     +15.0%     576573 ±  2%  meminfo.Shmem
    458144 ± 18%     +22.6%     561659 ±  2%  numa-meminfo.node1.Active(anon)
    462634 ± 18%     +22.6%     567357 ±  2%  numa-meminfo.node1.Shmem
    114517 ± 18%     +22.6%     140395 ±  2%  numa-vmstat.node1.nr_active_anon
    115654 ± 18%     +22.6%     141838 ±  2%  numa-vmstat.node1.nr_shmem
    114517 ± 18%     +22.6%     140395 ±  2%  numa-vmstat.node1.nr_zone_active_anon
    396.50          +745.6%       3353 ±181%  vmstat.memory.buff
    201414            +6.0%     213473        vmstat.system.cs
     57760            +5.4%      60879        vmstat.system.in
     22022 ±  2%      +6.4%      23432        fsmark.files_per_sec
    502.56            -5.9%     472.94        fsmark.time.elapsed_time
    502.56            -5.9%     472.94        fsmark.time.elapsed_time.max
    243.62 ±  2%      +5.0%     255.75        fsmark.time.percent_of_cpu_this_job_got
    123079 ±  8%     +15.1%     141624 ±  2%  proc-vmstat.nr_active_anon
      8462            +2.1%       8637        proc-vmstat.nr_mapped
    125342 ±  8%     +15.0%     144138 ±  2%  proc-vmstat.nr_shmem
    123079 ±  8%     +15.1%     141624 ±  2%  proc-vmstat.nr_zone_active_anon
    140970 ±  7%     +14.1%     160889 ±  2%  proc-vmstat.pgactivate
 3.617e+08            -3.7%  3.483e+08        proc-vmstat.pgpgout
      2.10 ±  9%      -0.2        1.85 ±  3%  perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
      2.08 ±  9%      -0.2        1.84 ±  3%  perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
      0.99 ± 20%      +0.4        1.37 ± 11%  perf-profile.calltrace.cycles-pp.jbd2__journal_start.ext4_do_writepages.ext4_writepages.do_writepages.filemap_fdatawrite_wbc
      0.50 ± 60%      +0.4        0.89 ± 14%  perf-profile.calltrace.cycles-pp.add_transaction_credits.start_this_handle.jbd2__journal_start.ext4_do_writepages.ext4_writepages
      2.50 ± 10%      -0.3        2.20 ±  3%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      2.48 ± 10%      -0.3        2.19 ±  3%  perf-profile.children.cycles-pp.hrtimer_interrupt
      0.24 ±  6%      +0.0        0.27 ±  6%  perf-profile.children.cycles-pp.ext4_dirty_inode
      0.19 ± 11%      +0.1        0.24 ±  6%  perf-profile.children.cycles-pp.ext4_block_bitmap_csum_set
 1.107e+09            +6.6%   1.18e+09        perf-stat.i.branch-instructions
    202521            +6.1%     214902        perf-stat.i.context-switches
 1.322e+10 ±  2%      +6.7%   1.41e+10        perf-stat.i.cpu-cycles
  5.46e+09            +6.6%  5.818e+09        perf-stat.i.instructions
      2.11            +6.2%       2.24        perf-stat.i.metric.K/sec
 1.105e+09            +6.6%  1.178e+09        perf-stat.ps.branch-instructions
    202013            +6.1%     214333        perf-stat.ps.context-switches
 1.319e+10 ±  2%      +6.7%  1.407e+10        perf-stat.ps.cpu-cycles
 5.448e+09            +6.5%  5.805e+09        perf-stat.ps.instructions




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki




More information about the Linux-nvme mailing list