[v6 0/9] Improve RISC-V Perf support using SBI PMU and sscofpmf extension

Nikita Shubin nikita.shubin at maquefel.me
Fri Mar 11 23:51:28 PST 2022


Hello Atish, thank you for your patches.

Tested you series on the top of v5.17-rc7, both on Qemu and HiFive
Unmatched.

Got similar results, everything is working fine.

HiFive Unmatched:
=================

perf stat -e cycles -e instructions -e L1-icache-load-misses -e
branches -e branch-misses \ -e r0000000000000200 -e r0000000000000400 \
-e r0000000000000800 perf bench sched messaging -g 25 -l 15

# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 25 groups == 1000 processes run

     Total time: 0.780 [sec]

 Performance counter stats for 'perf bench sched messaging -g 25 -l 15':

        3046119040      cycles
                      (68.35%) 1117366934      instructions
         #    0.37  insn per cycle           (77.53%) 0
        L1-icache-load-misses
        (72.86%) 149748628      branches
                              (69.44%) 40937064      branch-misses
               #   27.34% of all branches          (35.17%) 175959749
           r0000000000000200
         (43.85%) 117885368      r0000000000000400
                               (45.45%) 6396494      r0000000000000800
                                                   (49.92%)

       1.714602000 seconds time elapsed

       1.091126000 seconds user
       3.391938000 seconds sys

Series was tested on the top of Provide a fraemework for RISC-V ISA
extensions:

https://patchwork.kernel.org/project/linux-riscv/list/?series=616887

with Mario "Introduce pmu-events support for HiFive Unmatched" series
applied: 

https://patchwork.kernel.org/project/linux-riscv/list/?series=581071

With "riscv: Fix fill_callchain return value" patch - i was even able
to produce some flame graphs with "perf record".

"perf list pmu" gives exact u740 mapped events, so 

Tested-by: Nikita Shubin <n.shubin at yadro.com>

To all mentioned series.

Yours, 
Nikita Shubin

> This series adds improved perf support for RISC-V based system using
> SBI PMU extension[1] and Sscofpmf extension[2]. The SBI PMU extension
> allows the kernel to program the counters for different events and
> start/stop counters while the sscofpmf extension allows the counter
> overflow interrupt and privilege mode filtering. An hardware platform
> can leverage SBI PMU extension without the sscofpmf extension if it
> supports mcounteren at least. Perf stat will work but record won't
> work as sscofpmf & mcountinhibit is required to support that. A
> platform can support both features event counting and sampling using
> perf tool only if sscofpmf is supported. 
> 
> This series introduces a platform perf driver instead of a existing
> arch specific implementation. The new perf implementation has adopted
> a modular approach where most of the generic event handling is done
> in the core library while individual PMUs need to only implement
> necessary features specific to the PMU. This is easily extensible and
> any future RISC-V PMU implementation can leverage this. Currently,
> SBI PMU driver & legacy PMU driver are implemented as a part of this
> series.
> 
> The legacy driver tries to reimplement the existing minimal perf
> under a new config to maintain backward compatibility. This
> implementation only allows monitoring of always running
> cycle/instruction counters. Moreover, they can not be started or
> stopped. In general, this is very limited and not very useful. That's
> why, I am not very keen to carry the support into the new driver.
> However, I don't want to break perf for any existing hardware
> platforms. If everybody agrees that we don't need legacy perf
> implementation for older implementation, I will be happy to drop
> PATCH 4.
> 
> This series has been tested in Qemu (RV64 & RV32) and HiFive
> Unmatched. Qemu patches[4] and OpenSBI v1.0 is required to test it on
> Qemu and a dt patch required in U-Boot[5] for HiFive Unmatched. Qemu
> changes are not backward compatible. That means, you can not use perf
> anymore on older Qemu versions with latest OpenSBI and/or Kernel.
> However, newer kernel will just use legacy pmu driver if old OpenSBI
> is detected.
> 
> The U-Boot patch is just an example that encodes few of the events
> defined in fu740 documentation [6] in the DT. We can update the DT to
> include all the events defined if required.
> 
> This series depends on the ISA extension parsing series[7].
> 
> Here is an output of perf stat/report while running perf benchmark
> with OpenSBI, Linux kernel and U-Boot patches applied.
> 
> HiFive Unmatched:
> =================
> perf stat -e cycles -e instructions -e L1-icache-load-misses -e
> branches -e branch-misses \ -e r0000000000000200 -e r0000000000000400
> \ -e r0000000000000800 perf bench sched messaging -g 25 -l 15
> 
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 25 groups == 1000 processes run
> 
>      Total time: 0.826 [sec]
> 
>  Performance counter stats for 'perf bench sched messaging -g 25 -l
> 15':
> 
>         3426710073      cycles                (65.92%)
>         1348772808      instructions          #0.39  insn per cycle
> (75.44%) 0      L1-icache-load-misses (72.28%)
>          201133996      branches              (67.88%)
>           44663584      branch-misses         #22.21% of all branches
> (35.01%) 248194747      r0000000000000200     (41.94%) --> Integer
> load instruction retired 156879950      r0000000000000400
> (43.58%) --> Integer store instruction retired 6988678
> r0000000000000800     (47.91%) --> Atomic memory operation retired
> 
>        1.931335000 seconds time elapsed
> 
>        1.100415000 seconds user
>        3.755176000 seconds sys
> 
> 
> QEMU:
> =========
> Perf stat:
> =========
> 
> [root at fedora-riscv riscv]# perf stat -e r8000000000000005 -e
> r8000000000000007 \ -e r8000000000000006 -e r0000000000020002 -e
> r0000000000020004 -e branch-misses \ -e cache-misses -e
> dTLB-load-misses -e dTLB-store-misses -e iTLB-load-misses \ -e cycles
> -e instructions perf bench sched messaging -g 15 -l 10 \ Running with
> 15*40 (== 600) tasks. Time: 6.578
> 
>  Performance counter stats for './hackbench -pipe 15 process':
> 
>              1,794      r8000000000000005      (52.59%) -->
> SBI_PMU_FW_SET_TIMER 2,859      r8000000000000007      (60.74%) -->
> SBI_PMU_FW_IPI_RECVD 4,205      r8000000000000006      (68.71%) -->
> SBI_PMU_FW_IPI_SENT 0      r0000000000020002      (81.69%)
>      <not counted>      r0000000000020004      (0.00%)
>      <not counted>      branch-misses          (0.00%)
>      <not counted>      cache-misses           (0.00%)
>          7,878,328      dTLB-load-misses       (15.60%)
>            680,270      dTLB-store-misses      (28.45%)
>          8,287,931      iTLB-load-misses       (39.24%)
>     20,008,506,675      cycles                 (48.60%)
>     21,484,427,932      instructions   # 1.07  insn per cycle (56.60%)
> 
>        1.681344735 seconds time elapsed
> 
>        0.614460000 seconds user
>        8.313254000 seconds sys
> 
> 
> [root at fedora-riscv ~]# perf stat -e cycles -e instructions -e
> dTLB-load-misses -e dTLB-store-misses -e iTLB-load-misses \
> > perf bench sched messaging -g 1 -l 10  
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 1 groups == 40 processes run
> 
>      Total time: 0.218 [sec]
> 
>  Performance counter stats for 'perf bench sched messaging -g 1 -l
> 10':
> 
>      3,685,401,394      cycles
> 3,684,529,388      instructions              #    1.00  insn per
> cycle 3,006,042      dTLB-load-misses
> 258,144      dTLB-store-misses
> 1,992,860      iTLB-load-misses
>      
> 
>        0.588717389 seconds time elapsed
> 
>        0.324009000 seconds user
>        0.937087000 seconds sys
> 
> [root at fedora-riscv ~]# perf record -e cycles -e instructions -e
> dTLB-load-misses -e dTLB-store-misses \ -e iTLB-load-misses -c 10000
> perf bench sched messaging -g 1 -l 10 # Running 'sched/messaging'
> benchmark: # 20 sender and receiver processes per group
> # 1 groups == 40 processes run
> 
>      Total time: 2.160 [sec]
> [ perf record: Woken up 11 times to write data ]
> Warning:
> Processed 291769 events and lost 1 chunks!
> 
> [root at fedora-riscv ~]# perf report
> 
> Available samples
> 146K cycles
>          ◆ 146K instructions
>                     ▒ 298 dTLB-load-misses
>                                ▒ 8 dTLB-store-misses
>                                           ▒ 211 iTLB-load-misses  
> 
> [1]
> https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/riscv-sbi.adoc
> [2]
> https://drive.google.com/file/d/171j4jFjIkKdj5LWcExphq4xG_2sihbfd/edit
> [3] https://github.com/atishp04/linux/tree/riscv_pmu_v6 [4]
> https://github.com/atishp04/qemu/tree/riscv_pmu_v5 [5]
> https://github.com/atishp04/u-boot/tree/hifive_unmatched_dt_pmu [6]
> https://sifive.cdn.prismic.io/sifive/de1491e5-077c-461d-9605-e8a0ce57337d_fu740-c000-manual-v1p3.pdf
> [7] https://lkml.org/lkml/2022/2/15/1604
> 
> Changes from v5->v6:
> 1. Split the used counters bitmap to firmware and hardware so that we
> don't have to generate a hardware only used counter mask during
> overflow handling. 2. Stopped all the counters during cpu hotplug to
> restore correct behavior. 3. Addressed all the comments during the v5
> review. 4. Rebased on top of the isa extension framework series and
> 5.17-rc4.
> 
> Changes from v4->v5:
> 1. Fixed few corner case issues in perf interrupt handling.
> 2. Changed the set_period API so that the caller can compute the
> initialize value.
> 3. Fixed the per cpu interrupt enablement issue.
> 4. Fixed a bug for the privilege mode filtering.
> 5. Modified the sbi driver independent of the DT.
> 6. Removed any DT related modifications.
> 
> Changes from v3->v4:
> 1. Do not proceed overflow handler if event doesn't set for sampling.
> 2. overflow status register is only read after counters are stopped.
> 3. Added the PMU DT node for HiFive Unmatched.
> 
> Changes from v2->v3:
> 1. Added interrupt overflow support.
> 2. Cleaned up legacy driver initialization.
> 3. Supports perf record now.
> 4. Added the DT binding and maintainers file.
> 5. Changed cpu hotplug notifier to be multi-state.
> 6. OpenSBI doesn't disable cycle/instret counter during boot. Update
> the perf code to disable all the counter during the boot.
> 
> Changes from v1->v2
> 1. Implemented the latest SBI PMU extension specification.
> 2. The core platform driver was changed to operate as a library while
> only sbi based PMU is built as a driver. The legacy one is just a
> fallback if SBI PMU extension is not available.
> 
> Atish Patra (9):
> RISC-V: Remove the current perf implementation
> RISC-V: Add CSR encodings for all HPMCOUNTERS
> RISC-V: Add a perf core library for pmu drivers
> RISC-V: Add a simple platform driver for RISC-V legacy perf
> RISC-V: Add RISC-V SBI PMU extension definitions
> RISC-V: Add perf platform driver based on SBI PMU extension
> RISC-V: Add sscofpmf extension support
> Documentation: riscv: Remove the old documentation
> MAINTAINERS: Add entry for RISC-V PMU drivers
> 
> Documentation/riscv/pmu.rst         | 255 ---------
> MAINTAINERS                         |   9 +
> arch/riscv/Kconfig                  |  13 -
> arch/riscv/include/asm/csr.h        |  66 ++-
> arch/riscv/include/asm/hwcap.h      |   1 +
> arch/riscv/include/asm/perf_event.h |  72 ---
> arch/riscv/include/asm/sbi.h        |  95 ++++
> arch/riscv/kernel/Makefile          |   1 -
> arch/riscv/kernel/cpu.c             |   1 +
> arch/riscv/kernel/cpufeature.c      |   2 +
> arch/riscv/kernel/perf_event.c      | 485 -----------------
> drivers/perf/Kconfig                |  30 ++
> drivers/perf/Makefile               |   3 +
> drivers/perf/riscv_pmu.c            | 324 ++++++++++++
> drivers/perf/riscv_pmu_legacy.c     | 142 +++++
> drivers/perf/riscv_pmu_sbi.c        | 790 ++++++++++++++++++++++++++++
> include/linux/cpuhotplug.h          |   1 +
> include/linux/perf/riscv_pmu.h      |  75 +++
> 18 files changed, 1538 insertions(+), 827 deletions(-)
> delete mode 100644 Documentation/riscv/pmu.rst
> delete mode 100644 arch/riscv/kernel/perf_event.c
> create mode 100644 drivers/perf/riscv_pmu.c
> create mode 100644 drivers/perf/riscv_pmu_legacy.c
> create mode 100644 drivers/perf/riscv_pmu_sbi.c
> create mode 100644 include/linux/perf/riscv_pmu.h
> 
> --
> 2.30.2
> 
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv




More information about the linux-riscv mailing list