[PATCH v4 0/2] Make sysFS functional on topologies with per core sink

Linu Cherian linuc.decode at gmail.com
Thu Nov 12 03:57:52 EST 2020


Hi Suzuki,

On Tue, Nov 10, 2020 at 8:27 PM Suzuki K Poulose <suzuki.poulose at arm.com> wrote:
>
> Hi Linu
>
> On 11/10/20 12:57 PM, Linu Cherian wrote:
> > Hi Suzuki,
> >
> ...
>
> >> We are facing some issues while trying out perf. This doesn't appear
> >> to be related to your patch though. Will share the details once we
> >> do some initial analysis on it.
> >>
> >> Thanks.
> >
> > # ./perf record -vvv -e cs_etm// --per-thread uname -a
> > Using CPUID 0x00000000430f0b40
> > Attempting to add event pmu 'cs_etm' with '' that may result in non-fatal errors
> > nr_cblocks: 0
> > affinity: SYS
> > mmap flush: 1
> > comp level: 0
> > maps__set_modules_path_dir: cannot open
> > /lib/modules/5.9.0-rc5-00116-g91c9ea890e1a dir
> > Problems setting modules path maps, continuing anyway...
> > ------------------------------------------------------------
> > perf_event_attr:
> >    type                             8
> >    size                             120
> >    { sample_period, sample_freq }   1
> >    sample_type                      IP|TID|IDENTIFIER
> >    read_format                      ID
> >    disabled                         1
> >    enable_on_exec                   1
> >    sample_id_all                    1
> >    exclude_guest                    1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid 3333  cpu -1  group_fd -1  flags 0x8 = 5
> > ------------------------------------------------------------
> > perf_event_attr:
> >    type                             1
> >    size                             120
> >    config                           0x9
> >    { sample_period, sample_freq }   1
> >    sample_type                      IP|TID|IDENTIFIER
> >    read_format                      ID
> >    disabled                         1
> >    exclude_kernel                   1
> >    exclude_hv                       1
> >    mmap                             1
> >    comm                             1
> >    enable_on_exec                   1
> >    task                             1
> >    sample_id_all                    1
> >    exclude_guest                    1
> >    mmap2                            1
> >    comm_exec                        1
> >    context_switch                   1
> >    ksymbol                          1
> >    bpf_event                        1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid 3333  cpu -1  group_fd -1  flags 0x8 = 6
> > mmap size 589824B
> > AUX area mmap length 4194304
> > ------------------------------------------------------------
> > perf_event_attr:
> >    type                             1
> >    size                             120
> >    config                           0x9
> >    watermark                        1
> >    sample_id_all                    1
> >    bpf_event                        1
> >    { wakeup_events, wakeup_watermark } 1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid -1  cpu -1  group_fd -1  flags 0x8
> > sys_perf_event_open failed, error -22
> > switching off bpf_event
> > ------------------------------------------------------------
> > perf_event_attr:
> >    type                             1
> >    size                             120
> >    config                           0x9
> >    watermark                        1
> >    sample_id_all                    1
> >    { wakeup_events, wakeup_watermark } 1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid -1  cpu -1  group_fd -1  flags 0x8
> > sys_perf_event_open failed, error -22
> > switching off cloexec flag
> > ------------------------------------------------------------
> > perf_event_attr:
> >    type                             1
> >    size                             120
> >    config                           0x9
> >    watermark                        1
> >    sample_id_all                    1
> >    { wakeup_events, wakeup_watermark } 1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid -1  cpu -1  group_fd -1  flags 0
> > sys_perf_event_open failed, error -22
> > switching off sample_id_all
> > ------------------------------------------------------------
> > perf_event_attr:
> >    type                             1
> >    size                             120
> >    config                           0x9
> >    watermark                        1
> >    { wakeup_events, wakeup_watermark } 1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid -1  cpu -1  group_fd -1  flags 0
> > sys_perf_event_open failed, error -22
> > Couldn't start the BPF side band thread:
> > BPF programs starting from now on won't be annotatable
> > Synthesizing auxtrace information
> > cannot find cgroup mount point
> > Couldn't synthesize cgroup events.
> > Control descriptor is not initialized
> > Linux marvell 5.9.0-rc5-00116-g91c9ea890e1a #823 SMP PREEMPT Tue Nov
> > 10 10:49:15 IST 2020 aarch64 aarch64 aarch64 GNU/Linux
>
> > auxtrace idx 0 old 0 head 0xdd50 diff 0xdd50
>
> I haven't seen this in the normal verbose output.
>
> > [ perf record: Woken up 1 times to write data ]
> > symbol:init_start file:(null) line:0 offset:0 return:0 lazy:(null)
> > snip ..
> > symbol:memory_mallopt file:(null) line:0 offset:0 return:0 lazy:(null)
> > failed to write feature CPUDESC
> > failed to write feature MEM_TOPOLOGY
> > failed to write feature CPU_PMU_CAPS
> > [ perf record: Captured and wrote 0.056 MB perf.data ]
> >
> > # ./perf report
> > 0x368 [0x50]: failed to process type: 1 [Cannot allocate memory]
> > Error:
> > failed to process sample
>
> I have no clue about it. Are you able to run it under GDB ? (Looks like
> you have built the perf, so if you have sources, it may be a good idea
> to run under the GDB and figure out where that error is coming from).
>

Yeah gdb helped figuring out the issue.
The issue is in the opencsd, where it doesn't seem to support multiple streams
when the formatter is not enabled. .
Note:Our Silicon has formatter disabled and we already had changes in perf tool
to take care of the formatter status.

The below hack helped.

diff --git a/decoder/source/ocsd_dcd_tree.cpp b/decoder/source/ocsd_dcd_tree.cpp
index be15e36..0210dec 100644
--- a/decoder/source/ocsd_dcd_tree.cpp
+++ b/decoder/source/ocsd_dcd_tree.cpp
@@ -401,7 +401,7 @@ ocsd_err_t DecodeTree::createDecoder(const
std::string &decoderName, const int c
     int crtFlags = createFlags;

     uint8_t CSID = 0;   // default for single stream decoder (no
deformatter) - we ignore the ID
-    if(usingFormatter())
+    //if(usingFormatter())
     {
         CSID = pConfig->getTraceID();
         crtFlags |= OCSD_CREATE_FLG_INST_ID;


Not sure if this is the right fix though.

This is how i tested,

1. # taskset 0x2 ./perf record -e cs_etm//u -F 10 --per-thread ping -c
30 127.0.0.1

2. # Ctrl-Z // Put the process in background

3. # taskset -p 0x4 <pid of ping process> // Move the ping process to core 2

4. # fg // Get the process to foreground

5. ./perf report
snip ...

# Samples: 66K of event 'branches:uH'
# Event count (approx.): 66953
#
# Children      Self  Command  Shared Object          Symbol
# ........  ........  .......  .....................
........................................
#
    15.94%    15.94%  ping     ld-2.31.so             [.] _dl_lookup_symbol_x
    14.93%    14.93%  ping     ld-2.31.so             [.] do_lookup_x
    10.68%    10.68%  ping     libc-2.31.so           [.] _dl_addr
     9.87%     9.87%  ping     ld-2.31.so             [.] _dl_relocate_object
     6.75%     6.75%  ping     ld-2.31.so             [.] strcmp
     3.62%     3.62%  ping     ld-2.31.so             [.] check_match
     2.72%     2.72%  ping     libc-2.31.so           [.] __vfprintf_internal
     1.90%     1.90%  ping     libc-2.31.so           [.] _int_malloc
     1.29%     1.29%  ping     libc-2.31.so           [.] getenv
     1.28%     1.28%  ping     libc-2.31.so           [.] strcmp
     1.17%     1.17%  ping     libc-2.31.so           [.]
_IO_file_xsputn@@GLIBC_2.17
     1.16%     1.16%  ping     ld-2.31.so             [.] _dl_name_match_p

snip ...

Also i could verify using prints in the tmc-etr-driver that the trace
buffer gets reused across cores
as well.


> Also, what is
>
> perf --version ?

perf version 5.9.0-rc5


>
>
> > # To display the perf.data header info, please use --header/--header-only option
> >
> > ============================================================================
> >
> > Appreciate your help on getting some debug hints on what is going wrong.
>
>
> >
> > One strange thing noted here is sys_perf_event_open, passing cpu = -1
> > and pid = -1,
> > which doesnt appear to be valid as per tools/perf/design.txt
>
> I see that on my Juno, but it still works. I believe that is for the
> generic PMU (pmu.type == 1) and not the coresight PMU, which I believe
> is (type == 8) in your case (the first event).
>
> Suzuki



More information about the linux-arm-kernel mailing list