[PATCH v2 1/7] perf: Increase the maximum number of samples to 256.
Ian Rogers
irogers at google.com
Thu Feb 20 10:51:16 PST 2025
On Thu, Jan 16, 2025 at 3:10 PM Rajnesh Kanwal <rkanwal at rivosinc.com> wrote:
>
> RISCV CTR extension support a maximum depth of 256 last branch records.
> The 127 entries limit results in corrupting CTR entries for RISC-V if
> configured to be 256 entries. This will not impact any other architectures
> as it is just increasing maximum limit of possible entries.
I wonder if rather than a constant this code should just use the auto
resizing hashmap code?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/hashmap.h
I assume the value of 127 comes from perf_event.h's PERF_MAX_STACK_DEPTH:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/perf_event.h#n1252
Perhaps these constants shouldn't exist. The perf-record man page mentions:
sysctl.kernel.perf_event_max_stack
which I believe gets a value from
/proc/sys/kernel/perf_event_max_stack, so maybe these should be
runtime determined constants rather than compile time.
Thanks,
Ian
> Signed-off-by: Rajnesh Kanwal <rkanwal at rivosinc.com>
> ---
> tools/perf/util/machine.c | 21 ++++++++++++++-------
> 1 file changed, 14 insertions(+), 7 deletions(-)
>
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index 27d5345d2b30..f2eb3c20274e 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -2174,25 +2174,32 @@ static void save_iterations(struct iterations *iter,
> iter->cycles += be[i].flags.cycles;
> }
>
> -#define CHASHSZ 127
> -#define CHASHBITS 7
> -#define NO_ENTRY 0xff
> +#define CHASHBITS 8
> +#define NO_ENTRY 0xffU
>
> -#define PERF_MAX_BRANCH_DEPTH 127
> +#define PERF_MAX_BRANCH_DEPTH 256
>
> /* Remove loops. */
> +/* Note: Last entry (i==ff) will never be checked against NO_ENTRY
> + * so it's safe to have an unsigned char array to process 256 entries
> + * without causing clash between last entry and NO_ENTRY value.
> + */
> static int remove_loops(struct branch_entry *l, int nr,
> struct iterations *iter)
> {
> int i, j, off;
> - unsigned char chash[CHASHSZ];
> + unsigned char chash[PERF_MAX_BRANCH_DEPTH];
>
> memset(chash, NO_ENTRY, sizeof(chash));
>
> - BUG_ON(PERF_MAX_BRANCH_DEPTH > 255);
> + BUG_ON(PERF_MAX_BRANCH_DEPTH > 256);
>
> for (i = 0; i < nr; i++) {
> - int h = hash_64(l[i].from, CHASHBITS) % CHASHSZ;
> + /* Remainder division by PERF_MAX_BRANCH_DEPTH is not
> + * needed as hash_64 will anyway limit the hash
> + * to CHASHBITS
> + */
> + int h = hash_64(l[i].from, CHASHBITS);
>
> /* no collision handling for now */
> if (chash[h] == NO_ENTRY) {
> --
> 2.34.1
>
More information about the linux-riscv
mailing list