[PATCH v2 8/8] perf record: Directly bail out for compat case

Adrian Hunter adrian.hunter at intel.com
Wed Jun 9 01:23:25 PDT 2021


On 7/06/21 6:09 pm, Leo Yan wrote:
> On Mon, Jun 07, 2021 at 01:23:43PM +0300, Adrian Hunter wrote:
>> On 2/06/21 3:38 pm, Leo Yan wrote:
>>> Hi Adrain,
>>>
>>> On Wed, Jun 02, 2021 at 02:18:47PM +0300, Adrian Hunter wrote:
>>>> On 2/06/21 1:30 pm, Leo Yan wrote:
>>>>> Since the 64-bit atomicity is not promised in 32-bit perf, directly
>>>>> report the error and bail out for this case.
>>>>>
>>>>> Now only applies on x86_64 and Arm64 platforms.
>>>>>
>>>>> Suggested-by: Adrian Hunter <adrian.hunter at intel.com>
>>>>
>>>> Maybe we can do better for the compat case.
>>>>
>>>> We can assume the upper 32-bits change very seldom,
>>>> and always increase. So for the 'read' case:
>>>>
>>>> 	u64 first, second, last;
>>>> 	u64 mask = (u64)((u32)-1) << 32;
>>>>
>>>> 	do {
>>>> 		first = READ_ONCE(pc->aux_head);
>>>> 		rmb();
>>>> 		second = READ_ONCE(pc->aux_head);
>>>> 		rmb();
>>>> 		last = READ_ONCE(pc->aux_head);
>>>> 	} while ((first & mask) != (last & mask));
>>>> 	return second;
>>>>
>>>> For the write case, we can cause a fatal error only if the new
>>>> tail has non-zero upper 32-bits.  That gives up to 4GiB of data
>>>> before aborting:
>>>>
>>>> 	if (tail & mask)
>>>> 		return -1;
>>>> 	smp_mb();
>>>> 	WRITE_ONCE(pc->aux_tail, tail);
>>>
>>> Seems to me, it's pointless to only support aux_head for 64-bit and
>>> support aux_tail for 32-bit.  I understand this can be helpful for the
>>> snapshot mode which only uses aux_head, but it still fails to support
>>> the normal case for AUX ring buffer using 64-bit head/tail.
>>
>> I am not sure why you say it is pointless.  'perf record' would still be
>> able to capture up to 4GiB of data. Do you mean you usually capture more
>> than 4GiB of data?
> 
> Okay, understand.  We can support 32-bit perf for compat mode when the
> trace data is less than 4GiB.
> 
>> I was thinking we would separate out the compat case:
>>
>> #if BITS_PER_LONG == 32
>> 	if (kernel_is_64_bit)
>> 		return compat_auxtrace_mmap__[read_head/write_tail]()
>> #endif
>>
>> So the non-compat cases would not be affected.
> 
> Because I don't want to introduce the complexity for read/write head
> and tail, and we also need to handle the same issue for the perf ring
> buffer.  So how about below change?
> 
> The main idea for below change is it allows the perf to run normally
> on the compat mode and exitly if detects the buffer head is close to
> the low 32-bit's overflow: when detect the low 32-bit value is bigger
> than 0xf0000000 (so we have 256MiB margin to the overflow), it reports
> error and exit.
> 
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index 1b4091a3b508..2a9965bfeab4 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -1693,6 +1693,14 @@ static int __auxtrace_mmap__read(struct mmap *map,
>  	pr_debug3("auxtrace idx %d old %#"PRIx64" head %#"PRIx64" diff %#"PRIx64"\n",
>  		  mm->idx, old, head, head - old);
>  
> +#ifdef BITS_PER_LONG == 32
> +	if (kernel_is_64bit() && head >= 0xf0000000) {

You are assuming the head never increases by more than 256MiB which
means you should limit the buffer size to 256MiB maximum.

To me this seems a bit too far from an ideal solution.

I would have thought separating out the compat case makes things
simpler to understand.

> +		pr_err("32-bit perf cannot read 64-bit value atomically;\n");
> +		pr_err("exit to avoid the 4GB (32-bit) AUX buffer overflow on compat mode.\n");
> +		return -ENOMEM;
> +	}
> +#endif
> +
>  	if (mm->mask) {
>  		head_off = head & mm->mask;
>  		old_off = old & mm->mask;
> diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
> index 9130f6fad8d5..823b69895b85 100644
> --- a/tools/perf/util/env.c
> +++ b/tools/perf/util/env.c
> @@ -405,3 +405,20 @@ int perf_env__numa_node(struct perf_env *env, int cpu)
>  
>  	return cpu >= 0 && cpu < env->nr_numa_map ? env->numa_map[cpu] : -1;
>  }
> +
> +int perf_kernel_is_64bit(void)
> +{
> +	struct utsname uts;
> +	int ret;
> +
> +	ret = uname(&uts);
> +	if (ret < 0)
> +		return 0;
> +
> +	if (!strncmp(uts.machine, "x86_64", 6) ||
> +	    !strncmp(uts.machine, "aarch64", 7) ||
> +	    !strncmp(uts.machine, "arm64", 5))
> +		return 1;
> +
> +	return 0;
> +}

Obviously, we don't need to keep checking uname. It could be a global variable
that is always set up early.


> diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
> index ca249bf5e984..c6c034fc08f6 100644
> --- a/tools/perf/util/env.h
> +++ b/tools/perf/util/env.h
> @@ -147,4 +147,6 @@ void perf_env__insert_btf(struct perf_env *env, struct btf_node *btf_node);
>  struct btf_node *perf_env__find_btf(struct perf_env *env, __u32 btf_id);
>  
>  int perf_env__numa_node(struct perf_env *env, int cpu);
> +
> +int perf_kernel_is_64bit(void);
>  #endif /* __PERF_ENV_H */
> diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
> index ab7108d22428..f1d3725d599a 100644
> --- a/tools/perf/util/mmap.c
> +++ b/tools/perf/util/mmap.c
> @@ -323,6 +323,14 @@ int perf_mmap__push(struct mmap *md, void *to,
>  	if (rc < 0)
>  		return (rc == -EAGAIN) ? 1 : -1;
>  
> +#ifdef BITS_PER_LONG == 32
> +	if (kernel_is_64bit() && head >= 0xf0000000) {
> +		pr_err("32-bit perf cannot read 64-bit value atomically;\n");
> +		pr_err("exit to avoid the 4GB (32-bit) buffer overflow on compat mode.\n");
> +		return -ENOMEM;
> +	}
> +#endif
> +
>  	size = md->core.end - md->core.start;
>  
>  	if ((md->core.start & md->core.mask) + size != (md->core.end & md->core.mask)) {
> 
> Thanks,
> Leo
> 




More information about the linux-arm-kernel mailing list