[PATCH v2] arm64: Allow CONFIG_AUTOFDO_CLANG to be selected
Rong Xu
xur at google.com
Mon Nov 18 15:49:35 PST 2024
This patch looks good to me.
I assume the profile format change in the Android doc will be submitted soon.
Since "extbinary" is a superset of "binary", using the "extbinary"
format profile
in Android shouldn't cause any compatibility issues.
Reviewed-by: Rong Xu <xur.google.com>
-Rong
On Mon, Nov 18, 2024 at 2:25 PM Yabin Cui <yabinc at google.com> wrote:
>
> Select ARCH_SUPPORTS_AUTOFDO_CLANG to allow AUTOFDO_CLANG to be
> selected.
>
> On ARM64, ETM traces can be recorded and converted to AutoFDO profiles.
> Experiments on Android show 4% improvement in cold app startup time
> and 13% improvement in binder benchmarks.
>
> Signed-off-by: Yabin Cui <yabinc at google.com>
> ---
>
> Change-Logs in V2:
>
> 1. Use "For ARM platforms with ETM trace" in autofdo.rst.
> 2. Create an issue and a change to use extbinary format in instructions:
> https://github.com/Linaro/OpenCSD/issues/65
> https://android-review.googlesource.com/c/platform/system/extras/+/3362107
>
> Documentation/dev-tools/autofdo.rst | 18 +++++++++++++++++-
> arch/arm64/Kconfig | 1 +
> 2 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/dev-tools/autofdo.rst b/Documentation/dev-tools/autofdo.rst
> index 1f0a451e9ccd..a890e84a2fdd 100644
> --- a/Documentation/dev-tools/autofdo.rst
> +++ b/Documentation/dev-tools/autofdo.rst
> @@ -55,7 +55,7 @@ process consists of the following steps:
> workload to gather execution frequency data. This data is
> collected using hardware sampling, via perf. AutoFDO is most
> effective on platforms supporting advanced PMU features like
> - LBR on Intel machines.
> + LBR on Intel machines, ETM traces on ARM machines.
>
> #. AutoFDO profile generation: Perf output file is converted to
> the AutoFDO profile via offline tools.
> @@ -141,6 +141,22 @@ Here is an example workflow for AutoFDO kernel:
>
> $ perf record --pfm-events RETIRED_TAKEN_BRANCH_INSTRUCTIONS:k -a -N -b -c <count> -o <perf_file> -- <loadtest>
>
> + - For ARM platforms with ETM trace:
> +
> + Follow the instructions in the `Linaro OpenCSD document
> + https://github.com/Linaro/OpenCSD/blob/master/decoder/tests/auto-fdo/autofdo.md`_
> + to record ETM traces for AutoFDO::
> +
> + $ perf record -e cs_etm/@tmc_etr0/k -a -o <etm_perf_file> -- <loadtest>
> + $ perf inject -i <etm_perf_file> -o <perf_file> --itrace=i500009il
> +
> + For ARM platforms running Android, follow the instructions in the
> + `Android simpleperf document
> + <https://android.googlesource.com/platform/system/extras/+/main/simpleperf/doc/collect_etm_data_for_autofdo.md>`_
> + to record ETM traces for AutoFDO::
> +
> + $ simpleperf record -e cs-etm:k -a -o <perf_file> -- <loadtest>
> +
> 4) (Optional) Download the raw perf file to the host machine.
>
> 5) To generate an AutoFDO profile, two offline tools are available:
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index fd9df6dcc593..c3814df5e391 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -103,6 +103,7 @@ config ARM64
> select ARCH_SUPPORTS_PER_VMA_LOCK
> select ARCH_SUPPORTS_HUGE_PFNMAP if TRANSPARENT_HUGEPAGE
> select ARCH_SUPPORTS_RT
> + select ARCH_SUPPORTS_AUTOFDO_CLANG
> select ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH
> select ARCH_WANT_COMPAT_IPC_PARSE_VERSION if COMPAT
> select ARCH_WANT_DEFAULT_BPF_JIT
> --
> 2.47.0.338.g60cca15819-goog
>
More information about the linux-arm-kernel
mailing list