[PATCH 0/4] arm64: an optimization for AmpereOne
Huang Shijie
shijie at os.amperecomputing.com
Wed Nov 22 01:28:51 PST 2023
0) Background:
We found that AmpereOne benefits from aggressive prefetches when
using 4K page size.
1) This patch:
1.1) adds new WORKAROUND_AMPERE_AC03_PREFETCH capability.
1.2) uses MIDR_AMPERE1 to filter the processor.
1.3) uses alternative_if to alternative the code
for AmpereOne.
1.4) adds software prefetches for the specific loop.
Also add a macro add_prefetch.
2) Test result:
In hugetlb or tmpfs, We can get big seqential read performance improvement
up to 1.3x ~ 1.4x.
Huang Shijie (4):
extable: add __sort_main_extable
arm64: alternative: handle the kernel exception table
arm64: copy_template.S: add loop_for_copy_128_bytes macro
arm64: add software prefetches for AmpereOne
arch/arm64/Kconfig.platforms | 7 +++
arch/arm64/kernel/alternative.c | 18 +++++++
arch/arm64/kernel/cpu_errata.c | 9 ++++
arch/arm64/lib/copy_template.S | 87 +++++++++++++++++++++++----------
arch/arm64/tools/cpucaps | 1 +
include/linux/extable.h | 2 +
kernel/extable.c | 8 ++-
7 files changed, 105 insertions(+), 27 deletions(-)
--
2.40.1
More information about the linux-arm-kernel
mailing list