[PATCH 0/3] arm64: proton-pack: Add Spectre-BSE mitigation for Cortex-A7{2,3,5}
Geoff Blake
blakgeof at amazon.com
Wed Jan 22 14:36:54 PST 2025
On Wed, 22 Jan 2025, Doebel, Bjoern wrote:
> Hi,
>
> On 22.01.25 18:47, James Morse wrote:
> > Hello!
> >
> > Spectre-BSE is a variant of Spectre-BHB that abuses a power-saving mode
> > on some older cores to dodge the BHB mitigation applied to the branch
> > predictor.
> >
> > Only A72r0 actually needs anything doing - this is basically a bug in the
> > published BHB mitigation sequence that was published for A72r0. This
> > series moves A72r0 to use the WA1 firmware call for mitigation, and adds
> > the necessary reporting parts for user-space to discover which parts of
> > BHB/BSE are mitigated or vulnerable.
> >
> > WA1 is used instead of WA3 which was new for BHB because we can't rely
> > on hypervisors not to use the 'local' workaround, and for Spectre-BSE
> > we don't need to worry about discovery via. (Which is why WA3 exists -
> > for cores not vulnerable to the issue mititaged by WA1).
> >
> > Arm's description of this vulnerability can be found here:
> > https://developer.arm.com/Arm%20Security%20Center/Spectre-BSE
> >
> > This series is based on arm64/for-next/core, and can be retrieved from:
> > https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/log/?h=spectre_bse/v1
> >
> > Backports of this version can also be found under spectre_bse/backports
> > of the above repo.
> >
> > Because this vulnerability is hard to expoit, but the cost of mitigating
> > it is high - the mitigation is disabled by default. (see the last
> > patch). To enable the mitigation, a command-line argument is needed:
> > 'spectre_bse'.
>
> The Amazon Linux kernel team evaluated these patches on EC2 A1 instances
> running Amazon Linux 2 and UnixBench. We can confirm that patch impact is
> significant, especially for syscall overhead.
>
> UnixBench results in comparison to disabled mitigations (AL2, kernel 5.15, EC2
> A1.4xlarge instance):
>
> Dhrystone 2 -- +0.01%
> 2prec Whetstone -- +0.01%
> Execl throughput -- +21.39%
> File Copy 1024/2000 -- +45.40%
> File Copy 256/500 -- +46.52%
> File Copy 4096/8000 -- +25.68%
> Pipe Throughput -- +51.46%
> Pipe based ctx switch -- +10.91%
> Process creation -- +4.35%
> Shell Scripts x1 -- +20.00%
> Shell Scripts x8 -- +26.68%
> System Call Overhead -- +55.82%
> Total Score -- +28.36%
>
>
> Best,
> Bjoern
>
We also conducted full sized workload tests that we consider
representative of common use cases for A1 instances. The data we see
shows the impact can be significant depending on workload:
NGINX server configured as a load-balancer: -20%
Memcached loaded so P99 response latency <10ms: -29%
Memcached loaded so P99 response latency <1ms: -2%
Wordpress blog server: -2%
Thanks,
Geoff
More information about the linux-arm-kernel
mailing list