[PATCH 0/3] arm64: proton-pack: Add Spectre-BSE mitigation for Cortex-A7{2,3,5}

Geoff Blake blakgeof at amazon.com
Wed Jan 22 14:36:54 PST 2025


On Wed, 22 Jan 2025, Doebel, Bjoern wrote:

> Hi,
> 
> On 22.01.25 18:47, James Morse wrote:
> > Hello!
> > 
> > Spectre-BSE is a variant of Spectre-BHB that abuses a power-saving mode
> > on some older cores to dodge the BHB mitigation applied to the branch
> > predictor.
> > 
> > Only A72r0 actually needs anything doing - this is basically a bug in the
> > published BHB mitigation sequence that was published for A72r0. This
> > series moves A72r0 to use the WA1 firmware call for mitigation, and adds
> > the necessary reporting parts for user-space to discover which parts of
> > BHB/BSE are mitigated or vulnerable.
> > 
> > WA1 is used instead of WA3 which was new for BHB because we can't rely
> > on hypervisors not to use the 'local' workaround, and for Spectre-BSE
> > we don't need to worry about discovery via. (Which is why WA3 exists -
> > for cores not vulnerable to the issue mititaged by WA1).
> > 
> > Arm's description of this vulnerability can be found here:
> > https://developer.arm.com/Arm%20Security%20Center/Spectre-BSE
> > 
> > This series is based on arm64/for-next/core, and can be retrieved from:
> > https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/log/?h=spectre_bse/v1
> > 
> > Backports of this version can also be found under spectre_bse/backports
> > of the above repo.
> > 
> > Because this vulnerability is hard to expoit, but the cost of mitigating
> > it is high - the mitigation is disabled by default. (see the last
> > patch). To enable the mitigation, a command-line argument is needed:
> > 'spectre_bse'.
> 
> The Amazon Linux kernel team evaluated these patches on EC2 A1 instances
> running Amazon Linux 2 and UnixBench. We can confirm that patch impact is
> significant, especially for syscall overhead.
> 
> UnixBench results in comparison to disabled mitigations (AL2, kernel 5.15, EC2
> A1.4xlarge instance):
> 
> Dhrystone 2           --  +0.01%
> 2prec Whetstone       --  +0.01%
> Execl throughput      -- +21.39%
> File Copy 1024/2000   -- +45.40%
> File Copy 256/500     -- +46.52%
> File Copy 4096/8000   -- +25.68%
> Pipe Throughput       -- +51.46%
> Pipe based ctx switch -- +10.91%
> Process creation      --  +4.35%
> Shell Scripts x1      -- +20.00%
> Shell Scripts x8      -- +26.68%
> System Call Overhead  -- +55.82%
> Total Score           -- +28.36%
> 
> 
> Best,
> Bjoern
> 

We also conducted full sized workload tests that we consider 
representative of common use cases for A1 instances. The data we see 
shows the impact can be significant depending on workload:

NGINX server configured as a load-balancer: -20%
Memcached loaded so P99 response latency <10ms: -29%
Memcached loaded so P99 response latency <1ms: -2%
Wordpress blog server: -2%

Thanks,
Geoff



More information about the linux-arm-kernel mailing list