[PATCH v3 0/6] 32bit ARM branch predictor hardening

Wed Jan 31 11:54:24 PST 2018

On 31/01/18 19:07, Marc Zyngier wrote:
> On 31/01/18 18:53, Florian Fainelli wrote:
>> On 01/31/2018 04:45 AM, Hanjun Guo wrote:
>>> On 2018/1/29 22:58, Nishanth Menon wrote:
>>>> On Mon, Jan 29, 2018 at 5:36 AM, Hanjun Guo <guohanjun at huawei.com> wrote:
>>>> [...]
>>>>
>>>>> By the way, this patch set just enable branch predictor hardening
>>>>> on arm32 unconditionally, but some of machines (such as wireless
>>>>> network base station) will not be exposed to user to take advantage
>>>>> of variant 2, and those machines will be pretty sensitive for
>>>>> performance, so can we introduce Kconfig or boot option to disable
>>>>> branch predictor hardening as an option?
>>>>
>>>> I am curious: Have you seen performance degradation with this series?
>>>> If yes, is it possible to share the information?
>>>
>>> Sorry for the late reply, the performance data for context switch (CFS)
>>> is about 6%~12% drop (A9 based machine) for the first around test, but
>>> the data is not stable, I need to retest then I will update here.
>>
>> What tool did you use to measure this? On a Brahma-B15 platform clocked
>> at 1.5Ghz, across kernels 4.1, 4.9 (4.15 in progress as we speak), I
>> measured the following, with two memory configurations, one giving 256MB
>> of usable memory, another giving 3GB of usable memory, results below are
>> only the most extreme 256MB case. This is running 13 groups because the
>> ASID space is 256bits so this should force at least two full ASID
>> generation rollovers (assuming the logic is correct here).
>>
>> for i in $(seq 0 9)
>> do
>> 	hackbench 13 process 10000
>> done
>>
>> Average values, in seconds:
>>
>> 1) 4.1.45, ACTLR[0] = 0, no spectre variant 2 patches: 114,2666
>> 2) 4.1.45, ACTLR[0] = 1, no spectre variant 2 patches: 114,2952
>> 3) 4.1.45, ACTLR[0] =1 , spectre variant 2 patches: 115,5853
>>
>> => 3) is a 1.15% degradation against 1)
>>
>> 4.9.51, ACTLR[0] = 0, no spectre variant 2 patches: 130,7676
>> 4.9.51, ACTLR[0] = 1, no spectre variant 2 patches: 130,6848
>> 4.9.51, ACTLR[0] =1 , spectre variant 2 patches: 132,4274
>>
>> => 3) is a 1.26% degradation against 1)
>>
>> The relative differences between 4.1 and 4.9 appear consistent (with 4.9
>> being slower for a reason I ignore).
>>
>> Marc, are there any performance tests/results that you ran that you
>> could share?
> 
> None. I usually don't run benchmarks, because they are not
> representative of a real workload. I urge people to run their own real
> workload, as it is very unlikely to have hackbench's profile...

Very true.

Out of curiosity (and to prove that the patches and my home-baked
firmware fix actually had an effect), I also ran hackbench (of course!)
on a Calxeda Midway (4*Cortex-A15, 8GB RAM). Native runs showed only
very little degradation, not unlike Florian's numbers.
Running hackbench in a KVM guest however showed a bigger impact, which
is of course somewhat expected.

Cheers,
Andre.