[PATCH] KVM: arm64: Disable TRBE Trace Buffer Unit when running in guest context
James Clark
james.clark at linaro.org
Mon Feb 16 07:05:10 PST 2026
On 16/02/2026 2:29 pm, Marc Zyngier wrote:
> On Mon, 16 Feb 2026 13:09:59 +0000,
> Will Deacon <will at kernel.org> wrote:
>>
>> The nVHE world-switch code relies on zeroing TRFCR_EL1 to disable trace
>> generation in guest context when self-hosted TRBE is in use by the host.
>>
>> Per D3.2.1 ("Controls to prohibit trace at Exception levels"), clearing
>> TRFCR_EL1 means that trace generation is prohibited at EL1 and EL0 but
>> per R_YCHKJ the Trace Buffer Unit will still be enabled if
>> TRBLIMITR_EL1.E is set. R_SJFRQ goes on to state that, when enabled, the
>> Trace Buffer Unit can perform address translation for the "owning
>> exception level" even when it is out of context.
>
> Great. So TRBE violates all the principles that we hold true in the
> architecture. Does SPE suffer from the same level of brokenness?
>
>> Consequently, we can end up in a state where TRBE performs speculative
>> page-table walks for a host VA/IPA in guest/hypervisor context depending
>> on the value of MDCR_EL2.E2TB, which changes over world-switch. The
>> result appears to be a heady mixture of data corruption and hardware
>> lockups.
>>
>> Extend the TRBE world-switch code to clear TRBLIMITR_EL1.E after
>> draining the buffer, restoring the register on return to the host.
>>
>> Cc: Marc Zyngier <maz at kernel.org>
>> Cc: Oliver Upton <oupton at kernel.org>
>> Cc: James Clark <james.clark at linaro.org>
>> Cc: Leo Yan <leo.yan at arm.com>
>> Cc: Suzuki K Poulose <suzuki.poulose at arm.com>
>> Cc: Fuad Tabba <tabba at google.com>
>> Fixes: a1319260bf62 ("arm64: KVM: Enable access to TRBE support for host")
>> Signed-off-by: Will Deacon <will at kernel.org>
>> ---
>>
>> NOTE: This is *untested* as I don't have a TRBE-capable device that can
>> run upstream but I noticed this by inspection when triaging occasional
>> hardware lockups on systems using a 6.12-based kernel with TRBE running
>> at the same time as a vCPU is loaded. This code has changed quite a bit
>> over time, so stable backports are not entirely straightforward.
>> Hopefully James/Leo/Suzuki can help us test if folks agree with the
>> general approach taken here.
>>
>> arch/arm64/include/asm/kvm_host.h | 1 +
>> arch/arm64/kvm/hyp/nvhe/debug-sr.c | 36 ++++++++++++++++++++++--------
>> 2 files changed, 28 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>> index ac7f970c7883..a932cf043b83 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -746,6 +746,7 @@ struct kvm_host_data {
>> u64 pmscr_el1;
>> /* Self-hosted trace */
>> u64 trfcr_el1;
>> + u64 trblimitr_el1;
>> /* Values of trap registers for the host before guest entry. */
>> u64 mdcr_el2;
>> u64 brbcr_el1;
>> diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
>> index 2a1c0f49792b..fd389a26bc59 100644
>> --- a/arch/arm64/kvm/hyp/nvhe/debug-sr.c
>> +++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
>> @@ -57,12 +57,27 @@ static void __trace_do_switch(u64 *saved_trfcr, u64 new_trfcr)
>> write_sysreg_el1(new_trfcr, SYS_TRFCR);
>> }
>>
>> -static bool __trace_needs_drain(void)
>> +static void __trace_drain_and_disable(void)
>> {
>> - if (is_protected_kvm_enabled() && host_data_test_flag(HAS_TRBE))
>> - return read_sysreg_s(SYS_TRBLIMITR_EL1) & TRBLIMITR_EL1_E;
>> + u64 *trblimitr_el1 = host_data_ptr(host_debug_state.trblimitr_el1);
>>
>> - return host_data_test_flag(TRBE_ENABLED);
>> + *trblimitr_el1 = 0;
>> +
>> + if (is_protected_kvm_enabled()) {
>> + if (!host_data_test_flag(HAS_TRBE))
>> + return;
>> + } else {
>> + if (!host_data_test_flag(TRBE_ENABLED))
>> + return;
>> + }
>> +
>> + *trblimitr_el1 = read_sysreg_s(SYS_TRBLIMITR_EL1);
>> + if (*trblimitr_el1 & TRBLIMITR_EL1_E) {
>> + isb();
>> + tsb_csync();
>> + write_sysreg_s(0, SYS_TRBLIMITR_EL1);
>> + isb();
The TRBE driver might do an extra drain here as a workaround. Hard to
tell if it's actually required in this case (seems like probably not)
but it might be worth doing it anyway to avoid hitting the issue.
Especially if we add guest support later where some of the affected
registers might start being used. See:
if (trbe_needs_drain_after_disable(cpudata))
trbe_drain_buffer();
>> + }
>
> Doesn't this mean we should be able to get rid of most of the TRFCR
> messing about that litters the entry/exit code and leave that to VHE
Technically you could have ETMs that and are connected to sinks other
than TRBE. Unless you somehow switch off those sinks you still need to
do the TRFCR switching stuff.
> only? And even then, I'm tempted to simply get rid of any sort of
> guest-only tracing, given that TRBE is not capable of representing
> exceptions that are synthesised by the host, making it the resulting
> traces useless.
I haven't heard of anyone tracing a guest from the host, but until we
add support for guests to be able to trace themselves it's the only way
of doing it, so it could be useful. Although all the messing around with
TRFCR before was from a request to disable guest trace rather than
enable it, as it was always on in nVHE.
>
> I'm still trying to get my hands on a TRBE-enabled system that has
> some actual firmware tables (my O6 seems to have the HW, but no
> description of the required coresight infra).
Leo has a guide, I haven't tried it myself yet but it should be working.
I'll send it to you.
James
>
> Thanks,
>
> M.
>
More information about the linux-arm-kernel
mailing list