qemu riscv, thead c906, Linux boot regression

Daniel Henrique Barboza dbarboza at ventanamicro.com
Wed Jan 24 12:02:47 PST 2024



On 1/24/24 16:26, Björn Töpel wrote:
> Daniel Henrique Barboza <dbarboza at ventanamicro.com> writes:
> 
>> On 1/24/24 09:49, Björn Töpel wrote:
>>> Hi!
>>>
>>> I bumped the RISC-V Linux kernel CI to use qemu 8.2.0, and realized that
>>> thead c906 didn't boot anymore. Bisection points to commit d6a427e2c0b2
>>> ("target/riscv/cpu.c: restrict 'marchid' value")
>>>
>>> Reverting that commit, or the hack below solves the boot issue:
>>>
>>> --8<--
>>> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
>>> index 8cbfc7e781ad..e18596c8a55a 100644
>>> --- a/target/riscv/cpu.c
>>> +++ b/target/riscv/cpu.c
>>> @@ -505,6 +505,9 @@ static void rv64_thead_c906_cpu_init(Object *obj)
>>>        cpu->cfg.ext_xtheadsync = true;
>>>    
>>>        cpu->cfg.mvendorid = THEAD_VENDOR_ID;
>>> +    cpu->cfg.marchid = ((QEMU_VERSION_MAJOR << 16) |
>>> +                        (QEMU_VERSION_MINOR << 8)  |
>>> +                        (QEMU_VERSION_MICRO));
>>>    #ifndef CONFIG_USER_ONLY
>>>        set_satp_mode_max_supported(cpu, VM_1_10_SV39);
>>>    #endif
>>> --8<--
>>>
>>> I'm unsure what the correct qemu way of adding a default value is,
>>> or if c906 should have a proper marchid.
>>
>> In case you need to set a 'marchid' different than zero for c906, this hack would
>> be a proper fix. As mentioned in the commit msg of the patch you mentioned:
>>
>> "Named CPUs should set 'marchid' to a meaningful value instead, and generic
>>    CPUs can set to any valid value."
>>
>> That means that any specific marchid value that the CPU uses must to be set
>> in its own cpu_init() function.
> 
> Got it. Thanks, Daniel!
> 
> For completeness (since it came up on the weekly PW call); Conor pointed
> out that zero *is* indeed the right marchid for c906, and in fact, the
> non-zero marchid pre commit d6a427e2c0b2 was incorrect.
> 
> Post commit d6a427e2c0b2, the correct alternative is picked up, and
> ERRATA_THEAD_PBMT (using non-standard memory type bits in
> page-table-entries) kicks in. AFAIU, that's not implemented by qemu's
> c906 support, which then traps.


This looks like a very good reason to actually push what you called 'hack' as
a fix. Yeah, in theory that commit did nothing wrong, but the side effect
(missing support for non-standard memory type bits) is kind of a QEMU problem.

You're welcome to format that hack into a patch, explaining in the commit msg why
we need to set marchid for c906 to that specific value. I'd even add a TODO
tag in rv64_thead_c906_cpu_init() to remind us that this is a band-aid and
that we should remove it once we implement the needed support.



> 
> That's the theory. Maybe Christoph knows if the non-standard bits are
> implemented or not?
> 
> Regardless; I removed booting Qemu T-head c906 from the CI, and the
> build/boot passes nicely ;-) [1]

I vote for setting marchid in c906 cpu_init and re-enable it in the CI.


Thanks,

Daniel

> 
> 
> Björn
> 
> [1] https://github.com/linux-riscv/linux-riscv/actions/runs/7641764759/job/20819801235?pr=447



More information about the linux-riscv mailing list