Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)

Anup Patel apatel at ventanamicro.com
Wed Jul 19 09:10:46 PDT 2023


Hi Bin,

On Wed, Jul 19, 2023 at 9:15 PM Bin Meng <bmeng.cn at gmail.com> wrote:
>
> On Wed, Jul 19, 2023 at 11:22 PM Anup Patel <anup at brainfault.org> wrote:
> >
> > On Wed, Jul 19, 2023 at 3:23 PM Alistair Francis <alistair23 at gmail.com> wrote:
> > >
> > > On Wed, Jul 19, 2023 at 3:39 PM Anup Patel <anup at brainfault.org> wrote:
> > > >
> > > > On Wed, Jul 19, 2023 at 7:03 AM Alistair Francis <alistair23 at gmail.com> wrote:
> > > > >
> > > > > On Sat, Jul 15, 2023 at 7:14 PM Atish Patra <atishp at atishpatra.org> wrote:
> > > > > >
> > > > > > On Fri, Jul 14, 2023 at 5:29 AM Conor Dooley <conor at kernel.org> wrote:
> > > > > > >
> > > > > > > On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote:
> > > > > > > > On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
> > > > > > > >
> > > > > > > > > > > OpenSBI v1.3
> > > > > > > > > > >    ____                    _____ ____ _____
> > > > > > > > > > >   / __ \                  / ____|  _ \_   _|
> > > > > > > > > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > > > > > > > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > > > > > > > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > > > > > > > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > > > > > > > > >         | |
> > > > > > > > > > >         |_|
> > > > > > > > > > >
> > > > > > > > > > > init_coldboot: ipi init failed (error -1009)
> > > > > > > > > > >
> > > > > > > > > > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > > > > > > > > > and compiles only a significantly cut down number of files from it, we
> > > > > > > > > > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > > > > > > > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > > > > > > > > platform firmware to vendor v1.3 either.
> > > > > > > > > > >
> > > > > > > > > > > I unless there's something obvious to you, it sounds like I will need to
> > > > > > > > > > > go and bisect OpenSBI. That's a job for another day though, given the
> > > > > > > > > > > time.
> > > > > > > > > > >
> > > > > > > > >
> > > > > > > > > The real issue is some CPU/HART DT nodes marked as disabled in the
> > > > > > > > > DT passed to OpenSBI 1.3.
> > > > > > > > >
> > > > > > > > > This issue does not exist in any of the DTs generated by QEMU but some
> > > > > > > > > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > > > > > > > > the E-core disabled.
> > > > > > > > >
> > > > > > > > > I had discovered this issue in a totally different context after the OpenSBI 1.3
> > > > > > > > > release happened. This issue is already fixed in the latest OpenSBI by the
> > > > > > > > > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> > > > > > > > > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
> > > > > > > >
> > > > > > > > Great, thanks Anup! I thought I had tested tip-of-tree too, but
> > > > > > > > obviously not.
> > > > > > > >
> > > > > > > > > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > > > > > > > > QEMU microchip-icicle-kit machine but I guess that's not true.
> > > > > > > >
> > > > > > > > Unfortunately the HSS has not worked in QEMU for a long time, and while
> > > > > > > > I would love to fix it, but am pretty stretched for spare time to begin
> > > > > > > > with.
> > > > > > > > I usually just do direct kernel boots, which use the OpenSBI that comes
> > > > > > > > with QEMU, as I am sure you already know :)
> > > > > > > >
> > > > > > > > > At this point, you can either:
> > > > > > > > > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
> > > > > > >
> > > > > > > I forgot to reply to this point, wondering what should be done with
> > > > > > > QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless
> > > > > > > of whether I can go and build a fixed version of OpenSBI.
> > > > > > >
> > > > > > FYI: The no-map fix went in OpenSBI v1.3. Without the upgrade, any
> > > > > > user using the latest kernel (> v6.4)
> > > > > > may hit those random linear map related issues (in hibernation or EFI
> > > > > > booting path).
> > > > > >
> > > > > > There are three possible scenarios:
> > > > > >
> > > > > > 1. Upgrade to OpenSBI v1.3: Any user of microchip-icicle-kit machine
> > > > > > or sifive fu540 machine users
> > > > > > may hit this issue if the device tree has the disabled hart (e core).
> > > > > > 2. No upgrade to OpenSBI v1.2. Any user using hibernation or UEFI may
> > > > > > have issues [1]
> > > > > > 3. Include a non-release version OpenSBI in Qemu with the fix as an exception.
> > > > > >
> > > > > > #3 probably deviates from policy and sets a bad precedent. So I am not
> > > > > > advocating for it though ;)
> > > > > > For both #1 & #2, the solution would be to use the latest OpenSBI in
> > > > > > -bios argument instead of the stock one.
> > > > > > I could be wrong but my guess is the number of users facing #2 would
> > > > > > be higher than #1.
> > > > >
> > > > > Thanks for that info Atish!
> > > > >
> > > > > We are stuck in a bad situation.
> > > > >
> > > > > The best solution would be if OpenSBI can release a 1.3.1, @Anup Patel
> > > > > do you think you could do that?
> > > >
> > > > OpenSBI has a major number and minor number in the version but it does
> > > > not have release/patch number so best would be to treat OpenSBI vX.Y.Z
> > > > as bug fixes on-top-of OpenSBI vX.Y. In other words, supervisor software
> > > > won't be able to differentiate between OpenSBI vX.Y.Z and OpenSBI vX.Y
> > > > using sbi_get_impl_version().
> > > >
> > > > There are only three commits between the ACLINT fix and OpenSBI v1.3
> > > > so as one-of case I will go ahead create OpenSBI v1.3.1 containing only
> > > > four commits on-top of OpenSBI v1.3
> > > >
> > > > Does this sound okay ?
> > >
> > > That sounds fine to me. It fixes the issue for the Microsemi board and
> > > it's a very small change between 1.3 and 1.3.1
> >
> > Please check
> > https://github.com/riscv-software-src/opensbi/releases/tag/v1.3.1
> >
> > I hope this helps.
>
> Hi Alistair,
>
> Do we need to update QEMU's opensbi binaries to v1.3.1?
>
> Hi Anup,
>
> Somehow I cannot see the 'tag' v1.3.1 being populated in the opensbi
> git repo. Am I missing anything?

There is a v1.3.1 tag in https://github.com/riscv-software-src/opensbi
(Try cloning the repo again?)

The commit history of v1.3.1 is v1.3 tag + 5 cherry picked commits
which means the commit history of the master branch is not the same
as the commit history of v1.3.1.

Regards,
Anup



More information about the opensbi mailing list