[BUG] New arm scmi check in linux-next causing rk3568 not to boot due to firmware bug

Cristian Marussi cristian.marussi at arm.com
Thu May 5 07:21:53 PDT 2022


On Thu, May 05, 2022 at 11:47:41AM +0100, Cristian Marussi wrote:
> On Thu, May 05, 2022 at 11:40:09AM +0200, Etienne Carriere wrote:
> > Hello Nicolas, Cristian,
> > 
> > On Thu, 5 May 2022 at 10:03, Cristian Marussi <cristian.marussi at arm.com> wrote:
> > >
> > > On Wed, May 04, 2022 at 07:51:45PM +0200, Nicolas Frattaroli wrote:
> > > > On Mittwoch, 4. Mai 2022 15:21:30 CEST Sudeep Holla wrote:
> > > > > + Cristian
> > >
> > > +Etieenne
> > >
> > > Hi Nicolas,
> > >
> 
> Hi Etienne,
> 
> [snip]
> 
> > > Having a quick look at TF-A SCMI code in charge of this message (at least in
> > > the upstream):
> > >
> > > https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/tree/drivers/scmi-msg/base.c#n136
> > >
> > > it seems to me that the bug lies in the fact that the BASE_DISCOVER_PROTOCOLS
> > > response message built by the function above is not properly sized: the trailing
> > > message payload carrying the list of protocols (after returned_protos field) returns
> > > always a fixed arbitrarily sized payload, possibly zeroed if fewer protocols are
> > > carried.
> > >
> > > IOW, even though the answer in this case carries 3 items/protocols, the payload
> > > is anyway 8 bytes (with the last 5 bytes zeroed), while by the spec it should have
> > > been just 4 bytes.
> > >
> > > (in fact testing the kernel fix on a JUNO with last SCP fw release did NOT expose
> > > any issue...)
> > >
> > > I think a fix FW side could be something along these lines (UNTESTED NOR
> > > BUILT ! ... I Cc'ed Etienne that seems the author of this bit)
> > >
> > > This basically mirrors the same checks introduced in kernel...if someone
> > > is fancy/able to test it....
> > 
> > Indeed the firmware implementation is wrong in TF-A.
> > And also in OP-TEE by the way:
> > https://github.com/OP-TEE/optee_os/blob/3.17.0/core/drivers/scmi-msg/base.c#L163-L166
> > 
> > @Nicoals, do you want to send a patch to TF-A, or do you want me to do it?
> > 
> > I can fix the optee_os implementation. I'll tell you when I'll have
> > created a P-R.
> > The fix is the same for TF-A and OP-TEE.
> > Proposal from Cristian looks good to me, maybe simplified:
> > 
> > ```patch
> >          memcpy(outargs, &p2a, sizeof(p2a));
> >          memcpy(outargs + sizeof(p2a), list + a2p->skip, count);
> > 
> > -        scmi_write_response(msg, outargs, sizeof(outargs));
> > +        list_sz = (1 + (count - 1) / sizeof(uint32_t)) * sizeof(uint32_t);
> > +        scmi_write_response(msg, outargs, sizeof(p2a) + list_sz);
> > ```
> > 
> 
> I don't think list_sz is properly calculated if you don't rule out the
> count = 0 case (did the same mistake in Kernel at first :D)...if count is
> zero list_sz ends up being 4 [(1 + (0 - 1)  / 4 ) * 4] while it should be
> zero. (...and 'if (count)' also avoid an unneeded memcpy of zero bytes)
> 
> Moreover reviewing my own proposal code below it's probably easier to use
> some sort of macro like ALIGN(count, 4) if there's one in TF-A/OP-TEE.
> (...checking anyway that can handle correctly the zero case..)
> 
> Thanks,
> Etienne
> 
 Sorry this was meant to be
 
 Thanks,
 Cristian

:P but I messed up the snipping

Cristian




More information about the Linux-rockchip mailing list