[PATCH v2 08/13] firmware: arm_scmi: Harden clock protocol initialization
Cristian Marussi
cristian.marussi at arm.com
Sun Apr 26 16:10:47 PDT 2026
On Fri, Apr 24, 2026 at 03:55:08PM +0200, Geert Uytterhoeven wrote:
> Hi Cristian,
Hi,
>
> On Fri, 24 Apr 2026 at 15:32, Cristian Marussi <cristian.marussi at arm.com> wrote:
> > On Fri, Apr 24, 2026 at 02:07:59PM +0200, Nicolas Frattaroli wrote:
> > > On Tuesday, 10 March 2026 19:40:25 Central European Summer Time Cristian Marussi wrote:
> > > > Add proper error handling on failure to enumerate clocks features or
> > > > rates.
>
> > > > Signed-off-by: Cristian Marussi <cristian.marussi at arm.com>
> > > > ---
> > > > drivers/firmware/arm_scmi/clock.c | 22 ++++++++++++++++------
> > > > 1 file changed, 16 insertions(+), 6 deletions(-)
> > > >
> > > > diff --git a/drivers/firmware/arm_scmi/clock.c b/drivers/firmware/arm_scmi/clock.c
> > > > index c9b62edce4fd..bf956305a8fe 100644
> > > > --- a/drivers/firmware/arm_scmi/clock.c
> > > > +++ b/drivers/firmware/arm_scmi/clock.c
> > > > @@ -402,10 +402,16 @@ static int scmi_clock_attributes_get(const struct scmi_protocol_handle *ph,
> > > > SUPPORTS_RATE_CHANGE_REQUESTED_NOTIF(attributes))
> > > > clk->rate_change_requested_notifications = true;
> > > > if (PROTOCOL_REV_MAJOR(ph->version) >= 0x3) {
> > > > - if (SUPPORTS_PARENT_CLOCK(attributes))
> > > > - scmi_clock_possible_parents(ph, clk_id, cinfo);
> > > > - if (SUPPORTS_GET_PERMISSIONS(attributes))
> > > > - scmi_clock_get_permissions(ph, clk_id, clk);
> > > > + if (SUPPORTS_PARENT_CLOCK(attributes)) {
> > > > + ret = scmi_clock_possible_parents(ph, clk_id, cinfo);
> > > > + if (ret)
> > > > + return ret;
> > > > + }
> > > > + if (SUPPORTS_GET_PERMISSIONS(attributes)) {
> > > > + ret = scmi_clock_get_permissions(ph, clk_id, clk);
> > > > + if (ret)
> > > > + return ret;
> > > > + }
> > > > if (SUPPORTS_EXTENDED_CONFIG(attributes))
> > > > clk->extended_config = true;
> > > > }
> > > > @@ -1143,8 +1149,12 @@ static int scmi_clock_protocol_init(const struct scmi_protocol_handle *ph)
> > > > for (clkid = 0; clkid < cinfo->num_clocks; clkid++) {
> > > > cinfo->clkds[clkid].id = clkid;
> > > > ret = scmi_clock_attributes_get(ph, clkid, cinfo);
> > > > - if (!ret)
> > > > - scmi_clock_describe_rates_get(ph, clkid, cinfo);
> > > > + if (ret)
> > > > + return ret;
> > > > +
> > > > + ret = scmi_clock_describe_rates_get(ph, clkid, cinfo);
> > > > + if (ret)
> > > > + return ret;
> > > > }
> > > >
> > > > if (PROTOCOL_REV_MAJOR(ph->version) >= 0x3) {
> > > >
> > >
> > > I see that a quirk is being added for this, but I thought I should chime
> > > in with my opinion for future approaches in this direction.
> > >
> > > I don't see how this hardens anything. All this does is break platforms
> > > that were previously working by returning early. At most, this should
> >
> > Certainly the naming in the subject was chosen badly (by me!)...indeed it
> > should be more something like "Enforce strict protocol compliance",
> > because at the end all of the broken platforms really run a slighly odd
> > out of spec SCMI firmware that does NOT implement one or more of the SCMI
> > mandatory command...
> >
> > > be a warning (as in not WARN but pr_warn/dev_warn/...). If firmware
> > > returns nonsense, a clock driver should imho try its best to work
> > > around the nonsense in a safe way, because the alternative is that
> > > a major part of the system (and thus likely the entire system) no
> >
> > ..well yes we definitely dont want to break deployed platforms BUT also
> > we dont want to legalize this kind of out of spec behaviour in future
> > firmwares...hence (a number ?) of quirks an FW_BUG warns probably to
> > let already broken deployed platforms survive while discouraging such
> > implementation in future fw implementations...
> >
> > These firmware most certainly wont pass the SCMI compliance test suite [1],
> > which indeed we do not mandate, but the reason these bugs happened is
> > exactly because the kernel SCMI stack was buggy and left that door open...
> >
> > More specifically these kind of out-of-spec behaviours are not really just
> > a matter being 'picky', the problem is that any resource set in any
> > SCMI protocol is defined by the spec such as to be described by a
> > contiguos set of IDs and the drivers are designed anyway under that
> > assumption from the allocation point of view, so allowing a clock ID to
> > just fail one of the mandatory commands and skip a domain would jeopardize
> > all of this and, even if clearly is NOT a problem here, seems a fragile
> > assumption.
>
> How can you have all of:
> 1. a contiguous list of IDs,
> 2. implement all mandatory commands,
> 2. restrict the use of some clocks to a subset of the agents in the system?
> Use a different list of IDs for each agent?
Yes, the SCMI server can provide a per-agent view of the world to each
agent, and ideally it should not even expose resources that are not
needed at all to an agent...and by that I mean that the server SHOULD
not enumerate that resources when queried, by dropping them from the
list of resources that return to that agent WHILE maintaining the set of
IDs contigous...
...the SCMI server provides an illusion ideally where the agent is in
control and can access whatever it wants, while the reality is that the
agent can only issue commands that are deemed safe and sensible by the server,
which is the ultimate arbiter on the system, to the extent that can hide
resources or simply silently ignore requests...
> What if a mistake was made, and a clock was exposed to an agent by
> accident?
Quirk !
More detail on all of this in my babbling on the other thread...sorry
for the flood of words :P
Thanks,
Cristian
More information about the linux-arm-kernel
mailing list