[PATCH V4 4/5] soc: qcom: Introduce SCMI based Memlat (Memory Latency) governor

Sibi Sankar quic_sibis at quicinc.com
Mon Dec 23 05:57:32 PST 2024



On 12/5/24 18:09, Cristian Marussi wrote:
> On Thu, Dec 05, 2024 at 04:33:05PM +0530, Sibi Sankar wrote:
>>
>>
>> On 11/29/24 15:27, Shivnandan Kumar wrote:
>>>
>>>
> 
> Hi Sibi,
> 
> some rants down below :P

Hey Cristian,

Thanks for taking time to put out your thoughts!

>   
>>> On 10/7/2024 11:40 AM, Sibi Sankar wrote:
>>>> Introduce a client driver that uses the memlat algorithm string
>>>> hosted on QCOM SCMI Generic Extension Protocol to detect memory
>>>> latency workloads and control frequency/level of the various
>>>> memory buses (DDR/LLCC/DDR_QOS).
>>>>
> 
> [snip]
> 
>>>> +    /* Set the effective cpu frequency calculation method */
>>>> +    ret = ops->set_param(ph, &cpucp_freq_method,
>>>> sizeof(cpucp_freq_method), MEMLAT_ALGO_STR,
>>>> +                 MEMLAT_SET_EFFECTIVE_FREQ_METHOD);
>>>> +    if (ret < 0)
>>>> +        return dev_err_probe(&sdev->dev, ret,
>>>> +                     "failed to set effective frequency calc method\n");
>>>> +
>>>
>>> Hi Sibi,
>>
>> Hey Shiv,
>>
>> Thanks for taking time to review the series!
>>
>>> Since the MEMLAT_SET_EFFECTIVE_FREQ_METHOD command is not supported in
>>> the legacy CPUCP firmware, it should be kept optional. This way, if the
>>> legacy firmware is used, the driver will not return an error when the
>>> CPUCP firmware returns -EOPNOTSUPP.
>>
>> The vendor protocol with the current major/minor version is
>> expected to work as is on x1e platforms. What legacy firmware
>> are you referring to? All future SoCs that plan to adhere to
>> it are expected to maintain this abi and can decide to make
>> use of alternate mechanisms to calculating frequency based
>> on additional dt properties set.
>>
> 
> Normally in the SCMI world you could leverage protocol versioning and
> the standard PROTOCOL_MESSAGE_ATTRIBUTES(0x2) command to let the agent
> investigate if the SCMI server it is speaking to implements or NOT a
> specific command and which specific version of that command is understood
> (with possibly varying size and fields)...
> 
> ...BUT since your vendor protocol is 'Generic' and, as it stands, it
> basically piggybacks any kind of binary payload (i.e. subcommands of
> some kind of subprotocols of yours) into the same 4 get/set/start/stop
> 'Generic' ops, without even specifying the transmitted/received payload
> sizes into the message itself....all of the possible SCMI versioning
> autodiscovery and backward-compatibility capabilities happily go out of
> the window because:
> 
> - your versioning refers to the generic protocol and you cannot possibly
>    describe all the possible future subcommands (opaque payloads) variations
>    and/or subcommands addition/removals purely on the major/minor version, AND
>    even if you did that, NONE of such future variations will be documented
>    anywhere since you are hiding all of this inside a bunch of binary blobs
> 
> - you dont even specify the payload sizes of the tx/rx 'Generic' payload
>    subcommands so it becomes even difficult for both the server and the
>    client to safely handle your 'Generic' subcommand message payloads
> 
> - you cannot issue a PROTOCOL_MESSAGE_ATTRIBUTE() to see if a specific
>    subcommand is supported, because your subcommand is NOT really a protocol
>    command but it is just one of the payloads of one of the 'Generic' protocol:
>    you commmands are only set/get/start/stop (maybe some sort of hack
>    could be doen around these...bit all will be even more flaky...)
> 
> - you dont implement NEGOTIATE_PROTOCOL_VERSION and so you cannot even
>    check if the SCMI server that you are speaking to will agree to
>    downgrade and 'speak' your Kernel SCMI agent (possibly older) 'Generic'
>    protocol version
> 
> All of this basically defeats all of the SCMI general capabilities
> around versioning and auto-discovery when it comes to your 'Generic' vendor
> protocol, with the end result that you will have to be EXTREMELY confident
> to perfectly match and keep in sync at all times your SCMI Server(FW) and
> Client(Kernel) sides, without any possibility of coping with a such a mismatch
> on the field by using some of the fallback/downgrade mechanism that you
> threw out of the window...

Even though we listed some of the background behind the generic
you have to understand it was done in a vacuum where QCOM might
have assumed that the entire vendor ID space was to be shared
across all vendors.

But your suggestions for adding another messages like NEGOTIATE
PROTOCOL_VERSION in a future major version upgrade would help
solve some of the problems you are listing out here.

Since there are devices that are out in the wild already running this
firmware, the first version of the generic vendor protocol is limited in
what it can accommodate. But like we already said we are open to changes
that will help review/maintain this for future SoCs without breakage.

> (...and sorry but looking at the above xchange about 'legacy firmware' I am
>   a bit skeptic about this as of now :D)

Some people within just used the wrong terminology here. There
are no "legacy" firmware since this is the first version of it
landing upstream and future versions have the options to implement
additional messages and ensure they do things in a way that is
easily reviewable/maintainable.

-Sibi

> 
> ...as I said initially when reviewing this series, you can do whatever
> you want within your Vendor protocol, but abusing the SCMI Vendor extensions
> capabilities in such a way could end up bringing to the table more cons
> (the above) than pros (some supposed 'simplification')
> 
> Thanks,
> Cristian



More information about the linux-arm-kernel mailing list