cpufreq: frequency scaling spec in DT node
Mason
slash.tmp at free.fr
Thu Jun 29 04:41:46 PDT 2017
On 29/06/2017 12:04, Viresh Kumar wrote:
> On 29-06-17, 11:48, Mason wrote:
>
>> I have two similar, but slightly different SoCs.
>>
>> Firmware/bootloader sets the "nominal" CPU frequency to
>
> So nominal here is MAX cpu frequency.
>
>> - 1215 MHz on SoC A
>> - 1206 MHz on SoC B
>>
>> On both systems, software can reduce the CPU frequency by
>> writing an 8-bit integer divider to an MMIO register.
>>
>> Originally, I wanted to define a small number of operating points,
>> defined only by the divider value, and compute the actual OPP freq
>> at init.
>>
>> For example, use { 1, 2, 3, 5, 9 } for dividers =>
>> 1215, 607.5, 405, 243, 135 on SoC A
>> 1206, 603, 402, 241.2, 134 on Soc B
>>
>> I'm using the generic cpufreq driver.
>>
>> Binding for the generic cpufreq driver:
>> https://www.kernel.org/doc/Documentation/devicetree/bindings/cpufreq/cpufreq-dt.txt
>>
>> I don't think there's a way to do what I want with the
>> existing driver, right?
>
> No, you should rather use actual target frequency values.
>
>> It's not a big deal, I can write the actual target frequencies
>> in the DT.
>
> Right.
>
>> (BTW, the OPPs are more SW than HW desc, right?)
>
> Hmm, I wouldn't say that exactly :)
>
> What OPP contains is mostly defined by hardware, apart from the
> frequency values we are talking about. And those are decided by the
> boot loaders and they are like hardware to the kernel really. They
> define hardware capabilities IOW.
>
> If you want, you can actually try implementing a ->target() type
> cpufreq driver instead of ->target_index() and you will be able to
> select any frequency you want. But with the above example, what you
> can select is Max divided by integer value and so you can have 9
> different OPPs and reuse cpufreq-dt.
>
>> But my problem is: what happens if firmware/bootloader is
>> changed without me knowing, and they change the nominal
>> frequency?
>
> The kernel doesn't have any authority over what frequencies we are
> allowed to use and we depend on the boot loader for that. If someone
> changes that, screw him :)
>
>> Because of the rounding, if the nominal freq
>> is slightly increased, the SoC will start working at
>
> decreased ?
>
>> *slower* speeds.
>>
>> For example, if nominal is 1215, and I request 603, I will
>> actually get 405.
>
> No, you will normally get a frequency >= requested frequency with the
> cpufreq governors we have.
>
>> This effect can be seen if I define SoC B OPPs on SoC A:
>>
>> $ cat scaling_available_frequencies
>> 134000 241200 402000 603000 1206000
>> /sys/devices/system/cpu/cpu0/cpufreq$ echo 603000 > scaling_max_freq
>
> Wow. This is not how you request a frequency. What you said here is
> that the MAX frequency allowed now is 603000 instead of 1206000. And
> because 603000 isn't a valid frequency, we go down to 405000.
>
> So, you should try using the userspace governor and play with
> scaling_setspeed sysfs file.
I was trying to "emulate" the behavior of the ondemand governor.
Based on your reaction, I got it wrong...
Here is the actual issue:
I'm on SoC B, where nominal/max freq is expected to be 1206 MHz.
So the OPPs in the DT are:
operating-points = <1206000 0 603000 0 402000 0 241200 0 134000 0>;
*But* FW changed the max freq behind my back, to 1215 MHz.
Here is what happens when I execute:
echo ondemand >scaling_governor
sleep 2
cpuburn-a9 & cpuburn-a9 & cpuburn-a9 & cpuburn-a9
### cpuburn-a9 spins in a tight infinite loop,
### hitting all FUs to raise the CPU temperature
# cpufreq_test.sh
[ 69.933874] set_target: index=4
[ 69.944799] set_target: index=2
[ 69.947988] clk_divider_set_rate: rate=303750000 parent_rate=1215000000 div=4
[ 69.955542] set_target: index=4
[ 69.958801] clk_divider_set_rate: rate=607500000 parent_rate=1215000000 div=2
[ 69.984789] set_target: index=0
[ 69.987980] clk_divider_set_rate: rate=121500000 parent_rate=1215000000 div=10
[ 71.947597] set_target: index=4
[ 71.950996] clk_divider_set_rate: rate=607500000 parent_rate=1215000000 div=2
As you can see, the divider remains stuck at 2, so the SoC
is actually running only at 607.5 MHz (instead of 1215 MHz).
If I fix the OPPs in DT to:
operating-points = <1215000 0 607500 0 405000 0 243000 0 135000 0>;
Then I get the expected behavior:
$ cpufreq_test.sh
[ 32.717930] set_target: index=1
[ 32.721131] clk_divider_set_rate: rate=243000000 parent_rate=1215000000 div=5
[ 32.731326] set_target: index=4
[ 32.734521] clk_divider_set_rate: rate=1215000000 parent_rate=1215000000 div=1
[ 32.754556] set_target: index=0
[ 32.757738] clk_divider_set_rate: rate=135000000 parent_rate=1215000000 div=9
[ 32.765864] set_target: index=4
[ 32.769217] clk_divider_set_rate: rate=1215000000 parent_rate=1215000000 div=1
[ 33.438811] set_target: index=0
[ 33.442001] clk_divider_set_rate: rate=135000000 parent_rate=1215000000 div=9
[ 33.450249] set_target: index=4
[ 33.453470] clk_divider_set_rate: rate=1215000000 parent_rate=1215000000 div=1
[ 33.477888] set_target: index=0
[ 33.481067] clk_divider_set_rate: rate=135000000 parent_rate=1215000000 div=9
[ 34.714786] set_target: index=4
[ 34.718237] clk_divider_set_rate: rate=1215000000 parent_rate=1215000000 div=1
Divider settles at 1 (full speed) to provide maximum
performance for the user-space processes.
My concern is that if I don't check somewhere that the
nominal frequency is as expected in the DT, the CPU might
run slower than expected (max freq cut in half).
>> [ 60.401883] set_target: index=3
>> [ 60.405118] clk_divider_set_rate: rate=405000000 parent_rate=1215000000 div=3
>>
>>
>> What can I do against that?
>>
>> Should I check the nominal frequency in my clk driver?
>> (I'm not sure reading properties of unrelated nodes is acceptable practice.)
>
> We rely on the boot loader to get these details.
>
> There is one thing you can do to avoid adding OPP entries in the DT.
> You can rather add them dynamically with help of: dev_pm_opp_add() and
> cpufreq-dt will continue to work with that too.
In what driver should I call these... the clk driver?
(drivers/clk/tegra/cvb.c seems to be doind that)
A problem might arise when I need to do voltage scaling,
though, since I also need to specify voltages, right?
> But you should understand how to use the sysfs interface first and
> make sure you are doing the right thing.
You're talking about this document, right?
https://www.kernel.org/doc/Documentation/cpu-freq/user-guide.txt
Regards.
More information about the linux-arm-kernel
mailing list