[EXT] Re: [PATCH v2] cpufreq: armada-37xx: forbid cpufreq for 1.2 GHz variant

Elad Nachman enachman at marvell.com
Tue Aug 2 09:52:12 PDT 2022


Hi,

Unless the logs are misleading, then I see here:

cpu cpu0: _set_opp: switching OPP: Freq 200000000 -> 1200000000 Hz, Level 0 -> 0, Bw 0 -> 0

Which violates the errata.
If there is an interim step in between, I think it should be printed out in the debug so we can clearly understand what is the interim frequency setting between 200 and 1200 MHz.

Elad.

-----Original Message-----
From: Robert Marko <robert.marko at sartura.hr> 
Sent: Tuesday, August 2, 2022 7:42 PM
To: Elad Nachman <enachman at marvell.com>
Cc: Pali Rohár <pali at kernel.org>; Wojciech Bartczak <wbartczak at marvell.com>; Marek Behún <kabel at kernel.org>; Viresh Kumar <viresh.kumar at linaro.org>; Gregory CLEMENT <gregory.clement at bootlin.com>; Tomasz Maciej Nowak <tmn505 at gmail.com>; Anders Trier Olesen <anders.trier.olesen at gmail.com>; Philip Soares <philips at netisense.com>; linux-pm at vger.kernel.org; Sebastian Hesselbarth <sebastian.hesselbarth at gmail.com>; linux-arm-kernel at lists.infradead.org; nnet <nnet at fastmail.fm>; Gérald Kerma <gandalf at gk2.net>
Subject: Re: [EXT] Re: [PATCH v2] cpufreq: armada-37xx: forbid cpufreq for 1.2 GHz variant

On Mon, Aug 1, 2022 at 8:50 PM Elad Nachman <enachman at marvell.com> wrote:
>
> Hi Pali,
>
> Could you please provide the crash dump / call trace?
>
> Also, if you can please annotate with printk the exact voltage/frequency changes taken by the driver, up to the point of the crash?
>
> This will help understand the sequence of events leading to the crash.
>
> Thanks,
>
> Elad.


Hi Elad,
Here are 2 bootlogs, but I dont think they are of any use as the traces are rather random and they are always different, like a real voltage issue:
https://urldefense.proofpoint.com/v2/url?u=https-3A__gist.github.com_robimarko_113216f566ccf159dfd33933889da042&d=DwIFaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=eTeNTLEK5-TxXczjOcKPhANIFtlB9pP4lq9qhdlFrwQ&m=u39n7XPBdQVaoaviM32QcFaiO0KDs3BVzkeF-4zrqPKElNH3igH9KqEKfxSKLz-H&s=jvmR3Myk443DelvNZv1OkhmpqnMp9Y8mvzzYz2g13rM&e=
https://urldefense.proofpoint.com/v2/url?u=https-3A__gist.github.com_robimarko_990d757870d44a3c5acdfeb957547705&d=DwIFaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=eTeNTLEK5-TxXczjOcKPhANIFtlB9pP4lq9qhdlFrwQ&m=u39n7XPBdQVaoaviM32QcFaiO0KDs3BVzkeF-4zrqPKElNH3igH9KqEKfxSKLz-H&s=XrMFeJpEGO5A4rIKjkHLNc4MHzPGOBKeOktDWCbQMAc&e= 

Here is a bootleg with the frequency changes, OPP points that are set by the CPUFreq driver are also here:
https://urldefense.proofpoint.com/v2/url?u=https-3A__gist.github.com_robimarko_1a81b0c6e93735b75ff4461d405c8033&d=DwIFaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=eTeNTLEK5-TxXczjOcKPhANIFtlB9pP4lq9qhdlFrwQ&m=u39n7XPBdQVaoaviM32QcFaiO0KDs3BVzkeF-4zrqPKElNH3igH9KqEKfxSKLz-H&s=02ljqhQAdZki-JwDYNPKaStmzSkhuitBRP6R17iOZqA&e= 

I am still digging to print the voltage changes as _set_opp_voltage is not being used.

Regards,
Robert
>
>
> ________________________________
> מאת: Pali Rohár <pali at kernel.org>
> ‏‏נשלח: יום שני 01 אוגוסט 2022 20:56
> ‏‏אל: Elad Nachman <enachman at marvell.com>
> עותק: Wojciech Bartczak <wbartczak at marvell.com>; Marek Behún 
> <kabel at kernel.org>; Viresh Kumar <viresh.kumar at linaro.org>; Gregory 
> CLEMENT <gregory.clement at bootlin.com>; Robert Marko 
> <robert.marko at sartura.hr>; Tomasz Maciej Nowak <tmn505 at gmail.com>; 
> Anders Trier Olesen <anders.trier.olesen at gmail.com>; Philip Soares 
> <philips at netisense.com>; linux-pm at vger.kernel.org 
> <linux-pm at vger.kernel.org>; Sebastian Hesselbarth 
> <sebastian.hesselbarth at gmail.com>; 
> linux-arm-kernel at lists.infradead.org 
> <linux-arm-kernel at lists.infradead.org>; nnet <nnet at fastmail.fm>; 
> Gérald Kerma <gandalf at gk2.net>
> ‏‏נושא: Re: [EXT] Re: [PATCH v2] cpufreq: armada-37xx: forbid cpufreq 
> for 1.2 GHz variant
>
> Hello Elad!
>
> Robert (in CC) tested this proposed change. But increasing delay to 
> 100ms does not help. CPU still crashes early during boot.
>
> On Monday 01 August 2022 14:15:27 Elad Nachman wrote:
> > Hi,
> >
> > As first step, please try to increase the delay to 100ms, see if it helps.
> >
> > Elad.
> >
> > -----Original Message-----
> > From: Pali Rohár <pali at kernel.org>
> > Sent: Monday, August 1, 2022 5:13 PM
> > To: Elad Nachman <enachman at marvell.com>
> > Cc: Wojciech Bartczak <wbartczak at marvell.com>; Marek Behún 
> > <kabel at kernel.org>; Viresh Kumar <viresh.kumar at linaro.org>; Gregory 
> > CLEMENT <gregory.clement at bootlin.com>; Robert Marko 
> > <robert.marko at sartura.hr>; Tomasz Maciej Nowak <tmn505 at gmail.com>; 
> > Anders Trier Olesen <anders.trier.olesen at gmail.com>; Philip Soares 
> > <philips at netisense.com>; linux-pm at vger.kernel.org; Sebastian 
> > Hesselbarth <sebastian.hesselbarth at gmail.com>; 
> > linux-arm-kernel at lists.infradead.org; nnet <nnet at fastmail.fm>
> > Subject: Re: [EXT] Re: [PATCH v2] cpufreq: armada-37xx: forbid 
> > cpufreq for 1.2 GHz variant
> >
> > Hello Elad and thank you for response!
> >
> > This errata is already implemented in the kernel for a longer time by Gregory's commit:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__git.kernel.org_
> > pub_scm_linux_kernel_git_stable_linux.git_commit_-3Fid-3D61c40f35f5c
> > d6f67ccbd7319a1722eb78c815989&d=DwIDaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=eT
> > eNTLEK5-TxXczjOcKPhANIFtlB9pP4lq9qhdlFrwQ&m=-E-AwB9STVx8xgapaCNSpDJI
> > PPnkrzrWkZX0uFz2bfNGFnckZelT_XaovUUPrNIg&s=4EUcdDWB_gqnEV8nREQi9E_iy
> > m5bjoM6l5zLrbh_GVs&e=
> >
> > There is also 20ms delay after L2/L3 to L1 state switch.
> >
> > Any idea what could be wrong here? Or is something more than above commit needed to correctly implement that errata?
> >
> > On Monday 01 August 2022 14:01:07 Elad Nachman wrote:
> > > Hi Pali,
> > >
> > > There is an errata for that.
> > >
> > > "
> > > Switching from L2/L3 state (200/300 MHz) to L0 state (1200 MHz) 
> > > requires sudden changes of VDD supply, and it requires time to 
> > > stabilize the VDD supply. The solution is to use gradual switching from L2/L3 to L1 and then L1 to L0 state.
> > > "
> > >
> > > I would also add additional delay for the VDD supply stabilization.
> > >
> > > FYI,
> > >
> > > Elad.
> > >
> > > -----Original Message-----
> > > From: Pali Rohár <pali at kernel.org>
> > > Sent: Monday, August 1, 2022 3:36 PM
> > > To: Elad Nachman <enachman at marvell.com>; Wojciech Bartczak 
> > > <wbartczak at marvell.com>
> > > Cc: Marek Behún <kabel at kernel.org>; Viresh Kumar 
> > > <viresh.kumar at linaro.org>; Gregory CLEMENT 
> > > <gregory.clement at bootlin.com>; Robert Marko 
> > > <robert.marko at sartura.hr>; Tomasz Maciej Nowak <tmn505 at gmail.com>; 
> > > Anders Trier Olesen <anders.trier.olesen at gmail.com>; Philip Soares 
> > > <philips at netisense.com>; linux-pm at vger.kernel.org; Sebastian 
> > > Hesselbarth <sebastian.hesselbarth at gmail.com>;
> > > linux-arm-kernel at lists.infradead.org; nnet <nnet at fastmail.fm>
> > > Subject: [EXT] Re: [PATCH v2] cpufreq: armada-37xx: forbid cpufreq 
> > > for
> > > 1.2 GHz variant
> > >
> > > External Email
> > >
> > > ------------------------------------------------------------------
> > > ----
> > > + Elad and Wojciech from Marvell
> > >
> > > Could you please look at this issue and/or forward it to relevant Marvell team?
> > >
> > > Maintainer Viresh already wrote that we cannot hang forever for Marvell and patch which disables support for 1.2 GHz was merged:
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.o
> > > rg_l 
> > > inux-2Dpm_20210809040224.j2rvopmmqda3utc5-40vireshk-2Di7_&d=DwIDaQ
> > > &c=n 
> > > KjWec2b6R0mOyPaz7xtfQ&r=eTeNTLEK5-TxXczjOcKPhANIFtlB9pP4lq9qhdlFrw
> > > Q&m= 
> > > 5nMMKyKOOM3XdMe_PerZRx8L7-D7MkWhCl7GxpXTPiotVf1TR4j8v3bpjQmRKCLC&s
> > > =cXi CZByknfz1rOIgJl4fJHl1KLLRq2shHul2-VPpYP0&e=
> > >
> > > On Sunday 08 August 2021 21:30:26 Pali Rohár wrote:
> > > > Gentle reminder. This is really serious issue. Could you please look at it?
> > > >
> > > > Adding more MarvellEmbeddedProcessors people to the loop: Evan, 
> > > > Benjamin an Igal
> > > >
> > > > On Thursday 15 July 2021 21:33:21 Pali Rohár wrote:
> > > > > Ping! Gentle reminder for Marvell people.
> > > > >
> > > > > On Thursday 08 July 2021 16:34:51 Pali Rohár wrote:
> > > > > > Konstantin, Nadav, Ken, Victor, Jason: This issue is pretty 
> > > > > > serious, CPU on 1.2GHz A3720 is crashing. Could you please look at it?
> > > > > >
> > > > > > On Friday 02 July 2021 18:30:35 Pali Rohár wrote:
> > > > > > > +Jason from GlobalScale as this issue affects GlobalScale Espressobin Ultra and V7 1.2 GHz boards.
> > > > > > >
> > > > > > > On Thursday 01 July 2021 00:56:01 Marek Behún wrote:
> > > > > > > > The 1.2 GHz variant of the Armada 3720 SOC is unstable 
> > > > > > > > with
> > > > > > > > DVFS: when the SOC boots, the WTMI firmware sets clocks 
> > > > > > > > and AVS values that work correctly with 1.2 GHz CPU 
> > > > > > > > frequency, but random crashes occur once cpufreq driver starts scaling.
> > > > > > > >
> > > > > > > > We do not know currently what is the reason:
> > > > > > > > - it may be that the voltage value for L0 for 1.2 GHz variant provided
> > > > > > > >   by the vendor in the OTP is simply incorrect when 
> > > > > > > > scaling is used,
> > > > > > > > - it may be that some delay is needed somewhere,
> > > > > > > > - it may be something else.
> > > > > > > >
> > > > > > > > The most sane solution now seems to be to simply forbid 
> > > > > > > > the cpufreq driver on 1.2 GHz variant.
> > > > > > > >
> > > > > > > > Signed-off-by: Marek Behún <kabel at kernel.org>
> > > > > > > > Fixes: 92ce45fb875d ("cpufreq: Add DVFS support for 
> > > > > > > > Armada
> > > > > > > > 37xx")
> > > > > > > > ---
> > > > > > > > If someone from Marvell could look into this, it would 
> > > > > > > > be great since basically 1.2 GHz variant cannot scale, 
> > > > > > > > which is a feature that was claimed to be supported by the SOC.
> > > > > > > >
> > > > > > > > Ken Ma / Victor Gu, you have worked on commit 
> > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.
> > > > > > > > co
> > > > > > > > m_MarvellEmbeddedProcessors_linux-2Dmarvell_commit_d6719
> > > > > > > > fdc2
> > > > > > > > b3
> > > > > > > > cac58064f41b531f86993c919aa9a&d=DwIDaQ&c=nKjWec2b6R0mOyP
> > > > > > > > az7x
> > > > > > > > tf
> > > > > > > > Q&r=eTeNTLEK5-TxXczjOcKPhANIFtlB9pP4lq9qhdlFrwQ&m=5nMMKy
> > > > > > > > KOOM
> > > > > > > > 3X
> > > > > > > > dMe_PerZRx8L7-D7MkWhCl7GxpXTPiotVf1TR4j8v3bpjQmRKCLC&s=b
> > > > > > > > 9cDK em t70OiTJF6KXj0ySzbxpsB_nuteXJE87via80&e=
> > > > > > > > in linux-marvell.
> > > > > > > > Your patch takes away the 1202 mV constant for 1.2 GHz 
> > > > > > > > base CPU frequency and instead adds code that computes 
> > > > > > > > the voltages from the voltage found in L0 AVS register (which is filled in by WTMI firmware).
> > > > > > > >
> > > > > > > > Do you know why the code does not work correctly for 
> > > > > > > > some
> > > > > > > > 1.2 GHz boards? Do we need to force the L0 voltage to 
> > > > > > > > 1202 mV if it is lower, or something?
> > > > > > > > ---
> > > > > > > >  drivers/cpufreq/armada-37xx-cpufreq.c | 6 +++++-
> > > > > > > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/cpufreq/armada-37xx-cpufreq.c
> > > > > > > > b/drivers/cpufreq/armada-37xx-cpufreq.c
> > > > > > > > index 3fc98a3ffd91..c10fc33b29b1 100644
> > > > > > > > --- a/drivers/cpufreq/armada-37xx-cpufreq.c
> > > > > > > > +++ b/drivers/cpufreq/armada-37xx-cpufreq.c
> > > > > > > > @@ -104,7 +104,11 @@ struct armada_37xx_dvfs {  };
> > > > > > > >
> > > > > > > >  static struct armada_37xx_dvfs armada_37xx_dvfs[] = {
> > > > > > > > - {.cpu_freq_max = 1200*1000*1000, .divider = {1, 2, 4, 
> > > > > > > > 6} },
> > > > > > > > + /*
> > > > > > > > +  * The cpufreq scaling for 1.2 GHz variant of the SOC 
> > > > > > > > +is currently
> > > > > > > > +  * unstable because we do not know how to configure it properly.
> > > > > > > > +  */
> > > > > > > > + /* {.cpu_freq_max = 1200*1000*1000, .divider = {1, 2, 
> > > > > > > > +4, 6} }, */
> > > > > > > >    {.cpu_freq_max = 1000*1000*1000, .divider = {1, 2, 4, 5} },
> > > > > > > >    {.cpu_freq_max = 800*1000*1000,  .divider = {1, 2, 3, 4} },
> > > > > > > >    {.cpu_freq_max = 600*1000*1000,  .divider = {2, 4, 5, 
> > > > > > > > 6} },
> > > > > > > > --
> > > > > > > > 2.31.1
> > > > > > > >



--
Robert Marko
Staff Embedded Linux Engineer
Sartura Ltd.
Lendavska ulica 16a
10000 Zagreb, Croatia
Email: robert.marko at sartura.hr
Web: https://urldefense.proofpoint.com/v2/url?u=http-3A__www.sartura.hr&d=DwIFaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=eTeNTLEK5-TxXczjOcKPhANIFtlB9pP4lq9qhdlFrwQ&m=u39n7XPBdQVaoaviM32QcFaiO0KDs3BVzkeF-4zrqPKElNH3igH9KqEKfxSKLz-H&s=_aBokTETNVzTrHqewupr4PeLusBNf7LGrTmjI2hppFk&e= 


More information about the linux-arm-kernel mailing list