[PATCH 5/6] arm64: topology: Tell the scheduler about the relative power of cores

Wed Dec 11 14:27:09 EST 2013

On Wed, 11 Dec 2013, Mark Brown wrote:

> On Wed, Dec 11, 2013 at 02:47:55PM +0000, Catalin Marinas wrote:
> > On Wed, Dec 11, 2013 at 01:13:25PM +0000, Mark Brown wrote:
> 
> > > The power numbers are the same as for ARMv7 since it seems that the
> > > expected differential between the big and little cores is very similar on
> > > both ARMv7 and ARMv8.
> 
> > I have no idea ;). We don't have real silicon yet, so that's just a wild
> > guess.
> 
> I was going on some typical DMIPS/MHz numbers that I'd found so
> hopefully it's not a complete guess, though it will vary and that's just
> one benchmark with all the realism problems that entails.  The ratio
> seemed to be about the same as the equivalent for the ARMv7 cores so
> given that it's a finger in the air thing it didn't seem worth drilling
> down much further.
> 
> > > +static const struct cpu_efficiency table_efficiency[] = {
> > > +	{ "arm,cortex-a57", 3891 },
> > > +	{ "arm,cortex-a53", 2048 },
> > > +	{ NULL, },
> > > +};
> 
> > I also don't think we can just have absolute numbers here. I'm pretty
> > sure these were generated on TC2 but other platforms may have different
> > max CPU frequencies, memory subsystem, level and size of caches. The
> > "average" efficiency and difference will be different.
> 
> The CPU frequencies at least are taken care of already, these numbers
> get scaled for each core.  Once we're talking about things like the
> memory I'd also start worrying about application specific effects.
> There's also going to be stuff like thermal management which get fed in
> here and which varies during runtime.
> 
> I don't know where the numbers came from for v7.
> 
> > Can we define this via DT? It's a bit strange since that's a constant
> > used by the Linux scheduler but highly related to hardware.
> 
> I really don't think that's a good idea at this point, it seems better
> for the DT to stick to factual descriptions of what's present rather
> than putting tuning numbers in there.  If the wild guesses are in the
> kernel source it's fairly easy to improve them, if they're baked into
> system DTs that becomes harder.

I really think putting such things into DT is wrong.

If those numbers were derived from benchmark results, then it is most 
probably best to try to come up with some kind of equivalent benchmark 
in the kernel to qualify CPUs at run time.  After all this is what 
actually matters i.e. how CPUs perform relative to each other, and that 
may vary with many factors that people will forget to update when 
copying a DT content to enable a new board.

And that wouldn't be the first time some benchmark is used at boot time.  
Different crypto/RAID algorithms are tested to determine the best one to 
use, etc.

> I'm also worried about putting numbers into the DT now with all the
> scheduler work going on, this time next year we may well have a
> completely different idea of what we want to tell the scheduler.  It may
> be that we end up being able to explicitly tell the scheduler about
> things like the memory architecture, or that the scheduler just gets
> smarter and can estimate all this stuff at runtime.  

Exactly.  Which is why the kernel better be self-sufficient to determine 
such params.  Dt should be used only for things that may not be probed 
at run time.  The relative performance of a CPU certainly can be probed 
at run time.

Obviously the specifics of the actual benchmark might be debated, but 
the same can be said about static numbers.

Nicolas