[PATCH 2/4] of: DT quirks infrastructure

Pantelis Antoniou pantelis.antoniou at konsulko.com
Fri Feb 20 07:38:19 PST 2015


Hi Peter,

> On Feb 20, 2015, at 17:24 , Peter Hurley <peter at hurleysoftware.com> wrote:
> 
> On 02/20/2015 10:02 AM, Pantelis Antoniou wrote:
>> Hi Peter,
>> 
>>> On Feb 20, 2015, at 17:00 , Peter Hurley <peter at hurleysoftware.com> wrote:
>>> 
>>> On 02/20/2015 09:35 AM, Ludovic Desroches wrote:
>>>> On Fri, Feb 20, 2015 at 09:21:38AM -0500, Peter Hurley wrote:
>>>>> On 02/19/2015 12:38 PM, Pantelis Antoniou wrote:
>>>>>> 
>>>>>>> On Feb 19, 2015, at 19:30 , Frank Rowand <frowand.list at gmail.com> wrote:
>>>>>>> 
>>>>>>> On 2/19/2015 9:00 AM, Pantelis Antoniou wrote:
>>>>>>>> Hi Frank,
>>>>>>>> 
>>>>>>>>> On Feb 19, 2015, at 18:48 , Frank Rowand <frowand.list at gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>> On 2/19/2015 6:29 AM, Pantelis Antoniou wrote:
>>>>>>>>>> Hi Mark,
>>>>>>>>>> 
>>>>>>>>>>> On Feb 18, 2015, at 19:31 , Mark Rutland <mark.rutland at arm.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>>>> +While this may in theory work, in practice it is very cumbersome
>>>>>>>>>>>>>> +for the following reasons:
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +1. The act of selecting a different boot device tree blob requires
>>>>>>>>>>>>>> +a reasonably advanced bootloader with some kind of configuration or
>>>>>>>>>>>>>> +scripting capabilities. Sadly this is not the case many times, the
>>>>>>>>>>>>>> +bootloader is extremely dumb and can only use a single dt blob.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> You can have several bootloader builds, or even a single build with
>>>>>>>>>>>>> something like appended DTB to get an appropriate DTB if the same binary
>>>>>>>>>>>>> will otherwise work across all variants of a board.
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> No, the same DTB will not work across all the variants of a board.
>>>>>>>>>>> 
>>>>>>>>>>> I wasn't on about the DTB. I was on about the loader binary, in the case
>>>>>>>>>>> the FW/bootloader could be common even if the DTB couldn't.
>>>>>>>>>>> 
>>>>>>>>>>> To some extent there must be a DTB that will work across all variants
>>>>>>>>>>> (albeit with limited utility) or the quirk approach wouldn't work…
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> That’s not correct; the only part of the DTB that needs to be common
>>>>>>>>>> is the model property that would allow the quirk detection logic to fire.
>>>>>>>>>> 
>>>>>>>>>> So, there is a base DTB that will work on all variants, but that only means
>>>>>>>>>> that it will work only up to the point that the quirk detector method
>>>>>>>>>> can work. So while in recommended practice there are common subsets
>>>>>>>>>> of the DTB that might work, they might be unsafe.
>>>>>>>>>> 
>>>>>>>>>> For instance on the beaglebone the regulator configuration is different
>>>>>>>>>> between white and black, it is imperative you get them right otherwise
>>>>>>>>>> you risk board damage.
>>>>>>>>>> 
>>>>>>>>>>>>> So it's not necessarily true that you need a complex bootloader.
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>>> +2. On many instances boot time is extremely critical; in some cases
>>>>>>>>>>>>>> +there are hard requirements like having working video feeds in under
>>>>>>>>>>>>>> +2 seconds from power-up. This leaves an extremely small time budget for
>>>>>>>>>>>>>> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
>>>>>>>>>>>>>> +is by removing the standard bootloader from the normal boot sequence
>>>>>>>>>>>>>> +altogether by having a very small boot shim that loads the kernel and
>>>>>>>>>>>>>> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Given my previous comments above I don't see why this is relevant.
>>>>>>>>>>>>> You're already passing _some_ DTB here, so if you can organise for the
>>>>>>>>>>>>> board to statically provide a sane DTB that's fine, or you can resort to
>>>>>>>>>>>>> appended DTB if it's not possible to update the board configuration.
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> You’re missing the point. I can’t use the same DTB for each revision of the
>>>>>>>>>>>> board. Each board is similar but it’s not identical.
>>>>>>>>>>> 
>>>>>>>>>>> I think you've misunderstood my point. If you program the board with the
>>>>>>>>>>> relevant DTB, or use appended DTB, then you will pass the correct DTB to
>>>>>>>>>>> the kernel without need for quirks.
>>>>>>>>>>> 
>>>>>>>>>>> I understand that each variant is somewhat incompatible (and hence needs
>>>>>>>>>>> its own DTB).
>>>>>>>>>> 
>>>>>>>>>> In theory it might work, in practice this does not. Ludovic mentioned that they
>>>>>>>>>> have 27 different DTBs in use at the moment. At a relatively common 60k per DTB
>>>>>>>>>> that’s 27x60k = 1.6MB of DTBs, that need to be installed.
>>>>>>>>> 
>>>>>>>>> < snip >
>>>>>>>>> 
>>>>>>>>> Or you can install the correct DTB on the board.  You trust your manufacturing line
>>>>>>>>> to install the correct resistors.  You trust your manufacturing line to install the
>>>>>>>>> correct kernel version (eg an updated version to resolve a security issue).
>>>>>>>>> 
>>>>>>>>> I thought the DT blob was supposed to follow the same standard that other OS's or
>>>>>>>>> bootloaders understood.  Are you willing to break that?  (This is one of those
>>>>>>>>> ripples I mentioned in my other emails.)
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Trust no-one.
>>>>>>>> 
>>>>>>>> This is one of those things that the kernel community doesn’t understand which makes people
>>>>>>>> who push product quite mad.
>>>>>>>> 
>>>>>>>> Engineering a product is not only about meeting customer spec, in order to turn a profit
>>>>>>>> the whole endeavor must be engineered as well for manufacturability.
>>>>>>>> 
>>>>>>>> Yes, you can always manually install files in the bootloader. For 1 board no problem.
>>>>>>>> For 10 doable. For 100 I guess you can hire an extra guy. For 1 million? Guess what,
>>>>>>>> instead of turning a profit you’re losing money if you only have a few cents of profit
>>>>>>>> per unit.
>>>>>>> 
>>>>>>> I'm not installing physical components manually.  Why would I be installing software
>>>>>>> manually?  (rhetorical question)
>>>>>>> 
>>>>>> 
>>>>>> Because on high volume product runs the flash comes preprogrammed and is soldered as is.
>>>>>> 
>>>>>> Having a single binary to flash to every revision of the board makes logistics considerably
>>>>>> easier.
>>>>>> 
>>>>>> Having to boot and tweak the bootloader settings to select the correct dtb (even if it’s present
>>>>>> on the flash medium) takes time and is error-prone.
>>>>>> 
>>>>>> Factory time == money, errors == money.
>>>>>> 
>>>>>>>> 
>>>>>>>> No knobs to tweak means no knobs to break. And a broken knob can have pretty bad consequences
>>>>>>>> for a few million units. 
>>>>>>> 
>>>>>>> And you produce a few million units before testing that the first one off the line works?
>>>>>>> 
>>>>>> 
>>>>>> The first one off the line works. The rest will get some burn in and functional testing if you’re
>>>>>> lucky. In many cases where the product is very cheap it might make financial sense to just ship
>>>>>> as is and deal with recalls, if you’re reasonably happy after a little bit of statistical sampling.
>>>>>> 
>>>>>> Hardware is hard :)
>>>>> 
>>>>> I'm failing to see how this series improves your manufacturing process at all.
>>>>> 
>>>>> 1. Won't you have to provide the factory with different eeprom images for the
>>>>>  White and Black?  You _trust_ them to get that right, or more likely, you
>>>>>  have process control procedures in place so that you don't get 1 million Blacks
>>>>>  flashed with the White eeprom image.
>>>>> 
>>>>> 2. The White and Black use different memory technology so it's not as if the
>>>>>  eMMC from the Black will end up on the White SMT line (or vice versa).
>>>>> 
>>>>> 3  For that matter, why wouldn't you worry that all the microSD cards intended
>>>>>  for the White were accidentally assembled with the first 50,000 Blacks; at
>>>>>  that point you're losing a lot more than a few cents of profit. And that has
>>>>>  nothing to do with what image you provided.
>>>>> 
>>>> 
>>>> As you said, we can imagine many reasons to have a failure during the
>>>> production, having several DTB files will increase the risk.
>>> 
>>> It's interesting that you don't see the added complexity of open-coding
>>> the i2c driver or mixing DTS fragments for different designs as increased risk
>>> (for us all).
>>> 
>>> 
>> 
>> You don’t have to use it.
> 
>> diff --git a/arch/arm/mach-omap2/Makefile b/arch/arm/mach-omap2/Makefile
>> index 5d27dfd..02129e7 100644
>> --- a/arch/arm/mach-omap2/Makefile
>> +++ b/arch/arm/mach-omap2/Makefile
>> @@ -259,6 +259,11 @@ obj-$(CONFIG_MACH_CRANEBOARD)		+= board-am3517crane.o
>> 
>> obj-$(CONFIG_MACH_SBC3530)		+= board-omap3stalker.o
>> 
>> +# DT quirks
>> +ifneq ($(CONFIG_OF_DYNAMIC),)
>> +obj-$(CONFIG_SOC_AM33XX)		+= am33xx-dt-quirks.o
>> +endif
> 
> Won't this automatically be included on my Black that supports DT overlays?
> 

Yes it will. It is a grand total of 498 lines of code, and the total size of
the code segment is about 2.2K.

You do realize that you’re probably booting a multi-platform kernel on the 
black right? Including things for all 2xxx/3xxx and 44xx platforms?
For instance:

> ~/ti/kernels/linux-github.git $ wc -l arch/arm/mach-omap2/*44xx*.c
>    443 arch/arm/mach-omap2/clockdomains44xx_data.c
>    526 arch/arm/mach-omap2/cminst44xx.c
>    251 arch/arm/mach-omap2/cpuidle44xx.c
>    250 arch/arm/mach-omap2/dpll44xx.c
>   4864 arch/arm/mach-omap2/omap_hwmod_44xx_data.c
>    295 arch/arm/mach-omap2/pm44xx.c
>    358 arch/arm/mach-omap2/powerdomains44xx_data.c
>     62 arch/arm/mach-omap2/prcm_mpu44xx.c
>    770 arch/arm/mach-omap2/prm44xx.c
>    210 arch/arm/mach-omap2/prminst44xx.c
>    117 arch/arm/mach-omap2/vc44xx_data.c
>    130 arch/arm/mach-omap2/voltagedomains44xx_data.c
>    104 arch/arm/mach-omap2/vp44xx_data.c
>   8380 total

I bet those things are far larger than 2.2K. And I also bet that in the
tradeoff analysis that the board maintainer did things came down to 
increasing complexity so that he can consolidate the kernels for all the
other platforms he has to support besides the black.

> 
>> Some people really do though. As for increased risk
>> I expect to see arguments instead of a statement.
> 
> No one is wasting your time with random arguments. Please keep your tone civil.
> 

A statement like 'increasing risk for all of us' is very open ended. What is
the risk? How much of it exists?

If I offended you I’m really sorry though, I meant no such thing.

> Regards,
> Peter Hurley
> 

Regards

— Pantelis

> 
>>>>> 3. The factory is just as likely to use some other customer's image by accident,
>>>>>  so you're just as likely to have the same failure rate if you have no test
>>>>>  process at the factory.
>>>>> 
>>>>> 4. If you're using offline programming, the image has to be tested after
>>>>>  reflow anyway.
>>>>> 
>>>>> IOW, your QA process will not change at all == same cost.
>>>>> 
>>>>> Regards,
>>>>> Peter Hurley
>>> 
>> 
>> Regards
>> 
>> — Pantelis




More information about the linux-arm-kernel mailing list