[RFC 00/15] Device Tree schemas and validation

Benoit Cousson bcousson at baylibre.com
Thu Oct 3 09:53:35 EDT 2013


Hi David,

On 02/10/2013 16:29, David Gibson wrote:
> On Tue, Oct 01, 2013 at 04:22:24PM -0600, Stephen Warren wrote:
>> On 09/24/2013 10:52 AM, Benoit Cousson wrote:
>>> Hi All,
>>>
>>> Following the discussion that happened during LCE-2013 and the email
>>> thread started by Tomasz few months ago [1], here is a first attempt
>>> to introduce:
>>> - a schema language to define the bindings accurately
>>> - DTS validation during device tree compilation in DTC itself
>>
>> Sorry, this is probably going to sound a bit negative. Hopefully you
>> find it constructive though.
>>
>>> The syntax for a schema is the same as the one for dts. This choice has
>>> been made to simplify its development, to maximize the code reuse and
>>> finally because the format is human-readable.
>>
>> I'm not convinced that's a good decision.
>>
>> DT is a language for representing data.
>>
>> The validation checks described by schemas are rules, or code, and not
>> static data.
>>
>> So, while I'm sure it's possible to shoe-horn at least some reasonable
>> subset of DT validation into DT syntax itself, I feel it's unlikely to
>> yield something that's scalable enough.
>
> I tend to agree.
>
>> For example, it's easy to specify that a property must be 2 cells long.
>> What if it could be any multiple of two? That's a lot of numbers to
>> explicitly enumerate as data. Sure, you can then invent syntax to
>> represent that specific rule (parameterized by 2), but what about the
>> next similar-but-different rule? The only approach I can think of to
>> that is to allow the schema to contain arbitrary expressions, which
>> would likely need to morph into arbitary statements not just
>> expressions. Once you're there, I think the schema would be better
>> represented as a programming language rather than as a data structure
>> that could have code hooked into it.
>>
>>> How to:
>>>   * Associate a schema to one or several nodes
>>>
>>> As said earlier a schema can be used to validate one or several nodes
>>> from a dts. To do this the "compatible" properties from the nodes which
>>> should be validated must be present in the schema.
>>>
>>> 	timer1: timer at 4a318000 {
>>> 		compatible = "ti,omap3430-timer";
>> ...
>>> To write a schema which will validate OMAP Timers like the one above,
>>> one may write the following schema:
>>>
>>> 	/dts-v1/;
>>> 	/ {
>>> 		compatible = "ti,omap[0-9]+-timer";
>>
>> What about DT nodes that don't have a compatible value? We certainly
>> have some of those already like /memory and /chosen. We should be able
>> to validate their schema too. This probably doesn't invalidate being
>> able to look things up by compatible value though; it just means we need
>> some additional mechanisms too.
>
> More to the point, what about the properties of a node whose format is
> defined not by this node's binding but by some other nodes binding.
> e.g. the exact format of reg and ranges is at least partially
> determined by the parent bus's binding, and interrupts is defined
> partially by the interrupt parent's binding.  gpio properties are
> defined by a combination of a global binding and the gpio parent,
> IIRC.

Yeah, that's a general concern that Stephen raised several time as well.
We need to figure out some way to handle that.

>>>   * Define constraints on properties
>>>
>>> To define constraints on a property one has to create a node in a schema
>>> which has as name the name of the property that one want to validate.
>>>
>>> To specify constraints on the property "ti,hwmods" of OMAP Timers one
>>> can write this schema:
>>>
>>> 	/dts-v1/;
>>> 	/ {
>>> 		compatible = "ti,omap[0-9]+-timer";
>>> 		ti,hwmods {
>>> 			...
>>> 		};
>>
>> compatible and ti,hwmods are both properties in the DT file. However, in
>> the schema above, one appears as a property, and one as a node. I don't
>> like that inconsistency. It'd be better if compatible was a node too.
>
> Essentially what's going on here is that to describe the constraint on
> a property, a node with corresponding name is defined to encode the
> parameters of that constraint.  It kind of works, but it's forced.  It
> also hits problems since nodes and properties are technically in
> different namespaces, although they rarely collide in real cases.

OK, so would you suggest keeping mapping between node / attribute in DTS 
and in the schema?

>>> If one want to use a regular as property name one can write this schema:
>>>
>>> 	/dts-v1/;
>>> 	/ {
>>> 		compatible = "abc";
>>> 		def {
>>> 			name = "def[0-9]";
>>
>> Isn't it valid to have a property named "name" within the node itself?
>> How do you differentiate between specifying the node name and the name
>> property?
>
> Or to look at it another way, how do you differentiate between nodes
> representing encoded constraints for a property, and nodes
> representing nodes directly.
>
>> What if the node name needs more validation than just a regex. For
>> example, suppose we want to validate the
>> unit-name-must-match-reg-address rule. We need to write some complex
>> expression using data extracted from reg to calculate the unit address.
>> Equally, the node name perhaps has to exist in some global list of
>> acceptable node names. It would be extremely tricky if not impossible to
>> do that with a regex.
>>
>>> 			...
>>> 		};
>>> 	};
>>>
>>> Above one can see that the "name" property override the node name.
>>
>> Override implies that dtc would change the node name during compilation.
>> I think s/override/validate/ or s/override/overrides the validation
>> rules for/?
>
> Actually, dtc already contains checks that a "name" property (if
> present) matches the unit name.  Name properties vs. node names work a
> bit differently in the flat-tree world versus traditional OF, and this
> checks ensures that flat trees don't do (at least some) things which
> would break the OF traditional approach.
>
>>>   * Require the presence of a property inside a node or inside one of its
>>> parents
>> ...
>>> /dts-v1/;
>>> / {
>>>      compatible = "ti,twl[0-9]+-rtc";
>>>      interrupt-controller {
>>>          is-required;
>>>          can-be-inherited;
>>
>> interrupt-controller isn't a good example here, since it isn't a
>> property that would typically be inherited. Why not use interrupt-parent
>> instead?
>>
>>> One can check if 'node' has the following subnode 'subnode1', 'subnode2',
>>> and 'abc' with the schema below:
>>>
>>> /dts-v1/;
>>> / {
>>>      compatible = "comp";
>>>      children = "abc", "subnode[0-9]";
>>> };
>>
>> How is the schema for each sub-node specified?
>>
>> What if some nodes are optional and some required? The conditions where
>> a sub-node is required might be complex, and I think we'd always want to
>> be able to represent them in whatever schema language we chose.
>>
>> The most obvious way would be to make each sub-node's schema appear as a
>> sub-node within the main node's schema, but then how do you tell if a
>> schema node describes a property or a node?
>>
>> Note that the following DT file is currently accepted by dtc even if it
>> may not be the best choice of property and node names:
>>
>> ==========
>> /dts-v1/;
>>
>> / {
>> 	foo = <1>;
>> 	foo {};
>> };
>> ==========
>
> Note that node / property name collisions are not entirely theoretical
> either.  They are permitted in IEEE1275 and there are real Apple
> device trees in the wild which have them.  It's rare and discouraged,
> obviously.
>
>>>   * Constraints on array size
>>>
>>> One can specify the following constraints on array size:
>>>   - length: specify the exact length that an array must have.
>>>   - min-length: specify the minimum number of elements an array must have.
>>>   - max-length: specify the maximum number of elements an array must have.
>>
>> This seems rather inflexible; it'll cover a lot of the simple cases, but
>> hit a wall pretty soon. For example, how would it validate a property
>> that is supposed to include 3 GPIO specifiers, where the GPIO specifiers
>> are going to have DT-specific lengths, since the length of each
>> specifier is defined by the node that the phandles reference?
>>
>>
>> Overall, I believe perhaps the single most important aspect of any DT
>> schema is schema inheritance or instancing, and this proposal doesn't
>> appear to address that issue at all.
>>
>> Inheritance of schemas:
>>
>> For example, any node that is addressed must contain a reg property. The
>> constraints on that property are identical in all bindings; it must
>> consist of #address-cells + #size-cells integer values (cells). We don't
>> want to have to cut/paste that rule into every single binding
>> definition. Rather, we should simply say something like "this binding
>> uses the reg property", and the schema validation tool will look up the
>> definition of "reg property", and hence know how to validate it.
>>
>> Similarly, any binding that describes a GPIO controller will have some
>> similar requirements; the gpio-controller and #gpio-cells properties
>> must be present. The schema should simply say "I'm a GPIO controller",
>> and the schema tool should add some extra requirements to nodes of that
>> type.
>>
>> Instancing of schemas:
>>
>> Any binding that uses GPIOs should be able to say that a particular
>> property (e.g. "enable-gpios") is-a GPIO-specifier (with parameters
>> "enable" for the property name, min/max/expression length, etc.), and
>> then the schema validation tool would know to apply rules for a
>> specifier list to that property (and be able to check the property name).
>
> Yes, I agree both of those are important.
>
>
> So, here's a counter-proposal of at least a rough outline of how I
> think schemas could work, in a way that's still based generally on dt
> syntax.

That seems to be well aligned with what we tried to achieve, so I'm not 
considering that as a counter-proposal but as a refinement. :-)

> First, define the notion of dt "patterns" or "templates".  A dt
> pattern is to a dt node or subtree as a regex is to a string - it
> provides a reasonably expressive way of defining a family of dt
> nodes.  These would be defined in an extension / superset of dt
> syntax.

OK, make sense. Are you considering a syntax similar to xpath in order 
to match any node in the path, using the full path information instead 
of the individual node?

I'm a little bit rusty on xpath but AFAIR it could be something like that.

match any node containing reg:

"//reg/.."

match only node containing reg for ti,omap4-gpio compatible

"//*[@compatible='ti,omap4-gpio']/reg/.."

match only node containing reg below the ocp parent node

"//ocp/*/reg/.."


> A schema would then be defined as a set of implications:
> 	If node X matches pattern A, => it must also match pattern B
>
> For example:
> 	If a node has a compatible property with string "foodev"
> 	 => it must have various foodev properties.
>
> 	If a node has a "reg" property (at all)
> 	 => it must have the format required by reg
>
> 	If a node has an "interrupts" property
> 	 => it must have either "interrupt-parent" or "interrupt-map"
>

That's part is similar to what we had in mind. So we just need to find 
how to express it properly using the DTS syntax.

Thanks,
Benoit




More information about the linux-arm-kernel mailing list