[PATCH 1/5] common: machine_id: support /chosen/barebox, machine-id-path override

Wed Jun 30 13:13:45 PDT 2021

On Wed, Jun 30, 2021 at 3:27 AM Ahmad Fatoum <a.fatoum at pengutronix.de> wrote:
> On 28.06.21 21:50, Trent Piepho wrote:
> > On Sun, Jun 27, 2021 at 11:41 PM Ahmad Fatoum <a.fatoum at pengutronix.de> wrote:
> >
> > On a board I did before Barebox had machine-id support, we used two
> > sources of serial number to generate the machine id.  One was the imx7
> > unique id and the other was a serial number in a i2c security chip.
> >
> > The imx id is predictable, so even hashed one can predict the
> > machine-id exactly.
>
> We happen to have stm32mp1 twins with consecutive MAC addresses.
> I compared their serial numbers and while they clearly didn't start
> at zero, all bytes were the same, except for two nibbles that were
> one apart. So yes, if the i.MX UID follows a similar scheme, you
> may be able to guess the machine-id of other devices in the same
> batch if you already have the machine-id of one of them.
>
> > We didn't really like that, so we combined two
>  sources.
>
> Is your issue one of privacy or security?

Both.  We didn't like it being guessable for fishing attempts.
Security was more on the cloud side, having the machine-ids guessable
was not a good idea.  Yes, machine id is not to be used for access
control, it is not a password.  But one does not want an attacker to
have a list of every valid username even if the password is still
secret.

> My understanding is that the machine id is not disclosed as matter
> of privacy, so it's harder to track a device by manner of the unique
> IDs embedded in its outgoing communication.
>
> > It seems like there is no way to do that with this design?
>
> You can write a driver that collects multiple sources and offers
> a single NVMEM cell for consumption.

I suppose this could be done, but it certainly seems complex.  I think
one would need:
Write code in board init function to get imx id, data from additional
chip, put in global buffer.
Add barebox specific device tree node, compatible =
"mycompany,myboard-unique-data"
Write small nvmem driver that exports the global buffer as nvmem device.

I suppose the latter driver could be common if needing to create nvmem
devices to hold data becomes commonly needed.

> > I think it could be done if there was a function to add hash input,
> > which board code could call.  Keep the pools separate and have a
> > defined order.  I.e., /chosen/barebox,machine-id is first, then
> > machine_id_add_hashable() is next.
>
> If we restrict the new machine_id_add_hashable() for board use that
> would work, yes. I don't like it from a design perspective though:

Yes, only board code should call that.  It must be unique and
invariant to the board and no individual driver knows enough to do
that.

> A user would expect a machine-id-path property to point at all info
> used for determining the machine-id, not some of them.

One could have a special name in the path, "internal" or something,
that indicates data that code in Barebox will create.

> > Or /chosen/barebox,machine-id could be a _list_ of paths, to be used
> > in order.  But that requires a nvmem driver for each source.
>
> That's fine by me and could be added in future. I can add a property
> size check, so we leave open this avenue without breaking users.
>
> The root device tree node already has a device in barebox, so board
> could use that to offer custom info.

If a property was created that simply contained the data directly,
I.e. of_set_property(root, "extra-id-data", data, 8),
would there be a way for barebox,machine-id to point to it?

> > The security chip wouldn't have worked for a nvmem driver.
>
> Why not? Check out nvmem-rmem with just exports a memory region
> as read-only nvmem device. You can do likewise. There are also
> helpers to get a nvmem device out of a (i2c) regmap.

To get the id, one needs to construct a command and then parse the
response format to extract the data.  It is not as simple as some
register contents.

> > I'll also point out that just hashing the data is not a good idea to
> > make a UUID.  Anyone who hashes the imx unique id will get the exact
> > same UUID.  So it is not very universally unique!
>
> The i.MX unique ID is unique across i.MX processors. Yes, it would
> collide with an attacker that guess the ID, but is that really
> a threat? Anyhow, there are boards using it like this already in the

It comes up if some other software on the same board creates a UUID by
hashing the imx id.  Anything that does this will get the same UUID.
Suppose the board is not imx and has no unique built-in id other than
a MAC address.  Anyone who hashes the mac address gets the same UUID
and they collide.  It is not a security issue exactly, but a failure
of the universally unique property of the UUID.

> field. So this won't change, but for the new binding introduced here,
> I can address issues that are raised.

> > This is already a known issue when generating UUIDs that are not
> > purely random.  See RFC4122 §4.3 for generating UUIDs in a namespace.

> Users can do that, barebox will hashh it to get the format the OS expects.
> It's probably a good idea to hint at RFC4122 for users interested in
> generating their own material for use with the machine-id.

I don't know what you mean by the format the OS expects.  The output
of the §4.3 algorithm is a standard UUID.  It works perfectly well to
pass it as a machine id and then systemd will use it as it is.  Since
2011, when systemd generates a machine id it follows RFC4422 §4.4 for
random UUIDs.