[PATCH v4 00/31] Introduce SCMI Telemetry FS support

Wed Jun 17 05:58:09 PDT 2026

On Fri, Jun 12, 2026 at 11:37:30PM +0100, Cristian Marussi wrote:
> Hi all,
> 
> --------------------------------------------------------------------------------
> [TLDR Summary]
> This series introduces a new SCMI driver which uses a new Telemetry FS to expose
> and configure SCMI Telemetry Data Events retrieved from the platform SCMI FW
> at runtime. The patches carrying the new STLMFS Filesystem support are tagged
> with 'stlmfs'.
> --------------------------------------------------------------------------------
> 
> the upcoming SCMI v4.0 specification [0] introduces a new SCMI protocol
> dedicated to System Telemetry.
> 
> In a nutshell, the SCMI Telemetry protocol allows an agent to discover at
> runtime the set of Telemetry Data Events (DEs) available on a specific
> platform and provides the means to configure the set of DEs that a user is
> interested into, while reading them back using the collection method that
> is deeemed more suitable for the usecase at hand. (...amongst the various
> possible collection methods allowed by SCMI specification)
> 
> Without delving into the gory details of the whole SCMI Telemetry protocol
> let's just say that the SCMI platform/server firmware advertises a number
> of Telemetry Data Events, each one identified by a 32bit unique ID, and an
> SCMI agent/client, like Linux, can discover them and read back at will the
> associated data value in a number of ways.
> Data collection is mainly intended to happen on demand via shared memory
> areas exposed by the platform firmware, discovered dynamically via SCMI
> Telemetry and accessed by Linux on-demand, but some DE can also be reported
> via SCMI Notifications asynchronous messages or via direct dedicated
> FastChannels (another kind of SCMI memory based access): all of this
> underlying mechanism is anyway hidden to the user since it is mediated by
> the kernel driver which will return the proper data value when queried.
> 
> Anyway, the set of well-known architected DE IDs defined by the spec is
> limited to a dozen IDs, which means that the vast majority of DE IDs are
> customizable per-platform: as a consequence, though, the same ID, say
> '0x1234', could represent completely different things on different systems.
> 
> Precise definitions and semantic of such custom Data Event IDs are out of
> the scope of the SCMI Telemetry specification and of this implementation:
> they are supposed to be provided using some kind of JSON-like description
> file that will have to be consumed by a userspace tool which would be
> finally in charge of making sense of the set of available DEs.
> 
> IOW, in turn, this means that even though the DEs enumerated via SCMI come
> with some sort of topological and qualitative description provided by the
> protocol (like unit of measurements, name, topology info etc), kernel-wise
> we CANNOT be completely sure of "what is what" without being fed-back some
> sort of information about the DEs by the afore mentioned userspace tool.
> 
> For these reasons, currently this series does NOT attempt to register any
> of these DEs with any of the usual in-kernel subsystems (like HWMON, IIO,
> PERF etc), simply because we cannot be sure which DE is suitable, or even
> desirable, for a given subsystem. This also means there are NO in-kernel
> users of these Telemetry data events as of now.
> 
> So, while we do not exclude, for the future, to feed/register some of the
> discovered DEs to/with some of the above mentioned Kernel subsystems, as
> of now we have ONLY modeled a custom userspace API to make SCMI Telemetry
> available to userspace tools.
> 
> In deciding which kind of interface to expose SCMI Telemetry data to a
> user, this new SCMI Telemetry driver aims at satisfying 2 main reqs:
> 
>  - exposing an FS-based human-readable interface that can be used to
>    discover, configure and access our Telemetry data directly also from
>    the shell without special tools
> 
>  - exposing alternative machine-friendly, more-performant, binary
>    interfaces that can be used to avoid the overhead of multiple accesses
>    to the VFS and that can be more suitable to access with custom tools
> 
> In the initial RFC posted a few months ago [1], the above was achieved
> with a combination of a SysFS interface, for the human-readable side of
> the story, and a classic chardev/ioctl for the plain binary access.
> 
> Since V1, instead, we moved away from this combined approach, especially
> away from SysFS, for the following reason:
> 
>  1. "Abusing SysFS": SysFS is a handy way to expose device related
>       properties in a common way, using a few common helpers built on
>       kernfs; this means, though, that unfortunately in our scenario I had
>       to generate a dummy simple device for EACH SCMI Telemetry DataEvent
>       that I got to discover at runtime and attach to them, all of the
>       properties I need.
>       This by itself seemed to me abusing the SysFS framework, but, even
>       ignoring this, the impact on the system when we have to deal with
>       hundreds or tens of thousands of DEs is sensible.
>       In some test scenario I ended with 50k DE devices and half-a-millon
>       related property files ... O_o
> 
>  2. "SysFS constraints": SysFS usage itself has its well-known constraints
>       and best practices, like the one-file/one-value rule, and due to the
>       fact that any virtual file with a complex structure or handling logic
>       is frowned upon, you can forget about IOCTLs and mmap'ing to provide
>       a more performant interface within SysFs, which is the reason why,
>       in the previous RFC, there was an additional alternative chardev
>       interface.
>       These latter limitations around the implementation of files with a
>       more complex semantic (i.e. with a broader set of file_operations)
>       derive from the underlying KernFS support, so KernFS is equally not
>       suitable as a building block for our implementation.
> 
>  2. "Chardev limitations": Given the nature of the protocol, the hybrid
>       approach employing character devices was itself problematic: first
>       of all because there is an upper limit on the number of chardev we
>       can create, dictated by the range of available minor numbers, and
>       then because the fact itself to have to maintain 2 completely
>       different interfaces (FS + chardev) is painful.
> 
> As a final remark, please NOTE THAT all of this is supposed to be available
> in production systems across a number of heterogeneous platforms: for these
> reasons the easy choice, debugFS, is NOT an option here.
> 
> Due to the above reasoning, since V1 we opted for a new approach with the
> proposed interfaces now based on a full fledged, unified, virtual pseudo
> filesystem implemented from scratch, so that we can:
> 
>  - expose all the DEs property we like as before with SysFS, but without
>    any of the constraint imposed by the usage of SysFs or kernfs.
> 
>  - easily expose additional alternative views of the same set of DEs
>    using symlinking capabilities (e.g. alternative topological view)
> 
>  - additionally expose a few alternative and more performant interfaces
>    by embedding in that same FS, a few special virtual files:
> 
>    + 'control': to issue IOCTLs for quicker discovery and on-demand access
>    		to data
>    + 'pipe' [TBD]: to provide a stream of events using a virtual
>    		   infinite-style file
>    + 'raw_<N>' [TBD]: to provide direct memory mapped access to the raw
>    		      SCMI Telemetry data from userspace

A filsystem driver for telemetry like this is really misguided. I think
shell access is really not an argument for adding a filesystem into the
kernel like this. That's just not appropriate justification to push
thousand and thousands of lines of code into the kernel.

You're building completely new infrastructure. The format is whatever it
is. If you stream it somehow just add a binary that userspace can use to
consume or translate it. If you need a filesystem interface for
convenience build it via FUSE on top of whatever streams that data and
get it ouf of the kernels way.

You also buy into all kinds of really wonky properties. If you split it
over multiple files you can never get a snapshot of data that is
consistent if it's across multiple files.

Telemetry over a filesystem is just not a great idea. If you did it via
sysfs I really wouldn't care because all because the infrastructure
already exists and I couldn't be bothered if this grew yet another wart
but as a separate massive hand-rolled pseudofs, no I'm not seeing it.