[PATCH v4 4/8] arm64: Add sysreg header generation scripting
Mark Rutland
mark.rutland at arm.com
Thu Apr 21 02:47:42 PDT 2022
Hi Mark,
Thanks for picking this up; it's nice to see this moving forward!
On Tue, Apr 19, 2022 at 11:43:25AM +0100, Mark Brown wrote:
> From: Mark Rutland <mark.rutland at arm.com>
>
> The arm64 kernel requires some metadata for each system register it may
> need to access. Currently we have:
>
> * A SYS_<regname> definition which sorresponds to a sys_reg() macro.
> This is used both to look up a sysreg by encoding (e.g. in KVM), and
> also to generate code to access a sysreg where the assembler is
> unaware of the specific sysreg encoding.
>
> Where assemblers support the S3_<op1>_C<crn>_C<crm>_<op2> syntax for
> system registers, we could use this rather than manually assembling
> the instructions. However, we don't have consistent definitions for
> these and we currently still need to handle toolchains that lack this
> feature.
>
> * A set of <regname>_<fieldname>_SHIFT and <regname>_<fieldname>_MASK
> definitions, which can be used to extract fields from the register, or
> to construct a register from a set of fields.
>
> These do not follow the convention used by <linux/bitfield.h>, and the
> masks are not shifted into place, preventing their use in FIELD_PREP()
> and FIELD_GET(). We require the SHIFT definitions for inline assembly
> (and WIDTH definitions would be helpful for UBFX/SBFX), so we cannot
> only define a shifted MASK. Defining a SHIFT, WIDTH, shifted MASK and
> unshifted MASK is tedious and error-prone and life is much easier when
> they can be relied up to exist when writing code.
>
> * A set of <regname>_<fieldname>_<valname> definitions for each
> enumerated value a field may hold. These are used when identifying the
> presence of features.
>
> Atop of this, other code has to build up metadata at runtime (e.g. the
> sets of RES0/RES1 bits in a register).
>
> This patch adds scripting so that we can have an easier-to-manage
> canonical representation of this metadata, from which we can generate
> all the definitions necessary for various use-cases, e.g.
>
> | #define REG_ID_AA64ISAR0_EL1 S3_0_C0_C6_0
> | #define SYS_ID_AA64ISAR0_EL1 sys_reg(3, 0, 0, 6, 0)
> | #define SYS_ID_AA64ISAR0_EL1_Op0 3
> | #define SYS_ID_AA64ISAR0_EL1_Op1 0
> | #define SYS_ID_AA64ISAR0_EL1_CRn 0
> | #define SYS_ID_AA64ISAR0_EL1_CRm 6
> | #define SYS_ID_AA64ISAR0_EL1_Op2 0
>
> | #define ID_AA64ISAR0_EL1_RNDR ARM64_SYSREG_BITMASK(63, 60)
> | #define ID_AA64ISAR0_EL1_RNDR_MASK ARM64_SYSREG_BITMASK(63, 60)
I think this got missed when s/ARM64_SYSREG_BITMASK()/GENMASK_ULL()/ happened.
> | #define ID_AA64ISAR0_EL1_RNDR_SHIFT 60
> | #define ID_AA64ISAR0_EL1_RNDR_WIDTH 4
> | #define ID_AA64ISAR0_EL1_RNDR_NI ULL(0b0000)
> | #define ID_AA64ISAR0_EL1_RNDR_IMP ULL(0b0001)
Just to check, was there a reason for going for ULL() and GENMASK_ULL() rather
than UL() and GENMASK()?
We generally use UL() today, since we treat `unsigned long` as the native
register size.
>
> The script requires that all bits in the register be specified and that
> there be no overlapping fields. This helps the script spot errors in the
> input but means that the few registers which change layout at runtime
> depending on things like virtualisation settings will need some manual
> handling. No actual register conversions are done here but a header for
> the register data with some documention of the format is provided.
It would be good to see an example of how we'd handle one of those, in case
that means we need to play around with naming or structure of the definitions a
bit.
Regardless, this looks good to me.
Thanks,
Mark.
> At the moment this is only intended to express metadata from the
> architecture, and does not handle policy imposed by the kernel, such as
> values exposed to userspace or VMs. In future this could be extended to
> express such information.
>
> This script was mostly written by Mark Rutland but has been extended by
> Mark Brown to improve validation of input and better integrate with the
> kernel.
>
> Signed-off-by: Mark Rutland <mark.rutland at arm.com>
> Co-Developed-by: Mark Brown <broonie at kernel.org>
> Signed-off-by: Mark Brown <broonie at kernel.org>
> ---
> arch/arm64/tools/gen-sysreg.awk | 213 ++++++++++++++++++++++++++++++++
> arch/arm64/tools/sysreg | 34 +++++
> 2 files changed, 247 insertions(+)
> create mode 100755 arch/arm64/tools/gen-sysreg.awk
> create mode 100644 arch/arm64/tools/sysreg
>
> diff --git a/arch/arm64/tools/gen-sysreg.awk b/arch/arm64/tools/gen-sysreg.awk
> new file mode 100755
> index 000000000000..4be092372b16
> --- /dev/null
> +++ b/arch/arm64/tools/gen-sysreg.awk
> @@ -0,0 +1,213 @@
> +#!/bin/awk -f
> +# SPDX-License-Identifier: GPL-2.0
> +# gen-sysreg.awk: arm64 sysreg header generator
> +#
> +# Usage: awk -f gen-sysreg.awk sysregs.txt
> +
> +# Log an error and terminate
> +function fatal(msg) {
> + print "Error at " NR ": " msg > "/dev/stderr"
> + exit 1
> +}
> +
> +# Sanity check that the start or end of a block makes sense at this point in
> +# the file. If not, produce an error and terminate.
> +#
> +# @this - the $Block or $EndBlock
> +# @prev - the only valid block to already be in (value of @block)
> +# @new - the new value of @block
> +function change_block(this, prev, new) {
> + if (block != prev)
> + fatal("unexpected " this " (inside " block ")")
> +
> + block = new
> +}
> +
> +# Sanity check the number of records for a field makes sense. If not, produce
> +# an error and terminate.
> +function expect_fields(nf) {
> + if (NF != nf)
> + fatal(NF " fields found where " nf " expected")
> +}
> +
> +# Print a CPP macro definition, padded with spaces so that the macro bodies
> +# line up in a column
> +function define(name, val) {
> + printf "%-48s%s\n", "#define " name, val
> +}
> +
> +# Print standard BITMASK/SHIFT/WIDTH CPP definitions for a field
> +function define_field(reg, field, msb, lsb) {
> + define(reg "_" field, "GENMASK_ULL(" msb ", " lsb ")")
> + define(reg "_" field "_MASK", "GENMASK_ULL(" msb ", " lsb ")")
> + define(reg "_" field "_SHIFT", lsb)
> + define(reg "_" field "_WIDTH", msb - lsb + 1)
> +}
> +
> +# Parse a "<msb>[:<lsb>]" string into the global variables @msb and @lsb
> +function parse_bitdef(reg, field, bitdef, _bits)
> +{
> + if (bitdef ~ /^[0-9]+$/) {
> + msb = bitdef
> + lsb = bitdef
> + } else if (split(bitdef, _bits, ":") == 2) {
> + msb = _bits[1]
> + lsb = _bits[2]
> + } else {
> + fatal("invalid bit-range definition '" bitdef "'")
> + }
> +
> +
> + if (msb != next_bit)
> + fatal(reg "." field " starts at " msb " not " next_bit)
> + if (63 < msb || msb < 0)
> + fatal(reg "." field " invalid high bit in '" bitdef "'")
> + if (63 < lsb || lsb < 0)
> + fatal(reg "." field " invalid low bit in '" bitdef "'")
> + if (msb < lsb)
> + fatal(reg "." field " invalid bit-range '" bitdef "'")
> + if (low > high)
> + fatal(reg "." field " has invalid range " high "-" low)
> +
> + next_bit = lsb - 1
> +}
> +
> +BEGIN {
> + print "#ifndef __ASM_SYSREG_GEN_H"
> + print "#define __ASM_SYSREG_GEN_H"
> + print ""
> + print "/* Generated file - do not edit */"
> +
> + block = "None"
> +}
> +
> +END {
> + print "#endif /* __ASM_SYSREG_GEN_H */"
> +}
> +
> +# skip blank lines and comment lines
> +/^$/ { next }
> +/^#/ { next }
> +
> +/^Sysreg/ {
> + change_block("Sysreg", "None", "Sysreg")
> + expect_fields(7)
> +
> + reg = $2
> + op0 = $3
> + op1 = $4
> + crn = $5
> + crm = $6
> + op2 = $7
> +
> + res0 = "UL(0)"
> + res1 = "UL(0)"
> +
> + define("REG_" reg, "S" op0 "_" op1 "_C" crn "_C" crm "_" op2)
> + define("SYS_" reg, "sys_reg(" op0 ", " op1 ", " crn ", " crm ", " op2 ")")
> +
> + define("SYS_" reg "_Op0", op0)
> + define("SYS_" reg "_Op1", op1)
> + define("SYS_" reg "_CRn", crn)
> + define("SYS_" reg "_CRm", crm)
> + define("SYS_" reg "_Op2", op2)
> +
> + print ""
> +
> + next_bit = 63
> +
> + next
> +}
> +
> +/^EndSysreg/ {
> + if (next_bit > 0)
> + fatal("Unspecified bits in " reg)
> +
> + change_block("EndSysreg", "Sysreg", "None")
> +
> + define(reg "_RES0", "(" res0 ")")
> + define(reg "_RES1", "(" res1 ")")
> + print ""
> +
> + reg = null
> + op0 = null
> + op1 = null
> + crn = null
> + crm = null
> + op2 = null
> + res0 = null
> + res1 = null
> +
> + next
> +}
> +
> +/^Res0/ && block = "Sysreg" {
> + expect_fields(2)
> + parse_bitdef(reg, "RES0", $2)
> + field = "RES0_" msb "_" lsb
> +
> + define_field(reg, field, msb, lsb)
> + print ""
> +
> + res0 = res0 " | " reg "_" field
> +
> + next
> +}
> +
> +/^Res1/ && block = "Sysreg" {
> + expect_fields(2)
> + parse_bitdef(reg, "RES1", $2)
> + field = "RES1_" msb "_" lsb
> +
> + define_field(reg, field, msb, lsb)
> + print ""
> +
> + res1 = res1 " | " reg "_" field
> +
> + next
> +}
> +
> +/^Field/ && block = "Sysreg" {
> + expect_fields(3)
> + field = $3
> + parse_bitdef(reg, field, $2)
> +
> + define_field(reg, field, msb, lsb)
> + print ""
> +
> + next
> +}
> +
> +/^Enum/ {
> + change_block("Enum", "Sysreg", "Enum")
> + expect_fields(3)
> + field = $3
> + parse_bitdef(reg, field, $2)
> +
> + define_field(reg, field, msb, lsb)
> +
> + next
> +}
> +
> +/^EndEnum/ {
> + change_block("EndEnum", "Enum", "Sysreg")
> + field = null
> + msb = null
> + lsb = null
> + print ""
> + next
> +}
> +
> +/0b[01]+/ && block = "Enum" {
> + expect_fields(2)
> + val = $1
> + name = $2
> +
> + define(reg "_" field "_" name, "ULL(" val ")")
> + next
> +}
> +
> +# Any lines not handled by previous rules are unexpected
> +{
> + fatal("unhandled statement")
> +}
> diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
> new file mode 100644
> index 000000000000..3595c68b9a0b
> --- /dev/null
> +++ b/arch/arm64/tools/sysreg
> @@ -0,0 +1,34 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +#
> +# System register metadata
> +
> +# Each System register is described by a Sysreg block:
> +
> +# Sysreg <name> <op0> <op1> <crn> <crm> <op2>
> +# <field>
> +# ...
> +# EndSysreg
> +
> +# Within a Sysreg block, each field can be described as one of:
> +
> +# Res0 <msb>[:<lsb>]
> +
> +# Res1 <msb>[:<lsb>]
> +
> +# Field <msb>[:<lsb>] <name>
> +
> +# Enum <msb>[:<lsb>] <name>
> +# <enumval> <enumname>
> +# ...
> +# EndEnum
> +
> +# For ID registers we adopt a few conventions for translating the
> +# language in the ARM into defines:
> +#
> +# NI - Not implemented
> +# IMP - Implemented
> +#
> +# In general it is recommended that new enumeration items be named for the
> +# feature that introduces them (eg, FEAT_LS64_ACCDATA introduces enumeration
> +# item ACCDATA) though it may be more taseful to do something else.
> +
> --
> 2.30.2
>
More information about the linux-arm-kernel
mailing list