[PATCH] arm64: Add basic JSON register parser

Marc Zyngier maz at kernel.org
Wed Dec 25 10:57:11 PST 2024


We currently populate the sysreg file by hand from the ARM ARM,
resulting in a bunch of errors being introduced on a regular basis.
While there is an XML dump of the architecture produced on a quarterly
basis, the license that comes attached to it excludes any sort of
open-source usage.

However, ARM has recently made available a JSON dump[1] that contains
a reduced set of information under a BSD license. This has enough
data to extract what is relevant to the sysreg file.

This is achieved using a JQ script that I cobbled together over
the holiday, and while it has a number of limitations, it already
works well enough to extract useful data.

As an example, here's what the script returns for TCR_EL1:

$ jq -r --arg REG TCR_EL1 -f arch/arm64/tools/dumpreg.jq ~/Work/XML/2024-12/AARCHMRS_BSD_A_profile/Registers.json
TCR_EL1	[3,0,0,2,2]	MRS
TCR_EL1	[3,0,0,2,2]	MSRregister
TCR_EL12	[3,5,0,2,2]	MRS
TCR_EL12	[3,5,0,2,2]	MSRregister
TCRALIAS_EL1	[3,0,7,2,6]	MRS
TCRALIAS_EL1	[3,0,7,2,6]	MSRregister
true
true
Res0	63:62
Field	61	MTX1	(IsFeatureImplemented(FEAT_MTE_NO_ADDRESS_TAGS) || IsFeatureImplemented(FEAT_MTE_CANONICAL_TAGS))
Field	60	MTX0	(IsFeatureImplemented(FEAT_MTE_NO_ADDRESS_TAGS) || IsFeatureImplemented(FEAT_MTE_CANONICAL_TAGS))
Field	59	DS	(IsFeatureImplemented(FEAT_LPA2) && (!IsFeatureImplemented(FEAT_D128) || (AArch64 TCR2_EL1.D128 == '0')))
Field	59	DS	true
Field	58	TCMA1	IsFeatureImplemented(FEAT_MTE2)
Field	57	TCMA0	IsFeatureImplemented(FEAT_MTE2)
Field	56	E0PD1	IsFeatureImplemented(FEAT_E0PD)
Field	55	E0PD0	IsFeatureImplemented(FEAT_E0PD)
Field	54	NFD1	(IsFeatureImplemented(FEAT_SVE) || IsFeatureImplemented(FEAT_TME))
Field	53	NFD0	(IsFeatureImplemented(FEAT_SVE) || IsFeatureImplemented(FEAT_TME))
Field	52	TBID1	IsFeatureImplemented(FEAT_PAuth)
Field	51	TBID0	IsFeatureImplemented(FEAT_PAuth)
Field	50	HWU162	IsFeatureImplemented(FEAT_HPDS2)
Field	49	HWU161	IsFeatureImplemented(FEAT_HPDS2)
Field	48	HWU160	IsFeatureImplemented(FEAT_HPDS2)
Field	47	HWU159	IsFeatureImplemented(FEAT_HPDS2)
Field	46	HWU062	IsFeatureImplemented(FEAT_HPDS2)
Field	45	HWU061	IsFeatureImplemented(FEAT_HPDS2)
Field	44	HWU060	IsFeatureImplemented(FEAT_HPDS2)
Field	43	HWU059	IsFeatureImplemented(FEAT_HPDS2)
Field	42	HPD1	IsFeatureImplemented(FEAT_HPDS)
Field	41	HPD0	IsFeatureImplemented(FEAT_HPDS)
Field	40	HD	IsFeatureImplemented(FEAT_HAFDBS)
Field	39	HA	IsFeatureImplemented(FEAT_HAFDBS)
Field	38	TBI1
Field	37	TBI0
Field	36	AS
Res0	35
Field	34:32	IPS
Field	31:30	TG1
Field	29:28	SH1
Field	27:26	ORGN1
Field	25:24	IRGN1
Field	23	EPD1
Field	22	A1
Field	21:16	T1SZ
Field	15:14	TG0
Field	13:12	SH0
Field	11:10	ORGN0
Field	9:8	IRGN0
Field	7	EPD0
Res0	6
Field	5:0	T0SZ

I completely expect this to quickly rewritten by people who know
what they are doing (I don't) and improved as we understand more
of the data model.

[1] https://developer.arm.com/-/cdn-downloads/permalink/Exploration-Tools-OS-Machine-Readable-Data/AARCHMRS_BSD/AARCHMRS_BSD_A_profile-2024-12.tar.gz

Signed-off-by: Marc Zyngier <maz at kernel.org>
Cc: Mark Rutland <mark.rutland at arm.com>
Cc: Catalin Marinas <catalin.marinas at arm.com>
Cc: Will Deacon <will at kernel.org>
Cc: Mark Brown <broonie at kernel.org>
---
 arch/arm64/tools/dumpreg.jq | 124 ++++++++++++++++++++++++++++++++++++
 1 file changed, 124 insertions(+)
 create mode 100644 arch/arm64/tools/dumpreg.jq

diff --git a/arch/arm64/tools/dumpreg.jq b/arch/arm64/tools/dumpreg.jq
new file mode 100644
index 000000000000..84527e8d52bf
--- /dev/null
+++ b/arch/arm64/tools/dumpreg.jq
@@ -0,0 +1,124 @@
+# SPDX-License-Identifier: GPL-2.0
+# dumpreg.jq: JSON arm64 system register data extrator
+#
+# Usage: jq -r --arg REG "XZY_ELx" -f ./dumpreg.jq Registers.json
+
+# Dump a set of semi-pertinent informations (encodings, fields,
+# conditions, field position and width) about register XZY_ELx as
+# contained in ARM's AARCHMRS_BSD_A_profile JSON tarball.
+
+# This can/should be used to populate the arch/arm64/tools/sysreg
+# file, instead of copying things by hand.
+
+# The tool currently has a bunch of limitations that users need to be
+# aware of, but that should have a major impact on the usability:
+
+# - All accessors are shown, irrespective of the conditions in which
+#   the accessors are actually available
+
+# - All Fields.ConstantField are displayed as UnsignedEnum,
+#   irrespective of the signess of the field (as the JSON doesn't
+#   carry this information).
+
+# - Not all the field types are supported (Fields.Array being the most
+#   obvious one).
+
+# - Value ranges are displayed using '[...]'.
+
+# - MSRimmediate accessors have a giberish encoding displayed.
+
+# - ... and probably more...
+
+def walknode:
+        if   (._type == "AST.Identifier" or ._type == "AST.Integer") then
+	     	.value
+	elif (._type == "Types.Field") then
+	     	@text "\(.value.state) \(.value.name).\(.value.field)"
+	elif (._type == "AST.UnaryOp") then
+	     	@text "\(.op)\(.expr | walknode)"
+	elif (._type == "AST.Function") then
+		@text "\(.name)(\(.arguments[] | walknode))"
+	elif (._type == "AST.DotAtom") then
+		.values | map(walknode) | join(".")
+	elif (._type == "AST.BinaryOp") then
+		@text "(\(.left | walknode) \(.op) \(.right | walknode))"
+	elif (._type == "Values.Value" or ._type == "AST.Bool") then
+		@text "\(.value)"
+	else	# debug catch-all
+		.
+	end;
+
+def range:
+	. as { _type: $type, start: $start, width: $width } |
+	if ($width == 1) then
+		@text "\($start)"
+	else
+		@text "\($start + $width - 1):\($start)"
+	end;
+
+def fld:
+	@text "\(.type)\t\(.range | range)\t\(.name)";
+
+def condition:
+	@text "\(.condition | walknode)";
+
+def unquote:
+	"'" as $q | (.value | ltrimstr($q) | rtrimstr($q));
+
+def binvalue:
+	unquote as $v | @text "\t0b\($v)\tVAL_\($v)";
+
+def dumpconstants:
+	if   (._type == "Values.Value") then
+		binvalue
+	elif (._type == "Values.ValueRange") then
+		(.start | binvalue), @text "\t[...]", (.end | binvalue)
+	end;
+
+def walkfields:
+	if   (._type == "Fields.Reserved" and .value == "RES0") then
+		{ type: "Res0", name: "", range: .rangeset[] } | fld
+	elif (._type == "Fields.Reserved" and .value == "RES1") then
+		{ type: "Res1", name: "", range: .rangeset[] } | fld
+	elif (._type == "Fields.ConditionalField") then
+		.fields[] as $f |
+		{ type: "Field", name: $f.field.name, range: .rangeset[],
+		  condition: $f.condition } |
+		@text "\(fld)\t\(condition)"
+
+	elif (._type == "Fields.Dynamic") then
+	     	.instances[] | @text "\(.values[] | walkfields)\t\(condition)"
+	elif (._type == "Fields.ConstantField") then
+		({ type: "UnsignedEnum", name: .name, range: .rangeset[] } | fld),
+		(.value.constraints.values[]? | dumpconstants),
+		@text "EndEnum"
+	elif (._type == "Fields.Field") then
+		{ type: "Field", name: .name, range: .rangeset[] } | fld
+ 	else	# Debug catch all
+		.
+	end;
+
+def walkreg:
+	.fieldsets[] | condition,(.values[] | walkfields);
+
+# https://rosettacode.org/wiki/Non-decimal_radices/Convert#jq
+def bin_to_i:
+  explode | reverse| map(. - 48) |
+  reduce .[] as $c
+      # state: [power, ans]
+      ([1,0]; (.[0] * 2) as $b | [$b, .[1] + (.[0] * $c)]) | .[1];
+
+def encodings:
+	"'" as $q | .encodings |
+	@text "\(map(.value) | map(ltrimstr($q) | rtrimstr($q)) | map(bin_to_i) |
+		[ .[2], .[3], .[0], .[1], .[4]])";
+
+def encoding:
+	"A64." as $a64 |
+	@text "\(.encoding[].asmvalue)\t\(.encoding[] | encodings)\t\(.name | ltrimstr($a64))";
+
+def accessors:
+	.accessors[] | encoding;
+
+.[] | select (._type == "Register" and .name == $REG) |
+      accessors,condition,walkreg
-- 
2.43.0




More information about the linux-arm-kernel mailing list