From atishp at rivosinc.com Mon Mar 3 12:53:27 2025 From: atishp at rivosinc.com (Atish Kumar Patra) Date: Mon, 3 Mar 2025 12:53:27 -0800 Subject: [PATCH 3/4] KVM: riscv: selftests: Change command line option In-Reply-To: <20250227-eb9e3d8de1de2ff609ac8f64@orel> References: <20250226-kvm_pmu_improve-v1-0-74c058c2bf6d@rivosinc.com> <20250226-kvm_pmu_improve-v1-3-74c058c2bf6d@rivosinc.com> <20250227-eb9e3d8de1de2ff609ac8f64@orel> Message-ID: On Thu, Feb 27, 2025 at 12:08?AM Andrew Jones wrote: > > On Wed, Feb 26, 2025 at 12:25:05PM -0800, Atish Patra wrote: > > The PMU test commandline option takes an argument to disable a > > certain test. The initial assumption behind this was a common use case > > is just to run all the test most of the time. However, running a single > > test seems more useful instead. Especially, the overflow test has been > > helpful to validate PMU virtualizaiton interrupt changes. > > > > Switching the command line option to run a single test instead > > of disabling a single test also allows to provide additional > > test specific arguments to the test. The default without any options > > remains unchanged which continues to run all the tests. > > > > Signed-off-by: Atish Patra > > --- > > tools/testing/selftests/kvm/riscv/sbi_pmu_test.c | 40 +++++++++++++++--------- > > 1 file changed, 26 insertions(+), 14 deletions(-) > > > > diff --git a/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c b/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c > > index 284bc80193bd..533b76d0de82 100644 > > --- a/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c > > +++ b/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c > > @@ -39,7 +39,11 @@ static bool illegal_handler_invoked; > > #define SBI_PMU_TEST_SNAPSHOT BIT(2) > > #define SBI_PMU_TEST_OVERFLOW BIT(3) > > > > -static int disabled_tests; > > +struct test_args { > > + int disabled_tests; > > +}; > > + > > +static struct test_args targs; > > > > unsigned long pmu_csr_read_num(int csr_num) > > { > > @@ -604,7 +608,11 @@ static void test_vm_events_overflow(void *guest_code) > > vcpu_init_vector_tables(vcpu); > > /* Initialize guest timer frequency. */ > > timer_freq = vcpu_get_reg(vcpu, RISCV_TIMER_REG(frequency)); > > + > > + /* Export the shared variables to the guest */ > > sync_global_to_guest(vm, timer_freq); > > + sync_global_to_guest(vm, vcpu_shared_irq_count); > > + sync_global_to_guest(vm, targs); > > > > run_vcpu(vcpu); > > > > @@ -613,28 +621,30 @@ static void test_vm_events_overflow(void *guest_code) > > > > static void test_print_help(char *name) > > { > > - pr_info("Usage: %s [-h] [-d ]\n", name); > > - pr_info("\t-d: Test to disable. Available tests are 'basic', 'events', 'snapshot', 'overflow'\n"); > > + pr_info("Usage: %s [-h] [-t ]\n", name); > > + pr_info("\t-t: Test to run (default all). Available tests are 'basic', 'events', 'snapshot', 'overflow'\n"); > > It's probably fine to drop '-d', since we don't make any claims about > support, but doing so does risk breaking some CI somewhere. If that > potential breakage is a concern, then we could keep '-d', since nothing > stops us from having both. I don't think we have so much legacy usage with this test that we need to maintain both options. Since this was merged only a few cycles ago, I assume that it's not available in many CI to cause breakage. If somebody running CI actually shouts that it breaks their setup, sure. Otherwise, I feel it will be just confusing to the users. > > > pr_info("\t-h: print this help screen\n"); > > } > > > > static bool parse_args(int argc, char *argv[]) > > { > > int opt; > > - > > - while ((opt = getopt(argc, argv, "hd:")) != -1) { > > + int temp_disabled_tests = SBI_PMU_TEST_BASIC | SBI_PMU_TEST_EVENTS | SBI_PMU_TEST_SNAPSHOT | > > + SBI_PMU_TEST_OVERFLOW; > > + while ((opt = getopt(argc, argv, "h:t:n:")) != -1) { > > '-h' doesn't need an argument and '-n' should be introduced with the next > patch. > Yes. Thanks for catching it. I will fix it in v2. > > switch (opt) { > > - case 'd': > > + case 't': > > if (!strncmp("basic", optarg, 5)) > > - disabled_tests |= SBI_PMU_TEST_BASIC; > > + temp_disabled_tests &= ~SBI_PMU_TEST_BASIC; > > else if (!strncmp("events", optarg, 6)) > > - disabled_tests |= SBI_PMU_TEST_EVENTS; > > + temp_disabled_tests &= ~SBI_PMU_TEST_EVENTS; > > else if (!strncmp("snapshot", optarg, 8)) > > - disabled_tests |= SBI_PMU_TEST_SNAPSHOT; > > + temp_disabled_tests &= ~SBI_PMU_TEST_SNAPSHOT; > > else if (!strncmp("overflow", optarg, 8)) > > - disabled_tests |= SBI_PMU_TEST_OVERFLOW; > > + temp_disabled_tests &= ~SBI_PMU_TEST_OVERFLOW; > > else > > goto done; > > + targs.disabled_tests = temp_disabled_tests; > > break; > > case 'h': > > default: > > @@ -650,25 +660,27 @@ static bool parse_args(int argc, char *argv[]) > > > > int main(int argc, char *argv[]) > > { > > + targs.disabled_tests = 0; > > + > > if (!parse_args(argc, argv)) > > exit(KSFT_SKIP); > > > > - if (!(disabled_tests & SBI_PMU_TEST_BASIC)) { > > + if (!(targs.disabled_tests & SBI_PMU_TEST_BASIC)) { > > test_vm_basic_test(test_pmu_basic_sanity); > > pr_info("SBI PMU basic test : PASS\n"); > > } > > > > - if (!(disabled_tests & SBI_PMU_TEST_EVENTS)) { > > + if (!(targs.disabled_tests & SBI_PMU_TEST_EVENTS)) { > > test_vm_events_test(test_pmu_events); > > pr_info("SBI PMU event verification test : PASS\n"); > > } > > > > - if (!(disabled_tests & SBI_PMU_TEST_SNAPSHOT)) { > > + if (!(targs.disabled_tests & SBI_PMU_TEST_SNAPSHOT)) { > > test_vm_events_snapshot_test(test_pmu_events_snaphost); > > pr_info("SBI PMU event verification with snapshot test : PASS\n"); > > } > > > > - if (!(disabled_tests & SBI_PMU_TEST_OVERFLOW)) { > > + if (!(targs.disabled_tests & SBI_PMU_TEST_OVERFLOW)) { > > test_vm_events_overflow(test_pmu_events_overflow); > > pr_info("SBI PMU event verification with overflow test : PASS\n"); > > } > > > > -- > > 2.43.0 > > > > Otherwise, > > Reviewed-by: Andrew Jones From atishp at rivosinc.com Mon Mar 3 13:27:47 2025 From: atishp at rivosinc.com (Atish Kumar Patra) Date: Mon, 3 Mar 2025 13:27:47 -0800 Subject: [PATCH 4/4] KVM: riscv: selftests: Allow number of interrupts to be configurable In-Reply-To: <20250227-f7b303813dab128b5060b0c3@orel> References: <20250226-kvm_pmu_improve-v1-0-74c058c2bf6d@rivosinc.com> <20250226-kvm_pmu_improve-v1-4-74c058c2bf6d@rivosinc.com> <20250227-f7b303813dab128b5060b0c3@orel> Message-ID: On Thu, Feb 27, 2025 at 12:16?AM Andrew Jones wrote: > > On Wed, Feb 26, 2025 at 12:25:06PM -0800, Atish Patra wrote: > > It is helpful to vary the number of the LCOFI interrupts generated > > by the overflow test. Allow additional argument for overflow test > > to accommodate that. It can be easily cross-validated with > > /proc/interrupts output in the host. > > > > Signed-off-by: Atish Patra > > --- > > tools/testing/selftests/kvm/riscv/sbi_pmu_test.c | 36 ++++++++++++++++++++---- > > 1 file changed, 30 insertions(+), 6 deletions(-) > > > > diff --git a/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c b/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c > > index 533b76d0de82..7c273a1adb17 100644 > > --- a/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c > > +++ b/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c > > @@ -39,8 +39,10 @@ static bool illegal_handler_invoked; > > #define SBI_PMU_TEST_SNAPSHOT BIT(2) > > #define SBI_PMU_TEST_OVERFLOW BIT(3) > > > > +#define SBI_PMU_OVERFLOW_IRQNUM_DEFAULT 5 > > struct test_args { > > int disabled_tests; > > + int overflow_irqnum; > > }; > > > > static struct test_args targs; > > @@ -478,7 +480,7 @@ static void test_pmu_events_snaphost(void) > > > > static void test_pmu_events_overflow(void) > > { > > - int num_counters = 0; > > + int num_counters = 0, i = 0; > > > > /* Verify presence of SBI PMU and minimum requrired SBI version */ > > verify_sbi_requirement_assert(); > > @@ -495,11 +497,15 @@ static void test_pmu_events_overflow(void) > > * Qemu supports overflow for cycle/instruction. > > * This test may fail on any platform that do not support overflow for these two events. > > */ > > - test_pmu_event_overflow(SBI_PMU_HW_CPU_CYCLES); > > - GUEST_ASSERT_EQ(vcpu_shared_irq_count, 1); > > + for (i = 0; i < targs.overflow_irqnum; i++) > > + test_pmu_event_overflow(SBI_PMU_HW_CPU_CYCLES); > > + GUEST_ASSERT_EQ(vcpu_shared_irq_count, targs.overflow_irqnum); > > + > > + vcpu_shared_irq_count = 0; > > > > - test_pmu_event_overflow(SBI_PMU_HW_INSTRUCTIONS); > > - GUEST_ASSERT_EQ(vcpu_shared_irq_count, 2); > > + for (i = 0; i < targs.overflow_irqnum; i++) > > + test_pmu_event_overflow(SBI_PMU_HW_INSTRUCTIONS); > > + GUEST_ASSERT_EQ(vcpu_shared_irq_count, targs.overflow_irqnum); > > > > GUEST_DONE(); > > } > > @@ -621,8 +627,11 @@ static void test_vm_events_overflow(void *guest_code) > > > > static void test_print_help(char *name) > > { > > - pr_info("Usage: %s [-h] [-t ]\n", name); > > + pr_info("Usage: %s [-h] [-t ] [-n ]\n", > > + name); > > pr_info("\t-t: Test to run (default all). Available tests are 'basic', 'events', 'snapshot', 'overflow'\n"); > > + pr_info("\t-n: Number of LCOFI interrupt to trigger for each event in overflow test (default: %d)\n", > > + SBI_PMU_OVERFLOW_IRQNUM_DEFAULT); > > pr_info("\t-h: print this help screen\n"); > > } > > > > @@ -631,6 +640,8 @@ static bool parse_args(int argc, char *argv[]) > > int opt; > > int temp_disabled_tests = SBI_PMU_TEST_BASIC | SBI_PMU_TEST_EVENTS | SBI_PMU_TEST_SNAPSHOT | > > SBI_PMU_TEST_OVERFLOW; > > + int overflow_interrupts = -1; > > Initializing to -1 made me think that '-n 0' would be valid and a way to > disable the overflow test, but... > Is there any benefit ? I found it much more convenient to select a single test and run instead of disabling a single test. Once you single or a set of tests, all other tests are disabled anyways. > > + > > while ((opt = getopt(argc, argv, "h:t:n:")) != -1) { > > switch (opt) { > > case 't': > > @@ -646,12 +657,24 @@ static bool parse_args(int argc, char *argv[]) > > goto done; > > targs.disabled_tests = temp_disabled_tests; > > break; > > + case 'n': > > + overflow_interrupts = atoi_positive("Number of LCOFI", optarg); > > ...here we use atoi_positive() and... > > > + break; > > case 'h': > > default: > > goto done; > > } > > } > > > > + if (overflow_interrupts > 0) { > > ...here we only change from the default of 5 for nonzero. > > Should we allow '-n 0'? Otherwise overflow_interrupts can be initialized > to zero (not that it matters). > I will change the default value to 0 to avoid ambiguity for now. Please let me know if you strongly think we should support -n 0. We can always support it. I just don't see the point of specifying the test with options to disable it anymore. > > + if (targs.disabled_tests & SBI_PMU_TEST_OVERFLOW) { > > + pr_info("-n option is only available for overflow test\n"); > > + goto done; > > + } else { > > + targs.overflow_irqnum = overflow_interrupts; > > + } > > + } > > + > > return true; > > done: > > test_print_help(argv[0]); > > @@ -661,6 +684,7 @@ static bool parse_args(int argc, char *argv[]) > > int main(int argc, char *argv[]) > > { > > targs.disabled_tests = 0; > > + targs.overflow_irqnum = SBI_PMU_OVERFLOW_IRQNUM_DEFAULT; > > > > if (!parse_args(argc, argv)) > > exit(KSFT_SKIP); > > > > -- > > 2.43.0 > > > > Thanks, > drew From atishp at rivosinc.com Mon Mar 3 14:53:05 2025 From: atishp at rivosinc.com (Atish Patra) Date: Mon, 03 Mar 2025 14:53:05 -0800 Subject: [PATCH v2 0/4] RISC-V KVM PMU fix and selftest improvement Message-ID: <20250303-kvm_pmu_improve-v2-0-41d177e45929@rivosinc.com> This series adds a fix for KVM PMU code and improves the pmu selftest by allowing generating precise number of interrupts. It also provided another additional option to the overflow test that allows user to generate custom number of LCOFI interrupts. Signed-off-by: Atish Patra --- Changes in v2: - Initialized the local overflow irq variable to 0 indicate that it's not a allowed value. - Moved the introduction of argument option `n` to the last patch. - Link to v1: https://lore.kernel.org/r/20250226-kvm_pmu_improve-v1-0-74c058c2bf6d at rivosinc.com --- Atish Patra (4): RISC-V: KVM: Disable the kernel perf counter during configure KVM: riscv: selftests: Do not start the counter in the overflow handler KVM: riscv: selftests: Change command line option KVM: riscv: selftests: Allow number of interrupts to be configurable arch/riscv/kvm/vcpu_pmu.c | 1 + tools/testing/selftests/kvm/riscv/sbi_pmu_test.c | 81 ++++++++++++++++-------- 2 files changed, 57 insertions(+), 25 deletions(-) --- base-commit: 0ad2507d5d93f39619fc42372c347d6006b64319 change-id: 20250225-kvm_pmu_improve-fffd038b2404 -- Regards, Atish patra From atishp at rivosinc.com Mon Mar 3 14:53:06 2025 From: atishp at rivosinc.com (Atish Patra) Date: Mon, 03 Mar 2025 14:53:06 -0800 Subject: [PATCH v2 1/4] RISC-V: KVM: Disable the kernel perf counter during configure In-Reply-To: <20250303-kvm_pmu_improve-v2-0-41d177e45929@rivosinc.com> References: <20250303-kvm_pmu_improve-v2-0-41d177e45929@rivosinc.com> Message-ID: <20250303-kvm_pmu_improve-v2-1-41d177e45929@rivosinc.com> The perf event should be marked disabled during the creation as it is not ready to be scheduled until there is SBI PMU start call or config matching is called with auto start. Otherwise, event add/start gets called during perf_event_create_kernel_counter function. It will be enabled and scheduled to run via perf_event_enable during either the above mentioned scenario. Fixes: 0cb74b65d2e5 ("RISC-V: KVM: Implement perf support without sampling") Reviewed-by: Andrew Jones Signed-off-by: Atish Patra --- arch/riscv/kvm/vcpu_pmu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c index 2707a51b082c..78ac3216a54d 100644 --- a/arch/riscv/kvm/vcpu_pmu.c +++ b/arch/riscv/kvm/vcpu_pmu.c @@ -666,6 +666,7 @@ int kvm_riscv_vcpu_pmu_ctr_cfg_match(struct kvm_vcpu *vcpu, unsigned long ctr_ba .type = etype, .size = sizeof(struct perf_event_attr), .pinned = true, + .disabled = true, /* * It should never reach here if the platform doesn't support the sscofpmf * extension as mode filtering won't work without it. -- 2.43.0 From atishp at rivosinc.com Mon Mar 3 14:53:07 2025 From: atishp at rivosinc.com (Atish Patra) Date: Mon, 03 Mar 2025 14:53:07 -0800 Subject: [PATCH v2 2/4] KVM: riscv: selftests: Do not start the counter in the overflow handler In-Reply-To: <20250303-kvm_pmu_improve-v2-0-41d177e45929@rivosinc.com> References: <20250303-kvm_pmu_improve-v2-0-41d177e45929@rivosinc.com> Message-ID: <20250303-kvm_pmu_improve-v2-2-41d177e45929@rivosinc.com> There is no need to start the counter in the overflow handler as we intend to trigger precise number of LCOFI interrupts through these tests. The overflow irq handler has already stopped the counter. As a result, the stop call from the test function may return already stopped error which is fine as well. Reviewed-by: Andrew Jones Signed-off-by: Atish Patra --- tools/testing/selftests/kvm/riscv/sbi_pmu_test.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c b/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c index f45c0ecc902d..284bc80193bd 100644 --- a/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c +++ b/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c @@ -118,8 +118,8 @@ static void stop_counter(unsigned long counter, unsigned long stop_flags) ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_STOP, counter, 1, stop_flags, 0, 0, 0); - __GUEST_ASSERT(ret.error == 0, "Unable to stop counter %ld error %ld\n", - counter, ret.error); + __GUEST_ASSERT(ret.error == 0 || ret.error == SBI_ERR_ALREADY_STOPPED, + "Unable to stop counter %ld error %ld\n", counter, ret.error); } static void guest_illegal_exception_handler(struct ex_regs *regs) @@ -137,7 +137,6 @@ static void guest_irq_handler(struct ex_regs *regs) unsigned int irq_num = regs->cause & ~CAUSE_IRQ_FLAG; struct riscv_pmu_snapshot_data *snapshot_data = snapshot_gva; unsigned long overflown_mask; - unsigned long counter_val = 0; /* Validate that we are in the correct irq handler */ GUEST_ASSERT_EQ(irq_num, IRQ_PMU_OVF); @@ -151,10 +150,6 @@ static void guest_irq_handler(struct ex_regs *regs) GUEST_ASSERT(overflown_mask & 0x01); WRITE_ONCE(vcpu_shared_irq_count, vcpu_shared_irq_count+1); - - counter_val = READ_ONCE(snapshot_data->ctr_values[0]); - /* Now start the counter to mimick the real driver behavior */ - start_counter(counter_in_use, SBI_PMU_START_FLAG_SET_INIT_VALUE, counter_val); } static unsigned long get_counter_index(unsigned long cbase, unsigned long cmask, -- 2.43.0 From atishp at rivosinc.com Mon Mar 3 14:53:08 2025 From: atishp at rivosinc.com (Atish Patra) Date: Mon, 03 Mar 2025 14:53:08 -0800 Subject: [PATCH v2 3/4] KVM: riscv: selftests: Change command line option In-Reply-To: <20250303-kvm_pmu_improve-v2-0-41d177e45929@rivosinc.com> References: <20250303-kvm_pmu_improve-v2-0-41d177e45929@rivosinc.com> Message-ID: <20250303-kvm_pmu_improve-v2-3-41d177e45929@rivosinc.com> The PMU test commandline option takes an argument to disable a certain test. The initial assumption behind this was a common use case is just to run all the test most of the time. However, running a single test seems more useful instead. Especially, the overflow test has been helpful to validate PMU virtualizaiton interrupt changes. Switching the command line option to run a single test instead of disabling a single test also allows to provide additional test specific arguments to the test. The default without any options remains unchanged which continues to run all the tests. Reviewed-by: Andrew Jones Signed-off-by: Atish Patra --- tools/testing/selftests/kvm/riscv/sbi_pmu_test.c | 40 +++++++++++++++--------- 1 file changed, 26 insertions(+), 14 deletions(-) diff --git a/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c b/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c index 284bc80193bd..de66099235d9 100644 --- a/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c +++ b/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c @@ -39,7 +39,11 @@ static bool illegal_handler_invoked; #define SBI_PMU_TEST_SNAPSHOT BIT(2) #define SBI_PMU_TEST_OVERFLOW BIT(3) -static int disabled_tests; +struct test_args { + int disabled_tests; +}; + +static struct test_args targs; unsigned long pmu_csr_read_num(int csr_num) { @@ -604,7 +608,11 @@ static void test_vm_events_overflow(void *guest_code) vcpu_init_vector_tables(vcpu); /* Initialize guest timer frequency. */ timer_freq = vcpu_get_reg(vcpu, RISCV_TIMER_REG(frequency)); + + /* Export the shared variables to the guest */ sync_global_to_guest(vm, timer_freq); + sync_global_to_guest(vm, vcpu_shared_irq_count); + sync_global_to_guest(vm, targs); run_vcpu(vcpu); @@ -613,28 +621,30 @@ static void test_vm_events_overflow(void *guest_code) static void test_print_help(char *name) { - pr_info("Usage: %s [-h] [-d ]\n", name); - pr_info("\t-d: Test to disable. Available tests are 'basic', 'events', 'snapshot', 'overflow'\n"); + pr_info("Usage: %s [-h] [-t ]\n", name); + pr_info("\t-t: Test to run (default all). Available tests are 'basic', 'events', 'snapshot', 'overflow'\n"); pr_info("\t-h: print this help screen\n"); } static bool parse_args(int argc, char *argv[]) { int opt; - - while ((opt = getopt(argc, argv, "hd:")) != -1) { + int temp_disabled_tests = SBI_PMU_TEST_BASIC | SBI_PMU_TEST_EVENTS | SBI_PMU_TEST_SNAPSHOT | + SBI_PMU_TEST_OVERFLOW; + while ((opt = getopt(argc, argv, "ht:")) != -1) { switch (opt) { - case 'd': + case 't': if (!strncmp("basic", optarg, 5)) - disabled_tests |= SBI_PMU_TEST_BASIC; + temp_disabled_tests &= ~SBI_PMU_TEST_BASIC; else if (!strncmp("events", optarg, 6)) - disabled_tests |= SBI_PMU_TEST_EVENTS; + temp_disabled_tests &= ~SBI_PMU_TEST_EVENTS; else if (!strncmp("snapshot", optarg, 8)) - disabled_tests |= SBI_PMU_TEST_SNAPSHOT; + temp_disabled_tests &= ~SBI_PMU_TEST_SNAPSHOT; else if (!strncmp("overflow", optarg, 8)) - disabled_tests |= SBI_PMU_TEST_OVERFLOW; + temp_disabled_tests &= ~SBI_PMU_TEST_OVERFLOW; else goto done; + targs.disabled_tests = temp_disabled_tests; break; case 'h': default: @@ -650,25 +660,27 @@ static bool parse_args(int argc, char *argv[]) int main(int argc, char *argv[]) { + targs.disabled_tests = 0; + if (!parse_args(argc, argv)) exit(KSFT_SKIP); - if (!(disabled_tests & SBI_PMU_TEST_BASIC)) { + if (!(targs.disabled_tests & SBI_PMU_TEST_BASIC)) { test_vm_basic_test(test_pmu_basic_sanity); pr_info("SBI PMU basic test : PASS\n"); } - if (!(disabled_tests & SBI_PMU_TEST_EVENTS)) { + if (!(targs.disabled_tests & SBI_PMU_TEST_EVENTS)) { test_vm_events_test(test_pmu_events); pr_info("SBI PMU event verification test : PASS\n"); } - if (!(disabled_tests & SBI_PMU_TEST_SNAPSHOT)) { + if (!(targs.disabled_tests & SBI_PMU_TEST_SNAPSHOT)) { test_vm_events_snapshot_test(test_pmu_events_snaphost); pr_info("SBI PMU event verification with snapshot test : PASS\n"); } - if (!(disabled_tests & SBI_PMU_TEST_OVERFLOW)) { + if (!(targs.disabled_tests & SBI_PMU_TEST_OVERFLOW)) { test_vm_events_overflow(test_pmu_events_overflow); pr_info("SBI PMU event verification with overflow test : PASS\n"); } -- 2.43.0 From atishp at rivosinc.com Mon Mar 3 14:53:09 2025 From: atishp at rivosinc.com (Atish Patra) Date: Mon, 03 Mar 2025 14:53:09 -0800 Subject: [PATCH v2 4/4] KVM: riscv: selftests: Allow number of interrupts to be configurable In-Reply-To: <20250303-kvm_pmu_improve-v2-0-41d177e45929@rivosinc.com> References: <20250303-kvm_pmu_improve-v2-0-41d177e45929@rivosinc.com> Message-ID: <20250303-kvm_pmu_improve-v2-4-41d177e45929@rivosinc.com> It is helpful to vary the number of the LCOFI interrupts generated by the overflow test. Allow additional argument for overflow test to accommodate that. It can be easily cross-validated with /proc/interrupts output in the host. Signed-off-by: Atish Patra --- tools/testing/selftests/kvm/riscv/sbi_pmu_test.c | 38 +++++++++++++++++++----- 1 file changed, 31 insertions(+), 7 deletions(-) diff --git a/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c b/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c index de66099235d9..03406de4989d 100644 --- a/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c +++ b/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c @@ -39,8 +39,10 @@ static bool illegal_handler_invoked; #define SBI_PMU_TEST_SNAPSHOT BIT(2) #define SBI_PMU_TEST_OVERFLOW BIT(3) +#define SBI_PMU_OVERFLOW_IRQNUM_DEFAULT 5 struct test_args { int disabled_tests; + int overflow_irqnum; }; static struct test_args targs; @@ -478,7 +480,7 @@ static void test_pmu_events_snaphost(void) static void test_pmu_events_overflow(void) { - int num_counters = 0; + int num_counters = 0, i = 0; /* Verify presence of SBI PMU and minimum requrired SBI version */ verify_sbi_requirement_assert(); @@ -495,11 +497,15 @@ static void test_pmu_events_overflow(void) * Qemu supports overflow for cycle/instruction. * This test may fail on any platform that do not support overflow for these two events. */ - test_pmu_event_overflow(SBI_PMU_HW_CPU_CYCLES); - GUEST_ASSERT_EQ(vcpu_shared_irq_count, 1); + for (i = 0; i < targs.overflow_irqnum; i++) + test_pmu_event_overflow(SBI_PMU_HW_CPU_CYCLES); + GUEST_ASSERT_EQ(vcpu_shared_irq_count, targs.overflow_irqnum); + + vcpu_shared_irq_count = 0; - test_pmu_event_overflow(SBI_PMU_HW_INSTRUCTIONS); - GUEST_ASSERT_EQ(vcpu_shared_irq_count, 2); + for (i = 0; i < targs.overflow_irqnum; i++) + test_pmu_event_overflow(SBI_PMU_HW_INSTRUCTIONS); + GUEST_ASSERT_EQ(vcpu_shared_irq_count, targs.overflow_irqnum); GUEST_DONE(); } @@ -621,8 +627,11 @@ static void test_vm_events_overflow(void *guest_code) static void test_print_help(char *name) { - pr_info("Usage: %s [-h] [-t ]\n", name); + pr_info("Usage: %s [-h] [-t ] [-n ]\n", + name); pr_info("\t-t: Test to run (default all). Available tests are 'basic', 'events', 'snapshot', 'overflow'\n"); + pr_info("\t-n: Number of LCOFI interrupt to trigger for each event in overflow test (default: %d)\n", + SBI_PMU_OVERFLOW_IRQNUM_DEFAULT); pr_info("\t-h: print this help screen\n"); } @@ -631,7 +640,9 @@ static bool parse_args(int argc, char *argv[]) int opt; int temp_disabled_tests = SBI_PMU_TEST_BASIC | SBI_PMU_TEST_EVENTS | SBI_PMU_TEST_SNAPSHOT | SBI_PMU_TEST_OVERFLOW; - while ((opt = getopt(argc, argv, "ht:")) != -1) { + int overflow_interrupts = 0; + + while ((opt = getopt(argc, argv, "ht:n:")) != -1) { switch (opt) { case 't': if (!strncmp("basic", optarg, 5)) @@ -646,12 +657,24 @@ static bool parse_args(int argc, char *argv[]) goto done; targs.disabled_tests = temp_disabled_tests; break; + case 'n': + overflow_interrupts = atoi_positive("Number of LCOFI", optarg); + break; case 'h': default: goto done; } } + if (overflow_interrupts > 0) { + if (targs.disabled_tests & SBI_PMU_TEST_OVERFLOW) { + pr_info("-n option is only available for overflow test\n"); + goto done; + } else { + targs.overflow_irqnum = overflow_interrupts; + } + } + return true; done: test_print_help(argv[0]); @@ -661,6 +684,7 @@ static bool parse_args(int argc, char *argv[]) int main(int argc, char *argv[]) { targs.disabled_tests = 0; + targs.overflow_irqnum = SBI_PMU_OVERFLOW_IRQNUM_DEFAULT; if (!parse_args(argc, argv)) exit(KSFT_SKIP); -- 2.43.0 From ajones at ventanamicro.com Tue Mar 4 00:58:15 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Tue, 4 Mar 2025 09:58:15 +0100 Subject: [PATCH 4/4] KVM: riscv: selftests: Allow number of interrupts to be configurable In-Reply-To: References: <20250226-kvm_pmu_improve-v1-0-74c058c2bf6d@rivosinc.com> <20250226-kvm_pmu_improve-v1-4-74c058c2bf6d@rivosinc.com> <20250227-f7b303813dab128b5060b0c3@orel> Message-ID: <20250304-5a85b3a246f14f60f61a45e0@orel> On Mon, Mar 03, 2025 at 01:27:47PM -0800, Atish Kumar Patra wrote: > On Thu, Feb 27, 2025 at 12:16?AM Andrew Jones wrote: > > > > On Wed, Feb 26, 2025 at 12:25:06PM -0800, Atish Patra wrote: ... > I will change the default value to 0 to avoid ambiguity for now. > Please let me know if you strongly think we should support -n 0. > We can always support it. I just don't see the point of specifying the > test with options to disable it anymore. > I don't mind not supporting '-n 0'. Thanks, drew From ajones at ventanamicro.com Tue Mar 4 01:00:41 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Tue, 4 Mar 2025 10:00:41 +0100 Subject: [PATCH v2 4/4] KVM: riscv: selftests: Allow number of interrupts to be configurable In-Reply-To: <20250303-kvm_pmu_improve-v2-4-41d177e45929@rivosinc.com> References: <20250303-kvm_pmu_improve-v2-0-41d177e45929@rivosinc.com> <20250303-kvm_pmu_improve-v2-4-41d177e45929@rivosinc.com> Message-ID: <20250304-bb96798e9a1fd292430df3e8@orel> On Mon, Mar 03, 2025 at 02:53:09PM -0800, Atish Patra wrote: > It is helpful to vary the number of the LCOFI interrupts generated > by the overflow test. Allow additional argument for overflow test > to accommodate that. It can be easily cross-validated with > /proc/interrupts output in the host. > > Signed-off-by: Atish Patra > --- > tools/testing/selftests/kvm/riscv/sbi_pmu_test.c | 38 +++++++++++++++++++----- > 1 file changed, 31 insertions(+), 7 deletions(-) > Reviewed-by: Andrew Jones From andrew.jones at linux.dev Tue Mar 4 01:12:30 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Tue, 4 Mar 2025 10:12:30 +0100 Subject: [kvm-unit-tests PATCH 1/2] configure: Allow earlycon for all architectures In-Reply-To: <20250221162753.126290-5-andrew.jones@linux.dev> References: <20250221162753.126290-4-andrew.jones@linux.dev> <20250221162753.126290-5-andrew.jones@linux.dev> Message-ID: <20250304-4e2ee456d30819f9a38239c8@orel> On Fri, Feb 21, 2025 at 05:27:55PM +0100, Andrew Jones wrote: > From: Andrew Jones > > earlycon could be used by any architecture so check it outside the > arm block and apply it to riscv right away. > > Signed-off-by: Andrew Jones > --- > configure | 88 +++++++++++++++++++++++++++---------------------------- > 1 file changed, 43 insertions(+), 45 deletions(-) > > diff --git a/configure b/configure > index 86cf1da36467..7e462aaf1190 100755 > --- a/configure > +++ b/configure > @@ -78,10 +78,9 @@ usage() { > 4k [default], 16k, 64k for arm64. > 4k [default], 64k for ppc64. > --earlycon=EARLYCON > - Specify the UART name, type and address (optional, arm and > - arm64 only). The specified address will overwrite the UART > - address set by the --target option. EARLYCON can be one of > - (case sensitive): > + Specify the UART name, type and address (optional). > + The specified address will overwrite the UART address set by > + the --target option. EARLYCON can be one of (case sensitive): I'll add (arm/arm64 and riscv32/riscv64 only) since no other architecture is paying attention to the parameter at this time. Thanks, drew > uart[8250],mmio,ADDR > Specify an 8250 compatible UART at address ADDR. Supported > register stride is 8 bit only. > @@ -283,6 +282,41 @@ else > fi > fi > > +if [ "$earlycon" ]; then > + IFS=, read -r name type_addr addr <<<"$earlycon" > + if [ "$name" != "uart" ] && [ "$name" != "uart8250" ] && [ "$name" != "pl011" ]; then > + echo "unknown earlycon name: $name" > + usage > + fi > + > + if [ "$name" = "pl011" ]; then > + if [ -z "$addr" ]; then > + addr=$type_addr > + else > + if [ "$type_addr" != "mmio32" ]; then > + echo "unknown $name earlycon type: $type_addr" > + usage > + fi > + fi > + else > + if [ "$type_addr" != "mmio" ]; then > + echo "unknown $name earlycon type: $type_addr" > + usage > + fi > + fi > + > + if [ -z "$addr" ]; then > + echo "missing $name earlycon address" > + usage > + fi > + if [[ $addr =~ ^0(x|X)[0-9a-fA-F]+$ ]] || [[ $addr =~ ^[0-9]+$ ]]; then > + uart_early_addr=$addr > + else > + echo "invalid $name earlycon address: $addr" > + usage > + fi > +fi > + > [ -z "$processor" ] && processor="$arch" > > if [ "$processor" = "arm64" ]; then > @@ -296,51 +330,14 @@ if [ "$arch" = "i386" ] || [ "$arch" = "x86_64" ]; then > elif [ "$arch" = "arm" ] || [ "$arch" = "arm64" ]; then > testdir=arm > if [ "$target" = "qemu" ]; then > - arm_uart_early_addr=0x09000000 > + : "${uart_early_addr:=0x9000000}" > elif [ "$target" = "kvmtool" ]; then > - arm_uart_early_addr=0x1000000 > + : "${uart_early_addr:=0x1000000}" > errata_force=1 > else > echo "--target must be one of 'qemu' or 'kvmtool'!" > usage > fi > - > - if [ "$earlycon" ]; then > - IFS=, read -r name type_addr addr <<<"$earlycon" > - if [ "$name" != "uart" ] && [ "$name" != "uart8250" ] && > - [ "$name" != "pl011" ]; then > - echo "unknown earlycon name: $name" > - usage > - fi > - > - if [ "$name" = "pl011" ]; then > - if [ -z "$addr" ]; then > - addr=$type_addr > - else > - if [ "$type_addr" != "mmio32" ]; then > - echo "unknown $name earlycon type: $type_addr" > - usage > - fi > - fi > - else > - if [ "$type_addr" != "mmio" ]; then > - echo "unknown $name earlycon type: $type_addr" > - usage > - fi > - fi > - > - if [ -z "$addr" ]; then > - echo "missing $name earlycon address" > - usage > - fi > - if [[ $addr =~ ^0(x|X)[0-9a-fA-F]+$ ]] || > - [[ $addr =~ ^[0-9]+$ ]]; then > - arm_uart_early_addr=$addr > - else > - echo "invalid $name earlycon address: $addr" > - usage > - fi > - fi > elif [ "$arch" = "ppc64" ]; then > testdir=powerpc > firmware="$testdir/boot_rom.bin" > @@ -351,6 +348,7 @@ elif [ "$arch" = "ppc64" ]; then > elif [ "$arch" = "riscv32" ] || [ "$arch" = "riscv64" ]; then > testdir=riscv > arch_libdir=riscv > + : "${uart_early_addr:=0x10000000}" > elif [ "$arch" = "s390x" ]; then > testdir=s390x > else > @@ -491,7 +489,7 @@ EOF > if [ "$arch" = "arm" ] || [ "$arch" = "arm64" ]; then > cat <> lib/config.h > > -#define CONFIG_UART_EARLY_BASE ${arm_uart_early_addr} > +#define CONFIG_UART_EARLY_BASE ${uart_early_addr} > #define CONFIG_ERRATA_FORCE ${errata_force} > > EOF > @@ -506,7 +504,7 @@ EOF > elif [ "$arch" = "riscv32" ] || [ "$arch" = "riscv64" ]; then > cat <> lib/config.h > > -#define CONFIG_UART_EARLY_BASE 0x10000000 > +#define CONFIG_UART_EARLY_BASE ${uart_early_addr} > > EOF > fi > -- > 2.48.1 > From andrew.jones at linux.dev Tue Mar 4 01:31:21 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Tue, 4 Mar 2025 10:31:21 +0100 Subject: [kvm-unit-tests PATCH v2 00/11] riscv: sbi: Test improvements and a couple new In-Reply-To: <20250227141946.91604-13-andrew.jones@linux.dev> References: <20250227141946.91604-13-andrew.jones@linux.dev> Message-ID: <20250304-d6467defdcdc60f372296de0@orel> On Thu, Feb 27, 2025 at 03:19:24PM +0100, Andrew Jones wrote: > Improvements: > - Ensure system suspend test won't hang > - Ensure HSM suspend tests won't hang > - Ensure state is cleaned up at the end of one extension's test before > starting another > - Probe SUSP and skip cleanly > - Minor SUSP test output improvement > - Dropped upper bits fwft test for misaligned_exc_deleg > > New tests: > - Check bad FIDs for all extensions > - Check SUSP sleep_type upper bits are ignored on RV64 > - FWFT pte-hw-ad-updating tests > > v2: > - Dropped upper bits fwft test for misaligned_exc_deleg rather than > change to kfail > - Added FWFT pte-hw-ad-updating tests > - Added a couple Cl?ment tags > Merged. Thanks, drew From andrew.jones at linux.dev Tue Mar 4 01:31:48 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Tue, 4 Mar 2025 10:31:48 +0100 Subject: [kvm-unit-tests PATCH 0/2] riscv: Run with other QEMU models In-Reply-To: <20250221162753.126290-4-andrew.jones@linux.dev> References: <20250221162753.126290-4-andrew.jones@linux.dev> Message-ID: <20250304-47806ddca2d868f97d63de6b@orel> On Fri, Feb 21, 2025 at 05:27:54PM +0100, Andrew Jones wrote: > Provide a couple patches allowing a QEMU machine model other than 'virt' > to be used. We just need to be able to override 'virt' in the command > line and it's also nice to be able to specify a different UART address > for any early (pre DT parsing) outputs. > > Andrew Jones (2): > configure: Allow earlycon for all architectures > riscv: Introduce MACHINE_OVERRIDE > > configure | 88 +++++++++++++++++++++++++++---------------------------- > riscv/run | 6 ++-- > 2 files changed, 47 insertions(+), 47 deletions(-) > > -- > 2.48.1 Merged. Thanks, drew From andrew.jones at linux.dev Tue Mar 4 01:32:21 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Tue, 4 Mar 2025 10:32:21 +0100 Subject: [kvm-unit-tests PATCH v2 0/3] riscv: sbi: Provide sbiret_report/check In-Reply-To: <20250218185401.41250-5-andrew.jones@linux.dev> References: <20250218185401.41250-5-andrew.jones@linux.dev> Message-ID: <20250304-02695193b45ff3d30c19b561@orel> On Tue, Feb 18, 2025 at 07:54:01PM +0100, Andrew Jones wrote: > It's commom for SBI tests to expect a particular error return (or no > error, which case it expects the returned error to be SBI_SUCCESS). > When we don't get the expected error it'd be nice to output what we > did get. gen_report() in the SBI tests were doing that for the BASE > tests, improve it and export it to be used by all SBI tests. > > v2: > - Output expected error string with sbiret_report_error() [Cl?ment] > - Picked up Cl?ment's tags except for patch3 since it changed a > decent amount. > > Andrew Jones (3): > riscv: sbi: Improve gen_report > riscv: sbi: Export sbiret_report/check > riscv: sbi: Add sbiret_report_error > > riscv/sbi-tests.h | 36 ++++++++++++++++++++++++++++++++++++ > riscv/sbi.c | 30 ++++++++---------------------- > 2 files changed, 44 insertions(+), 22 deletions(-) > > -- > 2.48.1 > Merged. Thanks, drew From andrew.jones at linux.dev Tue Mar 4 01:34:18 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Tue, 4 Mar 2025 10:34:18 +0100 Subject: [kvm-unit-tests PATCH v3 0/2] Add support for SBI FWFT extension testing In-Reply-To: <20250129-4e54cccfa2abab6dba9a608b@orel> References: <20250128141543.1338677-1-cleger@rivosinc.com> <20250129-4e54cccfa2abab6dba9a608b@orel> Message-ID: <20250304-5e0b7b5bf691d3ba97621388@orel> On Wed, Jan 29, 2025 at 03:18:24PM +0100, Andrew Jones wrote: > On Tue, Jan 28, 2025 at 03:15:40PM +0100, Cl?ment L?ger wrote: > > This series adds a minimal set of tests for the FWFT extension. Reserved > > range as well as misaligned exception delegation. A commit coming from > > the SSE tests series is also included in this series to add -deps > > makefile notation. > > > > --- > > > > V3: > > - Rebase on top of andrew/riscv/sbi > > - Use sbiret_report_error() > > - Add helpers for MISALIGNED_EXC_DELEG fwft set/get > > - Add a comment on misaligned trap handling > > > > V2: > > - Added fwft_{get/set}_raw() to test invalid > 32 bits ids > > - Added test for invalid flags/value > 32 bits > > - Added test for lock feature > > - Use and enum for FWFT functions > > - Replace hardcoded 1 << with BIT() > > - Fix fwft_get/set return value > > - Split set/get tests for reserved ranges > > - Added push/pop to arch -c option > > - Remove leftover of manual probing code > > > > Cl?ment L?ger (2): > > riscv: Add "-deps" handling for tests > > riscv: Add tests for SBI FWFT extension > > > > riscv/Makefile | 8 +- > > lib/riscv/asm/sbi.h | 34 ++++++++ > > riscv/sbi-fwft.c | 190 ++++++++++++++++++++++++++++++++++++++++++++ > > riscv/sbi.c | 3 + > > 4 files changed, 232 insertions(+), 3 deletions(-) > > create mode 100644 riscv/sbi-fwft.c > > > > -- > > 2.47.1 > > > > Applied to riscv/sbi > > https://gitlab.com/jones-drew/kvm-unit-tests/-/commits/riscv/sbi > Merged. Thanks, drew From thuth at redhat.com Thu Mar 6 01:00:57 2025 From: thuth at redhat.com (Thomas Huth) Date: Thu, 6 Mar 2025 10:00:57 +0100 Subject: [RFC kvm-unit-tests PATCH] lib: Use __ASSEMBLER__ instead of __ASSEMBLY__ In-Reply-To: <20250222014526.2302653-1-seanjc@google.com> References: <20250222014526.2302653-1-seanjc@google.com> Message-ID: On 22/02/2025 02.45, Sean Christopherson wrote: > Convert all non-x86 #ifdefs from __ASSEMBLY__ to __ASSEMBLER__, and remove > all manual __ASSEMBLY__ #defines. __ASSEMBLY_ was inherited blindly from > the Linux kernel, and must be manually defined, e.g. through build rules > or with the aforementioned explicit #defines in assembly code. > > __ASSEMBLER__ on the other hand is automatically defined by the compiler > when preprocessing assembly, i.e. doesn't require manually #defines for > the code to function correctly. > > Ignore x86, as x86 doesn't actually rely on __ASSEMBLY__ at the moment, > and is undergoing a parallel cleanup. > > Signed-off-by: Sean Christopherson > --- > > Completely untested. This is essentially a "rage" patch after spending > way, way too much time trying to understand why I couldn't include some > __ASSEMBLY__ protected headers in x86 assembly files. Thanks, applied (after fixing the spot that Andrew mentioned and another one that has been merged in between)! BTW, do you happen to know why the kernel uses __ASSEMBLY__ and not __ASSEMBLER__? Just grown historically, or is there a real reason? Thomas From cleger at rivosinc.com Thu Mar 6 02:04:48 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Thu, 6 Mar 2025 11:04:48 +0100 Subject: [kvm-unit-tests PATCH v7 5/6] lib: riscv: Add SBI SSE support In-Reply-To: <20250227-a28f41972a73e8269d26b461@orel> References: <20250214114423.1071621-1-cleger@rivosinc.com> <20250214114423.1071621-6-cleger@rivosinc.com> <20250227-a28f41972a73e8269d26b461@orel> Message-ID: On 27/02/2025 17:03, Andrew Jones wrote: > On Fri, Feb 14, 2025 at 12:44:18PM +0100, Cl?ment L?ger wrote: >> Add support for registering and handling SSE events. This will be used >> by sbi test as well as upcoming double trap tests. >> >> Signed-off-by: Cl?ment L?ger >> --- >> riscv/Makefile | 2 + >> lib/riscv/asm/csr.h | 1 + >> lib/riscv/asm/sbi-sse.h | 48 +++++++++++++++++++ >> lib/riscv/sbi-sse-asm.S | 103 ++++++++++++++++++++++++++++++++++++++++ >> lib/riscv/asm-offsets.c | 9 ++++ >> lib/riscv/sbi-sse.c | 84 ++++++++++++++++++++++++++++++++ >> 6 files changed, 247 insertions(+) >> create mode 100644 lib/riscv/asm/sbi-sse.h >> create mode 100644 lib/riscv/sbi-sse-asm.S >> create mode 100644 lib/riscv/sbi-sse.c >> >> diff --git a/riscv/Makefile b/riscv/Makefile >> index 02d2ac39..ed590ede 100644 >> --- a/riscv/Makefile >> +++ b/riscv/Makefile >> @@ -43,6 +43,8 @@ cflatobjs += lib/riscv/setup.o >> cflatobjs += lib/riscv/smp.o >> cflatobjs += lib/riscv/stack.o >> cflatobjs += lib/riscv/timer.o >> +cflatobjs += lib/riscv/sbi-sse-asm.o >> +cflatobjs += lib/riscv/sbi-sse.o >> ifeq ($(ARCH),riscv32) >> cflatobjs += lib/ldiv32.o >> endif >> diff --git a/lib/riscv/asm/csr.h b/lib/riscv/asm/csr.h >> index 16f5ddd7..389d9850 100644 >> --- a/lib/riscv/asm/csr.h >> +++ b/lib/riscv/asm/csr.h >> @@ -17,6 +17,7 @@ >> #define CSR_TIME 0xc01 >> >> #define SR_SIE _AC(0x00000002, UL) >> +#define SR_SPP _AC(0x00000100, UL) >> >> /* Exception cause high bit - is an interrupt if set */ >> #define CAUSE_IRQ_FLAG (_AC(1, UL) << (__riscv_xlen - 1)) >> diff --git a/lib/riscv/asm/sbi-sse.h b/lib/riscv/asm/sbi-sse.h >> new file mode 100644 >> index 00000000..ba18ce27 >> --- /dev/null >> +++ b/lib/riscv/asm/sbi-sse.h >> @@ -0,0 +1,48 @@ >> +/* SPDX-License-Identifier: GPL-2.0-only */ >> +/* >> + * SBI SSE library interface >> + * >> + * Copyright (C) 2025, Rivos Inc., Cl?ment L?ger >> + */ >> +#ifndef _RISCV_SSE_H_ >> +#define _RISCV_SSE_H_ >> + >> +#include >> + >> +typedef void (*sbi_sse_handler_fn)(void *data, struct pt_regs *regs, unsigned int hartid); >> + >> +struct sbi_sse_handler_arg { >> + unsigned long reg_tmp; >> + sbi_sse_handler_fn handler; >> + void *handler_data; >> + void *stack; >> +}; >> + >> +extern void sbi_sse_entry(void); >> + >> +static inline bool sbi_sse_event_is_global(uint32_t event_id) >> +{ >> + return !!(event_id & SBI_SSE_EVENT_GLOBAL_BIT); >> +} >> + >> +struct sbiret sbi_sse_read_attrs_raw(unsigned long event_id, unsigned long base_attr_id, >> + unsigned long attr_count, unsigned long phys_lo, >> + unsigned long phys_hi); >> +struct sbiret sbi_sse_read_attrs(unsigned long event_id, unsigned long base_attr_id, >> + unsigned long attr_count, unsigned long *values); >> +struct sbiret sbi_sse_write_attrs_raw(unsigned long event_id, unsigned long base_attr_id, >> + unsigned long attr_count, unsigned long phys_lo, >> + unsigned long phys_hi); >> +struct sbiret sbi_sse_write_attrs(unsigned long event_id, unsigned long base_attr_id, >> + unsigned long attr_count, unsigned long *values); >> +struct sbiret sbi_sse_register_raw(unsigned long event_id, unsigned long entry_pc, >> + unsigned long entry_arg); >> +struct sbiret sbi_sse_register(unsigned long event_id, struct sbi_sse_handler_arg *arg); >> +struct sbiret sbi_sse_unregister(unsigned long event_id); >> +struct sbiret sbi_sse_enable(unsigned long event_id); >> +struct sbiret sbi_sse_hart_mask(void); >> +struct sbiret sbi_sse_hart_unmask(void); >> +struct sbiret sbi_sse_inject(unsigned long event_id, unsigned long hart_id); >> +struct sbiret sbi_sse_disable(unsigned long event_id); > > nit: I'd put disable up by enable to keep pairs together. > >> + >> +#endif /* !_RISCV_SSE_H_ */ >> diff --git a/lib/riscv/sbi-sse-asm.S b/lib/riscv/sbi-sse-asm.S >> new file mode 100644 >> index 00000000..5f60b839 >> --- /dev/null >> +++ b/lib/riscv/sbi-sse-asm.S >> @@ -0,0 +1,103 @@ >> +/* SPDX-License-Identifier: GPL-2.0 */ >> +/* >> + * RISC-V SSE events entry point. >> + * >> + * Copyright (C) 2025, Rivos Inc., Cl?ment L?ger >> + */ >> +#define __ASSEMBLY__ >> +#include >> +#include >> +#include > > nit: asm-offsets above csr if we want to practice alphabetical ordering. > >> +#include >> + >> +.section .text >> +.global sbi_sse_entry >> +sbi_sse_entry: >> + /* Save stack temporarily */ >> + REG_S sp, SBI_SSE_REG_TMP(a7) >> + /* Set entry stack */ >> + REG_L sp, SBI_SSE_HANDLER_STACK(a7) >> + >> + addi sp, sp, -(PT_SIZE) >> + REG_S ra, PT_RA(sp) >> + REG_S s0, PT_S0(sp) >> + REG_S s1, PT_S1(sp) >> + REG_S s2, PT_S2(sp) >> + REG_S s3, PT_S3(sp) >> + REG_S s4, PT_S4(sp) >> + REG_S s5, PT_S5(sp) >> + REG_S s6, PT_S6(sp) >> + REG_S s7, PT_S7(sp) >> + REG_S s8, PT_S8(sp) >> + REG_S s9, PT_S9(sp) >> + REG_S s10, PT_S10(sp) >> + REG_S s11, PT_S11(sp) >> + REG_S tp, PT_TP(sp) >> + REG_S t0, PT_T0(sp) >> + REG_S t1, PT_T1(sp) >> + REG_S t2, PT_T2(sp) >> + REG_S t3, PT_T3(sp) >> + REG_S t4, PT_T4(sp) >> + REG_S t5, PT_T5(sp) >> + REG_S t6, PT_T6(sp) >> + REG_S gp, PT_GP(sp) >> + REG_S a0, PT_A0(sp) >> + REG_S a1, PT_A1(sp) >> + REG_S a2, PT_A2(sp) >> + REG_S a3, PT_A3(sp) >> + REG_S a4, PT_A4(sp) >> + REG_S a5, PT_A5(sp) >> + csrr a1, CSR_SEPC >> + REG_S a1, PT_EPC(sp) >> + csrr a2, CSR_SSTATUS >> + REG_S a2, PT_STATUS(sp) >> + >> + REG_L a0, SBI_SSE_REG_TMP(a7) >> + REG_S a0, PT_SP(sp) >> + >> + REG_L t0, SBI_SSE_HANDLER(a7) >> + REG_L a0, SBI_SSE_HANDLER_DATA(a7) >> + mv a1, sp >> + mv a2, a6 >> + jalr t0 >> + >> + REG_L a1, PT_EPC(sp) >> + REG_L a2, PT_STATUS(sp) >> + csrw CSR_SEPC, a1 >> + csrw CSR_SSTATUS, a2 >> + >> + REG_L ra, PT_RA(sp) >> + REG_L s0, PT_S0(sp) >> + REG_L s1, PT_S1(sp) >> + REG_L s2, PT_S2(sp) >> + REG_L s3, PT_S3(sp) >> + REG_L s4, PT_S4(sp) >> + REG_L s5, PT_S5(sp) >> + REG_L s6, PT_S6(sp) >> + REG_L s7, PT_S7(sp) >> + REG_L s8, PT_S8(sp) >> + REG_L s9, PT_S9(sp) >> + REG_L s10, PT_S10(sp) >> + REG_L s11, PT_S11(sp) >> + REG_L tp, PT_TP(sp) >> + REG_L t0, PT_T0(sp) >> + REG_L t1, PT_T1(sp) >> + REG_L t2, PT_T2(sp) >> + REG_L t3, PT_T3(sp) >> + REG_L t4, PT_T4(sp) >> + REG_L t5, PT_T5(sp) >> + REG_L t6, PT_T6(sp) >> + REG_L gp, PT_GP(sp) >> + REG_L a0, PT_A0(sp) >> + REG_L a1, PT_A1(sp) >> + REG_L a2, PT_A2(sp) >> + REG_L a3, PT_A3(sp) >> + REG_L a4, PT_A4(sp) >> + REG_L a5, PT_A5(sp) >> + >> + REG_L sp, PT_SP(sp) >> + >> + li a7, ASM_SBI_EXT_SSE >> + li a6, ASM_SBI_EXT_SSE_COMPLETE >> + ecall > > nit: Format asm with a tab after the operator like save/restore_context do. > >> + >> diff --git a/lib/riscv/asm-offsets.c b/lib/riscv/asm-offsets.c >> index 6c511c14..77e5d26d 100644 >> --- a/lib/riscv/asm-offsets.c >> +++ b/lib/riscv/asm-offsets.c >> @@ -4,6 +4,7 @@ >> #include >> #include >> #include >> +#include > > nit: alphabetize > >> >> int main(void) >> { >> @@ -63,5 +64,13 @@ int main(void) >> OFFSET(THREAD_INFO_HARTID, thread_info, hartid); >> DEFINE(THREAD_INFO_SIZE, sizeof(struct thread_info)); >> >> + DEFINE(ASM_SBI_EXT_SSE, SBI_EXT_SSE); >> + DEFINE(ASM_SBI_EXT_SSE_COMPLETE, SBI_EXT_SSE_COMPLETE); >> + >> + OFFSET(SBI_SSE_REG_TMP, sbi_sse_handler_arg, reg_tmp); >> + OFFSET(SBI_SSE_HANDLER, sbi_sse_handler_arg, handler); >> + OFFSET(SBI_SSE_HANDLER_DATA, sbi_sse_handler_arg, handler_data); >> + OFFSET(SBI_SSE_HANDLER_STACK, sbi_sse_handler_arg, stack); >> + >> return 0; >> } >> diff --git a/lib/riscv/sbi-sse.c b/lib/riscv/sbi-sse.c >> new file mode 100644 >> index 00000000..bc4dd10e >> --- /dev/null >> +++ b/lib/riscv/sbi-sse.c > > I think we could just put these wrappers in lib/riscv/sbi.c, but OK. Hey Andrew, I'll move everything in sbi.c ;) > >> @@ -0,0 +1,84 @@ >> +// SPDX-License-Identifier: GPL-2.0-only >> +/* >> + * SBI SSE library >> + * >> + * Copyright (C) 2025, Rivos Inc., Cl?ment L?ger >> + */ >> +#include >> +#include >> +#include > > nit: alphabetize > >> + >> +struct sbiret sbi_sse_read_attrs_raw(unsigned long event_id, unsigned long base_attr_id, >> + unsigned long attr_count, unsigned long phys_lo, >> + unsigned long phys_hi) >> +{ >> + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_READ_ATTRS, event_id, base_attr_id, attr_count, >> + phys_lo, phys_hi, 0); >> +} >> + >> +struct sbiret sbi_sse_read_attrs(unsigned long event_id, unsigned long base_attr_id, >> + unsigned long attr_count, unsigned long *values) >> +{ >> + phys_addr_t p = virt_to_phys(values); >> + >> + return sbi_sse_read_attrs_raw(event_id, base_attr_id, attr_count, lower_32_bits(p), >> + upper_32_bits(p)); >> +} >> + >> +struct sbiret sbi_sse_write_attrs_raw(unsigned long event_id, unsigned long base_attr_id, >> + unsigned long attr_count, unsigned long phys_lo, >> + unsigned long phys_hi) >> +{ >> + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_WRITE_ATTRS, event_id, base_attr_id, attr_count, >> + phys_lo, phys_hi, 0); >> +} >> + >> +struct sbiret sbi_sse_write_attrs(unsigned long event_id, unsigned long base_attr_id, >> + unsigned long attr_count, unsigned long *values) >> +{ >> + phys_addr_t p = virt_to_phys(values); >> + >> + return sbi_sse_write_attrs_raw(event_id, base_attr_id, attr_count, lower_32_bits(p), >> + upper_32_bits(p)); >> +} >> + >> +struct sbiret sbi_sse_register_raw(unsigned long event_id, unsigned long entry_pc, >> + unsigned long entry_arg) >> +{ >> + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_REGISTER, event_id, entry_pc, entry_arg, 0, 0, 0); >> +} >> + >> +struct sbiret sbi_sse_register(unsigned long event_id, struct sbi_sse_handler_arg *arg) >> +{ > > phys_addr_t entry = virt_to_phys(sbi_sse_entry); > phys_addr_t arg = virt_to_phys(arg); > > assert(__riscv_xlen > 32 || upper_32_bits(entry) == 0); > assert(__riscv_xlen > 32 || upper_32_bits(arg) == 0); Ahem, while looking at it, there is actually no reason to pass phys address there :|. That should be virtual pointer, I'll remove that. Thanks, Cl?ment > > return sbi_sse_register_raw(event_id, entry, arg); > >> + return sbi_sse_register_raw(event_id, virt_to_phys(sbi_sse_entry), virt_to_phys(arg)); >> +} >> + >> +struct sbiret sbi_sse_unregister(unsigned long event_id) >> +{ >> + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_UNREGISTER, event_id, 0, 0, 0, 0, 0); >> +} >> + >> +struct sbiret sbi_sse_enable(unsigned long event_id) >> +{ >> + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_ENABLE, event_id, 0, 0, 0, 0, 0); >> +} >> + >> +struct sbiret sbi_sse_disable(unsigned long event_id) >> +{ >> + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_DISABLE, event_id, 0, 0, 0, 0, 0); >> +} >> + >> +struct sbiret sbi_sse_hart_mask(void) >> +{ >> + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_HART_MASK, 0, 0, 0, 0, 0, 0); >> +} >> + >> +struct sbiret sbi_sse_hart_unmask(void) >> +{ >> + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_HART_UNMASK, 0, 0, 0, 0, 0, 0); >> +} >> + >> +struct sbiret sbi_sse_inject(unsigned long event_id, unsigned long hart_id) >> +{ >> + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_INJECT, event_id, hart_id, 0, 0, 0, 0); >> +} >> -- >> 2.47.2 >> > > Besides the nits and the sanity checking for rv32, > > Reviewed-by: Andrew Jones From cleger at rivosinc.com Thu Mar 6 06:32:39 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Thu, 6 Mar 2025 15:32:39 +0100 Subject: [kvm-unit-tests PATCH v7 6/6] riscv: sbi: Add SSE extension tests In-Reply-To: <20250227-93a15f012d9bda941ef44e38@orel> References: <20250214114423.1071621-1-cleger@rivosinc.com> <20250214114423.1071621-7-cleger@rivosinc.com> <20250227-93a15f012d9bda941ef44e38@orel> Message-ID: On 28/02/2025 18:51, Andrew Jones wrote: > > Global style note: For casts we prefer no space between the (type) and > the variable, i.e. (type)var > > On Fri, Feb 14, 2025 at 12:44:19PM +0100, Cl?ment L?ger wrote: >> Add SBI SSE extension tests for the following features: >> - Test attributes errors (invalid values, RO, etc) >> - Registration errors >> - Simple events (register, enable, inject) >> - Events with different priorities >> - Global events dispatch on different harts >> - Local events on all harts >> - Hart mask/unmask events >> >> Signed-off-by: Cl?ment L?ger >> --- >> riscv/Makefile | 1 + >> riscv/sbi-sse.c | 1054 +++++++++++++++++++++++++++++++++++++++++++++++ >> riscv/sbi.c | 2 + >> 3 files changed, 1057 insertions(+) >> create mode 100644 riscv/sbi-sse.c >> >> diff --git a/riscv/Makefile b/riscv/Makefile >> index ed590ede..ea62e05f 100644 >> --- a/riscv/Makefile >> +++ b/riscv/Makefile >> @@ -18,6 +18,7 @@ tests += $(TEST_DIR)/sieve.$(exe) >> all: $(tests) >> >> $(TEST_DIR)/sbi-deps = $(TEST_DIR)/sbi-asm.o $(TEST_DIR)/sbi-fwft.o >> +$(TEST_DIR)/sbi-deps += $(TEST_DIR)/sbi-sse.o >> >> # When built for EFI sieve needs extra memory, run with e.g. '-m 256' on QEMU >> $(TEST_DIR)/sieve.$(exe): AUXFLAGS = 0x1 >> diff --git a/riscv/sbi-sse.c b/riscv/sbi-sse.c >> new file mode 100644 >> index 00000000..27f47f73 >> --- /dev/null >> +++ b/riscv/sbi-sse.c >> @@ -0,0 +1,1054 @@ >> +// SPDX-License-Identifier: GPL-2.0-only >> +/* >> + * SBI SSE testsuite >> + * >> + * Copyright (C) 2025, Rivos Inc., Cl?ment L?ger >> + */ >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> + >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> + >> +#include "sbi-tests.h" >> + >> +#define SSE_STACK_SIZE PAGE_SIZE >> + >> +void check_sse(void); > > Since we now have an #ifndef __ASSEMBLY__ section in riscv/sbi-tests.h we > can just put this prototype there. > >> + >> +struct sse_event_info { >> + uint32_t event_id; >> + const char *name; >> + bool can_inject; >> +}; >> + >> +static struct sse_event_info sse_event_infos[] = { >> + { >> + .event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, >> + .name = "local_high_prio_ras", >> + }, >> + { >> + .event_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP, >> + .name = "double_trap", >> + }, >> + { >> + .event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, >> + .name = "global_high_prio_ras", >> + }, >> + { >> + .event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, >> + .name = "local_pmu_overflow", >> + }, >> + { >> + .event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, >> + .name = "local_low_prio_ras", >> + }, >> + { >> + .event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, >> + .name = "global_low_prio_ras", >> + }, >> + { >> + .event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE, >> + .name = "local_software", >> + }, >> + { >> + .event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE, >> + .name = "global_software", >> + }, >> +}; >> + >> +static const char *const attr_names[] = { >> + [SBI_SSE_ATTR_STATUS] = "status", >> + [SBI_SSE_ATTR_PRIORITY] = "priority", >> + [SBI_SSE_ATTR_CONFIG] = "config", >> + [SBI_SSE_ATTR_PREFERRED_HART] = "preferred_hart", >> + [SBI_SSE_ATTR_ENTRY_PC] = "entry_pc", >> + [SBI_SSE_ATTR_ENTRY_ARG] = "entry_arg", >> + [SBI_SSE_ATTR_INTERRUPTED_SEPC] = "interrupted_sepc", >> + [SBI_SSE_ATTR_INTERRUPTED_FLAGS] = "interrupted_flags", >> + [SBI_SSE_ATTR_INTERRUPTED_A6] = "interrupted_a6", >> + [SBI_SSE_ATTR_INTERRUPTED_A7] = "interrupted_a7", >> +}; >> + >> +static const unsigned long ro_attrs[] = { >> + SBI_SSE_ATTR_STATUS, >> + SBI_SSE_ATTR_ENTRY_PC, >> + SBI_SSE_ATTR_ENTRY_ARG, >> +}; >> + >> +static const unsigned long interrupted_attrs[] = { >> + SBI_SSE_ATTR_INTERRUPTED_SEPC, >> + SBI_SSE_ATTR_INTERRUPTED_FLAGS, >> + SBI_SSE_ATTR_INTERRUPTED_A6, >> + SBI_SSE_ATTR_INTERRUPTED_A7, >> +}; >> + >> +static const unsigned long interrupted_flags[] = { >> + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPP, >> + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPIE, >> + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPELP, >> + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT, >> + SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPV, >> + SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPVP, >> +}; >> + >> +static struct sse_event_info *sse_event_get_info(uint32_t event_id) >> +{ >> + int i; >> + >> + for (i = 0; i < ARRAY_SIZE(sse_event_infos); i++) { >> + if (sse_event_infos[i].event_id == event_id) >> + return &sse_event_infos[i]; >> + } >> + >> + assert_msg(false, "Invalid event id: %d", event_id); >> +} >> + >> +static const char *sse_event_name(uint32_t event_id) >> +{ >> + return sse_event_get_info(event_id)->name; >> +} >> + >> +static bool sse_event_can_inject(uint32_t event_id) >> +{ >> + return sse_event_get_info(event_id)->can_inject; >> +} >> + >> +static struct sbiret sse_event_get_state(uint32_t event_id, enum sbi_sse_state *state) >> +{ >> + struct sbiret ret; >> + unsigned long status; >> + >> + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, 1, &status); >> + if (ret.error) { >> + sbiret_report_error(&ret, 0, "Get SSE event status no error"); >> + return ret; >> + } >> + >> + *state = status & SBI_SSE_ATTR_STATUS_STATE_MASK; >> + >> + return ret; >> +} >> + >> +static void sse_global_event_set_current_hart(uint32_t event_id) >> +{ >> + struct sbiret ret; >> + unsigned long current_hart = current_thread_info()->hartid; >> + >> + if (!sbi_sse_event_is_global(event_id)) >> + return; >> + >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, ¤t_hart); >> + if (ret.error) >> + report_abort("set preferred hart failure, error %ld", ret.error); > > Are we sure we want to abort? Or should we try to propagate an error from > here and just bail out of SSE tests? > >> +} >> + >> +static bool sse_check_state(uint32_t event_id, unsigned long expected_state) >> +{ >> + struct sbiret ret; >> + enum sbi_sse_state state; >> + >> + ret = sse_event_get_state(event_id, &state); >> + if (ret.error) >> + return false; >> + report(state == expected_state, "SSE event status == %ld", expected_state); >> + >> + return state == expected_state; > > Can just write > > return report(state == expected_state, "SSE event status == %ld", expected_state); > >> +} >> + >> +static bool sse_event_pending(uint32_t event_id) >> +{ >> + struct sbiret ret; >> + unsigned long status; >> + >> + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, 1, &status); >> + if (ret.error) { >> + report_fail("Failed to get SSE event status, error %ld", ret.error); > > sbiret_report_error() > >> + return false; >> + } >> + >> + return !!(status & BIT(SBI_SSE_ATTR_STATUS_PENDING_OFFSET)); >> +} >> + >> +static void *sse_alloc_stack(void) >> +{ >> + /* >> + * We assume that SSE_STACK_SIZE always fit in one page. This page will >> + * always be decrement before storing anything on it in sse-entry.S. > > decremented > >> + */ >> + assert(SSE_STACK_SIZE <= PAGE_SIZE); >> + >> + return (alloc_page() + SSE_STACK_SIZE); >> +} >> + >> +static void sse_free_stack(void *stack) >> +{ >> + free_page(stack - SSE_STACK_SIZE); >> +} >> + >> +static void sse_read_write_test(uint32_t event_id, unsigned long attr, unsigned long attr_count, >> + unsigned long *value, long expected_error, const char *str) >> +{ >> + struct sbiret ret; >> + >> + ret = sbi_sse_read_attrs(event_id, attr, attr_count, value); >> + sbiret_report_error(&ret, expected_error, "Read %s error", str); >> + >> + ret = sbi_sse_write_attrs(event_id, attr, attr_count, value); >> + sbiret_report_error(&ret, expected_error, "Write %s error", str); >> +} >> + >> +#define ALL_ATTRS_COUNT (SBI_SSE_ATTR_INTERRUPTED_A7 + 1) >> + >> +static void sse_test_attrs(uint32_t event_id) >> +{ >> + unsigned long value = 0; >> + struct sbiret ret; >> + void *ptr; >> + unsigned long values[ALL_ATTRS_COUNT]; >> + unsigned int i; >> + const char *max_hart_str; >> + const char *attr_name; >> + >> + report_prefix_push("attrs"); >> + >> + for (i = 0; i < ARRAY_SIZE(ro_attrs); i++) { >> + ret = sbi_sse_write_attrs(event_id, ro_attrs[i], 1, &value); >> + sbiret_report_error(&ret, SBI_ERR_BAD_RANGE, "RO attribute %s not writable", >> + attr_names[ro_attrs[i]]); >> + } >> + >> + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, ALL_ATTRS_COUNT, values); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Read multiple attributes no error"); >> + >> + for (i = SBI_SSE_ATTR_STATUS; i <= SBI_SSE_ATTR_INTERRUPTED_A7; i++) { >> + ret = sbi_sse_read_attrs(event_id, i, 1, &value); >> + attr_name = attr_names[i]; >> + >> + sbiret_report_error(&ret, SBI_SUCCESS, "Read single attribute %s", attr_name); >> + if (values[i] != value) >> + report_fail("Attribute 0x%x single value read (0x%lx) differs from the one read with multiple attributes (0x%lx)", >> + i, value, values[i]); >> + /* >> + * Preferred hart reset value is defined by SBI vendor and >> + * status injectable bit also depends on the SBI implementation > > The spec says the STATUS attribute reset value is zero. In any case, if > certain bits can be tested then we should test them, so we can mask off > the injectable bit and still check for zero. Ack. > >> + */ >> + if (i != SBI_SSE_ATTR_STATUS && i != SBI_SSE_ATTR_PREFERRED_HART) >> + report(value == 0, "Attribute %s reset value is 0", attr_name); >> + } >> + >> +#if __riscv_xlen > 32 >> + value = BIT(32); >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Write invalid prio > 0xFFFFFFFF error"); >> +#endif >> + >> + value = ~SBI_SSE_ATTR_CONFIG_ONESHOT; >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_CONFIG, 1, &value); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Write invalid config value error"); >> + >> + if (sbi_sse_event_is_global(event_id)) { >> + max_hart_str = getenv("MAX_HART_ID"); > > This should be named "INVALID_HARTID" > >> + if (!max_hart_str) >> + value = 0xFFFFFFFFUL; >> + else >> + value = strtoul(max_hart_str, NULL, 0); >> + >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid hart id error"); > > The spec doesn't say you can get invalid-param for a bad hartid. Indeed. > >> + } else { >> + /* Set Hart on local event -> RO */ >> + value = current_thread_info()->hartid; >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); >> + sbiret_report_error(&ret, SBI_ERR_BAD_RANGE, "Set hart id on local event error"); > > The spec doesn't say you can get bad-range for a valid hartid when set > locally. Ditto. > >> + } >> + >> + /* Set/get flags, sepc, a6, a7 */ >> + for (i = 0; i < ARRAY_SIZE(interrupted_attrs); i++) { >> + attr_name = attr_names[interrupted_attrs[i]]; >> + ret = sbi_sse_read_attrs(event_id, interrupted_attrs[i], 1, &value); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Get interrupted %s no error", attr_name); >> + >> + value = ARRAY_SIZE(interrupted_attrs) - i; > > I don't understand how this creates an invalid value for all interrupted attrs? > >> + ret = sbi_sse_write_attrs(event_id, interrupted_attrs[i], 1, &value); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, > > The spec doesn't state SBI_ERR_INVALID_STATE will ever be returned for > sbi_sse_write_attrs() Yeah, indeed, the attributes are specified as being writable only when in RUNNING state. I inferred that as well, i'll probably send a clarification to the spec for that. > >> + "Set attribute %s invalid state error", attr_name); >> + } >> + >> + sse_read_write_test(event_id, SBI_SSE_ATTR_STATUS, 0, &value, SBI_ERR_INVALID_PARAM, >> + "attribute attr_count == 0"); >> + sse_read_write_test(event_id, SBI_SSE_ATTR_INTERRUPTED_A7 + 1, 1, &value, SBI_ERR_BAD_RANGE, >> + "invalid attribute"); >> + >> +#if __riscv_xlen > 32 >> + sse_read_write_test(event_id, BIT(32), 1, &value, SBI_ERR_INVALID_PARAM, >> + "attribute id > 32 bits"); >> + sse_read_write_test(event_id, SBI_SSE_ATTR_STATUS, BIT(32), &value, SBI_ERR_INVALID_PARAM, >> + "attribute count > 32 bits"); > > I think you plan to change these to expect them to behave like > base_attr_id=1 and attr_count=1. > >> +#endif >> + >> + /* Misaligned pointer address */ >> + ptr = (void *) &value; >> + ptr += 1; >> + sse_read_write_test(event_id, SBI_SSE_ATTR_STATUS, 1, ptr, SBI_ERR_INVALID_ADDRESS, >> + "attribute with invalid address"); >> + >> + report_prefix_pop(); >> +} >> + >> +static void sse_test_register_error(uint32_t event_id) >> +{ >> + struct sbiret ret; >> + >> + report_prefix_push("register"); >> + >> + ret = sbi_sse_unregister(event_id); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "SSE unregister non registered event"); > > non-registered > >> + >> + ret = sbi_sse_register_raw(event_id, 0x1, 0); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "SSE register misaligned entry"); >> + >> + ret = sbi_sse_register_raw(event_id, virt_to_phys(sbi_sse_entry), 0); >> + sbiret_report_error(&ret, SBI_SUCCESS, "SSE register ok"); >> + if (ret.error) >> + goto done; >> + >> + ret = sbi_sse_register_raw(event_id, virt_to_phys(sbi_sse_entry), 0); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "SSE register twice failure"); > > "SSE register used event" > >> + >> + ret = sbi_sse_unregister(event_id); >> + sbiret_report_error(&ret, SBI_SUCCESS, "SSE unregister ok"); >> + >> +done: >> + report_prefix_pop(); >> +} >> + >> +struct sse_simple_test_arg { >> + bool done; >> + unsigned long expected_a6; >> + uint32_t event_id; >> +}; >> + >> +static void sse_simple_handler(void *data, struct pt_regs *regs, unsigned int hartid) >> +{ >> + struct sse_simple_test_arg *arg = data; >> + int i; >> + struct sbiret ret; >> + const char *attr_name; >> + uint32_t event_id = READ_ONCE(arg->event_id), attr; >> + unsigned long value, prev_value, flags; >> + unsigned long interrupted_state[ARRAY_SIZE(interrupted_attrs)]; >> + unsigned long modified_state[ARRAY_SIZE(interrupted_attrs)] = {4, 3, 2, 1}; >> + unsigned long tmp_state[ARRAY_SIZE(interrupted_attrs)]; >> + >> + report((regs->status & SR_SPP) == SR_SPP, "Interrupted S-mode"); >> + report(hartid == current_thread_info()->hartid, "Hartid correctly passed"); >> + sse_check_state(event_id, SBI_SSE_STATE_RUNNING); >> + report(!sse_event_pending(event_id), "Event not pending"); >> + >> + /* Read full interrupted state */ >> + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, >> + ARRAY_SIZE(interrupted_attrs), interrupted_state); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Save full interrupted state from SSE handler ok"); >> + >> + /* Write full modified state and read it */ >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, >> + ARRAY_SIZE(modified_state), modified_state); >> + sbiret_report_error(&ret, SBI_SUCCESS, >> + "Write full interrupted state from SSE handler ok"); >> + >> + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, >> + ARRAY_SIZE(tmp_state), tmp_state); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Read full modified state from SSE handler ok"); >> + >> + report(memcmp(tmp_state, modified_state, sizeof(modified_state)) == 0, >> + "Full interrupted state successfully written"); > > "Full... should line up under the report(, not memcmp( > >> + >> + /* Restore full saved state */ >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, >> + ARRAY_SIZE(interrupted_attrs), interrupted_state); >> + sbiret_report_error(&ret, SBI_SUCCESS, >> + "Full interrupted state restore from SSE handler ok"); >> + >> + /* We test SBI_SSE_ATTR_INTERRUPTED_FLAGS below with specific flag values */ >> + for (i = 0; i < ARRAY_SIZE(interrupted_attrs); i++) { >> + attr = interrupted_attrs[i]; >> + if (attr == SBI_SSE_ATTR_INTERRUPTED_FLAGS) >> + continue; >> + >> + attr_name = attr_names[attr]; >> + >> + ret = sbi_sse_read_attrs(event_id, attr, 1, &prev_value); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Get attr %s no error", attr_name); >> + >> + value = 0xDEADBEEF + i; >> + ret = sbi_sse_write_attrs(event_id, attr, 1, &value); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Set attr %s no error", attr_name); >> + >> + ret = sbi_sse_read_attrs(event_id, attr, 1, &value); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Get attr %s no error", attr_name); >> + report(value == 0xDEADBEEF + i, "Get attr %s no error, value: 0x%lx", attr_name, >> + value); >> + >> + ret = sbi_sse_write_attrs(event_id, attr, 1, &prev_value); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Restore attr %s value no error", attr_name); >> + } >> + >> + /* Test all flags allowed for SBI_SSE_ATTR_INTERRUPTED_FLAGS*/ > ^ > missing > space > >> + attr = SBI_SSE_ATTR_INTERRUPTED_FLAGS; >> + ret = sbi_sse_read_attrs(event_id, attr, 1, &prev_value); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Save interrupted flags no error"); >> + >> + for (i = 0; i < ARRAY_SIZE(interrupted_flags); i++) { >> + flags = interrupted_flags[i]; >> + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); >> + sbiret_report_error(&ret, SBI_SUCCESS, >> + "Set interrupted flags bit 0x%lx value no error", flags); >> + ret = sbi_sse_read_attrs(event_id, attr, 1, &value); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Get interrupted flags after set no error"); >> + report(value == flags, "interrupted flags modified value ok: 0x%lx", value); > > Do we also need to test with more than one flag set at a time? That is already done a few lines above (see /* Restore full saved state */). > >> + } >> + >> + /* Write invalid bit in flag register */ >> + flags = SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT << 1; >> + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", >> + flags); >> +#if __riscv_xlen > 32 >> + flags = BIT(32); >> + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", > > This should have a different report string than the test above. The bit value format does differentiate the printf though. > >> + flags); >> +#endif >> + >> + ret = sbi_sse_write_attrs(event_id, attr, 1, &prev_value); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Restore interrupted flags no error"); >> + >> + /* Try to change HARTID/Priority while running */ >> + if (sbi_sse_event_is_global(event_id)) { >> + value = current_thread_info()->hartid; >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "Set hart id while running error"); >> + } > > SBI_ERR_INVALID_STATE is not listed as a possible error for write_attrs. Will update the spec. > >> + >> + value = 0; >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "Set priority while running error"); >> + >> + report(interrupted_state[2] == READ_ONCE(arg->expected_a6), >> + "Interrupted state a6 ok, expected 0x%lx, got 0x%lx", >> + READ_ONCE(arg->expected_a6), interrupted_state[2]); > > Could just READ_ONCE expected_a6 once. Ack. > >> + >> + report(interrupted_state[3] == SBI_EXT_SSE, >> + "Interrupted state a7 ok, expected 0x%x, got 0x%lx", SBI_EXT_SSE, >> + interrupted_state[3]); >> + >> + WRITE_ONCE(arg->done, true); >> +} >> + >> +static void sse_test_inject_simple(uint32_t event_id) >> +{ >> + unsigned long value; >> + struct sbiret ret; >> + struct sse_simple_test_arg test_arg = {.event_id = event_id}; >> + struct sbi_sse_handler_arg args = { >> + .handler = sse_simple_handler, >> + .handler_data = (void *) &test_arg, >> + .stack = sse_alloc_stack(), >> + }; >> + >> + report_prefix_push("simple"); >> + >> + if (!sse_check_state(event_id, SBI_SSE_STATE_UNUSED)) >> + goto done; >> + >> + ret = sbi_sse_register(event_id, &args); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "SSE register no error")) >> + goto done; >> + >> + if (!sse_check_state(event_id, SBI_SSE_STATE_REGISTERED)) >> + goto done; >> + >> + /* Be sure global events are targeting the current hart */ >> + sse_global_event_set_current_hart(event_id); >> + >> + ret = sbi_sse_enable(event_id); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "SSE enable no error")) >> + goto done; >> + >> + if (!sse_check_state(event_id, SBI_SSE_STATE_ENABLED)) >> + goto done; >> + >> + ret = sbi_sse_hart_mask(); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "SSE hart mask no error")) >> + goto done; >> + >> + ret = sbi_sse_inject(event_id, current_thread_info()->hartid); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "SSE injection masked no error")) >> + goto done; >> + >> + barrier(); >> + report(test_arg.done == 0, "SSE event masked not handled"); > > Could instead drop the barrier() calls and use READ/WRITE_ONCE with all > test_arg member accesses. > >> + >> + /* >> + * When unmasking the SSE events, we expect it to be injected >> + * immediately so a6 should be SBI_EXT_SBI_SSE_HART_UNMASK >> + */ >> + test_arg.expected_a6 = SBI_EXT_SSE_HART_UNMASK; >> + ret = sbi_sse_hart_unmask(); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "SSE hart unmask no error")) >> + goto done; >> + >> + barrier(); >> + report(test_arg.done == 1, "SSE event unmasked handled"); >> + test_arg.done = 0; >> + test_arg.expected_a6 = SBI_EXT_SSE_INJECT; >> + >> + /* Set as oneshot and verify it is disabled */ >> + ret = sbi_sse_disable(event_id); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Disable event ok"); >> + value = SBI_SSE_ATTR_CONFIG_ONESHOT; >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_CONFIG, 1, &value); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Set event attribute as ONESHOT"); >> + ret = sbi_sse_enable(event_id); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Enable event ok"); >> + >> + ret = sbi_sse_inject(event_id, current_thread_info()->hartid); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "SSE second injection no error")) >> + goto done; >> + >> + barrier(); >> + report(test_arg.done == 1, "SSE event handled ok"); >> + test_arg.done = 0; >> + >> + if (!sse_check_state(event_id, SBI_SSE_STATE_REGISTERED)) >> + goto done; >> + >> + /* Clear ONESHOT FLAG */ >> + value = 0; >> + sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_CONFIG, 1, &value); > > Why clear the oneshot flag before unregistering (and not check if the attr > write was successful)? I previously add some loop over the test but the oneshot messed up the next test. Anyway, I think it's better to restore the event to it's previous state. I'll check for the return value though. > >> + >> + ret = sbi_sse_unregister(event_id); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "SSE unregister no error")) >> + goto done; >> + >> + sse_check_state(event_id, SBI_SSE_STATE_UNUSED); >> + >> +done: > > Is it ok to leave this function with an event registered/enabled? If not, > then some of the goto's above should goto other labels which disable and > unregister. No it's not but it's massive pain to keep everything coherent when it fails ;) > >> + sse_free_stack(args.stack); >> + report_prefix_pop(); >> +} >> + >> +struct sse_foreign_cpu_test_arg { >> + bool done; >> + unsigned int expected_cpu; >> + uint32_t event_id; >> +}; >> + >> +static void sse_foreign_cpu_handler(void *data, struct pt_regs *regs, unsigned int hartid) >> +{ >> + struct sse_foreign_cpu_test_arg *arg = data; >> + unsigned int expected_cpu; >> + >> + /* For arg content to be visible */ >> + smp_rmb(); >> + expected_cpu = READ_ONCE(arg->expected_cpu); >> + report(expected_cpu == current_thread_info()->cpu, >> + "Received event on CPU (%d), expected CPU (%d)", current_thread_info()->cpu, >> + expected_cpu); >> + >> + WRITE_ONCE(arg->done, true); >> + /* For arg update to be visible for other CPUs */ >> + smp_wmb(); >> +} >> + >> +struct sse_local_per_cpu { >> + struct sbi_sse_handler_arg args; >> + struct sbiret ret; >> + struct sse_foreign_cpu_test_arg handler_arg; >> +}; >> + >> +static void sse_register_enable_local(void *data) >> +{ >> + struct sbiret ret; >> + struct sse_local_per_cpu *cpu_args = data; >> + struct sse_local_per_cpu *cpu_arg = &cpu_args[current_thread_info()->cpu]; >> + uint32_t event_id = cpu_arg->handler_arg.event_id; >> + >> + ret = sbi_sse_register(event_id, &cpu_arg->args); >> + WRITE_ONCE(cpu_arg->ret, ret); >> + if (ret.error) >> + return; >> + >> + ret = sbi_sse_enable(event_id); >> + WRITE_ONCE(cpu_arg->ret, ret); >> +} >> + >> +static void sbi_sse_disable_unregister_local(void *data) >> +{ >> + struct sbiret ret; >> + struct sse_local_per_cpu *cpu_args = data; >> + struct sse_local_per_cpu *cpu_arg = &cpu_args[current_thread_info()->cpu]; >> + uint32_t event_id = cpu_arg->handler_arg.event_id; >> + >> + ret = sbi_sse_disable(event_id); >> + WRITE_ONCE(cpu_arg->ret, ret); >> + if (ret.error) >> + return; >> + >> + ret = sbi_sse_unregister(event_id); >> + WRITE_ONCE(cpu_arg->ret, ret); >> +} >> + >> +static void sse_test_inject_local(uint32_t event_id) >> +{ >> + int cpu; >> + struct sbiret ret; >> + struct sse_local_per_cpu *cpu_args, *cpu_arg; >> + struct sse_foreign_cpu_test_arg *handler_arg; >> + >> + cpu_args = calloc(NR_CPUS, sizeof(struct sbi_sse_handler_arg)); >> + >> + report_prefix_push("local_dispatch"); >> + for_each_online_cpu(cpu) { >> + cpu_arg = &cpu_args[cpu]; >> + cpu_arg->handler_arg.event_id = event_id; >> + cpu_arg->args.stack = sse_alloc_stack(); >> + cpu_arg->args.handler = sse_foreign_cpu_handler; >> + cpu_arg->args.handler_data = (void *)&cpu_arg->handler_arg; >> + } >> + >> + on_cpus(sse_register_enable_local, cpu_args); >> + for_each_online_cpu(cpu) { >> + cpu_arg = &cpu_args[cpu]; >> + ret = cpu_arg->ret; >> + if (ret.error) >> + report_abort("CPU failed to register/enable SSE event: %ld", >> + ret.error); > > Do we need this to be an abort? Probably not but since events used some state machine, we probably messed up the state. Same comment than before applies, it's a pain to restore events to their previous state properly. I figured out that since this is a test, it might not be *that* important to keep it totally sane in case something goes wrong. i'll try some best effort to handle that though. > >> + >> + handler_arg = &cpu_arg->handler_arg; >> + WRITE_ONCE(handler_arg->expected_cpu, cpu); >> + /* For handler_arg content to be visible for other CPUs */ >> + smp_wmb(); >> + ret = sbi_sse_inject(event_id, cpus[cpu].hartid); >> + if (ret.error) >> + report_abort("CPU failed to register/enable SSE event: %ld", >> + ret.error); > > abort? > >> + } >> + >> + for_each_online_cpu(cpu) { >> + handler_arg = &cpu_args[cpu].handler_arg; > > Need smp_rmb() here in case we read done=1 on first try in order > to be sure that all other data ordered wrt the done write can > be observed. > >> + while (!READ_ONCE(handler_arg->done)) { > > Maybe a cpu_relax() here. > >> + /* For handler_arg update to be visible */ >> + smp_rmb(); >> + } >> + WRITE_ONCE(handler_arg->done, false); >> + } >> + >> + on_cpus(sbi_sse_disable_unregister_local, cpu_args); >> + for_each_online_cpu(cpu) { >> + cpu_arg = &cpu_args[cpu]; >> + ret = READ_ONCE(cpu_arg->ret); >> + if (ret.error) >> + report_abort("CPU failed to disable/unregister SSE event: %ld", >> + ret.error); > > abort? > >> + } >> + >> + for_each_online_cpu(cpu) { >> + cpu_arg = &cpu_args[cpu]; >> + > > delete unnecessary blank line > >> + sse_free_stack(cpu_arg->args.stack); >> + } >> + >> + report_pass("local event dispatch on all CPUs"); >> + report_prefix_pop(); >> + >> +} >> + >> +static void sse_test_inject_global(uint32_t event_id) >> +{ >> + unsigned long value; >> + struct sbiret ret; >> + unsigned int cpu; >> + struct sse_foreign_cpu_test_arg test_arg = {.event_id = event_id}; >> + struct sbi_sse_handler_arg args = { >> + .handler = sse_foreign_cpu_handler, >> + .handler_data = (void *) &test_arg, >> + .stack = sse_alloc_stack(), >> + }; >> + enum sbi_sse_state state; >> + >> + report_prefix_push("global_dispatch"); >> + >> + ret = sbi_sse_register(event_id, &args); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Register event no error")) >> + goto err_free_stack; >> + >> + for_each_online_cpu(cpu) { >> + WRITE_ONCE(test_arg.expected_cpu, cpu); >> + /* For test_arg content to be visible for other CPUs */ >> + smp_wmb(); >> + value = cpu; >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Set preferred hart no error")) >> + goto err_unregister; >> + >> + ret = sbi_sse_enable(event_id); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Enable SSE event no error")) >> + goto err_unregister; >> + >> + ret = sbi_sse_inject(event_id, cpu); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Inject SSE event no error")) >> + goto err_disable; >> + > > smp_rmb() > >> + while (!READ_ONCE(test_arg.done)) { > > cpu_relax() > >> + /* For shared test_arg structure */ >> + smp_rmb(); >> + } >> + >> + WRITE_ONCE(test_arg.done, false); >> + >> + /* Wait for event to be in ENABLED state */ >> + do { >> + ret = sse_event_get_state(event_id, &state); >> + if (sbiret_report_error(&ret, SBI_SUCCESS, "Get SSE event state no error")) >> + goto err_disable; > > cpu_relax() or even use udelay()? > >> + } while (state != SBI_SSE_STATE_ENABLED); >> + >> +err_disable: >> + ret = sbi_sse_disable(event_id); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Disable SSE event no error")) >> + goto err_unregister; >> + >> + report_pass("Global event on CPU %d", cpu); >> + } >> + >> +err_unregister: >> + ret = sbi_sse_unregister(event_id); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Unregister SSE event no error"); >> + >> +err_free_stack: >> + sse_free_stack(args.stack); >> + report_prefix_pop(); >> +} >> + >> +struct priority_test_arg { >> + uint32_t event_id; >> + bool called; >> + u32 prio; >> + struct priority_test_arg *next_event_arg; >> + void (*check_func)(struct priority_test_arg *arg); >> +}; >> + >> +static void sse_hi_priority_test_handler(void *arg, struct pt_regs *regs, >> + unsigned int hartid) >> +{ >> + struct priority_test_arg *targ = arg; >> + struct priority_test_arg *next = targ->next_event_arg; >> + >> + targ->called = true; >> + if (next) { >> + sbi_sse_inject(next->event_id, current_thread_info()->hartid); >> + >> + report(!sse_event_pending(next->event_id), "Higher priority event is pending"); > > "is not pending" > >> + report(next->called, "Higher priority event was not handled"); > > "was handled" > >> + } >> +} >> + >> +static void sse_low_priority_test_handler(void *arg, struct pt_regs *regs, >> + unsigned int hartid) >> +{ >> + struct priority_test_arg *targ = arg; >> + struct priority_test_arg *next = targ->next_event_arg; >> + >> + targ->called = true; >> + >> + if (next) { >> + sbi_sse_inject(next->event_id, current_thread_info()->hartid); >> + >> + report(sse_event_pending(next->event_id), "Lower priority event is pending"); >> + report(!next->called, "Lower priority event %s was handle before %s", > > "was not handled" > >> + sse_event_name(next->event_id), sse_event_name(targ->event_id)); >> + } >> +} >> + >> +static void sse_test_injection_priority_arg(struct priority_test_arg *in_args, >> + unsigned int in_args_size, >> + sbi_sse_handler_fn handler, >> + const char *test_name) >> +{ >> + unsigned int i; >> + unsigned long value; >> + struct sbiret ret; >> + uint32_t event_id; >> + struct priority_test_arg *arg; >> + unsigned int args_size = 0; >> + struct sbi_sse_handler_arg event_args[in_args_size]; >> + struct priority_test_arg *args[in_args_size]; >> + void *stack; >> + struct sbi_sse_handler_arg *event_arg; >> + >> + report_prefix_push(test_name); >> + >> + for (i = 0; i < in_args_size; i++) { >> + arg = &in_args[i]; >> + event_id = arg->event_id; >> + if (!sse_event_can_inject(event_id)) >> + continue; >> + >> + args[args_size] = arg; >> + args_size++; >> + } >> + >> + if (!args_size) { >> + report_skip("No event injectable"); >> + report_prefix_pop(); >> + goto skip; >> + } >> + >> + for (i = 0; i < args_size; i++) { >> + arg = args[i]; >> + event_id = arg->event_id; >> + stack = sse_alloc_stack(); >> + >> + event_arg = &event_args[i]; >> + event_arg->handler = handler; >> + event_arg->handler_data = (void *)arg; >> + event_arg->stack = stack; >> + >> + if (i < (args_size - 1)) >> + arg->next_event_arg = args[i + 1]; >> + else >> + arg->next_event_arg = NULL; >> + >> + /* Be sure global events are targeting the current hart */ >> + sse_global_event_set_current_hart(event_id); >> + >> + sbi_sse_register(event_id, event_arg); >> + value = arg->prio; >> + sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); >> + sbi_sse_enable(event_id); > > No return code checks for these SSE calls? If we're 99% sure they should > succeed, then I'd still check them with asserts. I was a bit lazy here. Since the goal is *not* to check the event state themselve but rather the ordering, I didn't bother checking them. As said before, habndling error and event state properly in case of error seemed like a churn to me *just* for testing. I'll try something better as well though. > >> + } >> + >> + /* Inject first event */ >> + ret = sbi_sse_inject(args[0]->event_id, current_thread_info()->hartid); >> + sbiret_report_error(&ret, SBI_SUCCESS, "SSE injection no error"); >> + >> + for (i = 0; i < args_size; i++) { >> + arg = args[i]; >> + event_id = arg->event_id; >> + >> + report(arg->called, "Event %s handler called", sse_event_name(arg->event_id)); >> + >> + sbi_sse_disable(event_id); >> + sbi_sse_unregister(event_id); > > No return code checks? > >> + >> + event_arg = &event_args[i]; >> + sse_free_stack(event_arg->stack); >> + } >> + >> +skip: >> + report_prefix_pop(); > > Isn't this an extra pop()? We should be able to return directly from the > report_skip() if-block. Indeed. > >> +} >> + >> +static struct priority_test_arg hi_prio_args[] = { >> + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE}, >> + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW}, >> + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS}, >> +}; >> + >> +static struct priority_test_arg low_prio_args[] = { >> + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP}, >> + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS}, >> + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE}, >> + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE}, >> +}; >> + >> +static struct priority_test_arg prio_args[] = { >> + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE, .prio = 5}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE, .prio = 10}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, .prio = 12}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, .prio = 15}, >> + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, .prio = 20}, >> + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, .prio = 22}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, .prio = 25}, >> +}; >> + >> +static struct priority_test_arg same_prio_args[] = { >> + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, .prio = 0}, >> + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, .prio = 0}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, .prio = 10}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE, .prio = 10}, >> + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE, .prio = 10}, >> + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, .prio = 20}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, .prio = 20}, >> +}; > > nit: could tab out the .prio's in order to better tabulate the above two > structs. > >> + >> +static void sse_test_injection_priority(void) >> +{ >> + report_prefix_push("prio"); >> + >> + sse_test_injection_priority_arg(hi_prio_args, ARRAY_SIZE(hi_prio_args), >> + sse_hi_priority_test_handler, "high"); >> + >> + sse_test_injection_priority_arg(low_prio_args, ARRAY_SIZE(low_prio_args), >> + sse_low_priority_test_handler, "low"); >> + >> + sse_test_injection_priority_arg(prio_args, ARRAY_SIZE(prio_args), >> + sse_low_priority_test_handler, "changed"); >> + >> + sse_test_injection_priority_arg(same_prio_args, ARRAY_SIZE(same_prio_args), >> + sse_low_priority_test_handler, "same_prio_args"); >> + >> + report_prefix_pop(); >> +} >> + >> +static void test_invalid_event_id(unsigned long event_id) >> +{ >> + struct sbiret ret; >> + unsigned long value = 0; >> + >> + ret = sbi_sse_register_raw(event_id, (unsigned long) sbi_sse_entry, 0); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, >> + "SSE register event_id 0x%lx invalid param", event_id); >> + >> + ret = sbi_sse_unregister(event_id); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, >> + "SSE unregister event_id 0x%lx invalid param", event_id); >> + >> + ret = sbi_sse_enable(event_id); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, >> + "SSE enable event_id 0x%lx invalid param", event_id); >> + >> + ret = sbi_sse_disable(event_id); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, >> + "SSE disable event_id 0x%lx invalid param", event_id); >> + >> + ret = sbi_sse_inject(event_id, 0); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, >> + "SSE inject event_id 0x%lx invalid param", event_id); >> + >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, >> + "SSE write attr event_id 0x%lx invalid param", event_id); >> + >> + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, >> + "SSE read attr event_id 0x%lx invalid param", event_id); > > We can s/invalid param/ from all the strings above since we output > SBI_ERR_INVALID_PARAM with the latest sbiret_report_error() Yeah, I'll make some cleanup in other strings as well. > >> +} >> + >> +static void sse_test_invalid_event_id(void) >> +{ >> + >> + report_prefix_push("event_id"); >> + >> +#if __riscv_xlen > 32 >> + test_invalid_event_id(BIT_ULL(32)); >> +#endif > > With your latest opensbi changes we'll truncate eventid, so we need > another invalid eventid for the tests. That's good since we can then test > for rv32 too. Indeed. > >> + >> + report_prefix_pop(); >> +} >> + >> +static bool sse_can_inject(uint32_t event_id) >> +{ >> + struct sbiret ret; >> + unsigned long status; >> + >> + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, 1, &status); >> + if (ret.error) > > Don't we want to assert or complain loudly about an unexpected status read > failure? I can add some warning yeah > >> + return false; >> + >> + return !!(status & BIT(SBI_SSE_ATTR_STATUS_INJECT_OFFSET)); >> +} >> + >> +static void boot_secondary(void *data) > > Maybe name this sse_secondary_boot_and_unmask? > >> +{ >> + sbi_sse_hart_unmask(); >> +} >> + >> +static void sse_check_mask(void) >> +{ >> + struct sbiret ret; >> + >> + /* Upon boot, event are masked, check that */ >> + ret = sbi_sse_hart_mask(); >> + sbiret_report_error(&ret, SBI_ERR_ALREADY_STOPPED, "SSE hart mask at boot time ok"); >> + >> + ret = sbi_sse_hart_unmask(); >> + sbiret_report_error(&ret, SBI_SUCCESS, "SSE hart no error ok"); > ^ unmask > >> + ret = sbi_sse_hart_unmask(); >> + sbiret_report_error(&ret, SBI_ERR_ALREADY_STARTED, "SSE hart unmask twice error ok"); >> + >> + ret = sbi_sse_hart_mask(); >> + sbiret_report_error(&ret, SBI_SUCCESS, "SSE hart mask no error"); >> + ret = sbi_sse_hart_mask(); >> + sbiret_report_error(&ret, SBI_ERR_ALREADY_STOPPED, "SSE hart mask twice ok"); >> +} >> + >> +void check_sse(void) >> +{ >> + unsigned long i, event_id; >> + >> + report_prefix_push("sse"); >> + >> + if (!sbi_probe(SBI_EXT_SSE)) { >> + report_skip("SSE extension not available"); >> + report_prefix_pop(); >> + return; >> + } >> + >> + sse_check_mask(); >> + >> + /* >> + * Dummy wakeup of all processors since some of them will be targeted >> + * by global events without going through the wakeup call as well as >> + * unmasking SSE events on all harts >> + */ >> + on_cpus(boot_secondary, NULL); >> + >> + sse_test_invalid_event_id(); >> + >> + for (i = 0; i < ARRAY_SIZE(sse_event_infos); i++) { >> + event_id = sse_event_infos[i].event_id; >> + report_prefix_push(sse_event_infos[i].name); >> + if (!sse_can_inject(event_id)) { >> + report_skip("Event 0x%lx does not support injection", event_id); >> + report_prefix_pop(); >> + continue; >> + } else { >> + sse_event_infos[i].can_inject = true; > > What do we need to cache can_inject=true for if we filter out all events > which have can_inject=false? The cache value is used in priority tests. Should we run at least something with > events which don't support injection? Yeah, a few tests can be run, I'll modify that > >> + } >> + sse_test_attrs(event_id); >> + sse_test_register_error(event_id); >> + sse_test_inject_simple(event_id); >> + if (sbi_sse_event_is_global(event_id)) >> + sse_test_inject_global(event_id); >> + else >> + sse_test_inject_local(event_id); >> + >> + report_prefix_pop(); >> + } >> + >> + sse_test_injection_priority(); >> + >> + report_prefix_pop(); >> +} >> diff --git a/riscv/sbi.c b/riscv/sbi.c >> index 7c7a2d2d..49e81bda 100644 >> --- a/riscv/sbi.c >> +++ b/riscv/sbi.c >> @@ -32,6 +32,7 @@ >> >> #define HIGH_ADDR_BOUNDARY ((phys_addr_t)1 << 32) >> >> +void check_sse(void); >> void check_fwft(void); >> >> static long __labs(long a) >> @@ -1439,6 +1440,7 @@ int main(int argc, char **argv) >> check_hsm(); >> check_dbcn(); >> check_susp(); >> + check_sse(); >> check_fwft(); >> >> return report_summary(); >> -- >> 2.47.2 >> > > Thanks, > drew From andrew.jones at linux.dev Thu Mar 6 07:15:06 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Thu, 6 Mar 2025 16:15:06 +0100 Subject: [kvm-unit-tests PATCH v7 6/6] riscv: sbi: Add SSE extension tests In-Reply-To: References: <20250214114423.1071621-1-cleger@rivosinc.com> <20250214114423.1071621-7-cleger@rivosinc.com> <20250227-93a15f012d9bda941ef44e38@orel> Message-ID: <20250306-5f8b0b45873648fa93beccc7@orel> On Thu, Mar 06, 2025 at 03:32:39PM +0100, Cl?ment L?ger wrote: > > > On 28/02/2025 18:51, Andrew Jones wrote: ... > >> + attr = SBI_SSE_ATTR_INTERRUPTED_FLAGS; > >> + ret = sbi_sse_read_attrs(event_id, attr, 1, &prev_value); > >> + sbiret_report_error(&ret, SBI_SUCCESS, "Save interrupted flags no error"); > >> + > >> + for (i = 0; i < ARRAY_SIZE(interrupted_flags); i++) { > >> + flags = interrupted_flags[i]; > >> + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); > >> + sbiret_report_error(&ret, SBI_SUCCESS, > >> + "Set interrupted flags bit 0x%lx value no error", flags); > >> + ret = sbi_sse_read_attrs(event_id, attr, 1, &value); > >> + sbiret_report_error(&ret, SBI_SUCCESS, "Get interrupted flags after set no error"); > >> + report(value == flags, "interrupted flags modified value ok: 0x%lx", value); > > > > Do we also need to test with more than one flag set at a time? > > That is already done a few lines above (see /* Restore full saved state */). OK > > > > >> + } > >> + > >> + /* Write invalid bit in flag register */ > >> + flags = SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT << 1; > >> + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); > >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", > >> + flags); > >> +#if __riscv_xlen > 32 > >> + flags = BIT(32); > >> + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); > >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", > > > > This should have a different report string than the test above. > > The bit value format does differentiate the printf though. OK ... > >> + ret = sbi_sse_unregister(event_id); > >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "SSE unregister no error")) > >> + goto done; > >> + > >> + sse_check_state(event_id, SBI_SSE_STATE_UNUSED); > >> + > >> +done: > > > > Is it ok to leave this function with an event registered/enabled? If not, > > then some of the goto's above should goto other labels which disable and > > unregister. > > No it's not but it's massive pain to keep everything coherent when it > fails ;) > asserts/aborts are fine if we can't recover easily, but then we should move the SSE tests out of the main SBI test into its own test so we don't short-circuit all other tests that may follow it. ... > >> + /* Be sure global events are targeting the current hart */ > >> + sse_global_event_set_current_hart(event_id); > >> + > >> + sbi_sse_register(event_id, event_arg); > >> + value = arg->prio; > >> + sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); > >> + sbi_sse_enable(event_id); > > > > No return code checks for these SSE calls? If we're 99% sure they should > > succeed, then I'd still check them with asserts. > > I was a bit lazy here. Since the goal is *not* to check the event state > themselve but rather the ordering, I didn't bother checking them. As > said before, habndling error and event state properly in case of error > seemed like a churn to me *just* for testing. I'll try something better > as well though. > We always want at least asserts() in order to catch the train when it first goes off the rails, rather than after it smashed through a village or two. Thanks, drew From cleger at rivosinc.com Thu Mar 6 07:18:29 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Thu, 6 Mar 2025 16:18:29 +0100 Subject: [kvm-unit-tests PATCH v7 6/6] riscv: sbi: Add SSE extension tests In-Reply-To: <20250306-5f8b0b45873648fa93beccc7@orel> References: <20250214114423.1071621-1-cleger@rivosinc.com> <20250214114423.1071621-7-cleger@rivosinc.com> <20250227-93a15f012d9bda941ef44e38@orel> <20250306-5f8b0b45873648fa93beccc7@orel> Message-ID: <95a72c57-40ff-417a-a8a2-065554eda5ec@rivosinc.com> On 06/03/2025 16:15, Andrew Jones wrote: > On Thu, Mar 06, 2025 at 03:32:39PM +0100, Cl?ment L?ger wrote: >> >> >> On 28/02/2025 18:51, Andrew Jones wrote: > ... >>>> + attr = SBI_SSE_ATTR_INTERRUPTED_FLAGS; >>>> + ret = sbi_sse_read_attrs(event_id, attr, 1, &prev_value); >>>> + sbiret_report_error(&ret, SBI_SUCCESS, "Save interrupted flags no error"); >>>> + >>>> + for (i = 0; i < ARRAY_SIZE(interrupted_flags); i++) { >>>> + flags = interrupted_flags[i]; >>>> + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); >>>> + sbiret_report_error(&ret, SBI_SUCCESS, >>>> + "Set interrupted flags bit 0x%lx value no error", flags); >>>> + ret = sbi_sse_read_attrs(event_id, attr, 1, &value); >>>> + sbiret_report_error(&ret, SBI_SUCCESS, "Get interrupted flags after set no error"); >>>> + report(value == flags, "interrupted flags modified value ok: 0x%lx", value); >>> >>> Do we also need to test with more than one flag set at a time? >> >> That is already done a few lines above (see /* Restore full saved state */). > > OK > >> >>> >>>> + } >>>> + >>>> + /* Write invalid bit in flag register */ >>>> + flags = SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT << 1; >>>> + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); >>>> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", >>>> + flags); >>>> +#if __riscv_xlen > 32 >>>> + flags = BIT(32); >>>> + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); >>>> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", >>> >>> This should have a different report string than the test above. >> >> The bit value format does differentiate the printf though. > > OK > > ... >>>> + ret = sbi_sse_unregister(event_id); >>>> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "SSE unregister no error")) >>>> + goto done; >>>> + >>>> + sse_check_state(event_id, SBI_SSE_STATE_UNUSED); >>>> + >>>> +done: >>> >>> Is it ok to leave this function with an event registered/enabled? If not, >>> then some of the goto's above should goto other labels which disable and >>> unregister. >> >> No it's not but it's massive pain to keep everything coherent when it >> fails ;) >> > > asserts/aborts are fine if we can't recover easily, but then we should > move the SSE tests out of the main SBI test into its own test so we > don't short-circuit all other tests that may follow it. Oh yes I did not though of that. I could as well short circuit the sse event test themselves. Currently test function returns void but that could be handled more gracefully so that we don't break all other tests as well. > > ... >>>> + /* Be sure global events are targeting the current hart */ >>>> + sse_global_event_set_current_hart(event_id); >>>> + >>>> + sbi_sse_register(event_id, event_arg); >>>> + value = arg->prio; >>>> + sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); >>>> + sbi_sse_enable(event_id); >>> >>> No return code checks for these SSE calls? If we're 99% sure they should >>> succeed, then I'd still check them with asserts. >> >> I was a bit lazy here. Since the goal is *not* to check the event state >> themselve but rather the ordering, I didn't bother checking them. As >> said before, habndling error and event state properly in case of error >> seemed like a churn to me *just* for testing. I'll try something better >> as well though. >> > > We always want at least asserts() in order to catch the train when it > first goes off the rails, rather than after it smashed through a village > or two. Yeah sure ;) I'll try to do what I said before: short circuit the remaining SSE test for the event itself rather than aborting or splitting the tests. Thanks, Cl?ment > > Thanks, > drew From seanjc at google.com Thu Mar 6 14:17:50 2025 From: seanjc at google.com (Sean Christopherson) Date: Thu, 6 Mar 2025 14:17:50 -0800 Subject: [RFC kvm-unit-tests PATCH] lib: Use __ASSEMBLER__ instead of __ASSEMBLY__ In-Reply-To: References: <20250222014526.2302653-1-seanjc@google.com> Message-ID: On Thu, Mar 06, 2025, Thomas Huth wrote: > On 22/02/2025 02.45, Sean Christopherson wrote: > > Convert all non-x86 #ifdefs from __ASSEMBLY__ to __ASSEMBLER__, and remove > > all manual __ASSEMBLY__ #defines. __ASSEMBLY_ was inherited blindly from > > the Linux kernel, and must be manually defined, e.g. through build rules > > or with the aforementioned explicit #defines in assembly code. > > > > __ASSEMBLER__ on the other hand is automatically defined by the compiler > > when preprocessing assembly, i.e. doesn't require manually #defines for > > the code to function correctly. > > > > Ignore x86, as x86 doesn't actually rely on __ASSEMBLY__ at the moment, > > and is undergoing a parallel cleanup. > > > > Signed-off-by: Sean Christopherson > > --- > > > > Completely untested. This is essentially a "rage" patch after spending > > way, way too much time trying to understand why I couldn't include some > > __ASSEMBLY__ protected headers in x86 assembly files. > > Thanks, applied (after fixing the spot that Andrew mentioned and another one > that has been merged in between)! > > BTW, do you happen to know why the kernel uses __ASSEMBLY__ and not > __ASSEMBLER__? Just grown historically, or is there a real reason? AFAICT, it's purely historical. From anup at brainfault.org Fri Mar 7 00:28:46 2025 From: anup at brainfault.org (Anup Patel) Date: Fri, 7 Mar 2025 13:58:46 +0530 Subject: [PATCH v2 0/4] RISC-V KVM PMU fix and selftest improvement In-Reply-To: <20250303-kvm_pmu_improve-v2-0-41d177e45929@rivosinc.com> References: <20250303-kvm_pmu_improve-v2-0-41d177e45929@rivosinc.com> Message-ID: On Tue, Mar 4, 2025 at 4:23?AM Atish Patra wrote: > > This series adds a fix for KVM PMU code and improves the pmu selftest > by allowing generating precise number of interrupts. It also provided > another additional option to the overflow test that allows user to > generate custom number of LCOFI interrupts. > > Signed-off-by: Atish Patra > --- > Changes in v2: > - Initialized the local overflow irq variable to 0 indicate that it's not a > allowed value. > - Moved the introduction of argument option `n` to the last patch. > - Link to v1: https://lore.kernel.org/r/20250226-kvm_pmu_improve-v1-0-74c058c2bf6d at rivosinc.com > > --- > Atish Patra (4): > RISC-V: KVM: Disable the kernel perf counter during configure > KVM: riscv: selftests: Do not start the counter in the overflow handler > KVM: riscv: selftests: Change command line option > KVM: riscv: selftests: Allow number of interrupts to be configurable Queued this series for Linux-6.15. Thanks, Anup > > arch/riscv/kvm/vcpu_pmu.c | 1 + > tools/testing/selftests/kvm/riscv/sbi_pmu_test.c | 81 ++++++++++++++++-------- > 2 files changed, 57 insertions(+), 25 deletions(-) > --- > base-commit: 0ad2507d5d93f39619fc42372c347d6006b64319 > change-id: 20250225-kvm_pmu_improve-fffd038b2404 > -- > Regards, > Atish patra > From andrew.jones at linux.dev Fri Mar 7 00:39:53 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Fri, 7 Mar 2025 09:39:53 +0100 Subject: [kvm-unit-tests PATCH] Makefile: Use CFLAGS in cc-option Message-ID: <20250307083952.40999-2-andrew.jones@linux.dev> When cross compiling with clang we need to specify the target in CFLAGS and cc-option will fail to recognize target-specific options without it. Add CFLAGS to the CC invocation in cc-option. Signed-off-by: Andrew Jones --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 78352fced9d4..9dc5d2234e2a 100644 --- a/Makefile +++ b/Makefile @@ -21,7 +21,7 @@ DESTDIR := $(PREFIX)/share/kvm-unit-tests/ # cc-option # Usage: OP_CFLAGS+=$(call cc-option, -falign-functions=0, -malign-functions=0) -cc-option = $(shell if $(CC) -Werror $(1) -S -o /dev/null -xc /dev/null \ +cc-option = $(shell if $(CC) $(CFLAGS) -Werror $(1) -S -o /dev/null -xc /dev/null \ > /dev/null 2>&1; then echo "$(1)"; else echo "$(2)"; fi ;) libcflat := lib/libcflat.a -- 2.48.1 From thuth at redhat.com Fri Mar 7 00:42:03 2025 From: thuth at redhat.com (Thomas Huth) Date: Fri, 7 Mar 2025 09:42:03 +0100 Subject: [kvm-unit-tests PATCH] Makefile: Use CFLAGS in cc-option In-Reply-To: <20250307083952.40999-2-andrew.jones@linux.dev> References: <20250307083952.40999-2-andrew.jones@linux.dev> Message-ID: <2c0bd772-0132-4053-bd22-aac88a8dcfee@redhat.com> On 07/03/2025 09.39, Andrew Jones wrote: > When cross compiling with clang we need to specify the target in > CFLAGS and cc-option will fail to recognize target-specific options > without it. Add CFLAGS to the CC invocation in cc-option. > > Signed-off-by: Andrew Jones > --- > Makefile | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/Makefile b/Makefile > index 78352fced9d4..9dc5d2234e2a 100644 > --- a/Makefile > +++ b/Makefile > @@ -21,7 +21,7 @@ DESTDIR := $(PREFIX)/share/kvm-unit-tests/ > > # cc-option > # Usage: OP_CFLAGS+=$(call cc-option, -falign-functions=0, -malign-functions=0) > -cc-option = $(shell if $(CC) -Werror $(1) -S -o /dev/null -xc /dev/null \ > +cc-option = $(shell if $(CC) $(CFLAGS) -Werror $(1) -S -o /dev/null -xc /dev/null \ > > /dev/null 2>&1; then echo "$(1)"; else echo "$(2)"; fi ;) > > libcflat := lib/libcflat.a Reviewed-by: Thomas Huth From andrew.jones at linux.dev Fri Mar 7 00:45:39 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Fri, 7 Mar 2025 09:45:39 +0100 Subject: [kvm-unit-tests PATCH] Makefile: Use CFLAGS in cc-option In-Reply-To: <2c0bd772-0132-4053-bd22-aac88a8dcfee@redhat.com> References: <20250307083952.40999-2-andrew.jones@linux.dev> <2c0bd772-0132-4053-bd22-aac88a8dcfee@redhat.com> Message-ID: <20250307-7a4556045a53c51e3150da2f@orel> On Fri, Mar 07, 2025 at 09:42:03AM +0100, Thomas Huth wrote: > On 07/03/2025 09.39, Andrew Jones wrote: > > When cross compiling with clang we need to specify the target in > > CFLAGS and cc-option will fail to recognize target-specific options > > without it. Add CFLAGS to the CC invocation in cc-option. > > > > Signed-off-by: Andrew Jones > > --- > > Makefile | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/Makefile b/Makefile > > index 78352fced9d4..9dc5d2234e2a 100644 > > --- a/Makefile > > +++ b/Makefile > > @@ -21,7 +21,7 @@ DESTDIR := $(PREFIX)/share/kvm-unit-tests/ > > # cc-option > > # Usage: OP_CFLAGS+=$(call cc-option, -falign-functions=0, -malign-functions=0) > > -cc-option = $(shell if $(CC) -Werror $(1) -S -o /dev/null -xc /dev/null \ > > +cc-option = $(shell if $(CC) $(CFLAGS) -Werror $(1) -S -o /dev/null -xc /dev/null \ > > > /dev/null 2>&1; then echo "$(1)"; else echo "$(2)"; fi ;) > > libcflat := lib/libcflat.a > > Reviewed-by: Thomas Huth Thanks, but I just found out that I was too hasty with this patch. I broke x86, /builds/jones-drew/kvm-unit-tests/x86/Makefile.common:105: *** Recursive variable 'CFLAGS' references itself (eventually). Stop. I'll try to sort that out and send a v2. Thanks, drew From andrew.jones at linux.dev Fri Mar 7 01:18:29 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Fri, 7 Mar 2025 10:18:29 +0100 Subject: [kvm-unit-tests PATCH v2] Makefile: Use CFLAGS in cc-option Message-ID: <20250307091828.57933-2-andrew.jones@linux.dev> When cross compiling with clang we need to specify the target in CFLAGS and cc-option will fail to recognize target-specific options without it. Add CFLAGS to the CC invocation in cc-option. The introduction of the realmode_bits variable is necessary to avoid make failing to build x86 due to CFLAGS referencing itself. Signed-off-by: Andrew Jones --- v2: - Fixed x86 builds with the realmode_bits variable Makefile | 2 +- x86/Makefile.common | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/Makefile b/Makefile index 78352fced9d4..9dc5d2234e2a 100644 --- a/Makefile +++ b/Makefile @@ -21,7 +21,7 @@ DESTDIR := $(PREFIX)/share/kvm-unit-tests/ # cc-option # Usage: OP_CFLAGS+=$(call cc-option, -falign-functions=0, -malign-functions=0) -cc-option = $(shell if $(CC) -Werror $(1) -S -o /dev/null -xc /dev/null \ +cc-option = $(shell if $(CC) $(CFLAGS) -Werror $(1) -S -o /dev/null -xc /dev/null \ > /dev/null 2>&1; then echo "$(1)"; else echo "$(2)"; fi ;) libcflat := lib/libcflat.a diff --git a/x86/Makefile.common b/x86/Makefile.common index 0b7f35c8de85..e97464912e28 100644 --- a/x86/Makefile.common +++ b/x86/Makefile.common @@ -98,6 +98,7 @@ tests-common = $(TEST_DIR)/vmexit.$(exe) $(TEST_DIR)/tsc.$(exe) \ ifneq ($(CONFIG_EFI),y) tests-common += $(TEST_DIR)/realmode.$(exe) \ $(TEST_DIR)/la57.$(exe) +realmode_bits := $(if $(call cc-option,-m16,""),16,32) endif test_cases: $(tests-common) $(tests) @@ -108,7 +109,7 @@ $(TEST_DIR)/realmode.elf: $(TEST_DIR)/realmode.o $(LD) -m elf_i386 -nostdlib -o $@ \ -T $(SRCDIR)/$(TEST_DIR)/realmode.lds $^ -$(TEST_DIR)/realmode.o: bits = $(if $(call cc-option,-m16,""),16,32) +$(TEST_DIR)/realmode.o: bits = $(realmode_bits) $(TEST_DIR)/access_test.$(bin): $(TEST_DIR)/access.o -- 2.48.1 From cleger at rivosinc.com Fri Mar 7 08:15:43 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Fri, 7 Mar 2025 17:15:43 +0100 Subject: [kvm-unit-tests PATCH v8 1/6] kbuild: Allow multiple asm-offsets file to be generated In-Reply-To: <20250307161549.1873770-1-cleger@rivosinc.com> References: <20250307161549.1873770-1-cleger@rivosinc.com> Message-ID: <20250307161549.1873770-2-cleger@rivosinc.com> In order to allow multiple asm-offsets files to generated the include guard need to be different between these file. Add a asm_offset_name makefile macro to obtain an uppercase name matching the original asm offsets file. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- scripts/asm-offsets.mak | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/scripts/asm-offsets.mak b/scripts/asm-offsets.mak index 7b64162d..a5fdbf5d 100644 --- a/scripts/asm-offsets.mak +++ b/scripts/asm-offsets.mak @@ -15,10 +15,14 @@ define sed-y s:->::; p;}' endef +define asm_offset_name + $(shell echo $(notdir $(1)) | tr [:lower:]- [:upper:]_) +endef + define make_asm_offsets (set -e; \ - echo "#ifndef __ASM_OFFSETS_H__"; \ - echo "#define __ASM_OFFSETS_H__"; \ + echo "#ifndef __$(strip $(asm_offset_name))_H__"; \ + echo "#define __$(strip $(asm_offset_name))_H__"; \ echo "/*"; \ echo " * Generated file. DO NOT MODIFY."; \ echo " *"; \ @@ -29,12 +33,16 @@ define make_asm_offsets echo "#endif" ) > $@ endef -$(asm-offsets:.h=.s): $(asm-offsets:.h=.c) - $(CC) $(CFLAGS) -fverbose-asm -S -o $@ $< +define gen_asm_offsets_rules +$(1).s: $(1).c + $(CC) $(CFLAGS) -fverbose-asm -S -o $$@ $$< + +$(1).h: $(1).s + $$(call make_asm_offsets,$(1)) + cp -f $$@ lib/generated/ +endef -$(asm-offsets): $(asm-offsets:.h=.s) - $(call make_asm_offsets) - cp -f $(asm-offsets) lib/generated/ +$(foreach o,$(asm-offsets),$(eval $(call gen_asm_offsets_rules, $(o:.h=)))) OBJDIRS += lib/generated -- 2.47.2 From cleger at rivosinc.com Fri Mar 7 08:15:42 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Fri, 7 Mar 2025 17:15:42 +0100 Subject: [kvm-unit-tests PATCH v8 0/6] riscv: add SBI SSE extension tests Message-ID: <20250307161549.1873770-1-cleger@rivosinc.com> This series adds tests for SBI SSE extension as well as needed infrastructure for SSE support. It also adds test specific asm-offsets generation to use custom OFFSET and DEFINE from the test directory. --- V8: - Short circuit current event tests if failure happens - Remove SSE from all report strings - Indent .prio field - Add cpu_relax()/smp_rmb() where needed - Add timeout for global event ENABLED state check - Added BIT(32) aliases tests for attribute/event_id. V7: - Test ids/attributes/attributes count > 32 bits - Rename all SSE function to sbi_sse_* - Use event_id instead of event/evt - Factorize read/write test - Use virt_to_phys() for attributes read/write. - Extensively use sbiret_report_error() - Change check function return values to bool. - Added assert for stack size to be below or equal to PAGE_SIZE - Use en env variable for the maximum hart ID - Check that individual read from attributes matches the multiple attributes read. - Added multiple attributes write at once - Used READ_ONCE/WRITE_ONCE - Inject all local event at once rather than looping fopr each core. - Split test_arg for local_dispatch test so that all CPUs can run at once. - Move SSE entry and generic code to lib/riscv for other tests - Fix unmask/mask state checking V6: - Add missing $(generated-file) dependencies for "-deps" objects - Split SSE entry from sbi-asm.S to sse-asm.S and all SSE core functions since it will be useful for other tests as well (dbltrp). V5: - Update event ranges based on latest spec - Rename asm-offset-test.c to sbi-asm-offset.c V4: - Fix typo sbi_ext_ss_fid -> sbi_ext_sse_fid - Add proper asm-offset generation for tests - Move SSE specific file from lib/riscv to riscv/ V3: - Add -deps variable for test specific dependencies - Fix formatting errors/typo in sbi.h - Add missing double trap event - Alphabetize sbi-sse.c includes - Fix a6 content after unmasking event - Add SSE HART_MASK/UNMASK test - Use mv instead of move - move sbi_check_sse() definition in sbi.c - Remove sbi_sse test from unitests.cfg V2: - Rebased on origin/master and integrate it into sbi.c tests Cl?ment L?ger (6): kbuild: Allow multiple asm-offsets file to be generated riscv: Set .aux.o files as .PRECIOUS riscv: Use asm-offsets to generate SBI_EXT_HSM values riscv: lib: Add SBI SSE extension definitions lib: riscv: Add SBI SSE support riscv: sbi: Add SSE extension tests scripts/asm-offsets.mak | 22 +- riscv/Makefile | 5 +- lib/riscv/asm/csr.h | 1 + lib/riscv/asm/sbi.h | 144 ++++- lib/riscv/sbi-sse-asm.S | 103 ++++ lib/riscv/asm-offsets.c | 9 + lib/riscv/sbi.c | 76 +++ riscv/sbi-tests.h | 1 + riscv/sbi-asm.S | 6 +- riscv/sbi-asm-offsets.c | 11 + riscv/sbi-sse.c | 1215 +++++++++++++++++++++++++++++++++++++++ riscv/sbi.c | 2 + riscv/.gitignore | 1 + 13 files changed, 1584 insertions(+), 12 deletions(-) create mode 100644 lib/riscv/sbi-sse-asm.S create mode 100644 riscv/sbi-asm-offsets.c create mode 100644 riscv/sbi-sse.c create mode 100644 riscv/.gitignore -- 2.47.2 From cleger at rivosinc.com Fri Mar 7 08:15:44 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Fri, 7 Mar 2025 17:15:44 +0100 Subject: [kvm-unit-tests PATCH v8 2/6] riscv: Set .aux.o files as .PRECIOUS In-Reply-To: <20250307161549.1873770-1-cleger@rivosinc.com> References: <20250307161549.1873770-1-cleger@rivosinc.com> Message-ID: <20250307161549.1873770-3-cleger@rivosinc.com> When compiling, we need to keep .aux.o file or they will be removed after the compilation which leads to dependent files to be recompiled. Set these files as .PRECIOUS to keep them. Signed-off-by: Cl?ment L?ger --- riscv/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/riscv/Makefile b/riscv/Makefile index 52718f3f..ae9cf02a 100644 --- a/riscv/Makefile +++ b/riscv/Makefile @@ -90,6 +90,7 @@ CFLAGS += -I $(SRCDIR)/lib -I $(SRCDIR)/lib/libfdt -I lib -I $(SRCDIR)/riscv asm-offsets = lib/riscv/asm-offsets.h include $(SRCDIR)/scripts/asm-offsets.mak +.PRECIOUS: %.aux.o %.aux.o: $(SRCDIR)/lib/auxinfo.c $(CC) $(CFLAGS) -c -o $@ $< \ -DPROGNAME=\"$(notdir $(@:.aux.o=.$(exe)))\" -DAUXFLAGS=$(AUXFLAGS) -- 2.47.2 From cleger at rivosinc.com Fri Mar 7 08:15:45 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Fri, 7 Mar 2025 17:15:45 +0100 Subject: [kvm-unit-tests PATCH v8 3/6] riscv: Use asm-offsets to generate SBI_EXT_HSM values In-Reply-To: <20250307161549.1873770-1-cleger@rivosinc.com> References: <20250307161549.1873770-1-cleger@rivosinc.com> Message-ID: <20250307161549.1873770-4-cleger@rivosinc.com> Replace hardcoded values with generated ones using sbi-asm-offset. This allows to directly use ASM_SBI_EXT_HSM and ASM_SBI_EXT_HSM_STOP in assembly. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- riscv/Makefile | 2 +- riscv/sbi-asm.S | 6 ++++-- riscv/sbi-asm-offsets.c | 11 +++++++++++ riscv/.gitignore | 1 + 4 files changed, 17 insertions(+), 3 deletions(-) create mode 100644 riscv/sbi-asm-offsets.c create mode 100644 riscv/.gitignore diff --git a/riscv/Makefile b/riscv/Makefile index ae9cf02a..02d2ac39 100644 --- a/riscv/Makefile +++ b/riscv/Makefile @@ -87,7 +87,7 @@ CFLAGS += -ffreestanding CFLAGS += -O2 CFLAGS += -I $(SRCDIR)/lib -I $(SRCDIR)/lib/libfdt -I lib -I $(SRCDIR)/riscv -asm-offsets = lib/riscv/asm-offsets.h +asm-offsets = lib/riscv/asm-offsets.h riscv/sbi-asm-offsets.h include $(SRCDIR)/scripts/asm-offsets.mak .PRECIOUS: %.aux.o diff --git a/riscv/sbi-asm.S b/riscv/sbi-asm.S index f4185496..51f46efd 100644 --- a/riscv/sbi-asm.S +++ b/riscv/sbi-asm.S @@ -6,6 +6,8 @@ */ #include #include +#include +#include #include "sbi-tests.h" @@ -57,8 +59,8 @@ sbi_hsm_check: 7: lb t0, 0(t1) pause beqz t0, 7b - li a7, 0x48534d /* SBI_EXT_HSM */ - li a6, 1 /* SBI_EXT_HSM_HART_STOP */ + li a7, ASM_SBI_EXT_HSM + li a6, ASM_SBI_EXT_HSM_HART_STOP ecall 8: pause j 8b diff --git a/riscv/sbi-asm-offsets.c b/riscv/sbi-asm-offsets.c new file mode 100644 index 00000000..bd37b6a2 --- /dev/null +++ b/riscv/sbi-asm-offsets.c @@ -0,0 +1,11 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include + +int main(void) +{ + DEFINE(ASM_SBI_EXT_HSM, SBI_EXT_HSM); + DEFINE(ASM_SBI_EXT_HSM_HART_STOP, SBI_EXT_HSM_HART_STOP); + + return 0; +} diff --git a/riscv/.gitignore b/riscv/.gitignore new file mode 100644 index 00000000..0a8c5a36 --- /dev/null +++ b/riscv/.gitignore @@ -0,0 +1 @@ +/*-asm-offsets.[hs] -- 2.47.2 From cleger at rivosinc.com Fri Mar 7 08:15:46 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Fri, 7 Mar 2025 17:15:46 +0100 Subject: [kvm-unit-tests PATCH v8 4/6] riscv: lib: Add SBI SSE extension definitions In-Reply-To: <20250307161549.1873770-1-cleger@rivosinc.com> References: <20250307161549.1873770-1-cleger@rivosinc.com> Message-ID: <20250307161549.1873770-5-cleger@rivosinc.com> Add SBI SSE extension definitions in sbi.h Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- lib/riscv/asm/sbi.h | 106 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 105 insertions(+), 1 deletion(-) diff --git a/lib/riscv/asm/sbi.h b/lib/riscv/asm/sbi.h index 2f4d91ef..780c9edd 100644 --- a/lib/riscv/asm/sbi.h +++ b/lib/riscv/asm/sbi.h @@ -30,6 +30,7 @@ enum sbi_ext_id { SBI_EXT_DBCN = 0x4442434E, SBI_EXT_SUSP = 0x53555350, SBI_EXT_FWFT = 0x46574654, + SBI_EXT_SSE = 0x535345, }; enum sbi_ext_base_fid { @@ -78,7 +79,6 @@ enum sbi_ext_dbcn_fid { SBI_EXT_DBCN_CONSOLE_WRITE_BYTE, }; - enum sbi_ext_fwft_fid { SBI_EXT_FWFT_SET = 0, SBI_EXT_FWFT_GET, @@ -105,6 +105,110 @@ enum sbi_ext_fwft_fid { #define SBI_FWFT_SET_FLAG_LOCK BIT(0) +enum sbi_ext_sse_fid { + SBI_EXT_SSE_READ_ATTRS = 0, + SBI_EXT_SSE_WRITE_ATTRS, + SBI_EXT_SSE_REGISTER, + SBI_EXT_SSE_UNREGISTER, + SBI_EXT_SSE_ENABLE, + SBI_EXT_SSE_DISABLE, + SBI_EXT_SSE_COMPLETE, + SBI_EXT_SSE_INJECT, + SBI_EXT_SSE_HART_UNMASK, + SBI_EXT_SSE_HART_MASK, +}; + +/* SBI SSE Event Attributes. */ +enum sbi_sse_attr_id { + SBI_SSE_ATTR_STATUS = 0x00000000, + SBI_SSE_ATTR_PRIORITY = 0x00000001, + SBI_SSE_ATTR_CONFIG = 0x00000002, + SBI_SSE_ATTR_PREFERRED_HART = 0x00000003, + SBI_SSE_ATTR_ENTRY_PC = 0x00000004, + SBI_SSE_ATTR_ENTRY_ARG = 0x00000005, + SBI_SSE_ATTR_INTERRUPTED_SEPC = 0x00000006, + SBI_SSE_ATTR_INTERRUPTED_FLAGS = 0x00000007, + SBI_SSE_ATTR_INTERRUPTED_A6 = 0x00000008, + SBI_SSE_ATTR_INTERRUPTED_A7 = 0x00000009, +}; + +#define SBI_SSE_ATTR_STATUS_STATE_OFFSET 0 +#define SBI_SSE_ATTR_STATUS_STATE_MASK 0x3 +#define SBI_SSE_ATTR_STATUS_PENDING_OFFSET 2 +#define SBI_SSE_ATTR_STATUS_INJECT_OFFSET 3 + +#define SBI_SSE_ATTR_CONFIG_ONESHOT BIT(0) + +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPP BIT(0) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPIE BIT(1) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPV BIT(2) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPVP BIT(3) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPELP BIT(4) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT BIT(5) + +enum sbi_sse_state { + SBI_SSE_STATE_UNUSED = 0, + SBI_SSE_STATE_REGISTERED = 1, + SBI_SSE_STATE_ENABLED = 2, + SBI_SSE_STATE_RUNNING = 3, +}; + +/* SBI SSE Event IDs. */ +/* Range 0x00000000 - 0x0000ffff */ +#define SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS 0x00000000 +#define SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP 0x00000001 +#define SBI_SSE_EVENT_LOCAL_RESERVED_0_START 0x00000002 +#define SBI_SSE_EVENT_LOCAL_RESERVED_0_END 0x00003fff +#define SBI_SSE_EVENT_LOCAL_PLAT_0_START 0x00004000 +#define SBI_SSE_EVENT_LOCAL_PLAT_0_END 0x00007fff + +#define SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS 0x00008000 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_0_START 0x00008001 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_0_END 0x0000bfff +#define SBI_SSE_EVENT_GLOBAL_PLAT_0_START 0x0000c000 +#define SBI_SSE_EVENT_GLOBAL_PLAT_0_END 0x0000ffff + +/* Range 0x00010000 - 0x0001ffff */ +#define SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW 0x00010000 +#define SBI_SSE_EVENT_LOCAL_RESERVED_1_START 0x00010001 +#define SBI_SSE_EVENT_LOCAL_RESERVED_1_END 0x00013fff +#define SBI_SSE_EVENT_LOCAL_PLAT_1_START 0x00014000 +#define SBI_SSE_EVENT_LOCAL_PLAT_1_END 0x00017fff + +#define SBI_SSE_EVENT_GLOBAL_RESERVED_1_START 0x00018000 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_1_END 0x0001bfff +#define SBI_SSE_EVENT_GLOBAL_PLAT_1_START 0x0001c000 +#define SBI_SSE_EVENT_GLOBAL_PLAT_1_END 0x0001ffff + +/* Range 0x00100000 - 0x0010ffff */ +#define SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS 0x00100000 +#define SBI_SSE_EVENT_LOCAL_RESERVED_2_START 0x00100001 +#define SBI_SSE_EVENT_LOCAL_RESERVED_2_END 0x00103fff +#define SBI_SSE_EVENT_LOCAL_PLAT_2_START 0x00104000 +#define SBI_SSE_EVENT_LOCAL_PLAT_2_END 0x00107fff + +#define SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS 0x00108000 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_2_START 0x00108001 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_2_END 0x0010bfff +#define SBI_SSE_EVENT_GLOBAL_PLAT_2_START 0x0010c000 +#define SBI_SSE_EVENT_GLOBAL_PLAT_2_END 0x0010ffff + +/* Range 0xffff0000 - 0xffffffff */ +#define SBI_SSE_EVENT_LOCAL_SOFTWARE 0xffff0000 +#define SBI_SSE_EVENT_LOCAL_RESERVED_3_START 0xffff0001 +#define SBI_SSE_EVENT_LOCAL_RESERVED_3_END 0xffff3fff +#define SBI_SSE_EVENT_LOCAL_PLAT_3_START 0xffff4000 +#define SBI_SSE_EVENT_LOCAL_PLAT_3_END 0xffff7fff + +#define SBI_SSE_EVENT_GLOBAL_SOFTWARE 0xffff8000 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_3_START 0xffff8001 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_3_END 0xffffbfff +#define SBI_SSE_EVENT_GLOBAL_PLAT_3_START 0xffffc000 +#define SBI_SSE_EVENT_GLOBAL_PLAT_3_END 0xffffffff + +#define SBI_SSE_EVENT_PLATFORM_BIT BIT(14) +#define SBI_SSE_EVENT_GLOBAL_BIT BIT(15) + struct sbiret { long error; long value; -- 2.47.2 From cleger at rivosinc.com Fri Mar 7 08:15:47 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Fri, 7 Mar 2025 17:15:47 +0100 Subject: [kvm-unit-tests PATCH v8 5/6] lib: riscv: Add SBI SSE support In-Reply-To: <20250307161549.1873770-1-cleger@rivosinc.com> References: <20250307161549.1873770-1-cleger@rivosinc.com> Message-ID: <20250307161549.1873770-6-cleger@rivosinc.com> Add support for registering and handling SSE events. This will be used by sbi test as well as upcoming double trap tests. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- riscv/Makefile | 1 + lib/riscv/asm/csr.h | 1 + lib/riscv/asm/sbi.h | 38 ++++++++++++++- lib/riscv/sbi-sse-asm.S | 103 ++++++++++++++++++++++++++++++++++++++++ lib/riscv/asm-offsets.c | 9 ++++ lib/riscv/sbi.c | 76 +++++++++++++++++++++++++++++ 6 files changed, 227 insertions(+), 1 deletion(-) create mode 100644 lib/riscv/sbi-sse-asm.S diff --git a/riscv/Makefile b/riscv/Makefile index 02d2ac39..16fc125b 100644 --- a/riscv/Makefile +++ b/riscv/Makefile @@ -43,6 +43,7 @@ cflatobjs += lib/riscv/setup.o cflatobjs += lib/riscv/smp.o cflatobjs += lib/riscv/stack.o cflatobjs += lib/riscv/timer.o +cflatobjs += lib/riscv/sbi-sse-asm.o ifeq ($(ARCH),riscv32) cflatobjs += lib/ldiv32.o endif diff --git a/lib/riscv/asm/csr.h b/lib/riscv/asm/csr.h index c7fc87a9..3e4b5fca 100644 --- a/lib/riscv/asm/csr.h +++ b/lib/riscv/asm/csr.h @@ -17,6 +17,7 @@ #define CSR_TIME 0xc01 #define SR_SIE _AC(0x00000002, UL) +#define SR_SPP _AC(0x00000100, UL) /* Exception cause high bit - is an interrupt if set */ #define CAUSE_IRQ_FLAG (_AC(1, UL) << (__riscv_xlen - 1)) diff --git a/lib/riscv/asm/sbi.h b/lib/riscv/asm/sbi.h index 780c9edd..acef8a5e 100644 --- a/lib/riscv/asm/sbi.h +++ b/lib/riscv/asm/sbi.h @@ -230,5 +230,41 @@ struct sbiret sbi_send_ipi_broadcast(void); struct sbiret sbi_set_timer(unsigned long stime_value); long sbi_probe(int ext); -#endif /* !__ASSEMBLER__ */ +typedef void (*sbi_sse_handler_fn)(void *data, struct pt_regs *regs, unsigned int hartid); + +struct sbi_sse_handler_arg { + unsigned long reg_tmp; + sbi_sse_handler_fn handler; + void *handler_data; + void *stack; +}; + +extern void sbi_sse_entry(void); + +static inline bool sbi_sse_event_is_global(uint32_t event_id) +{ + return !!(event_id & SBI_SSE_EVENT_GLOBAL_BIT); +} + +struct sbiret sbi_sse_read_attrs_raw(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long phys_lo, + unsigned long phys_hi); +struct sbiret sbi_sse_read_attrs(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long *values); +struct sbiret sbi_sse_write_attrs_raw(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long phys_lo, + unsigned long phys_hi); +struct sbiret sbi_sse_write_attrs(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long *values); +struct sbiret sbi_sse_register_raw(unsigned long event_id, unsigned long entry_pc, + unsigned long entry_arg); +struct sbiret sbi_sse_register(unsigned long event_id, struct sbi_sse_handler_arg *arg); +struct sbiret sbi_sse_unregister(unsigned long event_id); +struct sbiret sbi_sse_enable(unsigned long event_id); +struct sbiret sbi_sse_disable(unsigned long event_id); +struct sbiret sbi_sse_hart_mask(void); +struct sbiret sbi_sse_hart_unmask(void); +struct sbiret sbi_sse_inject(unsigned long event_id, unsigned long hart_id); + +#endif /* !__ASSEMBLY__ */ #endif /* _ASMRISCV_SBI_H_ */ diff --git a/lib/riscv/sbi-sse-asm.S b/lib/riscv/sbi-sse-asm.S new file mode 100644 index 00000000..e4efd1ff --- /dev/null +++ b/lib/riscv/sbi-sse-asm.S @@ -0,0 +1,103 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * RISC-V SSE events entry point. + * + * Copyright (C) 2025, Rivos Inc., Cl?ment L?ger + */ +#define __ASSEMBLY__ +#include +#include +#include +#include + +.section .text +.global sbi_sse_entry +sbi_sse_entry: + /* Save stack temporarily */ + REG_S sp, SBI_SSE_REG_TMP(a7) + /* Set entry stack */ + REG_L sp, SBI_SSE_HANDLER_STACK(a7) + + addi sp, sp, -(PT_SIZE) + REG_S ra, PT_RA(sp) + REG_S s0, PT_S0(sp) + REG_S s1, PT_S1(sp) + REG_S s2, PT_S2(sp) + REG_S s3, PT_S3(sp) + REG_S s4, PT_S4(sp) + REG_S s5, PT_S5(sp) + REG_S s6, PT_S6(sp) + REG_S s7, PT_S7(sp) + REG_S s8, PT_S8(sp) + REG_S s9, PT_S9(sp) + REG_S s10, PT_S10(sp) + REG_S s11, PT_S11(sp) + REG_S tp, PT_TP(sp) + REG_S t0, PT_T0(sp) + REG_S t1, PT_T1(sp) + REG_S t2, PT_T2(sp) + REG_S t3, PT_T3(sp) + REG_S t4, PT_T4(sp) + REG_S t5, PT_T5(sp) + REG_S t6, PT_T6(sp) + REG_S gp, PT_GP(sp) + REG_S a0, PT_A0(sp) + REG_S a1, PT_A1(sp) + REG_S a2, PT_A2(sp) + REG_S a3, PT_A3(sp) + REG_S a4, PT_A4(sp) + REG_S a5, PT_A5(sp) + csrr a1, CSR_SEPC + REG_S a1, PT_EPC(sp) + csrr a2, CSR_SSTATUS + REG_S a2, PT_STATUS(sp) + + REG_L a0, SBI_SSE_REG_TMP(a7) + REG_S a0, PT_SP(sp) + + REG_L t0, SBI_SSE_HANDLER(a7) + REG_L a0, SBI_SSE_HANDLER_DATA(a7) + mv a1, sp + mv a2, a6 + jalr t0 + + REG_L a1, PT_EPC(sp) + REG_L a2, PT_STATUS(sp) + csrw CSR_SEPC, a1 + csrw CSR_SSTATUS, a2 + + REG_L ra, PT_RA(sp) + REG_L s0, PT_S0(sp) + REG_L s1, PT_S1(sp) + REG_L s2, PT_S2(sp) + REG_L s3, PT_S3(sp) + REG_L s4, PT_S4(sp) + REG_L s5, PT_S5(sp) + REG_L s6, PT_S6(sp) + REG_L s7, PT_S7(sp) + REG_L s8, PT_S8(sp) + REG_L s9, PT_S9(sp) + REG_L s10, PT_S10(sp) + REG_L s11, PT_S11(sp) + REG_L tp, PT_TP(sp) + REG_L t0, PT_T0(sp) + REG_L t1, PT_T1(sp) + REG_L t2, PT_T2(sp) + REG_L t3, PT_T3(sp) + REG_L t4, PT_T4(sp) + REG_L t5, PT_T5(sp) + REG_L t6, PT_T6(sp) + REG_L gp, PT_GP(sp) + REG_L a0, PT_A0(sp) + REG_L a1, PT_A1(sp) + REG_L a2, PT_A2(sp) + REG_L a3, PT_A3(sp) + REG_L a4, PT_A4(sp) + REG_L a5, PT_A5(sp) + + REG_L sp, PT_SP(sp) + + li a7, ASM_SBI_EXT_SSE + li a6, ASM_SBI_EXT_SSE_COMPLETE + ecall + diff --git a/lib/riscv/asm-offsets.c b/lib/riscv/asm-offsets.c index 6c511c14..a96c6e97 100644 --- a/lib/riscv/asm-offsets.c +++ b/lib/riscv/asm-offsets.c @@ -3,6 +3,7 @@ #include #include #include +#include #include int main(void) @@ -63,5 +64,13 @@ int main(void) OFFSET(THREAD_INFO_HARTID, thread_info, hartid); DEFINE(THREAD_INFO_SIZE, sizeof(struct thread_info)); + DEFINE(ASM_SBI_EXT_SSE, SBI_EXT_SSE); + DEFINE(ASM_SBI_EXT_SSE_COMPLETE, SBI_EXT_SSE_COMPLETE); + + OFFSET(SBI_SSE_REG_TMP, sbi_sse_handler_arg, reg_tmp); + OFFSET(SBI_SSE_HANDLER, sbi_sse_handler_arg, handler); + OFFSET(SBI_SSE_HANDLER_DATA, sbi_sse_handler_arg, handler_data); + OFFSET(SBI_SSE_HANDLER_STACK, sbi_sse_handler_arg, stack); + return 0; } diff --git a/lib/riscv/sbi.c b/lib/riscv/sbi.c index 02dd338c..1752c916 100644 --- a/lib/riscv/sbi.c +++ b/lib/riscv/sbi.c @@ -2,6 +2,7 @@ #include #include #include +#include #include #include @@ -31,6 +32,81 @@ struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0, return ret; } +struct sbiret sbi_sse_read_attrs_raw(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long phys_lo, + unsigned long phys_hi) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_READ_ATTRS, event_id, base_attr_id, attr_count, + phys_lo, phys_hi, 0); +} + +struct sbiret sbi_sse_read_attrs(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long *values) +{ + phys_addr_t p = virt_to_phys(values); + + return sbi_sse_read_attrs_raw(event_id, base_attr_id, attr_count, lower_32_bits(p), + upper_32_bits(p)); +} + +struct sbiret sbi_sse_write_attrs_raw(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long phys_lo, + unsigned long phys_hi) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_WRITE_ATTRS, event_id, base_attr_id, attr_count, + phys_lo, phys_hi, 0); +} + +struct sbiret sbi_sse_write_attrs(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long *values) +{ + phys_addr_t p = virt_to_phys(values); + + return sbi_sse_write_attrs_raw(event_id, base_attr_id, attr_count, lower_32_bits(p), + upper_32_bits(p)); +} + +struct sbiret sbi_sse_register_raw(unsigned long event_id, unsigned long entry_pc, + unsigned long entry_arg) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_REGISTER, event_id, entry_pc, entry_arg, 0, 0, 0); +} + +struct sbiret sbi_sse_register(unsigned long event_id, struct sbi_sse_handler_arg *arg) +{ + return sbi_sse_register_raw(event_id, (unsigned long) sbi_sse_entry, (unsigned long) arg); +} + +struct sbiret sbi_sse_unregister(unsigned long event_id) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_UNREGISTER, event_id, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_sse_enable(unsigned long event_id) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_ENABLE, event_id, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_sse_disable(unsigned long event_id) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_DISABLE, event_id, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_sse_hart_mask(void) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_HART_MASK, 0, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_sse_hart_unmask(void) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_HART_UNMASK, 0, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_sse_inject(unsigned long event_id, unsigned long hart_id) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_INJECT, event_id, hart_id, 0, 0, 0, 0); +} + void sbi_shutdown(void) { sbi_ecall(SBI_EXT_SRST, 0, 0, 0, 0, 0, 0, 0); -- 2.47.2 From cleger at rivosinc.com Fri Mar 7 08:15:48 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Fri, 7 Mar 2025 17:15:48 +0100 Subject: [kvm-unit-tests PATCH v8 6/6] riscv: sbi: Add SSE extension tests In-Reply-To: <20250307161549.1873770-1-cleger@rivosinc.com> References: <20250307161549.1873770-1-cleger@rivosinc.com> Message-ID: <20250307161549.1873770-7-cleger@rivosinc.com> Add SBI SSE extension tests for the following features: - Test attributes errors (invalid values, RO, etc) - Registration errors - Simple events (register, enable, inject) - Events with different priorities - Global events dispatch on different harts - Local events on all harts - Hart mask/unmask events Signed-off-by: Cl?ment L?ger --- riscv/Makefile | 1 + riscv/sbi-tests.h | 1 + riscv/sbi-sse.c | 1215 +++++++++++++++++++++++++++++++++++++++++++++ riscv/sbi.c | 2 + 4 files changed, 1219 insertions(+) create mode 100644 riscv/sbi-sse.c diff --git a/riscv/Makefile b/riscv/Makefile index 16fc125b..4fe2f1bb 100644 --- a/riscv/Makefile +++ b/riscv/Makefile @@ -18,6 +18,7 @@ tests += $(TEST_DIR)/sieve.$(exe) all: $(tests) $(TEST_DIR)/sbi-deps = $(TEST_DIR)/sbi-asm.o $(TEST_DIR)/sbi-fwft.o +$(TEST_DIR)/sbi-deps += $(TEST_DIR)/sbi-sse.o # When built for EFI sieve needs extra memory, run with e.g. '-m 256' on QEMU $(TEST_DIR)/sieve.$(exe): AUXFLAGS = 0x1 diff --git a/riscv/sbi-tests.h b/riscv/sbi-tests.h index b081464d..a71da809 100644 --- a/riscv/sbi-tests.h +++ b/riscv/sbi-tests.h @@ -71,6 +71,7 @@ sbiret_report(ret, expected_error, expected_value, "check sbi.error and sbi.value") void sbi_bad_fid(int ext); +void check_sse(void); #endif /* __ASSEMBLER__ */ #endif /* _RISCV_SBI_TESTS_H_ */ diff --git a/riscv/sbi-sse.c b/riscv/sbi-sse.c new file mode 100644 index 00000000..7bd58b8b --- /dev/null +++ b/riscv/sbi-sse.c @@ -0,0 +1,1215 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * SBI SSE testsuite + * + * Copyright (C) 2025, Rivos Inc., Cl?ment L?ger + */ +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "sbi-tests.h" + +#define SSE_STACK_SIZE PAGE_SIZE + +struct sse_event_info { + uint32_t event_id; + const char *name; + bool can_inject; +}; + +static struct sse_event_info sse_event_infos[] = { + { + .event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, + .name = "local_high_prio_ras", + }, + { + .event_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP, + .name = "double_trap", + }, + { + .event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, + .name = "global_high_prio_ras", + }, + { + .event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, + .name = "local_pmu_overflow", + }, + { + .event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, + .name = "local_low_prio_ras", + }, + { + .event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, + .name = "global_low_prio_ras", + }, + { + .event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE, + .name = "local_software", + }, + { + .event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE, + .name = "global_software", + }, +}; + +static const char *const attr_names[] = { + [SBI_SSE_ATTR_STATUS] = "status", + [SBI_SSE_ATTR_PRIORITY] = "priority", + [SBI_SSE_ATTR_CONFIG] = "config", + [SBI_SSE_ATTR_PREFERRED_HART] = "preferred_hart", + [SBI_SSE_ATTR_ENTRY_PC] = "entry_pc", + [SBI_SSE_ATTR_ENTRY_ARG] = "entry_arg", + [SBI_SSE_ATTR_INTERRUPTED_SEPC] = "interrupted_sepc", + [SBI_SSE_ATTR_INTERRUPTED_FLAGS] = "interrupted_flags", + [SBI_SSE_ATTR_INTERRUPTED_A6] = "interrupted_a6", + [SBI_SSE_ATTR_INTERRUPTED_A7] = "interrupted_a7", +}; + +static const unsigned long ro_attrs[] = { + SBI_SSE_ATTR_STATUS, + SBI_SSE_ATTR_ENTRY_PC, + SBI_SSE_ATTR_ENTRY_ARG, +}; + +static const unsigned long interrupted_attrs[] = { + SBI_SSE_ATTR_INTERRUPTED_SEPC, + SBI_SSE_ATTR_INTERRUPTED_FLAGS, + SBI_SSE_ATTR_INTERRUPTED_A6, + SBI_SSE_ATTR_INTERRUPTED_A7, +}; + +static const unsigned long interrupted_flags[] = { + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPP, + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPIE, + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPELP, + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT, + SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPV, + SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPVP, +}; + +static struct sse_event_info *sse_event_get_info(uint32_t event_id) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(sse_event_infos); i++) { + if (sse_event_infos[i].event_id == event_id) + return &sse_event_infos[i]; + } + + assert_msg(false, "Invalid event id: %d", event_id); +} + +static const char *sse_event_name(uint32_t event_id) +{ + return sse_event_get_info(event_id)->name; +} + +static bool sse_event_can_inject(uint32_t event_id) +{ + return sse_event_get_info(event_id)->can_inject; +} + +static struct sbiret sse_get_event_status_field(uint32_t event_id, unsigned long mask, + unsigned long shift, unsigned long *value) +{ + struct sbiret ret; + unsigned long status; + + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, 1, &status); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "Get event status"); + return ret; + } + + *value = (status & mask) >> shift; + + return ret; +} + +static struct sbiret sse_event_get_state(uint32_t event_id, enum sbi_sse_state *state) +{ + unsigned long status = 0; + struct sbiret ret; + + ret = sse_get_event_status_field(event_id, SBI_SSE_ATTR_STATUS_STATE_MASK, + SBI_SSE_ATTR_STATUS_STATE_OFFSET, &status); + *state = status; + + return ret; +} + +static unsigned long sse_global_event_set_current_hart(uint32_t event_id) +{ + struct sbiret ret; + unsigned long current_hart = current_thread_info()->hartid; + + if (!sbi_sse_event_is_global(event_id)) + return SBI_ERR_INVALID_PARAM; + + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, ¤t_hart); + if (sbiret_report_error(&ret, SBI_SUCCESS, "Set preferred hart")) + return ret.error; + + return 0; +} + +static bool sse_check_state(uint32_t event_id, unsigned long expected_state) +{ + struct sbiret ret; + enum sbi_sse_state state; + + ret = sse_event_get_state(event_id, &state); + if (ret.error) + return false; + + return report(state == expected_state, "event status == %ld", expected_state); +} + +static bool sse_event_pending(uint32_t event_id) +{ + bool pending = 0; + + sse_get_event_status_field(event_id, BIT(SBI_SSE_ATTR_STATUS_PENDING_OFFSET), + SBI_SSE_ATTR_STATUS_PENDING_OFFSET, (unsigned long*)&pending); + + return pending; +} + +static void *sse_alloc_stack(void) +{ + /* + * We assume that SSE_STACK_SIZE always fit in one page. This page will + * always be decremented before storing anything on it in sse-entry.S. + */ + assert(SSE_STACK_SIZE <= PAGE_SIZE); + + return (alloc_page() + SSE_STACK_SIZE); +} + +static void sse_free_stack(void *stack) +{ + free_page(stack - SSE_STACK_SIZE); +} + +static void sse_read_write_test(uint32_t event_id, unsigned long attr, unsigned long attr_count, + unsigned long *value, long expected_error, const char *str) +{ + struct sbiret ret; + + ret = sbi_sse_read_attrs(event_id, attr, attr_count, value); + sbiret_report_error(&ret, expected_error, "Read %s error", str); + + ret = sbi_sse_write_attrs(event_id, attr, attr_count, value); + sbiret_report_error(&ret, expected_error, "Write %s error", str); +} + +#define ALL_ATTRS_COUNT (SBI_SSE_ATTR_INTERRUPTED_A7 + 1) + +static void sse_test_attrs(uint32_t event_id) +{ + unsigned long value = 0; + struct sbiret ret; + void *ptr; + unsigned long values[ALL_ATTRS_COUNT]; + unsigned int i; + const char *invalid_hart_str; + const char *attr_name; + + report_prefix_push("attrs"); + + for (i = 0; i < ARRAY_SIZE(ro_attrs); i++) { + ret = sbi_sse_write_attrs(event_id, ro_attrs[i], 1, &value); + sbiret_report_error(&ret, SBI_ERR_DENIED, "RO attribute %s not writable", + attr_names[ro_attrs[i]]); + } + + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, ALL_ATTRS_COUNT, values); + sbiret_report_error(&ret, SBI_SUCCESS, "Read multiple attributes"); + + for (i = SBI_SSE_ATTR_STATUS; i <= SBI_SSE_ATTR_INTERRUPTED_A7; i++) { + ret = sbi_sse_read_attrs(event_id, i, 1, &value); + attr_name = attr_names[i]; + + sbiret_report_error(&ret, SBI_SUCCESS, "Read single attribute %s", attr_name); + if (values[i] != value) + report_fail("Attribute 0x%x single value read (0x%lx) differs from the one read with multiple attributes (0x%lx)", + i, value, values[i]); + /* + * Preferred hart reset value is defined by SBI vendor + */ + if(i != SBI_SSE_ATTR_PREFERRED_HART) { + /* + * Specification states that injectable bit is implementation dependent + * but other bits are zero-initialized. + */ + if (i == SBI_SSE_ATTR_STATUS) + value &= ~BIT(SBI_SSE_ATTR_STATUS_INJECT_OFFSET); + report(value == 0, "Attribute %s reset value is 0, found %lx", attr_name, + value); + } + } + +#if __riscv_xlen > 32 + value = BIT(32); + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Write invalid prio > 0xFFFFFFFF error"); +#endif + + value = ~SBI_SSE_ATTR_CONFIG_ONESHOT; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_CONFIG, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Write invalid config value error"); + + if (sbi_sse_event_is_global(event_id)) { + invalid_hart_str = getenv("INVALID_HART_ID"); + if (!invalid_hart_str) + value = 0xFFFFFFFFUL; + else + value = strtoul(invalid_hart_str, NULL, 0); + + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid hart id error"); + } else { + /* Set Hart on local event -> RO */ + value = current_thread_info()->hartid; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); + sbiret_report_error(&ret, SBI_ERR_DENIED, + "Set hart id on local event error"); + } + + /* Set/get flags, sepc, a6, a7 */ + for (i = 0; i < ARRAY_SIZE(interrupted_attrs); i++) { + attr_name = attr_names[interrupted_attrs[i]]; + ret = sbi_sse_read_attrs(event_id, interrupted_attrs[i], 1, &value); + sbiret_report_error(&ret, SBI_SUCCESS, "Get interrupted %s", attr_name); + + value = ARRAY_SIZE(interrupted_attrs) - i; + ret = sbi_sse_write_attrs(event_id, interrupted_attrs[i], 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, + "Set attribute %s invalid state error", attr_name); + } + + sse_read_write_test(event_id, SBI_SSE_ATTR_STATUS, 0, &value, SBI_ERR_INVALID_PARAM, + "attribute attr_count == 0"); + sse_read_write_test(event_id, SBI_SSE_ATTR_INTERRUPTED_A7 + 1, 1, &value, SBI_ERR_BAD_RANGE, + "invalid attribute"); + + /* Misaligned pointer address */ + ptr = (void *)&value; + ptr += 1; + sse_read_write_test(event_id, SBI_SSE_ATTR_STATUS, 1, ptr, SBI_ERR_INVALID_ADDRESS, + "attribute with invalid address"); + + report_prefix_pop(); +} + +static void sse_test_register_error(uint32_t event_id) +{ + struct sbiret ret; + + report_prefix_push("register"); + + ret = sbi_sse_unregister(event_id); + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "unregister non-registered event"); + + ret = sbi_sse_register_raw(event_id, 0x1, 0); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "register misaligned entry"); + + ret = sbi_sse_register_raw(event_id, (unsigned long)sbi_sse_entry, 0); + sbiret_report_error(&ret, SBI_SUCCESS, "register"); + if (ret.error) + goto done; + + ret = sbi_sse_register_raw(event_id, (unsigned long)sbi_sse_entry, 0); + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "register used event failure"); + + ret = sbi_sse_unregister(event_id); + sbiret_report_error(&ret, SBI_SUCCESS, "unregister"); + +done: + report_prefix_pop(); +} + +struct sse_simple_test_arg { + bool done; + unsigned long expected_a6; + uint32_t event_id; +}; + +#if __riscv_xlen > 32 + +struct alias_test_params { + unsigned long event_id; + unsigned long attr_id; + unsigned long attr_count; + const char *str; +}; + +static void test_alias(uint32_t event_id) +{ + struct alias_test_params *write, *read; + unsigned long write_value, read_value; + struct sbiret ret; + bool err = false; + int r, w; + struct alias_test_params params[] = { + {event_id, SBI_SSE_ATTR_INTERRUPTED_A6, 1, "non aliased"}, + {BIT(32) + event_id, SBI_SSE_ATTR_INTERRUPTED_A6, 1, "aliased event_id"}, + {event_id, BIT(32) + SBI_SSE_ATTR_INTERRUPTED_A6, 1, "aliased attr_id"}, + {event_id, SBI_SSE_ATTR_INTERRUPTED_A6, BIT(32) + 1, "aliased attr_count"}, + }; + + report_prefix_push("alias"); + for (w = 0; w < ARRAY_SIZE(params); w++) { + write = ¶ms[w]; + + write_value = 0xDEADBEEF + w; + ret = sbi_sse_write_attrs(write->event_id, write->attr_id, write->attr_count, &write_value); + if (ret.error) + sbiret_report_error(&ret, SBI_SUCCESS, "Write %s, event 0x%lx attr 0x%lx, attr count 0x%lx", write->str, write->event_id, write->attr_id, write->attr_count); + + for (r = 0; r < ARRAY_SIZE(params); r++) { + read = ¶ms[r]; + read_value = 0; + ret = sbi_sse_read_attrs(read->event_id, read->attr_id, read->attr_count, &read_value); + if (ret.error) + sbiret_report_error(&ret, SBI_SUCCESS, + "Read %s, event 0x%lx attr 0x%lx, attr count 0x%lx", read->str, read->event_id, read->attr_id, read->attr_count); + + /* Do not spam output with a lot of reports */ + if (write_value != read_value) { + err = true; + report_fail("Write %s, event 0x%lx attr 0x%lx, attr count 0x%lx value %lx ==" + "Read %s, event 0x%lx attr 0x%lx, attr count 0x%lx value %lx", + write->str, write->event_id, write->attr_id, write->attr_count, + write_value, + read->str, read->event_id, read->attr_id, read->attr_count, + read_value); + } + } + } + + report(!err, "BIT(32) aliasing tests"); + report_prefix_pop(); +} +#endif + +static void sse_simple_handler(void *data, struct pt_regs *regs, unsigned int hartid) +{ + struct sse_simple_test_arg *arg = data; + int i; + struct sbiret ret; + const char *attr_name; + uint32_t event_id = READ_ONCE(arg->event_id), attr; + unsigned long value, prev_value, flags; + unsigned long interrupted_state[ARRAY_SIZE(interrupted_attrs)]; + unsigned long modified_state[ARRAY_SIZE(interrupted_attrs)] = {4, 3, 2, 1}; + unsigned long tmp_state[ARRAY_SIZE(interrupted_attrs)]; + + report((regs->status & SR_SPP) == SR_SPP, "Interrupted S-mode"); + report(hartid == current_thread_info()->hartid, "Hartid correctly passed"); + sse_check_state(event_id, SBI_SSE_STATE_RUNNING); + report(!sse_event_pending(event_id), "Event not pending"); + + /* Read full interrupted state */ + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, + ARRAY_SIZE(interrupted_attrs), interrupted_state); + sbiret_report_error(&ret, SBI_SUCCESS, "Save full interrupted state from handler"); + + /* Write full modified state and read it */ + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, + ARRAY_SIZE(modified_state), modified_state); + sbiret_report_error(&ret, SBI_SUCCESS, + "Write full interrupted state from handler"); + + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, + ARRAY_SIZE(tmp_state), tmp_state); + sbiret_report_error(&ret, SBI_SUCCESS, "Read full modified state from handler"); + + report(memcmp(tmp_state, modified_state, sizeof(modified_state)) == 0, + "Full interrupted state successfully written"); + +#if __riscv_xlen > 32 + test_alias(event_id); +#endif + + /* Restore full saved state */ + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, + ARRAY_SIZE(interrupted_attrs), interrupted_state); + sbiret_report_error(&ret, SBI_SUCCESS, + "Full interrupted state restore from handler"); + + /* We test SBI_SSE_ATTR_INTERRUPTED_FLAGS below with specific flag values */ + for (i = 0; i < ARRAY_SIZE(interrupted_attrs); i++) { + attr = interrupted_attrs[i]; + if (attr == SBI_SSE_ATTR_INTERRUPTED_FLAGS) + continue; + + attr_name = attr_names[attr]; + + ret = sbi_sse_read_attrs(event_id, attr, 1, &prev_value); + sbiret_report_error(&ret, SBI_SUCCESS, "Get attr %s", attr_name); + + value = 0xDEADBEEF + i; + ret = sbi_sse_write_attrs(event_id, attr, 1, &value); + sbiret_report_error(&ret, SBI_SUCCESS, "Set attr %s", attr_name); + + ret = sbi_sse_read_attrs(event_id, attr, 1, &value); + sbiret_report_error(&ret, SBI_SUCCESS, "Get attr %s", attr_name); + report(value == 0xDEADBEEF + i, "Get attr %s, value: 0x%lx", attr_name, + value); + + ret = sbi_sse_write_attrs(event_id, attr, 1, &prev_value); + sbiret_report_error(&ret, SBI_SUCCESS, "Restore attr %s value", attr_name); + } + + /* Test all flags allowed for SBI_SSE_ATTR_INTERRUPTED_FLAGS */ + attr = SBI_SSE_ATTR_INTERRUPTED_FLAGS; + ret = sbi_sse_read_attrs(event_id, attr, 1, &prev_value); + sbiret_report_error(&ret, SBI_SUCCESS, "Save interrupted flags"); + + for (i = 0; i < ARRAY_SIZE(interrupted_flags); i++) { + flags = interrupted_flags[i]; + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); + sbiret_report_error(&ret, SBI_SUCCESS, + "Set interrupted flags bit 0x%lx value", flags); + ret = sbi_sse_read_attrs(event_id, attr, 1, &value); + sbiret_report_error(&ret, SBI_SUCCESS, "Get interrupted flags after set"); + report(value == flags, "interrupted flags modified value: 0x%lx", value); + } + + /* Write invalid bit in flag register */ + flags = SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT << 1; + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", + flags); + + flags = BIT(SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT + 1); + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", + flags); + + ret = sbi_sse_write_attrs(event_id, attr, 1, &prev_value); + sbiret_report_error(&ret, SBI_SUCCESS, "Restore interrupted flags"); + + /* Try to change HARTID/Priority while running */ + if (sbi_sse_event_is_global(event_id)) { + value = current_thread_info()->hartid; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "Set hart id while running error"); + } + + value = 0; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "Set priority while running error"); + + value = READ_ONCE(arg->expected_a6); + report(interrupted_state[2] == value, "Interrupted state a6, expected 0x%lx, got 0x%lx", + value, interrupted_state[2]); + + report(interrupted_state[3] == SBI_EXT_SSE, + "Interrupted state a7, expected 0x%x, got 0x%lx", SBI_EXT_SSE, + interrupted_state[3]); + + WRITE_ONCE(arg->done, true); +} + +static int sse_test_inject_simple(uint32_t event_id) +{ + unsigned long value, error; + struct sbiret ret; + int err_ret = 1; + struct sse_simple_test_arg test_arg = {.event_id = event_id}; + struct sbi_sse_handler_arg args = { + .handler = sse_simple_handler, + .handler_data = (void *)&test_arg, + .stack = sse_alloc_stack(), + }; + + report_prefix_push("simple"); + + if (!sse_check_state(event_id, SBI_SSE_STATE_UNUSED)) + goto err; + + ret = sbi_sse_register(event_id, &args); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "register")) + goto err; + + if (!sse_check_state(event_id, SBI_SSE_STATE_REGISTERED)) + goto err; + + if (sbi_sse_event_is_global(event_id)) { + /* Be sure global events are targeting the current hart */ + error = sse_global_event_set_current_hart(event_id); + if (error) + goto err; + } + + ret = sbi_sse_enable(event_id); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "enable")) + goto err; + + if (!sse_check_state(event_id, SBI_SSE_STATE_ENABLED)) + goto err; + + ret = sbi_sse_hart_mask(); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "hart mask")) + goto err; + + ret = sbi_sse_inject(event_id, current_thread_info()->hartid); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "injection masked")) { + sbi_sse_hart_unmask(); + goto err; + } + + report(READ_ONCE(test_arg.done) == 0, "event masked not handled"); + + /* + * When unmasking the SSE events, we expect it to be injected + * immediately so a6 should be SBI_EXT_SBI_SSE_HART_UNMASK + */ + WRITE_ONCE(test_arg.expected_a6, SBI_EXT_SSE_HART_UNMASK); + ret = sbi_sse_hart_unmask(); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "hart unmask")) { + goto err; + } + + report(READ_ONCE(test_arg.done) == 1, "event unmasked handled"); + WRITE_ONCE(test_arg.done, 0); + WRITE_ONCE(test_arg.expected_a6, SBI_EXT_SSE_INJECT); + + /* Set as oneshot and verify it is disabled */ + ret = sbi_sse_disable(event_id); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Disable event")) { + /* Nothing we can really do here, event can not be disabled */ + goto err; + } + + value = SBI_SSE_ATTR_CONFIG_ONESHOT; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_CONFIG, 1, &value); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Set event attribute as ONESHOT")) + goto err; + + ret = sbi_sse_enable(event_id); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Enable event")) + goto err; + + ret = sbi_sse_inject(event_id, current_thread_info()->hartid); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "second injection")) + goto err; + + report(READ_ONCE(test_arg.done) == 1, "event handled"); + WRITE_ONCE(test_arg.done, 0); + + if (!sse_check_state(event_id, SBI_SSE_STATE_REGISTERED)) + goto err; + + /* Clear ONESHOT FLAG */ + value = 0; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_CONFIG, 1, &value); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Clear CONFIG.ONESHOT flag")) + goto err; + + ret = sbi_sse_unregister(event_id); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "unregister")) + goto err; + + sse_check_state(event_id, SBI_SSE_STATE_UNUSED); + + err_ret = 0; +err: + sse_free_stack(args.stack); + report_prefix_pop(); + + return err_ret; +} + +struct sse_foreign_cpu_test_arg { + bool done; + unsigned int expected_cpu; + uint32_t event_id; +}; + +static void sse_foreign_cpu_handler(void *data, struct pt_regs *regs, unsigned int hartid) +{ + struct sse_foreign_cpu_test_arg *arg = data; + unsigned int expected_cpu; + + /* For arg content to be visible */ + smp_rmb(); + expected_cpu = READ_ONCE(arg->expected_cpu); + report(expected_cpu == current_thread_info()->cpu, + "Received event on CPU (%d), expected CPU (%d)", current_thread_info()->cpu, + expected_cpu); + + WRITE_ONCE(arg->done, true); + /* For arg update to be visible for other CPUs */ + smp_wmb(); +} + +struct sse_local_per_cpu { + struct sbi_sse_handler_arg args; + struct sbiret ret; + struct sse_foreign_cpu_test_arg handler_arg; +}; + +static void sse_register_enable_local(void *data) +{ + struct sbiret ret; + struct sse_local_per_cpu *cpu_args = data; + struct sse_local_per_cpu *cpu_arg = &cpu_args[current_thread_info()->cpu]; + uint32_t event_id = cpu_arg->handler_arg.event_id; + + ret = sbi_sse_register(event_id, &cpu_arg->args); + WRITE_ONCE(cpu_arg->ret, ret); + if (ret.error) + return; + + ret = sbi_sse_enable(event_id); + WRITE_ONCE(cpu_arg->ret, ret); +} + +static void sbi_sse_disable_unregister_local(void *data) +{ + struct sbiret ret; + struct sse_local_per_cpu *cpu_args = data; + struct sse_local_per_cpu *cpu_arg = &cpu_args[current_thread_info()->cpu]; + uint32_t event_id = cpu_arg->handler_arg.event_id; + + ret = sbi_sse_disable(event_id); + WRITE_ONCE(cpu_arg->ret, ret); + if (ret.error) + return; + + ret = sbi_sse_unregister(event_id); + WRITE_ONCE(cpu_arg->ret, ret); +} + +static int sse_test_inject_local(uint32_t event_id) +{ + int cpu; + int err_ret = 1; + struct sbiret ret; + struct sse_local_per_cpu *cpu_args, *cpu_arg; + struct sse_foreign_cpu_test_arg *handler_arg; + + cpu_args = calloc(NR_CPUS, sizeof(struct sbi_sse_handler_arg)); + + report_prefix_push("local_dispatch"); + for_each_online_cpu(cpu) { + cpu_arg = &cpu_args[cpu]; + cpu_arg->handler_arg.event_id = event_id; + cpu_arg->args.stack = sse_alloc_stack(); + cpu_arg->args.handler = sse_foreign_cpu_handler; + cpu_arg->args.handler_data = (void *)&cpu_arg->handler_arg; + } + + on_cpus(sse_register_enable_local, cpu_args); + for_each_online_cpu(cpu) { + cpu_arg = &cpu_args[cpu]; + ret = cpu_arg->ret; + if (ret.error) { + report_fail("CPU failed to register/enable event: %ld", ret.error); + goto err; + } + + handler_arg = &cpu_arg->handler_arg; + WRITE_ONCE(handler_arg->expected_cpu, cpu); + /* For handler_arg content to be visible for other CPUs */ + smp_wmb(); + ret = sbi_sse_inject(event_id, cpus[cpu].hartid); + if (ret.error) { + report_fail("CPU failed to inject event: %ld", ret.error); + goto err; + } + } + + for_each_online_cpu(cpu) { + handler_arg = &cpu_args[cpu].handler_arg; + smp_rmb(); + while (!READ_ONCE(handler_arg->done)) { + /* For handler_arg update to be visible */ + smp_rmb(); + cpu_relax(); + } + WRITE_ONCE(handler_arg->done, false); + } + + on_cpus(sbi_sse_disable_unregister_local, cpu_args); + for_each_online_cpu(cpu) { + cpu_arg = &cpu_args[cpu]; + ret = READ_ONCE(cpu_arg->ret); + if (ret.error) { + report_fail("CPU failed to disable/unregister event: %ld", ret.error); + goto err; + } + } + + err_ret = 0; +err: + for_each_online_cpu(cpu) { + cpu_arg = &cpu_args[cpu]; + sse_free_stack(cpu_arg->args.stack); + } + + report_pass("local event dispatch on all CPUs"); + report_prefix_pop(); + + return err_ret; +} + +static int sse_test_inject_global(uint32_t event_id) +{ + unsigned long value; + struct sbiret ret; + unsigned int cpu; + uint64_t timeout; + int err_ret = 1; + struct sse_foreign_cpu_test_arg test_arg = {.event_id = event_id}; + struct sbi_sse_handler_arg args = { + .handler = sse_foreign_cpu_handler, + .handler_data = (void *)&test_arg, + .stack = sse_alloc_stack(), + }; + enum sbi_sse_state state; + + report_prefix_push("global_dispatch"); + + ret = sbi_sse_register(event_id, &args); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Register event")) + goto err; + + for_each_online_cpu(cpu) { + WRITE_ONCE(test_arg.expected_cpu, cpu); + /* For test_arg content to be visible for other CPUs */ + smp_wmb(); + value = cpu; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Set preferred hart")) + goto err; + + ret = sbi_sse_enable(event_id); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Enable event")) + goto err; + + ret = sbi_sse_inject(event_id, cpu); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Inject event")) + goto err; + + smp_rmb(); + while (!READ_ONCE(test_arg.done)) { + /* For shared test_arg structure */ + smp_rmb(); + cpu_relax(); + } + + WRITE_ONCE(test_arg.done, false); + + timeout = timer_get_cycles() + usec_to_cycles((uint64_t)1000); + /* Wait for event to be back in ENABLED state */ + do { + ret = sse_event_get_state(event_id, &state); + if (ret.error) + goto err; + cpu_relax(); + } while (state != SBI_SSE_STATE_ENABLED || timer_get_cycles() < timeout); + + if (!report(state == SBI_SSE_STATE_ENABLED, + "wait for event to be in enable state")) + goto err; + + ret = sbi_sse_disable(event_id); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Disable event")) + goto err; + + report_pass("Global event on CPU %d", cpu); + } + + ret = sbi_sse_unregister(event_id); + sbiret_report_error(&ret, SBI_SUCCESS, "Unregister event"); + + err_ret = 0; + +err: + sse_free_stack(args.stack); + report_prefix_pop(); + + return err_ret; +} + +struct priority_test_arg { + uint32_t event_id; + bool called; + u32 prio; + struct priority_test_arg *next_event_arg; + void (*check_func)(struct priority_test_arg *arg); +}; + +static void sse_hi_priority_test_handler(void *arg, struct pt_regs *regs, + unsigned int hartid) +{ + struct priority_test_arg *targ = arg; + struct priority_test_arg *next = targ->next_event_arg; + + targ->called = true; + if (next) { + sbi_sse_inject(next->event_id, current_thread_info()->hartid); + + report(!sse_event_pending(next->event_id), "Higher priority event is not pending"); + report(next->called, "Higher priority event was handled"); + } +} + +static void sse_low_priority_test_handler(void *arg, struct pt_regs *regs, + unsigned int hartid) +{ + struct priority_test_arg *targ = arg; + struct priority_test_arg *next = targ->next_event_arg; + + targ->called = true; + + if (next) { + sbi_sse_inject(next->event_id, current_thread_info()->hartid); + + report(sse_event_pending(next->event_id), "Lower priority event is pending"); + report(!next->called, "Lower priority event %s was not handle before %s", + sse_event_name(next->event_id), sse_event_name(targ->event_id)); + } +} + +static void sse_test_injection_priority_arg(struct priority_test_arg *in_args, + unsigned int in_args_size, + sbi_sse_handler_fn handler, + const char *test_name) +{ + unsigned int i; + unsigned long value, uret; + struct sbiret ret; + uint32_t event_id; + struct priority_test_arg *arg; + unsigned int args_size = 0; + struct sbi_sse_handler_arg event_args[in_args_size]; + struct priority_test_arg *args[in_args_size]; + void *stack; + struct sbi_sse_handler_arg *event_arg; + + report_prefix_push(test_name); + + for (i = 0; i < in_args_size; i++) { + arg = &in_args[i]; + event_id = arg->event_id; + if (!sse_event_can_inject(event_id)) + continue; + + args[args_size] = arg; + args_size++; + event_args->stack = 0; + } + + if (!args_size) { + report_skip("No injectable events"); + goto skip; + } + + for (i = 0; i < args_size; i++) { + arg = args[i]; + event_id = arg->event_id; + stack = sse_alloc_stack(); + + event_arg = &event_args[i]; + event_arg->handler = handler; + event_arg->handler_data = (void *)arg; + event_arg->stack = stack; + + if (i < (args_size - 1)) + arg->next_event_arg = args[i + 1]; + else + arg->next_event_arg = NULL; + + /* Be sure global events are targeting the current hart */ + if (sbi_sse_event_is_global(event_id)) { + uret = sse_global_event_set_current_hart(event_id); + if (uret) + goto err; + } + + ret = sbi_sse_register(event_id, event_arg); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "register event 0x%x", event_id); + goto err; + } + + value = arg->prio; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "set event 0x%x priority", event_id); + goto err; + } + ret = sbi_sse_enable(event_id); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "enable event 0x%x", event_id); + goto err; + } + } + + /* Inject first event */ + ret = sbi_sse_inject(args[0]->event_id, current_thread_info()->hartid); + sbiret_report_error(&ret, SBI_SUCCESS, "injection"); + +err: + for (i = 0; i < args_size; i++) { + arg = args[i]; + event_id = arg->event_id; + + report(arg->called, "Event %s handler called", sse_event_name(arg->event_id)); + + ret = sbi_sse_disable(event_id); + if (ret.error) + sbiret_report_error(&ret, SBI_SUCCESS, "disable event 0x%x", event_id); + + sbi_sse_unregister(event_id); + if (ret.error) + sbiret_report_error(&ret, SBI_SUCCESS, "unregister event 0x%x", event_id); + + event_arg = &event_args[i]; + if (event_arg->stack) + sse_free_stack(event_arg->stack); + } + +skip: + report_prefix_pop(); +} + +static struct priority_test_arg hi_prio_args[] = { + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE}, + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE}, + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW}, + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP}, + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS}, +}; + +static struct priority_test_arg low_prio_args[] = { + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP}, + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW}, + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE}, + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE}, +}; + +static struct priority_test_arg prio_args[] = { + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE, .prio = 5}, + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE, .prio = 10}, + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, .prio = 12}, + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, .prio = 15}, + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, .prio = 20}, + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, .prio = 22}, + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, .prio = 25}, +}; + +static struct priority_test_arg same_prio_args[] = { + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, .prio = 0}, + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, .prio = 0}, + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, .prio = 10}, + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE, .prio = 10}, + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE, .prio = 10}, + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, .prio = 20}, + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, .prio = 20}, +}; + +static void sse_test_injection_priority(void) +{ + report_prefix_push("prio"); + + sse_test_injection_priority_arg(hi_prio_args, ARRAY_SIZE(hi_prio_args), + sse_hi_priority_test_handler, "high"); + + sse_test_injection_priority_arg(low_prio_args, ARRAY_SIZE(low_prio_args), + sse_low_priority_test_handler, "low"); + + sse_test_injection_priority_arg(prio_args, ARRAY_SIZE(prio_args), + sse_low_priority_test_handler, "changed"); + + sse_test_injection_priority_arg(same_prio_args, ARRAY_SIZE(same_prio_args), + sse_low_priority_test_handler, "same_prio_args"); + + report_prefix_pop(); +} + +static void test_invalid_event_id(unsigned long event_id) +{ + struct sbiret ret; + unsigned long value = 0; + + ret = sbi_sse_register_raw(event_id, (unsigned long) sbi_sse_entry, 0); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "register event_id 0x%lx", event_id); + + ret = sbi_sse_unregister(event_id); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "unregister event_id 0x%lx", event_id); + + ret = sbi_sse_enable(event_id); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "enable event_id 0x%lx", event_id); + + ret = sbi_sse_disable(event_id); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "disable event_id 0x%lx", event_id); + + ret = sbi_sse_inject(event_id, 0); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "inject event_id 0x%lx", event_id); + + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "write attr event_id 0x%lx", event_id); + + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "read attr event_id 0x%lx", event_id); +} + +static void sse_test_invalid_event_id(void) +{ + + report_prefix_push("event_id"); + + test_invalid_event_id(SBI_SSE_EVENT_LOCAL_RESERVED_0_START); + + report_prefix_pop(); +} + +static void sse_check_event_availability(uint32_t event_id, bool *can_inject, bool *supported) +{ + unsigned long status; + struct sbiret ret; + + *can_inject = false; + *supported = false; + + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, 1, &status); + if (ret.error != SBI_SUCCESS && ret.error != SBI_ERR_NOT_SUPPORTED) { + report_fail("Get event status != SBI_SUCCESS && != SBI_ERR_NOT_SUPPORTED: %ld", ret.error); + return; + } + if (ret.error == SBI_ERR_NOT_SUPPORTED) + return; + + if (!ret.error) + *supported = true; + + *can_inject = (status >> SBI_SSE_ATTR_STATUS_INJECT_OFFSET) & 1; +} + +static void sse_secondary_boot_and_unmask(void *data) +{ + sbi_sse_hart_unmask(); +} + +static void sse_check_mask(void) +{ + struct sbiret ret; + + /* Upon boot, event are masked, check that */ + ret = sbi_sse_hart_mask(); + sbiret_report_error(&ret, SBI_ERR_ALREADY_STOPPED, "hart mask at boot time"); + + ret = sbi_sse_hart_unmask(); + sbiret_report_error(&ret, SBI_SUCCESS, "hart unmask"); + ret = sbi_sse_hart_unmask(); + sbiret_report_error(&ret, SBI_ERR_ALREADY_STARTED, "hart unmask twice error"); + + ret = sbi_sse_hart_mask(); + sbiret_report_error(&ret, SBI_SUCCESS, "hart mask"); + ret = sbi_sse_hart_mask(); + sbiret_report_error(&ret, SBI_ERR_ALREADY_STOPPED, "hart mask twice"); +} + +static int run_inject_test(struct sse_event_info *info) +{ + unsigned long event_id = info->event_id; + + if (!info->can_inject) { + report_skip("Event does not support injection, skipping injection tests"); + return 0; + } + + if (sse_test_inject_simple(event_id)) + return 1; + + if (sbi_sse_event_is_global(event_id)) + return sse_test_inject_global(event_id); + else + return sse_test_inject_local(event_id); +} + +void check_sse(void) +{ + struct sse_event_info *info; + unsigned long i, event_id; + bool supported; + + report_prefix_push("sse"); + + if (!sbi_probe(SBI_EXT_SSE)) { + report_skip("extension not available"); + report_prefix_pop(); + return; + } + + sse_check_mask(); + + /* + * Dummy wakeup of all processors since some of them will be targeted + * by global events without going through the wakeup call as well as + * unmasking SSE events on all harts + */ + on_cpus(sse_secondary_boot_and_unmask, NULL); + + sse_test_invalid_event_id(); + + for (i = 0; i < ARRAY_SIZE(sse_event_infos); i++) { + info = &sse_event_infos[i]; + event_id = info->event_id; + report_prefix_push(info->name); + sse_check_event_availability(event_id, &info->can_inject, &supported); + if (!supported) { + report_skip("Event is not supported, skipping tests"); + report_prefix_pop(); + continue; + } + + sse_test_attrs(event_id); + sse_test_register_error(event_id); + + if (run_inject_test(info)) { + report_skip("Event test failed, event state unreliable"); + info->can_inject = false; + } + + report_prefix_pop(); + } + + sse_test_injection_priority(); + + report_prefix_pop(); +} diff --git a/riscv/sbi.c b/riscv/sbi.c index 0404bb81..478cb35d 100644 --- a/riscv/sbi.c +++ b/riscv/sbi.c @@ -32,6 +32,7 @@ #define HIGH_ADDR_BOUNDARY ((phys_addr_t)1 << 32) +void check_sse(void); void check_fwft(void); static long __labs(long a) @@ -1567,6 +1568,7 @@ int main(int argc, char **argv) check_hsm(); check_dbcn(); check_susp(); + check_sse(); check_fwft(); return report_summary(); -- 2.47.2 From s.paone at virtualopensystems.com Mon Mar 10 03:06:08 2025 From: s.paone at virtualopensystems.com (Samuele Paone) Date: Mon, 10 Mar 2025 11:06:08 +0100 Subject: [QUESTION] Understanding in-kernel irqchip (APLIC + IMSIC) behavior with multiple devices and interrupt handling In-Reply-To: <5e839bfd-16c0-45cc-b01a-6ae69f6a22cd@virtualopensystems.com> References: <5e839bfd-16c0-45cc-b01a-6ae69f6a22cd@virtualopensystems.com> Message-ID: Hello, I was just wondering if you had the chance to take a look at this? Regards, Samuele On 2/19/25 10:19 AM, Samuele Paone wrote: > Hello, > thanks for the reply > > I am asking that because we have the suspect that there is > a small error in the way edge triggered interrupts are emulated when a > device perform a write on the irqfd. > > The point is that in `virt/kvm/eventfd.c` in function `irqfd_wakeup`: > ``` > ??????? if (kvm_arch_set_irq_inatomic(&irq, kvm, > ????????????????????????? KVM_USERSPACE_IRQ_SOURCE_ID, 1, > ????????????????????????? false) == -EWOULDBLOCK) > ??????????? schedule_work(&irqfd->inject); > ``` > The level is always set to 1. In the case of an edge triggered interrupt, > I would expect the behavior of `&irqfd->inject`: > ``` > ??? if (!irqfd->resampler) { > ??????? kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd->gsi, 1, > ??????????????? false); > ??????? kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd->gsi, 0, > ??????????????? false); > ??? } else > ??????? kvm_set_irq(kvm, KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID, > ??????????????? irqfd->gsi, 1, false); > } > ``` > > The point is, that can never happen since `kvm_set_irq_inatomic` is > implemented > like this: > ``` > int kvm_arch_set_irq_inatomic(struct kvm_kernel_irq_routing_entry *e, > ????????????????? struct kvm *kvm, int irq_source_id, int level, > ????????????????? bool line_status) > { > ??? if (!level) > ??????? return -EWOULDBLOCK; > ... > } > ``` > Regards, > Samuele > > On 1/31/25 6:06 PM, Anup Patel wrote: >> On Tue, Jan 7, 2025 at 10:36?PM Samuele Paone >> wrote: >>> Hello everyone, >>> I have a few questions regarding the in-kernel irqchip >>> implementation, specifically in the context of APLIC and IMSIC for >>> RISC-V: >>> >>> 1. **Behavior with Multiple Devices**: Does the current implementation >>> support the creation of a VM with both PCI devices (that use MSI) and >>> MMIO devices (that use wired interrupts)? >> Yes, its supported. >> >> MSIs from both PCIe and Platform devices are processed by IMSIC. >> Wired interrupts are handled through APLIC MSI-mode where the >> APLIC converts interrupt state change into MSI writes. >> >>> If such a scenario is not supported, how can I configure an MMIO >>> device to use MSI interrupts? Would it all boil down to some device >>> tree configuration >>> (i.e., the MMIO device tree node "interrupts" property referencing the >>> IMSIC rather than the APLIC directly)? >> Platform (or MMIO) devices with MSI support don't need the "interrupts" >> DT property, instead the "msi-parent" DT property is used to point to >> the IMSICs. >> >>> 2. I find unclear how the variable `level` is used inside function >>> `kvm_riscv_aia_aplic_inject` in `arch/riscv/kvm/aia_aplic.c` >>> >>> ``` >>> switch (irqd->sourcecfg & APLIC_SOURCECFG_SM_MASK) { >>> case APLIC_SOURCECFG_SM_EDGE_RISE: >>> if (level && !(irqd->state & APLIC_IRQ_STATE_INPUT) && >>> ???? !(irqd->state & APLIC_IRQ_STATE_PENDING)) >>> irqd->state |= APLIC_IRQ_STATE_PENDING; >>> ???????????????? level = 0; >>> break; >>> case APLIC_SOURCECFG_SM_EDGE_FALL: >>> if (!level && (irqd->state & APLIC_IRQ_STATE_INPUT) && >>> ???? !(irqd->state & APLIC_IRQ_STATE_PENDING)) >>> irqd->state |= APLIC_IRQ_STATE_PENDING; >>> break; >>> case APLIC_SOURCECFG_SM_LEVEL_HIGH: >>> if (level && !(irqd->state & APLIC_IRQ_STATE_PENDING)) >>> irqd->state |= APLIC_IRQ_STATE_PENDING; >>> break; >>> case APLIC_SOURCECFG_SM_LEVEL_LOW: >>> if (!level && !(irqd->state & APLIC_IRQ_STATE_PENDING)) >>> irqd->state |= APLIC_IRQ_STATE_PENDING; >>> break; >>> } >>> >>> if (level) >>> irqd->state |= APLIC_IRQ_STATE_INPUT; >>> else >>> irqd->state &= ~APLIC_IRQ_STATE_INPUT; >>> ``` >>> >>> Considering `edge_rise` interrupts, as far as I can see, the >>> APLIC_IRQ_STATE_INPUT is set only in the code above >>> ? but prevents edge_rise interrupts from being set pending. >>> Moreover, in a system emulator implementation that uses in-kernel >>> IRQCHIP and >>> IrqFd, this variable is hard-coded (`virt/kvm/eventfd.c` in function >>> `irqfd_wakeup` when it calls `kvm_arch_set_irq_inatomic`) >>> in the kernel as 1. >> The INPUT bit in the IRQ state represents the raw input >> state whereas the PENDING bit in the IRQ state represents >> that given IRQ needs to be taken by CPU. >> >>> 3. **MSI-only Support in the irqchip**: The in-kernel irqchip >>> documentation states that it only supports MSI. However, the RISC-V >>> hardware documentation mentions that APLIC can take wired interrupts >>> as input and convert them to MSI to forward to IMSIC. Could someone >>> clarify what is meant by "MSI-only" support in this context? Are there >>> specific constraints or implementation details that limit the irqchip >>> to only support MSI for virtualization? >> The KVM in-kernel AIA virtualization only has HW acceleration for >> IMSIC whereas the APLIC MSI-mode is still trap-n-emulated. >> >>> 4. **Device tree node differences**: How would a device tree node >>> backed >>> by MSI interrupt differ from a device backed by a wired interrupt? >> It is differentiated using "msi-parent" and "interrupts" DT property. >> >> Regards, >> Anup From cleger at rivosinc.com Mon Mar 10 08:12:07 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 10 Mar 2025 16:12:07 +0100 Subject: [PATCH v3 00/17] riscv: add SBI FWFT misaligned exception delegation support Message-ID: <20250310151229.2365992-1-cleger@rivosinc.com> The SBI Firmware Feature extension allows the S-mode to request some specific features (either hardware or software) to be enabled. This series uses this extension to request misaligned access exception delegation to S-mode in order to let the kernel handle it. It also adds support for the KVM FWFT SBI extension based on the misaligned access handling infrastructure. FWFT SBI extension is part of the SBI V3.0 specifications [1]. It can be tested using the qemu provided at [2] which contains the series from [3]. kvm-unit-tests [4] can be used inside kvm to tests the correct delegation of misaligned exceptions. Upstream OpenSBI can be used. Note: Since SBI V3.0 is not yet ratified, FWFT extension API is split between interface only and implementation, allowing to pick only the interface which do not have hard dependencies on SBI. The tests can be run using the included kselftest: $ qemu-system-riscv64 \ -cpu rv64,trap-misaligned-access=true,v=true \ -M virt \ -m 1024M \ -bios fw_dynamic.bin \ -kernel Image ... # ./misaligned TAP version 13 1..23 # Starting 23 tests from 1 test cases. # RUN global.gp_load_lh ... # OK global.gp_load_lh ok 1 global.gp_load_lh # RUN global.gp_load_lhu ... # OK global.gp_load_lhu ok 2 global.gp_load_lhu # RUN global.gp_load_lw ... # OK global.gp_load_lw ok 3 global.gp_load_lw # RUN global.gp_load_lwu ... # OK global.gp_load_lwu ok 4 global.gp_load_lwu # RUN global.gp_load_ld ... # OK global.gp_load_ld ok 5 global.gp_load_ld # RUN global.gp_load_c_lw ... # OK global.gp_load_c_lw ok 6 global.gp_load_c_lw # RUN global.gp_load_c_ld ... # OK global.gp_load_c_ld ok 7 global.gp_load_c_ld # RUN global.gp_load_c_ldsp ... # OK global.gp_load_c_ldsp ok 8 global.gp_load_c_ldsp # RUN global.gp_load_sh ... # OK global.gp_load_sh ok 9 global.gp_load_sh # RUN global.gp_load_sw ... # OK global.gp_load_sw ok 10 global.gp_load_sw # RUN global.gp_load_sd ... # OK global.gp_load_sd ok 11 global.gp_load_sd # RUN global.gp_load_c_sw ... # OK global.gp_load_c_sw ok 12 global.gp_load_c_sw # RUN global.gp_load_c_sd ... # OK global.gp_load_c_sd ok 13 global.gp_load_c_sd # RUN global.gp_load_c_sdsp ... # OK global.gp_load_c_sdsp ok 14 global.gp_load_c_sdsp # RUN global.fpu_load_flw ... # OK global.fpu_load_flw ok 15 global.fpu_load_flw # RUN global.fpu_load_fld ... # OK global.fpu_load_fld ok 16 global.fpu_load_fld # RUN global.fpu_load_c_fld ... # OK global.fpu_load_c_fld ok 17 global.fpu_load_c_fld # RUN global.fpu_load_c_fldsp ... # OK global.fpu_load_c_fldsp ok 18 global.fpu_load_c_fldsp # RUN global.fpu_store_fsw ... # OK global.fpu_store_fsw ok 19 global.fpu_store_fsw # RUN global.fpu_store_fsd ... # OK global.fpu_store_fsd ok 20 global.fpu_store_fsd # RUN global.fpu_store_c_fsd ... # OK global.fpu_store_c_fsd ok 21 global.fpu_store_c_fsd # RUN global.fpu_store_c_fsdsp ... # OK global.fpu_store_c_fsdsp ok 22 global.fpu_store_c_fsdsp # RUN global.gen_sigbus ... [12797.988647] misaligned[618]: unhandled signal 7 code 0x1 at 0x0000000000014dc0 in misaligned[4dc0,10000+76000] [12797.988990] CPU: 0 UID: 0 PID: 618 Comm: misaligned Not tainted 6.13.0-rc6-00008-g4ec4468967c9-dirty #51 [12797.989169] Hardware name: riscv-virtio,qemu (DT) [12797.989264] epc : 0000000000014dc0 ra : 0000000000014d00 sp : 00007fffe165d100 [12797.989407] gp : 000000000008f6e8 tp : 0000000000095760 t0 : 0000000000000008 [12797.989544] t1 : 00000000000965d8 t2 : 000000000008e830 s0 : 00007fffe165d160 [12797.989692] s1 : 000000000000001a a0 : 0000000000000000 a1 : 0000000000000002 [12797.989831] a2 : 0000000000000000 a3 : 0000000000000000 a4 : ffffffffdeadbeef [12797.989964] a5 : 000000000008ef61 a6 : 626769735f6e0000 a7 : fffffffffffff000 [12797.990094] s2 : 0000000000000001 s3 : 00007fffe165d838 s4 : 00007fffe165d848 [12797.990238] s5 : 000000000000001a s6 : 0000000000010442 s7 : 0000000000010200 [12797.990391] s8 : 000000000000003a s9 : 0000000000094508 s10: 0000000000000000 [12797.990526] s11: 0000555567460668 t3 : 00007fffe165d070 t4 : 00000000000965d0 [12797.990656] t5 : fefefefefefefeff t6 : 0000000000000073 [12797.990756] status: 0000000200004020 badaddr: 000000000008ef61 cause: 0000000000000006 [12797.990911] Code: 8793 8791 3423 fcf4 3783 fc84 c737 dead 0713 eef7 (c398) 0001 # OK global.gen_sigbus ok 23 global.gen_sigbus # PASSED: 23 / 23 tests passed. # Totals: pass:23 fail:0 xfail:0 xpass:0 skip:0 error:0 With kvm-tools: # lkvm run -k sbi.flat -m 128 Info: # lkvm run -k sbi.flat -m 128 -c 1 --name guest-97 Info: Removed ghost socket file "/root/.lkvm//guest-97.sock". ########################################################################## # kvm-unit-tests ########################################################################## ... [test messages elided] PASS: sbi: fwft: FWFT extension probing no error PASS: sbi: fwft: get/set reserved feature 0x6 error == SBI_ERR_DENIED PASS: sbi: fwft: get/set reserved feature 0x3fffffff error == SBI_ERR_DENIED PASS: sbi: fwft: get/set reserved feature 0x80000000 error == SBI_ERR_DENIED PASS: sbi: fwft: get/set reserved feature 0xbfffffff error == SBI_ERR_DENIED PASS: sbi: fwft: misaligned_deleg: Get misaligned deleg feature no error PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature invalid value error PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature invalid value error PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature value no error PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature value 0 PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature value no error PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature value 1 PASS: sbi: fwft: misaligned_deleg: Verify misaligned load exception trap in supervisor SUMMARY: 50 tests, 2 unexpected failures, 12 skipped This series is available at [6]. Link: https://github.com/riscv-non-isa/riscv-sbi-doc/releases/download/vv3.0-rc2/riscv-sbi.pdf [1] Link: https://github.com/rivosinc/qemu/tree/dev/cleger/misaligned [2] Link: https://lore.kernel.org/all/20241211211933.198792-3-fkonrad at amd.com/T/ [3] Link: https://github.com/clementleger/kvm-unit-tests/tree/dev/cleger/fwft_v1 [4] Link: https://github.com/clementleger/unaligned_test [5] Link: https://github.com/rivosinc/linux/tree/dev/cleger/fwft_v1 [6] --- V3: - Added comment about kvm sbi fwft supported/set/get callback requirements - Move struct kvm_sbi_fwft_feature in kvm_sbi_fwft.c - Add a FWFT interface V2: - Added Kselftest for misaligned testing - Added get_user() usage instead of __get_user() - Reenable interrupt when possible in misaligned access handling - Document that riscv supports unaligned-traps - Fix KVM extension state when an init function is present - Rework SBI misaligned accesses trap delegation code - Added support for CPU hotplugging - Added KVM SBI reset callback - Added reset for KVM SBI FWFT lock - Return SBI_ERR_DENIED_LOCKED when LOCK flag is set Cl?ment L?ger (17): riscv: add Firmware Feature (FWFT) SBI extensions definitions riscv: sbi: add FWFT extension interface riscv: sbi: add SBI FWFT extension calls riscv: misaligned: request misaligned exception from SBI riscv: misaligned: use on_each_cpu() for scalar misaligned access probing riscv: misaligned: use correct CONFIG_ ifdef for misaligned_access_speed riscv: misaligned: move emulated access uniformity check in a function riscv: misaligned: add a function to check misalign trap delegability riscv: misaligned: factorize trap handling riscv: misaligned: enable IRQs while handling misaligned accesses riscv: misaligned: use get_user() instead of __get_user() Documentation/sysctl: add riscv to unaligned-trap supported archs selftests: riscv: add misaligned access testing RISC-V: KVM: add SBI extension init()/deinit() functions RISC-V: KVM: add SBI extension reset callback RISC-V: KVM: add support for FWFT SBI extension RISC-V: KVM: add support for SBI_FWFT_MISALIGNED_DELEG Documentation/admin-guide/sysctl/kernel.rst | 4 +- arch/riscv/include/asm/cpufeature.h | 8 +- arch/riscv/include/asm/kvm_host.h | 5 +- arch/riscv/include/asm/kvm_vcpu_sbi.h | 12 + arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h | 31 +++ arch/riscv/include/asm/sbi.h | 38 +++ arch/riscv/include/uapi/asm/kvm.h | 1 + arch/riscv/kernel/sbi.c | 123 +++++++++ arch/riscv/kernel/traps.c | 57 ++-- arch/riscv/kernel/traps_misaligned.c | 119 +++++++- arch/riscv/kernel/unaligned_access_speed.c | 11 +- arch/riscv/kvm/Makefile | 1 + arch/riscv/kvm/vcpu.c | 7 +- arch/riscv/kvm/vcpu_sbi.c | 57 ++++ arch/riscv/kvm/vcpu_sbi_fwft.c | 251 +++++++++++++++++ arch/riscv/kvm/vcpu_sbi_sta.c | 3 +- .../selftests/riscv/misaligned/.gitignore | 1 + .../selftests/riscv/misaligned/Makefile | 12 + .../selftests/riscv/misaligned/common.S | 33 +++ .../testing/selftests/riscv/misaligned/fpu.S | 180 +++++++++++++ tools/testing/selftests/riscv/misaligned/gp.S | 103 +++++++ .../selftests/riscv/misaligned/misaligned.c | 254 ++++++++++++++++++ 22 files changed, 1264 insertions(+), 47 deletions(-) create mode 100644 arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h create mode 100644 arch/riscv/kvm/vcpu_sbi_fwft.c create mode 100644 tools/testing/selftests/riscv/misaligned/.gitignore create mode 100644 tools/testing/selftests/riscv/misaligned/Makefile create mode 100644 tools/testing/selftests/riscv/misaligned/common.S create mode 100644 tools/testing/selftests/riscv/misaligned/fpu.S create mode 100644 tools/testing/selftests/riscv/misaligned/gp.S create mode 100644 tools/testing/selftests/riscv/misaligned/misaligned.c -- 2.47.2 From cleger at rivosinc.com Mon Mar 10 08:12:08 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 10 Mar 2025 16:12:08 +0100 Subject: [PATCH v3 01/17] riscv: add Firmware Feature (FWFT) SBI extensions definitions In-Reply-To: <20250310151229.2365992-1-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> Message-ID: <20250310151229.2365992-2-cleger@rivosinc.com> The Firmware Features extension (FWFT) was added as part of the SBI 3.0 specification. Add SBI definitions to use this extension. Signed-off-by: Cl?ment L?ger Reviewed-by: Samuel Holland Tested-by: Samuel Holland Reviewed-by: Deepak Gupta --- arch/riscv/include/asm/sbi.h | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h index 3d250824178b..bb077d0c912f 100644 --- a/arch/riscv/include/asm/sbi.h +++ b/arch/riscv/include/asm/sbi.h @@ -35,6 +35,7 @@ enum sbi_ext_id { SBI_EXT_DBCN = 0x4442434E, SBI_EXT_STA = 0x535441, SBI_EXT_NACL = 0x4E41434C, + SBI_EXT_FWFT = 0x46574654, /* Experimentals extensions must lie within this range */ SBI_EXT_EXPERIMENTAL_START = 0x08000000, @@ -402,6 +403,33 @@ enum sbi_ext_nacl_feature { #define SBI_NACL_SHMEM_SRET_X(__i) ((__riscv_xlen / 8) * (__i)) #define SBI_NACL_SHMEM_SRET_X_LAST 31 +/* SBI function IDs for FW feature extension */ +#define SBI_EXT_FWFT_SET 0x0 +#define SBI_EXT_FWFT_GET 0x1 + +enum sbi_fwft_feature_t { + SBI_FWFT_MISALIGNED_EXC_DELEG = 0x0, + SBI_FWFT_LANDING_PAD = 0x1, + SBI_FWFT_SHADOW_STACK = 0x2, + SBI_FWFT_DOUBLE_TRAP = 0x3, + SBI_FWFT_PTE_AD_HW_UPDATING = 0x4, + SBI_FWFT_POINTER_MASKING_PMLEN = 0x5, + SBI_FWFT_LOCAL_RESERVED_START = 0x6, + SBI_FWFT_LOCAL_RESERVED_END = 0x3fffffff, + SBI_FWFT_LOCAL_PLATFORM_START = 0x40000000, + SBI_FWFT_LOCAL_PLATFORM_END = 0x7fffffff, + + SBI_FWFT_GLOBAL_RESERVED_START = 0x80000000, + SBI_FWFT_GLOBAL_RESERVED_END = 0xbfffffff, + SBI_FWFT_GLOBAL_PLATFORM_START = 0xc0000000, + SBI_FWFT_GLOBAL_PLATFORM_END = 0xffffffff, +}; + +#define SBI_FWFT_PLATFORM_FEATURE_BIT BIT(30) +#define SBI_FWFT_GLOBAL_FEATURE_BIT BIT(31) + +#define SBI_FWFT_SET_FLAG_LOCK BIT(0) + /* SBI spec version fields */ #define SBI_SPEC_VERSION_DEFAULT 0x1 #define SBI_SPEC_VERSION_MAJOR_SHIFT 24 @@ -419,6 +447,11 @@ enum sbi_ext_nacl_feature { #define SBI_ERR_ALREADY_STARTED -7 #define SBI_ERR_ALREADY_STOPPED -8 #define SBI_ERR_NO_SHMEM -9 +#define SBI_ERR_INVALID_STATE -10 +#define SBI_ERR_BAD_RANGE -11 +#define SBI_ERR_TIMEOUT -12 +#define SBI_ERR_IO -13 +#define SBI_ERR_DENIED_LOCKED -14 extern unsigned long sbi_spec_version; struct sbiret { -- 2.47.2 From cleger at rivosinc.com Mon Mar 10 08:12:09 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 10 Mar 2025 16:12:09 +0100 Subject: [PATCH v3 02/17] riscv: sbi: add FWFT extension interface In-Reply-To: <20250310151229.2365992-1-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> Message-ID: <20250310151229.2365992-3-cleger@rivosinc.com> This SBI extensions enables supervisor mode to control feature that are under M-mode control (For instance, Svadu menvcfg ADUE bit, Ssdbltrp DTE, etc). Signed-off-by: Cl?ment L?ger --- arch/riscv/include/asm/sbi.h | 5 ++ arch/riscv/kernel/sbi.c | 97 ++++++++++++++++++++++++++++++++++++ 2 files changed, 102 insertions(+) diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h index bb077d0c912f..fc87c609c11a 100644 --- a/arch/riscv/include/asm/sbi.h +++ b/arch/riscv/include/asm/sbi.h @@ -503,6 +503,11 @@ int sbi_remote_hfence_vvma_asid(const struct cpumask *cpu_mask, unsigned long asid); long sbi_probe_extension(int ext); +int sbi_fwft_all_cpus_set(u32 feature, unsigned long value, unsigned long flags, + bool revert_on_failure); +int sbi_fwft_get(u32 feature, unsigned long *value); +int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags); + /* Check if current SBI specification version is 0.1 or not */ static inline int sbi_spec_is_0_1(void) { diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c index 1989b8cade1b..256910db1307 100644 --- a/arch/riscv/kernel/sbi.c +++ b/arch/riscv/kernel/sbi.c @@ -299,6 +299,103 @@ static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask, return 0; } +int sbi_fwft_get(u32 feature, unsigned long *value) +{ + return -EOPNOTSUPP; +} + +/** + * sbi_fwft_set() - Set a feature on all online cpus + * @feature: The feature to be set + * @value: The feature value to be set + * @flags: FWFT feature set flags + * + * Return: 0 on success, appropriate linux error code otherwise. + */ +int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags) +{ + return -EOPNOTSUPP; +} + +struct fwft_set_req { + u32 feature; + unsigned long value; + unsigned long flags; + cpumask_t mask; +}; + +static void cpu_sbi_fwft_set(void *arg) +{ + struct fwft_set_req *req = arg; + + if (sbi_fwft_set(req->feature, req->value, req->flags)) + cpumask_clear_cpu(smp_processor_id(), &req->mask); +} + +static int sbi_fwft_feature_local_set(u32 feature, unsigned long value, + unsigned long flags, + bool revert_on_fail) +{ + int ret; + unsigned long prev_value; + cpumask_t tmp; + struct fwft_set_req req = { + .feature = feature, + .value = value, + .flags = flags, + }; + + cpumask_copy(&req.mask, cpu_online_mask); + + /* We can not revert if features are locked */ + if (revert_on_fail && flags & SBI_FWFT_SET_FLAG_LOCK) + return -EINVAL; + + /* Reset value is the same for all cpus, read it once. */ + ret = sbi_fwft_get(feature, &prev_value); + if (ret) + return ret; + + /* Feature might already be set to the value we want */ + if (prev_value == value) + return 0; + + on_each_cpu_mask(&req.mask, cpu_sbi_fwft_set, &req, 1); + if (cpumask_equal(&req.mask, cpu_online_mask)) + return 0; + + pr_err("Failed to set feature %x for all online cpus, reverting\n", + feature); + + req.value = prev_value; + cpumask_copy(&tmp, &req.mask); + on_each_cpu_mask(&req.mask, cpu_sbi_fwft_set, &req, 1); + if (cpumask_equal(&req.mask, &tmp)) + return 0; + + return -EINVAL; +} + +/** + * sbi_fwft_all_cpus_set() - Set a feature on all online cpus + * @feature: The feature to be set + * @value: The feature value to be set + * @flags: FWFT feature set flags + * @revert_on_fail: true if feature value should be restored to it's orignal + * value on failure. + * + * Return: 0 on success, appropriate linux error code otherwise. + */ +int sbi_fwft_all_cpus_set(u32 feature, unsigned long value, unsigned long flags, + bool revert_on_fail) +{ + if (feature & SBI_FWFT_GLOBAL_FEATURE_BIT) + return sbi_fwft_set(feature, value, flags); + + return sbi_fwft_feature_local_set(feature, value, flags, + revert_on_fail); +} + /** * sbi_set_timer() - Program the timer for next timer event. * @stime_value: The value after which next timer event should fire. -- 2.47.2 From cleger at rivosinc.com Mon Mar 10 08:12:10 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 10 Mar 2025 16:12:10 +0100 Subject: [PATCH v3 03/17] riscv: sbi: add SBI FWFT extension calls In-Reply-To: <20250310151229.2365992-1-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> Message-ID: <20250310151229.2365992-4-cleger@rivosinc.com> Add FWFT extension calls. This will be ratified in SBI V3.0 hence, it is provided as a separate commit that can be left out if needed. Signed-off-by: Cl?ment L?ger --- arch/riscv/kernel/sbi.c | 30 ++++++++++++++++++++++++++++-- 1 file changed, 28 insertions(+), 2 deletions(-) diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c index 256910db1307..af8e2199e32d 100644 --- a/arch/riscv/kernel/sbi.c +++ b/arch/riscv/kernel/sbi.c @@ -299,9 +299,19 @@ static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask, return 0; } +static bool sbi_fwft_supported; + int sbi_fwft_get(u32 feature, unsigned long *value) { - return -EOPNOTSUPP; + struct sbiret ret; + + if (!sbi_fwft_supported) + return -EOPNOTSUPP; + + ret = sbi_ecall(SBI_EXT_FWFT, SBI_EXT_FWFT_GET, + feature, 0, 0, 0, 0, 0); + + return sbi_err_map_linux_errno(ret.error); } /** @@ -314,7 +324,15 @@ int sbi_fwft_get(u32 feature, unsigned long *value) */ int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags) { - return -EOPNOTSUPP; + struct sbiret ret; + + if (!sbi_fwft_supported) + return -EOPNOTSUPP; + + ret = sbi_ecall(SBI_EXT_FWFT, SBI_EXT_FWFT_SET, + feature, value, flags, 0, 0, 0); + + return sbi_err_map_linux_errno(ret.error); } struct fwft_set_req { @@ -389,6 +407,9 @@ static int sbi_fwft_feature_local_set(u32 feature, unsigned long value, int sbi_fwft_all_cpus_set(u32 feature, unsigned long value, unsigned long flags, bool revert_on_fail) { + if (!sbi_fwft_supported) + return -EOPNOTSUPP; + if (feature & SBI_FWFT_GLOBAL_FEATURE_BIT) return sbi_fwft_set(feature, value, flags); @@ -719,6 +740,11 @@ void __init sbi_init(void) pr_info("SBI DBCN extension detected\n"); sbi_debug_console_available = true; } + if ((sbi_spec_version >= sbi_mk_version(2, 0)) && + (sbi_probe_extension(SBI_EXT_FWFT) > 0)) { + pr_info("SBI FWFT extension detected\n"); + sbi_fwft_supported = true; + } } else { __sbi_set_timer = __sbi_set_timer_v01; __sbi_send_ipi = __sbi_send_ipi_v01; -- 2.47.2 From cleger at rivosinc.com Mon Mar 10 08:12:11 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 10 Mar 2025 16:12:11 +0100 Subject: [PATCH v3 04/17] riscv: misaligned: request misaligned exception from SBI In-Reply-To: <20250310151229.2365992-1-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> Message-ID: <20250310151229.2365992-5-cleger@rivosinc.com> Now that the kernel can handle misaligned accesses in S-mode, request misaligned access exception delegation from SBI. This uses the FWFT SBI extension defined in SBI version 3.0. Signed-off-by: Cl?ment L?ger --- arch/riscv/include/asm/cpufeature.h | 3 +- arch/riscv/kernel/traps_misaligned.c | 77 +++++++++++++++++++++- arch/riscv/kernel/unaligned_access_speed.c | 11 +++- 3 files changed, 86 insertions(+), 5 deletions(-) diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h index 569140d6e639..ad7d26788e6a 100644 --- a/arch/riscv/include/asm/cpufeature.h +++ b/arch/riscv/include/asm/cpufeature.h @@ -64,8 +64,9 @@ void __init riscv_user_isa_enable(void); _RISCV_ISA_EXT_DATA(_name, _id, _sub_exts, ARRAY_SIZE(_sub_exts), _validate) bool check_unaligned_access_emulated_all_cpus(void); +void unaligned_access_init(void); +int cpu_online_unaligned_access_init(unsigned int cpu); #if defined(CONFIG_RISCV_SCALAR_MISALIGNED) -void check_unaligned_access_emulated(struct work_struct *work __always_unused); void unaligned_emulation_finish(void); bool unaligned_ctl_available(void); DECLARE_PER_CPU(long, misaligned_access_speed); diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c index 7cc108aed74e..90ac74191357 100644 --- a/arch/riscv/kernel/traps_misaligned.c +++ b/arch/riscv/kernel/traps_misaligned.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #define INSN_MATCH_LB 0x3 @@ -635,7 +636,7 @@ bool check_vector_unaligned_access_emulated_all_cpus(void) static bool unaligned_ctl __read_mostly; -void check_unaligned_access_emulated(struct work_struct *work __always_unused) +static void check_unaligned_access_emulated(struct work_struct *work __always_unused) { int cpu = smp_processor_id(); long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu); @@ -646,6 +647,13 @@ void check_unaligned_access_emulated(struct work_struct *work __always_unused) __asm__ __volatile__ ( " "REG_L" %[tmp], 1(%[ptr])\n" : [tmp] "=r" (tmp_val) : [ptr] "r" (&tmp_var) : "memory"); +} + +static int cpu_online_check_unaligned_access_emulated(unsigned int cpu) +{ + long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu); + + check_unaligned_access_emulated(NULL); /* * If unaligned_ctl is already set, this means that we detected that all @@ -654,9 +662,10 @@ void check_unaligned_access_emulated(struct work_struct *work __always_unused) */ if (unlikely(unaligned_ctl && (*mas_ptr != RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED))) { pr_crit("CPU misaligned accesses non homogeneous (expected all emulated)\n"); - while (true) - cpu_relax(); + return -EINVAL; } + + return 0; } bool check_unaligned_access_emulated_all_cpus(void) @@ -688,4 +697,66 @@ bool check_unaligned_access_emulated_all_cpus(void) { return false; } +static int cpu_online_check_unaligned_access_emulated(unsigned int cpu) +{ + return 0; +} #endif + +#ifdef CONFIG_RISCV_SBI + +static bool misaligned_traps_delegated; + +static int cpu_online_sbi_unaligned_setup(unsigned int cpu) +{ + if (sbi_fwft_set(SBI_FWFT_MISALIGNED_EXC_DELEG, 1, 0) && + misaligned_traps_delegated) { + pr_crit("Misaligned trap delegation non homogeneous (expected delegated)"); + return -EINVAL; + } + + return 0; +} + +static void unaligned_sbi_request_delegation(void) +{ + int ret; + + ret = sbi_fwft_all_cpus_set(SBI_FWFT_MISALIGNED_EXC_DELEG, 1, 0, 0); + if (ret) + return; + + misaligned_traps_delegated = true; + pr_info("SBI misaligned access exception delegation ok\n"); + /* + * Note that we don't have to take any specific action here, if + * the delegation is successful, then + * check_unaligned_access_emulated() will verify that indeed the + * platform traps on misaligned accesses. + */ +} + +void unaligned_access_init(void) +{ + if (sbi_probe_extension(SBI_EXT_FWFT) > 0) + unaligned_sbi_request_delegation(); +} +#else +void unaligned_access_init(void) {} + +static int cpu_online_sbi_unaligned_setup(unsigned int cpu __always_unused) +{ + return 0; +} +#endif + +int cpu_online_unaligned_access_init(unsigned int cpu) +{ + int ret; + + ret = cpu_online_sbi_unaligned_setup(cpu); + if (ret) + return ret; + + return cpu_online_check_unaligned_access_emulated(cpu); +} diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel/unaligned_access_speed.c index 91f189cf1611..2f3aba073297 100644 --- a/arch/riscv/kernel/unaligned_access_speed.c +++ b/arch/riscv/kernel/unaligned_access_speed.c @@ -188,13 +188,20 @@ arch_initcall_sync(lock_and_set_unaligned_access_static_branch); static int riscv_online_cpu(unsigned int cpu) { + int ret; static struct page *buf; /* We are already set since the last check */ if (per_cpu(misaligned_access_speed, cpu) != RISCV_HWPROBE_MISALIGNED_SCALAR_UNKNOWN) goto exit; - check_unaligned_access_emulated(NULL); + ret = cpu_online_unaligned_access_init(cpu); + if (ret) + return ret; + + if (per_cpu(misaligned_access_speed, cpu) == RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED) + goto exit; + buf = alloc_pages(GFP_KERNEL, MISALIGNED_BUFFER_ORDER); if (!buf) { pr_warn("Allocation failure, not measuring misaligned performance\n"); @@ -403,6 +410,8 @@ static int check_unaligned_access_all_cpus(void) { bool all_cpus_emulated, all_cpus_vec_unsupported; + unaligned_access_init(); + all_cpus_emulated = check_unaligned_access_emulated_all_cpus(); all_cpus_vec_unsupported = check_vector_unaligned_access_emulated_all_cpus(); -- 2.47.2 From cleger at rivosinc.com Mon Mar 10 08:12:12 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 10 Mar 2025 16:12:12 +0100 Subject: [PATCH v3 05/17] riscv: misaligned: use on_each_cpu() for scalar misaligned access probing In-Reply-To: <20250310151229.2365992-1-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> Message-ID: <20250310151229.2365992-6-cleger@rivosinc.com> schedule_on_each_cpu() was used without any good reason while documented as very slow. This call was in the boot path, so better use on_each_cpu() for scalar misaligned checking. Vector misaligned check still needs to use schedule_on_each_cpu() since it requires irqs to be enabled but that's less of a problem since this code is ran in a kthread. Add a comment to explicit that. Signed-off-by: Cl?ment L?ger --- arch/riscv/kernel/traps_misaligned.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c index 90ac74191357..ffac424faa88 100644 --- a/arch/riscv/kernel/traps_misaligned.c +++ b/arch/riscv/kernel/traps_misaligned.c @@ -616,6 +616,11 @@ bool check_vector_unaligned_access_emulated_all_cpus(void) return false; } + /* + * While being documented as very slow, schedule_on_each_cpu() is used + * since kernel_vector_begin() that is called inside the vector code + * expects irqs to be enabled or it will panic(). + */ schedule_on_each_cpu(check_vector_unaligned_access_emulated); for_each_online_cpu(cpu) @@ -636,7 +641,7 @@ bool check_vector_unaligned_access_emulated_all_cpus(void) static bool unaligned_ctl __read_mostly; -static void check_unaligned_access_emulated(struct work_struct *work __always_unused) +static void check_unaligned_access_emulated(void *arg __always_unused) { int cpu = smp_processor_id(); long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu); @@ -677,7 +682,7 @@ bool check_unaligned_access_emulated_all_cpus(void) * accesses emulated since tasks requesting such control can run on any * CPU. */ - schedule_on_each_cpu(check_unaligned_access_emulated); + on_each_cpu(check_unaligned_access_emulated, NULL, 1); for_each_online_cpu(cpu) if (per_cpu(misaligned_access_speed, cpu) -- 2.47.2 From cleger at rivosinc.com Mon Mar 10 08:12:13 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 10 Mar 2025 16:12:13 +0100 Subject: [PATCH v3 06/17] riscv: misaligned: use correct CONFIG_ ifdef for misaligned_access_speed In-Reply-To: <20250310151229.2365992-1-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> Message-ID: <20250310151229.2365992-7-cleger@rivosinc.com> misaligned_access_speed is defined under CONFIG_RISCV_SCALAR_MISALIGNED but was used under CONFIG_RISCV_PROBE_UNALIGNED_ACCESS. Fix that by using the correct config option. Signed-off-by: Cl?ment L?ger --- arch/riscv/kernel/traps_misaligned.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c index ffac424faa88..7fe25adf2539 100644 --- a/arch/riscv/kernel/traps_misaligned.c +++ b/arch/riscv/kernel/traps_misaligned.c @@ -362,7 +362,7 @@ static int handle_scalar_misaligned_load(struct pt_regs *regs) perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS, 1, regs, addr); -#ifdef CONFIG_RISCV_PROBE_UNALIGNED_ACCESS +#ifdef CONFIG_RISCV_SCALAR_MISALIGNED *this_cpu_ptr(&misaligned_access_speed) = RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED; #endif -- 2.47.2 From cleger at rivosinc.com Mon Mar 10 08:12:14 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 10 Mar 2025 16:12:14 +0100 Subject: [PATCH v3 07/17] riscv: misaligned: move emulated access uniformity check in a function In-Reply-To: <20250310151229.2365992-1-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> Message-ID: <20250310151229.2365992-8-cleger@rivosinc.com> Split the code that check for the uniformity of misaligned accesses performance on all cpus from check_unaligned_access_emulated_all_cpus() to its own function which will be used for delegation check. No functional changes intended. Signed-off-by: Cl?ment L?ger --- arch/riscv/kernel/traps_misaligned.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c index 7fe25adf2539..db31966a834e 100644 --- a/arch/riscv/kernel/traps_misaligned.c +++ b/arch/riscv/kernel/traps_misaligned.c @@ -673,10 +673,20 @@ static int cpu_online_check_unaligned_access_emulated(unsigned int cpu) return 0; } -bool check_unaligned_access_emulated_all_cpus(void) +static bool all_cpus_unaligned_scalar_access_emulated(void) { int cpu; + for_each_online_cpu(cpu) + if (per_cpu(misaligned_access_speed, cpu) != + RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED) + return false; + + return true; +} + +bool check_unaligned_access_emulated_all_cpus(void) +{ /* * We can only support PR_UNALIGN controls if all CPUs have misaligned * accesses emulated since tasks requesting such control can run on any @@ -684,10 +694,8 @@ bool check_unaligned_access_emulated_all_cpus(void) */ on_each_cpu(check_unaligned_access_emulated, NULL, 1); - for_each_online_cpu(cpu) - if (per_cpu(misaligned_access_speed, cpu) - != RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED) - return false; + if (!all_cpus_unaligned_scalar_access_emulated()) + return false; unaligned_ctl = true; return true; -- 2.47.2 From cleger at rivosinc.com Mon Mar 10 08:12:15 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 10 Mar 2025 16:12:15 +0100 Subject: [PATCH v3 08/17] riscv: misaligned: add a function to check misalign trap delegability In-Reply-To: <20250310151229.2365992-1-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> Message-ID: <20250310151229.2365992-9-cleger@rivosinc.com> Checking for the delegability of the misaligned access trap is needed for the KVM FWFT extension implementation. Add a function to get the delegability of the misaligned trap exception. Signed-off-by: Cl?ment L?ger --- arch/riscv/include/asm/cpufeature.h | 5 +++++ arch/riscv/kernel/traps_misaligned.c | 17 +++++++++++++++-- 2 files changed, 20 insertions(+), 2 deletions(-) diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h index ad7d26788e6a..8b97cba99fc3 100644 --- a/arch/riscv/include/asm/cpufeature.h +++ b/arch/riscv/include/asm/cpufeature.h @@ -69,12 +69,17 @@ int cpu_online_unaligned_access_init(unsigned int cpu); #if defined(CONFIG_RISCV_SCALAR_MISALIGNED) void unaligned_emulation_finish(void); bool unaligned_ctl_available(void); +bool misaligned_traps_can_delegate(void); DECLARE_PER_CPU(long, misaligned_access_speed); #else static inline bool unaligned_ctl_available(void) { return false; } +static inline bool misaligned_traps_can_delegate(void) +{ + return false; +} #endif bool check_vector_unaligned_access_emulated_all_cpus(void); diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c index db31966a834e..a67a6e709a06 100644 --- a/arch/riscv/kernel/traps_misaligned.c +++ b/arch/riscv/kernel/traps_misaligned.c @@ -716,10 +716,10 @@ static int cpu_online_check_unaligned_access_emulated(unsigned int cpu) } #endif -#ifdef CONFIG_RISCV_SBI - static bool misaligned_traps_delegated; +#ifdef CONFIG_RISCV_SBI + static int cpu_online_sbi_unaligned_setup(unsigned int cpu) { if (sbi_fwft_set(SBI_FWFT_MISALIGNED_EXC_DELEG, 1, 0) && @@ -761,6 +761,7 @@ static int cpu_online_sbi_unaligned_setup(unsigned int cpu __always_unused) { return 0; } + #endif int cpu_online_unaligned_access_init(unsigned int cpu) @@ -773,3 +774,15 @@ int cpu_online_unaligned_access_init(unsigned int cpu) return cpu_online_check_unaligned_access_emulated(cpu); } + +bool misaligned_traps_can_delegate(void) +{ + /* + * Either we successfully requested misaligned traps delegation for all + * CPUS or the SBI does not implemented FWFT extension but delegated the + * exception by default. + */ + return misaligned_traps_delegated || + all_cpus_unaligned_scalar_access_emulated(); +} +EXPORT_SYMBOL_GPL(misaligned_traps_can_delegate); \ No newline at end of file -- 2.47.2 From cleger at rivosinc.com Mon Mar 10 08:12:16 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 10 Mar 2025 16:12:16 +0100 Subject: [PATCH v3 09/17] riscv: misaligned: factorize trap handling In-Reply-To: <20250310151229.2365992-1-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> Message-ID: <20250310151229.2365992-10-cleger@rivosinc.com> misaligned accesses traps are not nmi and should be treated as normal one using irqentry_enter()/exit(). Since both load/store and user/kernel should use almost the same path and that we are going to add some code around that, factorize it. Signed-off-by: Cl?ment L?ger --- arch/riscv/kernel/traps.c | 49 ++++++++++++++++----------------------- 1 file changed, 20 insertions(+), 29 deletions(-) diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 8ff8e8b36524..55d9f3450398 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -198,47 +198,38 @@ asmlinkage __visible __trap_section void do_trap_insn_illegal(struct pt_regs *re DO_ERROR_INFO(do_trap_load_fault, SIGSEGV, SEGV_ACCERR, "load access fault"); -asmlinkage __visible __trap_section void do_trap_load_misaligned(struct pt_regs *regs) +enum misaligned_access_type { + MISALIGNED_STORE, + MISALIGNED_LOAD, +}; + +static void do_trap_misaligned(struct pt_regs *regs, enum misaligned_access_type type) { - if (user_mode(regs)) { - irqentry_enter_from_user_mode(regs); + irqentry_state_t state = irqentry_enter(regs); + if (type == MISALIGNED_LOAD) { if (handle_misaligned_load(regs)) do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc, - "Oops - load address misaligned"); - - irqentry_exit_to_user_mode(regs); + "Oops - load address misaligned"); } else { - irqentry_state_t state = irqentry_nmi_enter(regs); - - if (handle_misaligned_load(regs)) + if (handle_misaligned_store(regs)) do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc, - "Oops - load address misaligned"); - - irqentry_nmi_exit(regs, state); + "Oops - store (or AMO) address misaligned"); } + + irqentry_exit(regs, state); } -asmlinkage __visible __trap_section void do_trap_store_misaligned(struct pt_regs *regs) +asmlinkage __visible __trap_section void do_trap_load_misaligned(struct pt_regs *regs) { - if (user_mode(regs)) { - irqentry_enter_from_user_mode(regs); - - if (handle_misaligned_store(regs)) - do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc, - "Oops - store (or AMO) address misaligned"); - - irqentry_exit_to_user_mode(regs); - } else { - irqentry_state_t state = irqentry_nmi_enter(regs); - - if (handle_misaligned_store(regs)) - do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc, - "Oops - store (or AMO) address misaligned"); + do_trap_misaligned(regs, MISALIGNED_LOAD); +} - irqentry_nmi_exit(regs, state); - } +asmlinkage __visible __trap_section void do_trap_store_misaligned(struct pt_regs *regs) +{ + do_trap_misaligned(regs, MISALIGNED_STORE); } + DO_ERROR_INFO(do_trap_store_fault, SIGSEGV, SEGV_ACCERR, "store (or AMO) access fault"); DO_ERROR_INFO(do_trap_ecall_s, -- 2.47.2 From cleger at rivosinc.com Mon Mar 10 08:12:17 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 10 Mar 2025 16:12:17 +0100 Subject: [PATCH v3 10/17] riscv: misaligned: enable IRQs while handling misaligned accesses In-Reply-To: <20250310151229.2365992-1-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> Message-ID: <20250310151229.2365992-11-cleger@rivosinc.com> We can safely reenable IRQs if they were enabled in the previous context. This allows to access user memory that could potentially trigger a page fault. Signed-off-by: Cl?ment L?ger --- arch/riscv/kernel/traps.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 55d9f3450398..3eecc2addc41 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -206,6 +206,11 @@ enum misaligned_access_type { static void do_trap_misaligned(struct pt_regs *regs, enum misaligned_access_type type) { irqentry_state_t state = irqentry_enter(regs); + bool enable_irqs = !regs_irqs_disabled(regs); + + /* Enable interrupts if they were enabled in the interrupted context. */ + if (enable_irqs) + local_irq_enable(); if (type == MISALIGNED_LOAD) { if (handle_misaligned_load(regs)) @@ -217,6 +222,9 @@ static void do_trap_misaligned(struct pt_regs *regs, enum misaligned_access_type "Oops - store (or AMO) address misaligned"); } + if (enable_irqs) + local_irq_disable(); + irqentry_exit(regs, state); } -- 2.47.2 From cleger at rivosinc.com Mon Mar 10 08:12:18 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 10 Mar 2025 16:12:18 +0100 Subject: [PATCH v3 11/17] riscv: misaligned: use get_user() instead of __get_user() In-Reply-To: <20250310151229.2365992-1-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> Message-ID: <20250310151229.2365992-12-cleger@rivosinc.com> Now that we can safely handle user memory accesses while in the misaligned access handlers, use get_user() instead of __get_user() to have user memory access checks. Signed-off-by: Cl?ment L?ger --- arch/riscv/kernel/traps_misaligned.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c index a67a6e709a06..44b9348c80d4 100644 --- a/arch/riscv/kernel/traps_misaligned.c +++ b/arch/riscv/kernel/traps_misaligned.c @@ -269,7 +269,7 @@ static unsigned long get_f32_rs(unsigned long insn, u8 fp_reg_offset, int __ret; \ \ if (user_mode(regs)) { \ - __ret = __get_user(insn, (type __user *) insn_addr); \ + __ret = get_user(insn, (type __user *) insn_addr); \ } else { \ insn = *(type *)insn_addr; \ __ret = 0; \ -- 2.47.2 From cleger at rivosinc.com Mon Mar 10 08:12:19 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 10 Mar 2025 16:12:19 +0100 Subject: [PATCH v3 12/17] Documentation/sysctl: add riscv to unaligned-trap supported archs In-Reply-To: <20250310151229.2365992-1-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> Message-ID: <20250310151229.2365992-13-cleger@rivosinc.com> riscv supports the "unaligned-trap" sysctl variable, add it to the list of supported architectures. Signed-off-by: Cl?ment L?ger --- Documentation/admin-guide/sysctl/kernel.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst index a43b78b4b646..ce3f0dd3666e 100644 --- a/Documentation/admin-guide/sysctl/kernel.rst +++ b/Documentation/admin-guide/sysctl/kernel.rst @@ -1584,8 +1584,8 @@ unaligned-trap On architectures where unaligned accesses cause traps, and where this feature is supported (``CONFIG_SYSCTL_ARCH_UNALIGN_ALLOW``; currently, -``arc``, ``parisc`` and ``loongarch``), controls whether unaligned traps -are caught and emulated (instead of failing). +``arc``, ``parisc``, ``loongarch`` and ``riscv``), controls whether unaligned +traps are caught and emulated (instead of failing). = ======================================================== 0 Do not emulate unaligned accesses. -- 2.47.2 From cleger at rivosinc.com Mon Mar 10 08:12:20 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 10 Mar 2025 16:12:20 +0100 Subject: [PATCH v3 13/17] selftests: riscv: add misaligned access testing In-Reply-To: <20250310151229.2365992-1-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> Message-ID: <20250310151229.2365992-14-cleger@rivosinc.com> Now that the kernel can emulate misaligned access and control its behavior, add a selftest for that. This selftest tests all the currently emulated instruction (except for the RV32 compressed ones which are left as a future exercise for a RV32 user). For the FPU instructions, all the FPU registers are tested. Signed-off-by: Cl?ment L?ger --- .../selftests/riscv/misaligned/.gitignore | 1 + .../selftests/riscv/misaligned/Makefile | 12 + .../selftests/riscv/misaligned/common.S | 33 +++ .../testing/selftests/riscv/misaligned/fpu.S | 180 +++++++++++++ tools/testing/selftests/riscv/misaligned/gp.S | 103 +++++++ .../selftests/riscv/misaligned/misaligned.c | 254 ++++++++++++++++++ 6 files changed, 583 insertions(+) create mode 100644 tools/testing/selftests/riscv/misaligned/.gitignore create mode 100644 tools/testing/selftests/riscv/misaligned/Makefile create mode 100644 tools/testing/selftests/riscv/misaligned/common.S create mode 100644 tools/testing/selftests/riscv/misaligned/fpu.S create mode 100644 tools/testing/selftests/riscv/misaligned/gp.S create mode 100644 tools/testing/selftests/riscv/misaligned/misaligned.c diff --git a/tools/testing/selftests/riscv/misaligned/.gitignore b/tools/testing/selftests/riscv/misaligned/.gitignore new file mode 100644 index 000000000000..5eff15a1f981 --- /dev/null +++ b/tools/testing/selftests/riscv/misaligned/.gitignore @@ -0,0 +1 @@ +misaligned diff --git a/tools/testing/selftests/riscv/misaligned/Makefile b/tools/testing/selftests/riscv/misaligned/Makefile new file mode 100644 index 000000000000..1aa40110c50d --- /dev/null +++ b/tools/testing/selftests/riscv/misaligned/Makefile @@ -0,0 +1,12 @@ +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) 2021 ARM Limited +# Originally tools/testing/arm64/abi/Makefile + +CFLAGS += -I$(top_srcdir)/tools/include + +TEST_GEN_PROGS := misaligned + +include ../../lib.mk + +$(OUTPUT)/misaligned: misaligned.c fpu.S gp.S + $(CC) -g3 -static -o$@ -march=rv64imafdc $(CFLAGS) $(LDFLAGS) $^ diff --git a/tools/testing/selftests/riscv/misaligned/common.S b/tools/testing/selftests/riscv/misaligned/common.S new file mode 100644 index 000000000000..8fa00035bd5d --- /dev/null +++ b/tools/testing/selftests/riscv/misaligned/common.S @@ -0,0 +1,33 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) 2025 Rivos Inc. + * + * Authors: + * Cl?ment L?ger + */ + +.macro lb_sb temp, offset, src, dst + lb \temp, \offset(\src) + sb \temp, \offset(\dst) +.endm + +.macro copy_long_to temp, src, dst + lb_sb \temp, 0, \src, \dst, + lb_sb \temp, 1, \src, \dst, + lb_sb \temp, 2, \src, \dst, + lb_sb \temp, 3, \src, \dst, + lb_sb \temp, 4, \src, \dst, + lb_sb \temp, 5, \src, \dst, + lb_sb \temp, 6, \src, \dst, + lb_sb \temp, 7, \src, \dst, +.endm + +.macro sp_stack_prologue offset + addi sp, sp, -8 + sub sp, sp, \offset +.endm + +.macro sp_stack_epilogue offset + add sp, sp, \offset + addi sp, sp, 8 +.endm diff --git a/tools/testing/selftests/riscv/misaligned/fpu.S b/tools/testing/selftests/riscv/misaligned/fpu.S new file mode 100644 index 000000000000..d008bff58310 --- /dev/null +++ b/tools/testing/selftests/riscv/misaligned/fpu.S @@ -0,0 +1,180 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) 2025 Rivos Inc. + * + * Authors: + * Cl?ment L?ger + */ + +#include "common.S" + +#define CASE_ALIGN 4 + +.macro fpu_load_inst fpreg, inst, precision, load_reg +.align CASE_ALIGN + \inst \fpreg, 0(\load_reg) + fmv.\precision fa0, \fpreg + j 2f +.endm + +#define flw(__fpreg) fpu_load_inst __fpreg, flw, s, s1 +#define fld(__fpreg) fpu_load_inst __fpreg, fld, d, s1 +#define c_flw(__fpreg) fpu_load_inst __fpreg, c.flw, s, s1 +#define c_fld(__fpreg) fpu_load_inst __fpreg, c.fld, d, s1 +#define c_fldsp(__fpreg) fpu_load_inst __fpreg, c.fldsp, d, sp + +.macro fpu_store_inst fpreg, inst, precision, store_reg +.align CASE_ALIGN + fmv.\precision \fpreg, fa0 + \inst \fpreg, 0(\store_reg) + j 2f +.endm + +#define fsw(__fpreg) fpu_store_inst __fpreg, fsw, s, s1 +#define fsd(__fpreg) fpu_store_inst __fpreg, fsd, d, s1 +#define c_fsw(__fpreg) fpu_store_inst __fpreg, c.fsw, s, s1 +#define c_fsd(__fpreg) fpu_store_inst __fpreg, c.fsd, d, s1 +#define c_fsdsp(__fpreg) fpu_store_inst __fpreg, c.fsdsp, d, sp + +.macro fp_test_prologue + move s1, a1 + /* + * Compute jump offset to store the correct FP register since we don't + * have indirect FP register access (or at least we don't use this + * extension so that works on all archs) + */ + sll t0, a0, CASE_ALIGN + la t2, 1f + add t0, t0, t2 + jr t0 +.align CASE_ALIGN +1: +.endm + +.macro fp_test_prologue_compressed + /* FP registers for compressed instructions starts from 8 to 16 */ + addi a0, a0, -8 + fp_test_prologue +.endm + +#define fp_test_body_compressed(__inst_func) \ + __inst_func(f8); \ + __inst_func(f9); \ + __inst_func(f10); \ + __inst_func(f11); \ + __inst_func(f12); \ + __inst_func(f13); \ + __inst_func(f14); \ + __inst_func(f15); \ +2: + +#define fp_test_body(__inst_func) \ + __inst_func(f0); \ + __inst_func(f1); \ + __inst_func(f2); \ + __inst_func(f3); \ + __inst_func(f4); \ + __inst_func(f5); \ + __inst_func(f6); \ + __inst_func(f7); \ + __inst_func(f8); \ + __inst_func(f9); \ + __inst_func(f10); \ + __inst_func(f11); \ + __inst_func(f12); \ + __inst_func(f13); \ + __inst_func(f14); \ + __inst_func(f15); \ + __inst_func(f16); \ + __inst_func(f17); \ + __inst_func(f18); \ + __inst_func(f19); \ + __inst_func(f20); \ + __inst_func(f21); \ + __inst_func(f22); \ + __inst_func(f23); \ + __inst_func(f24); \ + __inst_func(f25); \ + __inst_func(f26); \ + __inst_func(f27); \ + __inst_func(f28); \ + __inst_func(f29); \ + __inst_func(f30); \ + __inst_func(f31); \ +2: +.text + +#define __gen_test_inst(__inst, __suffix) \ +.global test_ ## __inst; \ +test_ ## __inst:; \ + fp_test_prologue ## __suffix; \ + fp_test_body ## __suffix(__inst); \ + ret + +#define gen_test_inst_compressed(__inst) \ + .option arch,+c; \ + __gen_test_inst(c_ ## __inst, _compressed) + +#define gen_test_inst(__inst) \ + .balign 16; \ + .option push; \ + .option arch,-c; \ + __gen_test_inst(__inst, ); \ + .option pop + +.macro fp_test_prologue_load_compressed_sp + copy_long_to t0, a1, sp +.endm + +.macro fp_test_epilogue_load_compressed_sp +.endm + +.macro fp_test_prologue_store_compressed_sp +.endm + +.macro fp_test_epilogue_store_compressed_sp + copy_long_to t0, sp, a1 +.endm + +#define gen_inst_compressed_sp(__inst, __type) \ + .global test_c_ ## __inst ## sp; \ + test_c_ ## __inst ## sp:; \ + sp_stack_prologue a2; \ + fp_test_prologue_## __type ## _compressed_sp; \ + fp_test_prologue_compressed; \ + fp_test_body_compressed(c_ ## __inst ## sp); \ + fp_test_epilogue_## __type ## _compressed_sp; \ + sp_stack_epilogue a2; \ + ret + +#define gen_test_load_compressed_sp(__inst) gen_inst_compressed_sp(__inst, load) +#define gen_test_store_compressed_sp(__inst) gen_inst_compressed_sp(__inst, store) + +/* + * float_fsw_reg - Set a FP register from a register containing the value + * a0 = FP register index to be set + * a1 = addr where to store register value + * a2 = address offset + * a3 = value to be store + */ +gen_test_inst(fsw) + +/* + * float_flw_reg - Get a FP register value and return it + * a0 = FP register index to be retrieved + * a1 = addr to load register from + * a2 = address offset + */ +gen_test_inst(flw) + +gen_test_inst(fsd) +#ifdef __riscv_compressed +gen_test_inst_compressed(fsd) +gen_test_store_compressed_sp(fsd) +#endif + +gen_test_inst(fld) +#ifdef __riscv_compressed +gen_test_inst_compressed(fld) +gen_test_load_compressed_sp(fld) +#endif diff --git a/tools/testing/selftests/riscv/misaligned/gp.S b/tools/testing/selftests/riscv/misaligned/gp.S new file mode 100644 index 000000000000..f53f4c6d81dd --- /dev/null +++ b/tools/testing/selftests/riscv/misaligned/gp.S @@ -0,0 +1,103 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) 2025 Rivos Inc. + * + * Authors: + * Cl?ment L?ger + */ + +#include "common.S" + +.text + +.macro __gen_test_inst inst, src_reg + \inst a2, 0(\src_reg) + move a0, a2 +.endm + +.macro gen_func_header func_name, rvc + .option arch,\rvc + .global test_\func_name + test_\func_name: +.endm + +.macro gen_test_inst inst + .option push + gen_func_header \inst, -c + __gen_test_inst \inst, a0 + .option pop + ret +.endm + +.macro __gen_test_inst_c name, src_reg + .option push + gen_func_header c_\name, +c + __gen_test_inst c.\name, \src_reg + .option pop + ret +.endm + +.macro gen_test_inst_c name + __gen_test_inst_c \name, a0 +.endm + + +.macro gen_test_inst_load_c_sp name + .option push + gen_func_header c_\name\()sp, +c + sp_stack_prologue a1 + copy_long_to t0, a0, sp + c.ldsp a0, 0(sp) + sp_stack_epilogue a1 + .option pop + ret +.endm + +.macro lb_sp_sb_a0 reg, offset + lb_sb \reg, \offset, sp, a0 +.endm + +.macro gen_test_inst_store_c_sp inst_name + .option push + gen_func_header c_\inst_name\()sp, +c + /* Misalign stack pointer */ + sp_stack_prologue a1 + /* Misalign access */ + c.sdsp a2, 0(sp) + copy_long_to t0, sp, a0 + sp_stack_epilogue a1 + .option pop + ret +.endm + + + /* + * a0 = addr to load from + * a1 = address offset + * a2 = value to be loaded + */ +gen_test_inst lh +gen_test_inst lhu +gen_test_inst lw +gen_test_inst lwu +gen_test_inst ld +#ifdef __riscv_compressed +gen_test_inst_c lw +gen_test_inst_c ld +gen_test_inst_load_c_sp ld +#endif + +/* + * a0 = addr where to store value + * a1 = address offset + * a2 = value to be stored + */ +gen_test_inst sh +gen_test_inst sw +gen_test_inst sd +#ifdef __riscv_compressed +gen_test_inst_c sw +gen_test_inst_c sd +gen_test_inst_store_c_sp sd +#endif + diff --git a/tools/testing/selftests/riscv/misaligned/misaligned.c b/tools/testing/selftests/riscv/misaligned/misaligned.c new file mode 100644 index 000000000000..c66aa87ec03e --- /dev/null +++ b/tools/testing/selftests/riscv/misaligned/misaligned.c @@ -0,0 +1,254 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2025 Rivos Inc. + * + * Authors: + * Cl?ment L?ger + */ +#include +#include +#include +#include +#include "../../kselftest_harness.h" + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#define stringify(s) __stringify(s) +#define __stringify(s) #s + +#define VAL16 0x1234 +#define VAL32 0xDEADBEEF +#define VAL64 0x45674321D00DF789 + +#define VAL_float 78951.234375 +#define VAL_double 567890.512396965789589290 + +static bool float_equal(float a, float b) +{ + float scaled_epsilon; + float difference = fabsf(a - b); + + // Scale to the largest value. + a = fabsf(a); + b = fabsf(b); + if (a > b) + scaled_epsilon = FLT_EPSILON * a; + else + scaled_epsilon = FLT_EPSILON * b; + + return difference <= scaled_epsilon; +} + +static bool double_equal(double a, double b) +{ + double scaled_epsilon; + double difference = fabsf(a - b); + + // Scale to the largest value. + a = fabs(a); + b = fabs(b); + if (a > b) + scaled_epsilon = DBL_EPSILON * a; + else + scaled_epsilon = DBL_EPSILON * b; + + return difference <= scaled_epsilon; +} + +#define fpu_load_proto(__inst, __type) \ +extern __type test_ ## __inst(unsigned long fp_reg, void *addr, unsigned long offset, __type value) + +fpu_load_proto(flw, float); +fpu_load_proto(fld, double); +fpu_load_proto(c_flw, float); +fpu_load_proto(c_fld, double); +fpu_load_proto(c_fldsp, double); + +#define fpu_store_proto(__inst, __type) \ +extern void test_ ## __inst(unsigned long fp_reg, void *addr, unsigned long offset, __type value) + +fpu_store_proto(fsw, float); +fpu_store_proto(fsd, double); +fpu_store_proto(c_fsw, float); +fpu_store_proto(c_fsd, double); +fpu_store_proto(c_fsdsp, double); + +#define gp_load_proto(__inst, __type) \ +extern __type test_ ## __inst(void *addr, unsigned long offset, __type value) + +gp_load_proto(lh, uint16_t); +gp_load_proto(lhu, uint16_t); +gp_load_proto(lw, uint32_t); +gp_load_proto(lwu, uint32_t); +gp_load_proto(ld, uint64_t); +gp_load_proto(c_lw, uint32_t); +gp_load_proto(c_ld, uint64_t); +gp_load_proto(c_ldsp, uint64_t); + +#define gp_store_proto(__inst, __type) \ +extern void test_ ## __inst(void *addr, unsigned long offset, __type value) + +gp_store_proto(sh, uint16_t); +gp_store_proto(sw, uint32_t); +gp_store_proto(sd, uint64_t); +gp_store_proto(c_sw, uint32_t); +gp_store_proto(c_sd, uint64_t); +gp_store_proto(c_sdsp, uint64_t); + +#define TEST_GP_LOAD(__inst, __type_size) \ +TEST(gp_load_ ## __inst) \ +{ \ + int offset, ret; \ + uint8_t buf[16] __attribute__((aligned(16))); \ + \ + ret = prctl(PR_SET_UNALIGN, PR_UNALIGN_NOPRINT); \ + ASSERT_EQ(ret, 0); \ + \ + for (offset = 1; offset < __type_size / 8; offset++) { \ + uint ## __type_size ## _t val = VAL ## __type_size; \ + uint ## __type_size ## _t *ptr = (uint ## __type_size ## _t *) (buf + offset); \ + memcpy(ptr, &val, sizeof(val)); \ + val = test_ ## __inst(ptr, offset, val); \ + EXPECT_EQ(VAL ## __type_size, val); \ + } \ +} + +TEST_GP_LOAD(lh, 16); +TEST_GP_LOAD(lhu, 16); +TEST_GP_LOAD(lw, 32); +TEST_GP_LOAD(lwu, 32); +TEST_GP_LOAD(ld, 64); +#ifdef __riscv_compressed +TEST_GP_LOAD(c_lw, 32); +TEST_GP_LOAD(c_ld, 64); +TEST_GP_LOAD(c_ldsp, 64); +#endif + +#define TEST_GP_STORE(__inst, __type_size) \ +TEST(gp_load_ ## __inst) \ +{ \ + int offset, ret; \ + uint8_t buf[16] __attribute__((aligned(16))); \ + \ + ret = prctl(PR_SET_UNALIGN, PR_UNALIGN_NOPRINT); \ + ASSERT_EQ(ret, 0); \ + \ + for (offset = 1; offset < __type_size / 8; offset++) { \ + uint ## __type_size ## _t val = VAL ## __type_size; \ + uint ## __type_size ## _t *ptr = (uint ## __type_size ## _t *) (buf + offset); \ + memset(ptr, 0, sizeof(val)); \ + test_ ## __inst(ptr, offset, val); \ + memcpy(&val, ptr, sizeof(val)); \ + EXPECT_EQ(VAL ## __type_size, val); \ + } \ +} +TEST_GP_STORE(sh, 16); +TEST_GP_STORE(sw, 32); +TEST_GP_STORE(sd, 64); +#ifdef __riscv_compressed +TEST_GP_STORE(c_sw, 32); +TEST_GP_STORE(c_sd, 64); +TEST_GP_STORE(c_sdsp, 64); +#endif + +#define __TEST_FPU_LOAD(__type, __inst, __reg_start, __reg_end) \ +TEST(fpu_load_ ## __inst) \ +{ \ + int i, ret, offset, fp_reg; \ + uint8_t buf[16] __attribute__((aligned(16))); \ + \ + ret = prctl(PR_SET_UNALIGN, PR_UNALIGN_NOPRINT); \ + ASSERT_EQ(ret, 0); \ + \ + for (fp_reg = __reg_start; fp_reg < __reg_end; fp_reg++) { \ + for (offset = 1; offset < 4; offset++) { \ + void *load_addr = (buf + offset); \ + __type val = VAL_ ## __type ; \ + \ + memcpy(load_addr, &val, sizeof(val)); \ + val = test_ ## __inst(fp_reg, load_addr, offset, val); \ + EXPECT_TRUE(__type ##_equal(val, VAL_## __type)); \ + } \ + } \ +} +#define TEST_FPU_LOAD(__type, __inst) \ + __TEST_FPU_LOAD(__type, __inst, 0, 32) +#define TEST_FPU_LOAD_COMPRESSED(__type, __inst) \ + __TEST_FPU_LOAD(__type, __inst, 8, 16) + +TEST_FPU_LOAD(float, flw) +TEST_FPU_LOAD(double, fld) +#ifdef __riscv_compressed +TEST_FPU_LOAD_COMPRESSED(double, c_fld) +TEST_FPU_LOAD_COMPRESSED(double, c_fldsp) +#endif + +#define __TEST_FPU_STORE(__type, __inst, __reg_start, __reg_end) \ +TEST(fpu_store_ ## __inst) \ +{ \ + int i, ret, offset, fp_reg; \ + uint8_t buf[16] __attribute__((aligned(16))); \ + \ + ret = prctl(PR_SET_UNALIGN, PR_UNALIGN_NOPRINT); \ + ASSERT_EQ(ret, 0); \ + \ + for (fp_reg = __reg_start; fp_reg < __reg_end; fp_reg++) { \ + for (offset = 1; offset < 4; offset++) { \ + \ + void *store_addr = (buf + offset); \ + __type val = VAL_ ## __type ; \ + \ + test_ ## __inst(fp_reg, store_addr, offset, val); \ + memcpy(&val, store_addr, sizeof(val)); \ + EXPECT_TRUE(__type ## _equal(val, VAL_## __type)); \ + } \ + } \ +} +#define TEST_FPU_STORE(__type, __inst) \ + __TEST_FPU_STORE(__type, __inst, 0, 32) +#define TEST_FPU_STORE_COMPRESSED(__type, __inst) \ + __TEST_FPU_STORE(__type, __inst, 8, 16) + +TEST_FPU_STORE(float, fsw) +TEST_FPU_STORE(double, fsd) +#ifdef __riscv_compressed +TEST_FPU_STORE_COMPRESSED(double, c_fsd) +TEST_FPU_STORE_COMPRESSED(double, c_fsdsp) +#endif + +TEST_SIGNAL(gen_sigbus, SIGBUS) +{ + uint32_t *ptr; + uint8_t buf[16] __attribute__((aligned(16))); + int ret; + + ret = prctl(PR_SET_UNALIGN, PR_UNALIGN_SIGBUS); + ASSERT_EQ(ret, 0); + + ptr = (uint32_t *)(buf + 1); + *ptr = 0xDEADBEEFULL; +} + +int main(int argc, char **argv) +{ + int ret, val; + + ret = prctl(PR_GET_UNALIGN, &val); + if (ret == -1 && errno == EINVAL) + ksft_exit_skip("SKIP GET_UNALIGN_CTL not supported\n"); + + exit(test_harness_run(argc, argv)); +} -- 2.47.2 From cleger at rivosinc.com Mon Mar 10 08:12:21 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 10 Mar 2025 16:12:21 +0100 Subject: [PATCH v3 14/17] RISC-V: KVM: add SBI extension init()/deinit() functions In-Reply-To: <20250310151229.2365992-1-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> Message-ID: <20250310151229.2365992-15-cleger@rivosinc.com> The FWFT SBI extension will need to dynamically allocate memory and do init time specific initialization. Add an init/deinit callbacks that allows to do so. Signed-off-by: Cl?ment L?ger --- arch/riscv/include/asm/kvm_vcpu_sbi.h | 9 +++++++++ arch/riscv/kvm/vcpu.c | 2 ++ arch/riscv/kvm/vcpu_sbi.c | 29 +++++++++++++++++++++++++++ 3 files changed, 40 insertions(+) diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h index 4ed6203cdd30..bcb90757b149 100644 --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h @@ -49,6 +49,14 @@ struct kvm_vcpu_sbi_extension { /* Extension specific probe function */ unsigned long (*probe)(struct kvm_vcpu *vcpu); + + /* + * Init/deinit function called once during VCPU init/destroy. These + * might be use if the SBI extensions need to allocate or do specific + * init time only configuration. + */ + int (*init)(struct kvm_vcpu *vcpu); + void (*deinit)(struct kvm_vcpu *vcpu); }; void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run); @@ -69,6 +77,7 @@ const struct kvm_vcpu_sbi_extension *kvm_vcpu_sbi_find_ext( bool riscv_vcpu_supports_sbi_ext(struct kvm_vcpu *vcpu, int idx); int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run); void kvm_riscv_vcpu_sbi_init(struct kvm_vcpu *vcpu); +void kvm_riscv_vcpu_sbi_deinit(struct kvm_vcpu *vcpu); int kvm_riscv_vcpu_get_reg_sbi_sta(struct kvm_vcpu *vcpu, unsigned long reg_num, unsigned long *reg_val); diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c index 60d684c76c58..877bcc85c067 100644 --- a/arch/riscv/kvm/vcpu.c +++ b/arch/riscv/kvm/vcpu.c @@ -185,6 +185,8 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu) void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu) { + kvm_riscv_vcpu_sbi_deinit(vcpu); + /* Cleanup VCPU AIA context */ kvm_riscv_vcpu_aia_deinit(vcpu); diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c index d1c83a77735e..858ddefd7e7f 100644 --- a/arch/riscv/kvm/vcpu_sbi.c +++ b/arch/riscv/kvm/vcpu_sbi.c @@ -505,8 +505,37 @@ void kvm_riscv_vcpu_sbi_init(struct kvm_vcpu *vcpu) continue; } + if (!ext->default_disabled && ext->init && + ext->init(vcpu) != 0) { + scontext->ext_status[idx] = KVM_RISCV_SBI_EXT_STATUS_UNAVAILABLE; + continue; + } + scontext->ext_status[idx] = ext->default_disabled ? KVM_RISCV_SBI_EXT_STATUS_DISABLED : KVM_RISCV_SBI_EXT_STATUS_ENABLED; } } + +void kvm_riscv_vcpu_sbi_deinit(struct kvm_vcpu *vcpu) +{ + struct kvm_vcpu_sbi_context *scontext = &vcpu->arch.sbi_context; + const struct kvm_riscv_sbi_extension_entry *entry; + const struct kvm_vcpu_sbi_extension *ext; + int idx, i; + + for (i = 0; i < ARRAY_SIZE(sbi_ext); i++) { + entry = &sbi_ext[i]; + ext = entry->ext_ptr; + idx = entry->ext_idx; + + if (idx < 0 || idx >= ARRAY_SIZE(scontext->ext_status)) + continue; + + if (scontext->ext_status[idx] == KVM_RISCV_SBI_EXT_STATUS_UNAVAILABLE || + !ext->deinit) + continue; + + ext->deinit(vcpu); + } +} -- 2.47.2 From cleger at rivosinc.com Mon Mar 10 08:12:22 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 10 Mar 2025 16:12:22 +0100 Subject: [PATCH v3 15/17] RISC-V: KVM: add SBI extension reset callback In-Reply-To: <20250310151229.2365992-1-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> Message-ID: <20250310151229.2365992-16-cleger@rivosinc.com> Currently, oonly the STA extension needed a reset function but that's going to be the case for FWFT as well. Add a reset callback that can be implemented by SBI extensions. Signed-off-by: Cl?ment L?ger --- arch/riscv/include/asm/kvm_host.h | 1 - arch/riscv/include/asm/kvm_vcpu_sbi.h | 2 ++ arch/riscv/kvm/vcpu.c | 2 +- arch/riscv/kvm/vcpu_sbi.c | 24 ++++++++++++++++++++++++ arch/riscv/kvm/vcpu_sbi_sta.c | 3 ++- 5 files changed, 29 insertions(+), 3 deletions(-) diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h index cc33e35cd628..bb93d2995ea2 100644 --- a/arch/riscv/include/asm/kvm_host.h +++ b/arch/riscv/include/asm/kvm_host.h @@ -409,7 +409,6 @@ void __kvm_riscv_vcpu_power_on(struct kvm_vcpu *vcpu); void kvm_riscv_vcpu_power_on(struct kvm_vcpu *vcpu); bool kvm_riscv_vcpu_stopped(struct kvm_vcpu *vcpu); -void kvm_riscv_vcpu_sbi_sta_reset(struct kvm_vcpu *vcpu); void kvm_riscv_vcpu_record_steal_time(struct kvm_vcpu *vcpu); #endif /* __RISCV_KVM_HOST_H__ */ diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h index bcb90757b149..cb68b3a57c8f 100644 --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h @@ -57,6 +57,7 @@ struct kvm_vcpu_sbi_extension { */ int (*init)(struct kvm_vcpu *vcpu); void (*deinit)(struct kvm_vcpu *vcpu); + void (*reset)(struct kvm_vcpu *vcpu); }; void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run); @@ -78,6 +79,7 @@ bool riscv_vcpu_supports_sbi_ext(struct kvm_vcpu *vcpu, int idx); int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run); void kvm_riscv_vcpu_sbi_init(struct kvm_vcpu *vcpu); void kvm_riscv_vcpu_sbi_deinit(struct kvm_vcpu *vcpu); +void kvm_riscv_vcpu_sbi_reset(struct kvm_vcpu *vcpu); int kvm_riscv_vcpu_get_reg_sbi_sta(struct kvm_vcpu *vcpu, unsigned long reg_num, unsigned long *reg_val); diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c index 877bcc85c067..542747e2c7f5 100644 --- a/arch/riscv/kvm/vcpu.c +++ b/arch/riscv/kvm/vcpu.c @@ -94,7 +94,7 @@ static void kvm_riscv_reset_vcpu(struct kvm_vcpu *vcpu) vcpu->arch.hfence_tail = 0; memset(vcpu->arch.hfence_queue, 0, sizeof(vcpu->arch.hfence_queue)); - kvm_riscv_vcpu_sbi_sta_reset(vcpu); + kvm_riscv_vcpu_sbi_reset(vcpu); /* Reset the guest CSRs for hotplug usecase */ if (loaded) diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c index 858ddefd7e7f..18726096ef44 100644 --- a/arch/riscv/kvm/vcpu_sbi.c +++ b/arch/riscv/kvm/vcpu_sbi.c @@ -539,3 +539,27 @@ void kvm_riscv_vcpu_sbi_deinit(struct kvm_vcpu *vcpu) ext->deinit(vcpu); } } + +void kvm_riscv_vcpu_sbi_reset(struct kvm_vcpu *vcpu) +{ + struct kvm_vcpu_sbi_context *scontext = &vcpu->arch.sbi_context; + const struct kvm_riscv_sbi_extension_entry *entry; + const struct kvm_vcpu_sbi_extension *ext; + int idx, i; + + for (i = 0; i < ARRAY_SIZE(sbi_ext); i++) { + entry = &sbi_ext[i]; + ext = entry->ext_ptr; + idx = entry->ext_idx; + + if (idx < 0 || idx >= ARRAY_SIZE(scontext->ext_status)) + continue; + + if (scontext->ext_status[idx] != KVM_RISCV_SBI_EXT_STATUS_ENABLED || + !ext->reset) + continue; + + ext->reset(vcpu); + } +} + diff --git a/arch/riscv/kvm/vcpu_sbi_sta.c b/arch/riscv/kvm/vcpu_sbi_sta.c index 5f35427114c1..cc6cb7c8f0e4 100644 --- a/arch/riscv/kvm/vcpu_sbi_sta.c +++ b/arch/riscv/kvm/vcpu_sbi_sta.c @@ -16,7 +16,7 @@ #include #include -void kvm_riscv_vcpu_sbi_sta_reset(struct kvm_vcpu *vcpu) +static void kvm_riscv_vcpu_sbi_sta_reset(struct kvm_vcpu *vcpu) { vcpu->arch.sta.shmem = INVALID_GPA; vcpu->arch.sta.last_steal = 0; @@ -156,6 +156,7 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_sta = { .extid_end = SBI_EXT_STA, .handler = kvm_sbi_ext_sta_handler, .probe = kvm_sbi_ext_sta_probe, + .reset = kvm_riscv_vcpu_sbi_sta_reset, }; int kvm_riscv_vcpu_get_reg_sbi_sta(struct kvm_vcpu *vcpu, -- 2.47.2 From cleger at rivosinc.com Mon Mar 10 08:12:23 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 10 Mar 2025 16:12:23 +0100 Subject: [PATCH v3 16/17] RISC-V: KVM: add support for FWFT SBI extension In-Reply-To: <20250310151229.2365992-1-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> Message-ID: <20250310151229.2365992-17-cleger@rivosinc.com> Add basic infrastructure to support the FWFT extension in KVM. Signed-off-by: Cl?ment L?ger --- arch/riscv/include/asm/kvm_host.h | 4 + arch/riscv/include/asm/kvm_vcpu_sbi.h | 1 + arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h | 31 +++ arch/riscv/include/uapi/asm/kvm.h | 1 + arch/riscv/kvm/Makefile | 1 + arch/riscv/kvm/vcpu_sbi.c | 4 + arch/riscv/kvm/vcpu_sbi_fwft.c | 212 +++++++++++++++++++++ 7 files changed, 254 insertions(+) create mode 100644 arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h create mode 100644 arch/riscv/kvm/vcpu_sbi_fwft.c diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h index bb93d2995ea2..c0db61ba691a 100644 --- a/arch/riscv/include/asm/kvm_host.h +++ b/arch/riscv/include/asm/kvm_host.h @@ -19,6 +19,7 @@ #include #include #include +#include #include #include @@ -281,6 +282,9 @@ struct kvm_vcpu_arch { /* Performance monitoring context */ struct kvm_pmu pmu_context; + /* Firmware feature SBI extension context */ + struct kvm_sbi_fwft fwft_context; + /* 'static' configurations which are set only once */ struct kvm_vcpu_config cfg; diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h index cb68b3a57c8f..ffd03fed0c06 100644 --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h @@ -98,6 +98,7 @@ extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_hsm; extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_dbcn; extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_susp; extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_sta; +extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_fwft; extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_experimental; extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_vendor; diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h b/arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h new file mode 100644 index 000000000000..ec7568e0dc1a --- /dev/null +++ b/arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h @@ -0,0 +1,31 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) 2025 Rivos Inc. + * + * Authors: + * Cl?ment L?ger + */ + +#ifndef __KVM_VCPU_RISCV_FWFT_H +#define __KVM_VCPU_RISCV_FWFT_H + +#include + +struct kvm_sbi_fwft_config; +struct kvm_sbi_fwft_feature; +struct kvm_vcpu; + +struct kvm_sbi_fwft_config { + const struct kvm_sbi_fwft_feature *feature; + bool supported; + unsigned long flags; +}; + +/* FWFT data structure per vcpu */ +struct kvm_sbi_fwft { + struct kvm_sbi_fwft_config *configs; +}; + +#define vcpu_to_fwft(vcpu) (&(vcpu)->arch.fwft_context) + +#endif /* !__KVM_VCPU_RISCV_FWFT_H */ diff --git a/arch/riscv/include/uapi/asm/kvm.h b/arch/riscv/include/uapi/asm/kvm.h index f06bc5efcd79..fa6eee1caf41 100644 --- a/arch/riscv/include/uapi/asm/kvm.h +++ b/arch/riscv/include/uapi/asm/kvm.h @@ -202,6 +202,7 @@ enum KVM_RISCV_SBI_EXT_ID { KVM_RISCV_SBI_EXT_DBCN, KVM_RISCV_SBI_EXT_STA, KVM_RISCV_SBI_EXT_SUSP, + KVM_RISCV_SBI_EXT_FWFT, KVM_RISCV_SBI_EXT_MAX, }; diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile index 4e0bba91d284..06e2d52a9b88 100644 --- a/arch/riscv/kvm/Makefile +++ b/arch/riscv/kvm/Makefile @@ -26,6 +26,7 @@ kvm-y += vcpu_onereg.o kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o kvm-y += vcpu_sbi.o kvm-y += vcpu_sbi_base.o +kvm-y += vcpu_sbi_fwft.o kvm-y += vcpu_sbi_hsm.o kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_sbi_pmu.o kvm-y += vcpu_sbi_replace.o diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c index 18726096ef44..27f22e98c8f8 100644 --- a/arch/riscv/kvm/vcpu_sbi.c +++ b/arch/riscv/kvm/vcpu_sbi.c @@ -78,6 +78,10 @@ static const struct kvm_riscv_sbi_extension_entry sbi_ext[] = { .ext_idx = KVM_RISCV_SBI_EXT_STA, .ext_ptr = &vcpu_sbi_ext_sta, }, + { + .ext_idx = KVM_RISCV_SBI_EXT_FWFT, + .ext_ptr = &vcpu_sbi_ext_fwft, + }, { .ext_idx = KVM_RISCV_SBI_EXT_EXPERIMENTAL, .ext_ptr = &vcpu_sbi_ext_experimental, diff --git a/arch/riscv/kvm/vcpu_sbi_fwft.c b/arch/riscv/kvm/vcpu_sbi_fwft.c new file mode 100644 index 000000000000..cce1e41d5490 --- /dev/null +++ b/arch/riscv/kvm/vcpu_sbi_fwft.c @@ -0,0 +1,212 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2025 Rivos Inc. + * + * Authors: + * Cl?ment L?ger + */ + +#include +#include +#include +#include +#include +#include +#include + +struct kvm_sbi_fwft_feature { + /** + * @id: Feature ID + */ + enum sbi_fwft_feature_t id; + + /** + * @supported: Check if the feature is supported on the vcpu + * + * This callback is optional, if not provided the feature is assumed to + * be supported + */ + bool (*supported)(struct kvm_vcpu *vcpu); + + /** + * @set: Set the feature value + * + * This callback is mandatory + */ + int (*set)(struct kvm_vcpu *vcpu, struct kvm_sbi_fwft_config *conf, unsigned long value); + + /** + * @get: Get the feature current value + * + * This callback is mandatory + */ + int (*get)(struct kvm_vcpu *vcpu, struct kvm_sbi_fwft_config *conf, unsigned long *value); +}; + +static const enum sbi_fwft_feature_t kvm_fwft_defined_features[] = { + SBI_FWFT_MISALIGNED_EXC_DELEG, + SBI_FWFT_LANDING_PAD, + SBI_FWFT_SHADOW_STACK, + SBI_FWFT_DOUBLE_TRAP, + SBI_FWFT_PTE_AD_HW_UPDATING, + SBI_FWFT_POINTER_MASKING_PMLEN, +}; + +static bool kvm_fwft_is_defined_feature(enum sbi_fwft_feature_t feature) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(kvm_fwft_defined_features); i++) { + if (kvm_fwft_defined_features[i] == feature) + return true; + } + + return false; +} + +static const struct kvm_sbi_fwft_feature features[] = { +}; + +static struct kvm_sbi_fwft_config * +kvm_sbi_fwft_get_config(struct kvm_vcpu *vcpu, enum sbi_fwft_feature_t feature) +{ + int i = 0; + struct kvm_sbi_fwft *fwft = vcpu_to_fwft(vcpu); + + for (i = 0; i < ARRAY_SIZE(features); i++) { + if (fwft->configs[i].feature->id == feature) + return &fwft->configs[i]; + } + + return NULL; +} + +static int kvm_fwft_get_feature(struct kvm_vcpu *vcpu, u32 feature, + struct kvm_sbi_fwft_config **conf) +{ + struct kvm_sbi_fwft_config *tconf; + + tconf = kvm_sbi_fwft_get_config(vcpu, feature); + if (!tconf) { + if (kvm_fwft_is_defined_feature(feature)) + return SBI_ERR_NOT_SUPPORTED; + + return SBI_ERR_DENIED; + } + + if (!tconf->supported) + return SBI_ERR_NOT_SUPPORTED; + + *conf = tconf; + + return SBI_SUCCESS; +} + +static int kvm_sbi_fwft_set(struct kvm_vcpu *vcpu, u32 feature, + unsigned long value, unsigned long flags) +{ + int ret; + struct kvm_sbi_fwft_config *conf; + + ret = kvm_fwft_get_feature(vcpu, feature, &conf); + if (ret) + return ret; + + if ((flags & ~SBI_FWFT_SET_FLAG_LOCK) != 0) + return SBI_ERR_INVALID_PARAM; + + if (conf->flags & SBI_FWFT_SET_FLAG_LOCK) + return SBI_ERR_DENIED_LOCKED; + + conf->flags = flags; + + return conf->feature->set(vcpu, conf, value); +} + +static int kvm_sbi_fwft_get(struct kvm_vcpu *vcpu, unsigned long feature, + unsigned long *value) +{ + int ret; + struct kvm_sbi_fwft_config *conf; + + ret = kvm_fwft_get_feature(vcpu, feature, &conf); + if (ret) + return ret; + + return conf->feature->get(vcpu, conf, value); +} + +static int kvm_sbi_ext_fwft_handler(struct kvm_vcpu *vcpu, struct kvm_run *run, + struct kvm_vcpu_sbi_return *retdata) +{ + int ret = 0; + struct kvm_cpu_context *cp = &vcpu->arch.guest_context; + unsigned long funcid = cp->a6; + + switch (funcid) { + case SBI_EXT_FWFT_SET: + ret = kvm_sbi_fwft_set(vcpu, cp->a0, cp->a1, cp->a2); + break; + case SBI_EXT_FWFT_GET: + ret = kvm_sbi_fwft_get(vcpu, cp->a0, &retdata->out_val); + break; + default: + ret = SBI_ERR_NOT_SUPPORTED; + break; + } + + retdata->err_val = ret; + + return 0; +} + +static int kvm_sbi_ext_fwft_init(struct kvm_vcpu *vcpu) +{ + struct kvm_sbi_fwft *fwft = vcpu_to_fwft(vcpu); + const struct kvm_sbi_fwft_feature *feature; + struct kvm_sbi_fwft_config *conf; + int i; + + fwft->configs = kcalloc(ARRAY_SIZE(features), sizeof(struct kvm_sbi_fwft_config), + GFP_KERNEL); + if (!fwft->configs) + return -ENOMEM; + + for (i = 0; i < ARRAY_SIZE(features); i++) { + feature = &features[i]; + conf = &fwft->configs[i]; + if (feature->supported) + conf->supported = feature->supported(vcpu); + else + conf->supported = true; + + conf->feature = feature; + } + + return 0; +} + +static void kvm_sbi_ext_fwft_deinit(struct kvm_vcpu *vcpu) +{ + struct kvm_sbi_fwft *fwft = vcpu_to_fwft(vcpu); + + kfree(fwft->configs); +} + +static void kvm_sbi_ext_fwft_reset(struct kvm_vcpu *vcpu) +{ + int i = 0; + struct kvm_sbi_fwft *fwft = vcpu_to_fwft(vcpu); + + for (i = 0; i < ARRAY_SIZE(features); i++) + fwft->configs[i].flags = 0; +} + +const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_fwft = { + .extid_start = SBI_EXT_FWFT, + .extid_end = SBI_EXT_FWFT, + .handler = kvm_sbi_ext_fwft_handler, + .init = kvm_sbi_ext_fwft_init, + .deinit = kvm_sbi_ext_fwft_deinit, + .reset = kvm_sbi_ext_fwft_reset, +}; -- 2.47.2 From cleger at rivosinc.com Mon Mar 10 08:12:24 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 10 Mar 2025 16:12:24 +0100 Subject: [PATCH v3 17/17] RISC-V: KVM: add support for SBI_FWFT_MISALIGNED_DELEG In-Reply-To: <20250310151229.2365992-1-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> Message-ID: <20250310151229.2365992-18-cleger@rivosinc.com> SBI_FWFT_MISALIGNED_DELEG needs hedeleg to be modified to delegate misaligned load/store exceptions. Save and restore it during CPU load/put. Signed-off-by: Cl?ment L?ger Reviewed-by: Deepak Gupta --- arch/riscv/kvm/vcpu.c | 3 +++ arch/riscv/kvm/vcpu_sbi_fwft.c | 39 ++++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+) diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c index 542747e2c7f5..d98e379945c3 100644 --- a/arch/riscv/kvm/vcpu.c +++ b/arch/riscv/kvm/vcpu.c @@ -646,6 +646,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) { void *nsh; struct kvm_vcpu_csr *csr = &vcpu->arch.guest_csr; + struct kvm_vcpu_config *cfg = &vcpu->arch.cfg; vcpu->cpu = -1; @@ -671,6 +672,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) csr->vstval = nacl_csr_read(nsh, CSR_VSTVAL); csr->hvip = nacl_csr_read(nsh, CSR_HVIP); csr->vsatp = nacl_csr_read(nsh, CSR_VSATP); + cfg->hedeleg = nacl_csr_read(nsh, CSR_HEDELEG); } else { csr->vsstatus = csr_read(CSR_VSSTATUS); csr->vsie = csr_read(CSR_VSIE); @@ -681,6 +683,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) csr->vstval = csr_read(CSR_VSTVAL); csr->hvip = csr_read(CSR_HVIP); csr->vsatp = csr_read(CSR_VSATP); + cfg->hedeleg = csr_read(CSR_HEDELEG); } } diff --git a/arch/riscv/kvm/vcpu_sbi_fwft.c b/arch/riscv/kvm/vcpu_sbi_fwft.c index cce1e41d5490..756fda1cf2e7 100644 --- a/arch/riscv/kvm/vcpu_sbi_fwft.c +++ b/arch/riscv/kvm/vcpu_sbi_fwft.c @@ -14,6 +14,8 @@ #include #include +#define MIS_DELEG (BIT_ULL(EXC_LOAD_MISALIGNED) | BIT_ULL(EXC_STORE_MISALIGNED)) + struct kvm_sbi_fwft_feature { /** * @id: Feature ID @@ -64,7 +66,44 @@ static bool kvm_fwft_is_defined_feature(enum sbi_fwft_feature_t feature) return false; } +static bool kvm_sbi_fwft_misaligned_delegation_supported(struct kvm_vcpu *vcpu) +{ + if (!misaligned_traps_can_delegate()) + return false; + + return true; +} + +static int kvm_sbi_fwft_set_misaligned_delegation(struct kvm_vcpu *vcpu, + struct kvm_sbi_fwft_config *conf, + unsigned long value) +{ + if (value == 1) + csr_set(CSR_HEDELEG, MIS_DELEG); + else if (value == 0) + csr_clear(CSR_HEDELEG, MIS_DELEG); + else + return SBI_ERR_INVALID_PARAM; + + return SBI_SUCCESS; +} + +static int kvm_sbi_fwft_get_misaligned_delegation(struct kvm_vcpu *vcpu, + struct kvm_sbi_fwft_config *conf, + unsigned long *value) +{ + *value = (csr_read(CSR_HEDELEG) & MIS_DELEG) != 0; + + return SBI_SUCCESS; +} + static const struct kvm_sbi_fwft_feature features[] = { + { + .id = SBI_FWFT_MISALIGNED_EXC_DELEG, + .supported = kvm_sbi_fwft_misaligned_delegation_supported, + .set = kvm_sbi_fwft_set_misaligned_delegation, + .get = kvm_sbi_fwft_get_misaligned_delegation, + }, }; static struct kvm_sbi_fwft_config * -- 2.47.2 From ajones at ventanamicro.com Wed Mar 12 09:29:10 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Wed, 12 Mar 2025 17:29:10 +0100 Subject: [kvm-unit-tests PATCH v8 5/6] lib: riscv: Add SBI SSE support In-Reply-To: <20250307161549.1873770-6-cleger@rivosinc.com> References: <20250307161549.1873770-1-cleger@rivosinc.com> <20250307161549.1873770-6-cleger@rivosinc.com> Message-ID: <20250312-cf0b7348207165c9dceddd56@orel> On Fri, Mar 07, 2025 at 05:15:47PM +0100, Cl?ment L?ger wrote: > Add support for registering and handling SSE events. This will be used > by sbi test as well as upcoming double trap tests. > > Signed-off-by: Cl?ment L?ger > Reviewed-by: Andrew Jones > --- > riscv/Makefile | 1 + > lib/riscv/asm/csr.h | 1 + > lib/riscv/asm/sbi.h | 38 ++++++++++++++- > lib/riscv/sbi-sse-asm.S | 103 ++++++++++++++++++++++++++++++++++++++++ > lib/riscv/asm-offsets.c | 9 ++++ > lib/riscv/sbi.c | 76 +++++++++++++++++++++++++++++ > 6 files changed, 227 insertions(+), 1 deletion(-) > create mode 100644 lib/riscv/sbi-sse-asm.S > > diff --git a/riscv/Makefile b/riscv/Makefile > index 02d2ac39..16fc125b 100644 > --- a/riscv/Makefile > +++ b/riscv/Makefile > @@ -43,6 +43,7 @@ cflatobjs += lib/riscv/setup.o > cflatobjs += lib/riscv/smp.o > cflatobjs += lib/riscv/stack.o > cflatobjs += lib/riscv/timer.o > +cflatobjs += lib/riscv/sbi-sse-asm.o > ifeq ($(ARCH),riscv32) > cflatobjs += lib/ldiv32.o > endif > diff --git a/lib/riscv/asm/csr.h b/lib/riscv/asm/csr.h > index c7fc87a9..3e4b5fca 100644 > --- a/lib/riscv/asm/csr.h > +++ b/lib/riscv/asm/csr.h > @@ -17,6 +17,7 @@ > #define CSR_TIME 0xc01 > > #define SR_SIE _AC(0x00000002, UL) > +#define SR_SPP _AC(0x00000100, UL) > > /* Exception cause high bit - is an interrupt if set */ > #define CAUSE_IRQ_FLAG (_AC(1, UL) << (__riscv_xlen - 1)) > diff --git a/lib/riscv/asm/sbi.h b/lib/riscv/asm/sbi.h > index 780c9edd..acef8a5e 100644 > --- a/lib/riscv/asm/sbi.h > +++ b/lib/riscv/asm/sbi.h > @@ -230,5 +230,41 @@ struct sbiret sbi_send_ipi_broadcast(void); > struct sbiret sbi_set_timer(unsigned long stime_value); > long sbi_probe(int ext); > > -#endif /* !__ASSEMBLER__ */ > +typedef void (*sbi_sse_handler_fn)(void *data, struct pt_regs *regs, unsigned int hartid); > + > +struct sbi_sse_handler_arg { > + unsigned long reg_tmp; > + sbi_sse_handler_fn handler; > + void *handler_data; > + void *stack; > +}; > + > +extern void sbi_sse_entry(void); > + > +static inline bool sbi_sse_event_is_global(uint32_t event_id) > +{ > + return !!(event_id & SBI_SSE_EVENT_GLOBAL_BIT); > +} > + > +struct sbiret sbi_sse_read_attrs_raw(unsigned long event_id, unsigned long base_attr_id, > + unsigned long attr_count, unsigned long phys_lo, > + unsigned long phys_hi); > +struct sbiret sbi_sse_read_attrs(unsigned long event_id, unsigned long base_attr_id, > + unsigned long attr_count, unsigned long *values); > +struct sbiret sbi_sse_write_attrs_raw(unsigned long event_id, unsigned long base_attr_id, > + unsigned long attr_count, unsigned long phys_lo, > + unsigned long phys_hi); > +struct sbiret sbi_sse_write_attrs(unsigned long event_id, unsigned long base_attr_id, > + unsigned long attr_count, unsigned long *values); > +struct sbiret sbi_sse_register_raw(unsigned long event_id, unsigned long entry_pc, > + unsigned long entry_arg); > +struct sbiret sbi_sse_register(unsigned long event_id, struct sbi_sse_handler_arg *arg); > +struct sbiret sbi_sse_unregister(unsigned long event_id); > +struct sbiret sbi_sse_enable(unsigned long event_id); > +struct sbiret sbi_sse_disable(unsigned long event_id); > +struct sbiret sbi_sse_hart_mask(void); > +struct sbiret sbi_sse_hart_unmask(void); > +struct sbiret sbi_sse_inject(unsigned long event_id, unsigned long hart_id); > + > +#endif /* !__ASSEMBLY__ */ When rebasing on latest master this should have been left as __ASSEMBLER__ > #endif /* _ASMRISCV_SBI_H_ */ > diff --git a/lib/riscv/sbi-sse-asm.S b/lib/riscv/sbi-sse-asm.S > new file mode 100644 > index 00000000..e4efd1ff > --- /dev/null > +++ b/lib/riscv/sbi-sse-asm.S > @@ -0,0 +1,103 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * RISC-V SSE events entry point. > + * > + * Copyright (C) 2025, Rivos Inc., Cl?ment L?ger > + */ > +#define __ASSEMBLY__ __ASSEMBLER__ > +#include > +#include > +#include > +#include > + > +.section .text > +.global sbi_sse_entry > +sbi_sse_entry: > + /* Save stack temporarily */ > + REG_S sp, SBI_SSE_REG_TMP(a7) > + /* Set entry stack */ > + REG_L sp, SBI_SSE_HANDLER_STACK(a7) > + > + addi sp, sp, -(PT_SIZE) > + REG_S ra, PT_RA(sp) > + REG_S s0, PT_S0(sp) > + REG_S s1, PT_S1(sp) > + REG_S s2, PT_S2(sp) > + REG_S s3, PT_S3(sp) > + REG_S s4, PT_S4(sp) > + REG_S s5, PT_S5(sp) > + REG_S s6, PT_S6(sp) > + REG_S s7, PT_S7(sp) > + REG_S s8, PT_S8(sp) > + REG_S s9, PT_S9(sp) > + REG_S s10, PT_S10(sp) > + REG_S s11, PT_S11(sp) > + REG_S tp, PT_TP(sp) > + REG_S t0, PT_T0(sp) > + REG_S t1, PT_T1(sp) > + REG_S t2, PT_T2(sp) > + REG_S t3, PT_T3(sp) > + REG_S t4, PT_T4(sp) > + REG_S t5, PT_T5(sp) > + REG_S t6, PT_T6(sp) > + REG_S gp, PT_GP(sp) > + REG_S a0, PT_A0(sp) > + REG_S a1, PT_A1(sp) > + REG_S a2, PT_A2(sp) > + REG_S a3, PT_A3(sp) > + REG_S a4, PT_A4(sp) > + REG_S a5, PT_A5(sp) > + csrr a1, CSR_SEPC > + REG_S a1, PT_EPC(sp) > + csrr a2, CSR_SSTATUS > + REG_S a2, PT_STATUS(sp) > + > + REG_L a0, SBI_SSE_REG_TMP(a7) > + REG_S a0, PT_SP(sp) > + > + REG_L t0, SBI_SSE_HANDLER(a7) > + REG_L a0, SBI_SSE_HANDLER_DATA(a7) > + mv a1, sp > + mv a2, a6 > + jalr t0 > + > + REG_L a1, PT_EPC(sp) > + REG_L a2, PT_STATUS(sp) > + csrw CSR_SEPC, a1 > + csrw CSR_SSTATUS, a2 > + > + REG_L ra, PT_RA(sp) > + REG_L s0, PT_S0(sp) > + REG_L s1, PT_S1(sp) > + REG_L s2, PT_S2(sp) > + REG_L s3, PT_S3(sp) > + REG_L s4, PT_S4(sp) > + REG_L s5, PT_S5(sp) > + REG_L s6, PT_S6(sp) > + REG_L s7, PT_S7(sp) > + REG_L s8, PT_S8(sp) > + REG_L s9, PT_S9(sp) > + REG_L s10, PT_S10(sp) > + REG_L s11, PT_S11(sp) > + REG_L tp, PT_TP(sp) > + REG_L t0, PT_T0(sp) > + REG_L t1, PT_T1(sp) > + REG_L t2, PT_T2(sp) > + REG_L t3, PT_T3(sp) > + REG_L t4, PT_T4(sp) > + REG_L t5, PT_T5(sp) > + REG_L t6, PT_T6(sp) > + REG_L gp, PT_GP(sp) > + REG_L a0, PT_A0(sp) > + REG_L a1, PT_A1(sp) > + REG_L a2, PT_A2(sp) > + REG_L a3, PT_A3(sp) > + REG_L a4, PT_A4(sp) > + REG_L a5, PT_A5(sp) > + > + REG_L sp, PT_SP(sp) > + > + li a7, ASM_SBI_EXT_SSE > + li a6, ASM_SBI_EXT_SSE_COMPLETE > + ecall > + > diff --git a/lib/riscv/asm-offsets.c b/lib/riscv/asm-offsets.c > index 6c511c14..a96c6e97 100644 > --- a/lib/riscv/asm-offsets.c > +++ b/lib/riscv/asm-offsets.c > @@ -3,6 +3,7 @@ > #include > #include > #include > +#include > #include > > int main(void) > @@ -63,5 +64,13 @@ int main(void) > OFFSET(THREAD_INFO_HARTID, thread_info, hartid); > DEFINE(THREAD_INFO_SIZE, sizeof(struct thread_info)); > > + DEFINE(ASM_SBI_EXT_SSE, SBI_EXT_SSE); > + DEFINE(ASM_SBI_EXT_SSE_COMPLETE, SBI_EXT_SSE_COMPLETE); > + > + OFFSET(SBI_SSE_REG_TMP, sbi_sse_handler_arg, reg_tmp); > + OFFSET(SBI_SSE_HANDLER, sbi_sse_handler_arg, handler); > + OFFSET(SBI_SSE_HANDLER_DATA, sbi_sse_handler_arg, handler_data); > + OFFSET(SBI_SSE_HANDLER_STACK, sbi_sse_handler_arg, stack); > + > return 0; > } > diff --git a/lib/riscv/sbi.c b/lib/riscv/sbi.c > index 02dd338c..1752c916 100644 > --- a/lib/riscv/sbi.c > +++ b/lib/riscv/sbi.c > @@ -2,6 +2,7 @@ > #include > #include > #include > +#include > #include > #include > > @@ -31,6 +32,81 @@ struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0, > return ret; > } > > +struct sbiret sbi_sse_read_attrs_raw(unsigned long event_id, unsigned long base_attr_id, > + unsigned long attr_count, unsigned long phys_lo, > + unsigned long phys_hi) extra space here ^ > +{ > + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_READ_ATTRS, event_id, base_attr_id, attr_count, > + phys_lo, phys_hi, 0); > +} > + > +struct sbiret sbi_sse_read_attrs(unsigned long event_id, unsigned long base_attr_id, > + unsigned long attr_count, unsigned long *values) > +{ > + phys_addr_t p = virt_to_phys(values); > + > + return sbi_sse_read_attrs_raw(event_id, base_attr_id, attr_count, lower_32_bits(p), > + upper_32_bits(p)); > +} > + > +struct sbiret sbi_sse_write_attrs_raw(unsigned long event_id, unsigned long base_attr_id, > + unsigned long attr_count, unsigned long phys_lo, > + unsigned long phys_hi) > +{ > + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_WRITE_ATTRS, event_id, base_attr_id, attr_count, > + phys_lo, phys_hi, 0); > +} > + > +struct sbiret sbi_sse_write_attrs(unsigned long event_id, unsigned long base_attr_id, > + unsigned long attr_count, unsigned long *values) > +{ > + phys_addr_t p = virt_to_phys(values); > + > + return sbi_sse_write_attrs_raw(event_id, base_attr_id, attr_count, lower_32_bits(p), > + upper_32_bits(p)); > +} > + > +struct sbiret sbi_sse_register_raw(unsigned long event_id, unsigned long entry_pc, > + unsigned long entry_arg) > +{ > + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_REGISTER, event_id, entry_pc, entry_arg, 0, 0, 0); > +} > + > +struct sbiret sbi_sse_register(unsigned long event_id, struct sbi_sse_handler_arg *arg) > +{ > + return sbi_sse_register_raw(event_id, (unsigned long) sbi_sse_entry, (unsigned long) arg); nit: no spaces after casts > +} > + > +struct sbiret sbi_sse_unregister(unsigned long event_id) > +{ > + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_UNREGISTER, event_id, 0, 0, 0, 0, 0); > +} > + > +struct sbiret sbi_sse_enable(unsigned long event_id) > +{ > + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_ENABLE, event_id, 0, 0, 0, 0, 0); > +} > + > +struct sbiret sbi_sse_disable(unsigned long event_id) > +{ > + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_DISABLE, event_id, 0, 0, 0, 0, 0); > +} > + > +struct sbiret sbi_sse_hart_mask(void) > +{ > + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_HART_MASK, 0, 0, 0, 0, 0, 0); > +} > + > +struct sbiret sbi_sse_hart_unmask(void) > +{ > + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_HART_UNMASK, 0, 0, 0, 0, 0, 0); > +} > + > +struct sbiret sbi_sse_inject(unsigned long event_id, unsigned long hart_id) > +{ > + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_INJECT, event_id, hart_id, 0, 0, 0, 0); > +} > + > void sbi_shutdown(void) > { > sbi_ecall(SBI_EXT_SRST, 0, 0, 0, 0, 0, 0, 0); > -- > 2.47.2 > Thanks, drew From ajones at ventanamicro.com Wed Mar 12 09:30:09 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Wed, 12 Mar 2025 17:30:09 +0100 Subject: [kvm-unit-tests PATCH v8 2/6] riscv: Set .aux.o files as .PRECIOUS In-Reply-To: <20250307161549.1873770-3-cleger@rivosinc.com> References: <20250307161549.1873770-1-cleger@rivosinc.com> <20250307161549.1873770-3-cleger@rivosinc.com> Message-ID: <20250312-c9b8ee98ed38271ed7422550@orel> On Fri, Mar 07, 2025 at 05:15:44PM +0100, Cl?ment L?ger wrote: > When compiling, we need to keep .aux.o file or they will be removed > after the compilation which leads to dependent files to be recompiled. > Set these files as .PRECIOUS to keep them. > > Signed-off-by: Cl?ment L?ger > --- > riscv/Makefile | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/riscv/Makefile b/riscv/Makefile > index 52718f3f..ae9cf02a 100644 > --- a/riscv/Makefile > +++ b/riscv/Makefile > @@ -90,6 +90,7 @@ CFLAGS += -I $(SRCDIR)/lib -I $(SRCDIR)/lib/libfdt -I lib -I $(SRCDIR)/riscv > asm-offsets = lib/riscv/asm-offsets.h > include $(SRCDIR)/scripts/asm-offsets.mak > > +.PRECIOUS: %.aux.o > %.aux.o: $(SRCDIR)/lib/auxinfo.c > $(CC) $(CFLAGS) -c -o $@ $< \ > -DPROGNAME=\"$(notdir $(@:.aux.o=.$(exe)))\" -DAUXFLAGS=$(AUXFLAGS) > -- > 2.47.2 > Reviewed-by: Andrew Jones From ajones at ventanamicro.com Wed Mar 12 10:51:53 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Wed, 12 Mar 2025 18:51:53 +0100 Subject: [kvm-unit-tests PATCH v8 6/6] riscv: sbi: Add SSE extension tests In-Reply-To: <20250307161549.1873770-7-cleger@rivosinc.com> References: <20250307161549.1873770-1-cleger@rivosinc.com> <20250307161549.1873770-7-cleger@rivosinc.com> Message-ID: <20250312-fa9b1889ef5f422be2e20cf4@orel> On Fri, Mar 07, 2025 at 05:15:48PM +0100, Cl?ment L?ger wrote: > Add SBI SSE extension tests for the following features: > - Test attributes errors (invalid values, RO, etc) > - Registration errors > - Simple events (register, enable, inject) > - Events with different priorities > - Global events dispatch on different harts > - Local events on all harts > - Hart mask/unmask events > > Signed-off-by: Cl?ment L?ger > --- > riscv/Makefile | 1 + > riscv/sbi-tests.h | 1 + > riscv/sbi-sse.c | 1215 +++++++++++++++++++++++++++++++++++++++++++++ > riscv/sbi.c | 2 + > 4 files changed, 1219 insertions(+) > create mode 100644 riscv/sbi-sse.c > > diff --git a/riscv/Makefile b/riscv/Makefile > index 16fc125b..4fe2f1bb 100644 > --- a/riscv/Makefile > +++ b/riscv/Makefile > @@ -18,6 +18,7 @@ tests += $(TEST_DIR)/sieve.$(exe) > all: $(tests) > > $(TEST_DIR)/sbi-deps = $(TEST_DIR)/sbi-asm.o $(TEST_DIR)/sbi-fwft.o > +$(TEST_DIR)/sbi-deps += $(TEST_DIR)/sbi-sse.o > > # When built for EFI sieve needs extra memory, run with e.g. '-m 256' on QEMU > $(TEST_DIR)/sieve.$(exe): AUXFLAGS = 0x1 > diff --git a/riscv/sbi-tests.h b/riscv/sbi-tests.h > index b081464d..a71da809 100644 > --- a/riscv/sbi-tests.h > +++ b/riscv/sbi-tests.h > @@ -71,6 +71,7 @@ > sbiret_report(ret, expected_error, expected_value, "check sbi.error and sbi.value") > > void sbi_bad_fid(int ext); > +void check_sse(void); > > #endif /* __ASSEMBLER__ */ > #endif /* _RISCV_SBI_TESTS_H_ */ > diff --git a/riscv/sbi-sse.c b/riscv/sbi-sse.c > new file mode 100644 > index 00000000..7bd58b8b > --- /dev/null > +++ b/riscv/sbi-sse.c > @@ -0,0 +1,1215 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* > + * SBI SSE testsuite > + * > + * Copyright (C) 2025, Rivos Inc., Cl?ment L?ger > + */ > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include "sbi-tests.h" > + > +#define SSE_STACK_SIZE PAGE_SIZE > + > +struct sse_event_info { > + uint32_t event_id; > + const char *name; > + bool can_inject; > +}; > + > +static struct sse_event_info sse_event_infos[] = { > + { > + .event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, > + .name = "local_high_prio_ras", > + }, > + { > + .event_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP, > + .name = "double_trap", > + }, > + { > + .event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, > + .name = "global_high_prio_ras", > + }, > + { > + .event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, > + .name = "local_pmu_overflow", > + }, > + { > + .event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, > + .name = "local_low_prio_ras", > + }, > + { > + .event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, > + .name = "global_low_prio_ras", > + }, > + { > + .event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE, > + .name = "local_software", > + }, > + { > + .event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE, > + .name = "global_software", > + }, > +}; > + > +static const char *const attr_names[] = { > + [SBI_SSE_ATTR_STATUS] = "status", > + [SBI_SSE_ATTR_PRIORITY] = "priority", > + [SBI_SSE_ATTR_CONFIG] = "config", > + [SBI_SSE_ATTR_PREFERRED_HART] = "preferred_hart", > + [SBI_SSE_ATTR_ENTRY_PC] = "entry_pc", > + [SBI_SSE_ATTR_ENTRY_ARG] = "entry_arg", > + [SBI_SSE_ATTR_INTERRUPTED_SEPC] = "interrupted_sepc", > + [SBI_SSE_ATTR_INTERRUPTED_FLAGS] = "interrupted_flags", > + [SBI_SSE_ATTR_INTERRUPTED_A6] = "interrupted_a6", > + [SBI_SSE_ATTR_INTERRUPTED_A7] = "interrupted_a7", nit: tabulate > +}; > + > +static const unsigned long ro_attrs[] = { > + SBI_SSE_ATTR_STATUS, > + SBI_SSE_ATTR_ENTRY_PC, > + SBI_SSE_ATTR_ENTRY_ARG, > +}; > + > +static const unsigned long interrupted_attrs[] = { > + SBI_SSE_ATTR_INTERRUPTED_SEPC, > + SBI_SSE_ATTR_INTERRUPTED_FLAGS, > + SBI_SSE_ATTR_INTERRUPTED_A6, > + SBI_SSE_ATTR_INTERRUPTED_A7, > +}; > + > +static const unsigned long interrupted_flags[] = { > + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPP, > + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPIE, > + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPELP, > + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT, > + SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPV, > + SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPVP, > +}; > + > +static struct sse_event_info *sse_event_get_info(uint32_t event_id) > +{ > + int i; > + > + for (i = 0; i < ARRAY_SIZE(sse_event_infos); i++) { > + if (sse_event_infos[i].event_id == event_id) > + return &sse_event_infos[i]; > + } > + > + assert_msg(false, "Invalid event id: %d", event_id); > +} > + > +static const char *sse_event_name(uint32_t event_id) > +{ > + return sse_event_get_info(event_id)->name; > +} > + > +static bool sse_event_can_inject(uint32_t event_id) > +{ > + return sse_event_get_info(event_id)->can_inject; > +} > + > +static struct sbiret sse_get_event_status_field(uint32_t event_id, unsigned long mask, > + unsigned long shift, unsigned long *value) > +{ > + struct sbiret ret; > + unsigned long status; > + > + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, 1, &status); > + if (ret.error) { > + sbiret_report_error(&ret, SBI_SUCCESS, "Get event status"); > + return ret; > + } > + > + *value = (status & mask) >> shift; > + > + return ret; > +} > + > +static struct sbiret sse_event_get_state(uint32_t event_id, enum sbi_sse_state *state) > +{ > + unsigned long status = 0; > + struct sbiret ret; > + > + ret = sse_get_event_status_field(event_id, SBI_SSE_ATTR_STATUS_STATE_MASK, > + SBI_SSE_ATTR_STATUS_STATE_OFFSET, &status); > + *state = status; > + > + return ret; > +} > + > +static unsigned long sse_global_event_set_current_hart(uint32_t event_id) > +{ > + struct sbiret ret; > + unsigned long current_hart = current_thread_info()->hartid; > + > + if (!sbi_sse_event_is_global(event_id)) > + return SBI_ERR_INVALID_PARAM; The only two callers of sse_global_event_set_current_hart() check sbi_sse_event_is_global() themselves, so this check can be an assert(). I think it should be anyway, since returning an SBI error, which didn't originate in SBI, could lead to confusion. > + > + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, ¤t_hart); > + if (sbiret_report_error(&ret, SBI_SUCCESS, "Set preferred hart")) > + return ret.error; > + > + return 0; > +} > + > +static bool sse_check_state(uint32_t event_id, unsigned long expected_state) > +{ > + struct sbiret ret; > + enum sbi_sse_state state; > + > + ret = sse_event_get_state(event_id, &state); > + if (ret.error) > + return false; > + > + return report(state == expected_state, "event status == %ld", expected_state); > +} > + > +static bool sse_event_pending(uint32_t event_id) > +{ > + bool pending = 0; > + > + sse_get_event_status_field(event_id, BIT(SBI_SSE_ATTR_STATUS_PENDING_OFFSET), > + SBI_SSE_ATTR_STATUS_PENDING_OFFSET, (unsigned long*)&pending); > + > + return pending; > +} > + > +static void *sse_alloc_stack(void) > +{ > + /* > + * We assume that SSE_STACK_SIZE always fit in one page. This page will > + * always be decremented before storing anything on it in sse-entry.S. > + */ > + assert(SSE_STACK_SIZE <= PAGE_SIZE); > + > + return (alloc_page() + SSE_STACK_SIZE); > +} > + > +static void sse_free_stack(void *stack) > +{ > + free_page(stack - SSE_STACK_SIZE); > +} > + > +static void sse_read_write_test(uint32_t event_id, unsigned long attr, unsigned long attr_count, > + unsigned long *value, long expected_error, const char *str) > +{ > + struct sbiret ret; > + > + ret = sbi_sse_read_attrs(event_id, attr, attr_count, value); > + sbiret_report_error(&ret, expected_error, "Read %s error", str); > + > + ret = sbi_sse_write_attrs(event_id, attr, attr_count, value); > + sbiret_report_error(&ret, expected_error, "Write %s error", str); > +} > + > +#define ALL_ATTRS_COUNT (SBI_SSE_ATTR_INTERRUPTED_A7 + 1) > + > +static void sse_test_attrs(uint32_t event_id) > +{ > + unsigned long value = 0; > + struct sbiret ret; > + void *ptr; > + unsigned long values[ALL_ATTRS_COUNT]; > + unsigned int i; > + const char *invalid_hart_str; > + const char *attr_name; > + > + report_prefix_push("attrs"); > + > + for (i = 0; i < ARRAY_SIZE(ro_attrs); i++) { > + ret = sbi_sse_write_attrs(event_id, ro_attrs[i], 1, &value); > + sbiret_report_error(&ret, SBI_ERR_DENIED, "RO attribute %s not writable", > + attr_names[ro_attrs[i]]); > + } > + > + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, ALL_ATTRS_COUNT, values); > + sbiret_report_error(&ret, SBI_SUCCESS, "Read multiple attributes"); > + > + for (i = SBI_SSE_ATTR_STATUS; i <= SBI_SSE_ATTR_INTERRUPTED_A7; i++) { > + ret = sbi_sse_read_attrs(event_id, i, 1, &value); > + attr_name = attr_names[i]; > + > + sbiret_report_error(&ret, SBI_SUCCESS, "Read single attribute %s", attr_name); > + if (values[i] != value) > + report_fail("Attribute 0x%x single value read (0x%lx) differs from the one read with multiple attributes (0x%lx)", > + i, value, values[i]); > + /* > + * Preferred hart reset value is defined by SBI vendor > + */ > + if(i != SBI_SSE_ATTR_PREFERRED_HART) { ^ missing space > + /* > + * Specification states that injectable bit is implementation dependent > + * but other bits are zero-initialized. > + */ > + if (i == SBI_SSE_ATTR_STATUS) > + value &= ~BIT(SBI_SSE_ATTR_STATUS_INJECT_OFFSET); > + report(value == 0, "Attribute %s reset value is 0, found %lx", attr_name, > + value); nit: no need to wrap the line > + } > + } > + > +#if __riscv_xlen > 32 > + value = BIT(32); > + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); > + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Write invalid prio > 0xFFFFFFFF error"); > +#endif > + > + value = ~SBI_SSE_ATTR_CONFIG_ONESHOT; > + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_CONFIG, 1, &value); > + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Write invalid config value error"); > + > + if (sbi_sse_event_is_global(event_id)) { > + invalid_hart_str = getenv("INVALID_HART_ID"); > + if (!invalid_hart_str) > + value = 0xFFFFFFFFUL; > + else > + value = strtoul(invalid_hart_str, NULL, 0); > + > + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); > + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid hart id error"); > + } else { > + /* Set Hart on local event -> RO */ > + value = current_thread_info()->hartid; > + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); > + sbiret_report_error(&ret, SBI_ERR_DENIED, > + "Set hart id on local event error"); > + } > + > + /* Set/get flags, sepc, a6, a7 */ > + for (i = 0; i < ARRAY_SIZE(interrupted_attrs); i++) { > + attr_name = attr_names[interrupted_attrs[i]]; > + ret = sbi_sse_read_attrs(event_id, interrupted_attrs[i], 1, &value); > + sbiret_report_error(&ret, SBI_SUCCESS, "Get interrupted %s", attr_name); > + > + value = ARRAY_SIZE(interrupted_attrs) - i; > + ret = sbi_sse_write_attrs(event_id, interrupted_attrs[i], 1, &value); > + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, > + "Set attribute %s invalid state error", attr_name); > + } > + > + sse_read_write_test(event_id, SBI_SSE_ATTR_STATUS, 0, &value, SBI_ERR_INVALID_PARAM, > + "attribute attr_count == 0"); > + sse_read_write_test(event_id, SBI_SSE_ATTR_INTERRUPTED_A7 + 1, 1, &value, SBI_ERR_BAD_RANGE, > + "invalid attribute"); > + > + /* Misaligned pointer address */ > + ptr = (void *)&value; > + ptr += 1; > + sse_read_write_test(event_id, SBI_SSE_ATTR_STATUS, 1, ptr, SBI_ERR_INVALID_ADDRESS, > + "attribute with invalid address"); > + > + report_prefix_pop(); > +} > + > +static void sse_test_register_error(uint32_t event_id) > +{ > + struct sbiret ret; > + > + report_prefix_push("register"); > + > + ret = sbi_sse_unregister(event_id); > + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "unregister non-registered event"); > + > + ret = sbi_sse_register_raw(event_id, 0x1, 0); > + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "register misaligned entry"); > + > + ret = sbi_sse_register_raw(event_id, (unsigned long)sbi_sse_entry, 0); sbi_sse_register(event_id, sbi_sse_entry, NULL) to avoid a cast. > + sbiret_report_error(&ret, SBI_SUCCESS, "register"); > + if (ret.error) > + goto done; > + > + ret = sbi_sse_register_raw(event_id, (unsigned long)sbi_sse_entry, 0); same > + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "register used event failure"); > + > + ret = sbi_sse_unregister(event_id); > + sbiret_report_error(&ret, SBI_SUCCESS, "unregister"); > + > +done: > + report_prefix_pop(); > +} > + > +struct sse_simple_test_arg { > + bool done; > + unsigned long expected_a6; > + uint32_t event_id; > +}; > + > +#if __riscv_xlen > 32 > + > +struct alias_test_params { > + unsigned long event_id; > + unsigned long attr_id; > + unsigned long attr_count; > + const char *str; > +}; > + > +static void test_alias(uint32_t event_id) > +{ > + struct alias_test_params *write, *read; > + unsigned long write_value, read_value; > + struct sbiret ret; > + bool err = false; > + int r, w; > + struct alias_test_params params[] = { > + {event_id, SBI_SSE_ATTR_INTERRUPTED_A6, 1, "non aliased"}, > + {BIT(32) + event_id, SBI_SSE_ATTR_INTERRUPTED_A6, 1, "aliased event_id"}, > + {event_id, BIT(32) + SBI_SSE_ATTR_INTERRUPTED_A6, 1, "aliased attr_id"}, > + {event_id, SBI_SSE_ATTR_INTERRUPTED_A6, BIT(32) + 1, "aliased attr_count"}, > + }; > + > + report_prefix_push("alias"); > + for (w = 0; w < ARRAY_SIZE(params); w++) { > + write = ¶ms[w]; > + > + write_value = 0xDEADBEEF + w; > + ret = sbi_sse_write_attrs(write->event_id, write->attr_id, write->attr_count, &write_value); > + if (ret.error) > + sbiret_report_error(&ret, SBI_SUCCESS, "Write %s, event 0x%lx attr 0x%lx, attr count 0x%lx", write->str, write->event_id, write->attr_id, write->attr_count); I like long lines, but this one is too long even for me. I'd break it after the format string. > + > + for (r = 0; r < ARRAY_SIZE(params); r++) { > + read = ¶ms[r]; > + read_value = 0; > + ret = sbi_sse_read_attrs(read->event_id, read->attr_id, read->attr_count, &read_value); > + if (ret.error) > + sbiret_report_error(&ret, SBI_SUCCESS, > + "Read %s, event 0x%lx attr 0x%lx, attr count 0x%lx", read->str, read->event_id, read->attr_id, read->attr_count); same > + > + /* Do not spam output with a lot of reports */ > + if (write_value != read_value) { > + err = true; > + report_fail("Write %s, event 0x%lx attr 0x%lx, attr count 0x%lx value %lx ==" > + "Read %s, event 0x%lx attr 0x%lx, attr count 0x%lx value %lx", > + write->str, write->event_id, write->attr_id, write->attr_count, > + write_value, > + read->str, read->event_id, read->attr_id, read->attr_count, > + read_value); indentation is off > + } > + } > + } > + > + report(!err, "BIT(32) aliasing tests"); > + report_prefix_pop(); > +} > +#endif > + > +static void sse_simple_handler(void *data, struct pt_regs *regs, unsigned int hartid) > +{ > + struct sse_simple_test_arg *arg = data; > + int i; > + struct sbiret ret; > + const char *attr_name; > + uint32_t event_id = READ_ONCE(arg->event_id), attr; > + unsigned long value, prev_value, flags; > + unsigned long interrupted_state[ARRAY_SIZE(interrupted_attrs)]; > + unsigned long modified_state[ARRAY_SIZE(interrupted_attrs)] = {4, 3, 2, 1}; > + unsigned long tmp_state[ARRAY_SIZE(interrupted_attrs)]; > + > + report((regs->status & SR_SPP) == SR_SPP, "Interrupted S-mode"); > + report(hartid == current_thread_info()->hartid, "Hartid correctly passed"); > + sse_check_state(event_id, SBI_SSE_STATE_RUNNING); > + report(!sse_event_pending(event_id), "Event not pending"); > + > + /* Read full interrupted state */ > + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, > + ARRAY_SIZE(interrupted_attrs), interrupted_state); > + sbiret_report_error(&ret, SBI_SUCCESS, "Save full interrupted state from handler"); > + > + /* Write full modified state and read it */ > + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, > + ARRAY_SIZE(modified_state), modified_state); > + sbiret_report_error(&ret, SBI_SUCCESS, > + "Write full interrupted state from handler"); > + > + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, > + ARRAY_SIZE(tmp_state), tmp_state); > + sbiret_report_error(&ret, SBI_SUCCESS, "Read full modified state from handler"); > + > + report(memcmp(tmp_state, modified_state, sizeof(modified_state)) == 0, > + "Full interrupted state successfully written"); > + > +#if __riscv_xlen > 32 > + test_alias(event_id); > +#endif > + > + /* Restore full saved state */ > + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, > + ARRAY_SIZE(interrupted_attrs), interrupted_state); indentation is off > + sbiret_report_error(&ret, SBI_SUCCESS, > + "Full interrupted state restore from handler"); no need to wrap > + > + /* We test SBI_SSE_ATTR_INTERRUPTED_FLAGS below with specific flag values */ > + for (i = 0; i < ARRAY_SIZE(interrupted_attrs); i++) { > + attr = interrupted_attrs[i]; > + if (attr == SBI_SSE_ATTR_INTERRUPTED_FLAGS) > + continue; > + > + attr_name = attr_names[attr]; > + > + ret = sbi_sse_read_attrs(event_id, attr, 1, &prev_value); > + sbiret_report_error(&ret, SBI_SUCCESS, "Get attr %s", attr_name); > + > + value = 0xDEADBEEF + i; > + ret = sbi_sse_write_attrs(event_id, attr, 1, &value); > + sbiret_report_error(&ret, SBI_SUCCESS, "Set attr %s", attr_name); > + > + ret = sbi_sse_read_attrs(event_id, attr, 1, &value); > + sbiret_report_error(&ret, SBI_SUCCESS, "Get attr %s", attr_name); > + report(value == 0xDEADBEEF + i, "Get attr %s, value: 0x%lx", attr_name, > + value); no need to wrap > + > + ret = sbi_sse_write_attrs(event_id, attr, 1, &prev_value); > + sbiret_report_error(&ret, SBI_SUCCESS, "Restore attr %s value", attr_name); > + } > + > + /* Test all flags allowed for SBI_SSE_ATTR_INTERRUPTED_FLAGS */ > + attr = SBI_SSE_ATTR_INTERRUPTED_FLAGS; > + ret = sbi_sse_read_attrs(event_id, attr, 1, &prev_value); > + sbiret_report_error(&ret, SBI_SUCCESS, "Save interrupted flags"); > + > + for (i = 0; i < ARRAY_SIZE(interrupted_flags); i++) { > + flags = interrupted_flags[i]; > + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); > + sbiret_report_error(&ret, SBI_SUCCESS, > + "Set interrupted flags bit 0x%lx value", flags); > + ret = sbi_sse_read_attrs(event_id, attr, 1, &value); > + sbiret_report_error(&ret, SBI_SUCCESS, "Get interrupted flags after set"); > + report(value == flags, "interrupted flags modified value: 0x%lx", value); > + } > + > + /* Write invalid bit in flag register */ > + flags = SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT << 1; > + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); > + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", > + flags); > + no need to wrap > + flags = BIT(SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT + 1); > + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); > + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", > + flags); > + no need to wrap > + ret = sbi_sse_write_attrs(event_id, attr, 1, &prev_value); > + sbiret_report_error(&ret, SBI_SUCCESS, "Restore interrupted flags"); > + > + /* Try to change HARTID/Priority while running */ > + if (sbi_sse_event_is_global(event_id)) { > + value = current_thread_info()->hartid; > + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); > + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "Set hart id while running error"); > + } > + > + value = 0; > + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); > + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "Set priority while running error"); > + > + value = READ_ONCE(arg->expected_a6); > + report(interrupted_state[2] == value, "Interrupted state a6, expected 0x%lx, got 0x%lx", > + value, interrupted_state[2]); > + > + report(interrupted_state[3] == SBI_EXT_SSE, > + "Interrupted state a7, expected 0x%x, got 0x%lx", SBI_EXT_SSE, > + interrupted_state[3]); > + > + WRITE_ONCE(arg->done, true); > +} > + > +static int sse_test_inject_simple(uint32_t event_id) > +{ > + unsigned long value, error; > + struct sbiret ret; > + int err_ret = 1; > + struct sse_simple_test_arg test_arg = {.event_id = event_id}; > + struct sbi_sse_handler_arg args = { > + .handler = sse_simple_handler, > + .handler_data = (void *)&test_arg, > + .stack = sse_alloc_stack(), > + }; > + > + report_prefix_push("simple"); > + > + if (!sse_check_state(event_id, SBI_SSE_STATE_UNUSED)) > + goto err; > + > + ret = sbi_sse_register(event_id, &args); > + if (!sbiret_report_error(&ret, SBI_SUCCESS, "register")) > + goto err; > + > + if (!sse_check_state(event_id, SBI_SSE_STATE_REGISTERED)) > + goto err; > + > + if (sbi_sse_event_is_global(event_id)) { > + /* Be sure global events are targeting the current hart */ > + error = sse_global_event_set_current_hart(event_id); > + if (error) > + goto err; > + } > + > + ret = sbi_sse_enable(event_id); > + if (!sbiret_report_error(&ret, SBI_SUCCESS, "enable")) > + goto err; > + > + if (!sse_check_state(event_id, SBI_SSE_STATE_ENABLED)) > + goto err; > + > + ret = sbi_sse_hart_mask(); > + if (!sbiret_report_error(&ret, SBI_SUCCESS, "hart mask")) > + goto err; > + > + ret = sbi_sse_inject(event_id, current_thread_info()->hartid); > + if (!sbiret_report_error(&ret, SBI_SUCCESS, "injection masked")) { > + sbi_sse_hart_unmask(); > + goto err; > + } > + > + report(READ_ONCE(test_arg.done) == 0, "event masked not handled"); > + > + /* > + * When unmasking the SSE events, we expect it to be injected > + * immediately so a6 should be SBI_EXT_SBI_SSE_HART_UNMASK > + */ > + WRITE_ONCE(test_arg.expected_a6, SBI_EXT_SSE_HART_UNMASK); > + ret = sbi_sse_hart_unmask(); > + if (!sbiret_report_error(&ret, SBI_SUCCESS, "hart unmask")) { > + goto err; > + } > + > + report(READ_ONCE(test_arg.done) == 1, "event unmasked handled"); > + WRITE_ONCE(test_arg.done, 0); > + WRITE_ONCE(test_arg.expected_a6, SBI_EXT_SSE_INJECT); > + > + /* Set as oneshot and verify it is disabled */ > + ret = sbi_sse_disable(event_id); > + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Disable event")) { > + /* Nothing we can really do here, event can not be disabled */ > + goto err; > + } > + > + value = SBI_SSE_ATTR_CONFIG_ONESHOT; > + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_CONFIG, 1, &value); > + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Set event attribute as ONESHOT")) > + goto err; > + > + ret = sbi_sse_enable(event_id); > + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Enable event")) > + goto err; > + > + ret = sbi_sse_inject(event_id, current_thread_info()->hartid); > + if (!sbiret_report_error(&ret, SBI_SUCCESS, "second injection")) > + goto err; > + > + report(READ_ONCE(test_arg.done) == 1, "event handled"); > + WRITE_ONCE(test_arg.done, 0); > + > + if (!sse_check_state(event_id, SBI_SSE_STATE_REGISTERED)) > + goto err; > + > + /* Clear ONESHOT FLAG */ > + value = 0; > + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_CONFIG, 1, &value); > + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Clear CONFIG.ONESHOT flag")) > + goto err; > + > + ret = sbi_sse_unregister(event_id); > + if (!sbiret_report_error(&ret, SBI_SUCCESS, "unregister")) > + goto err; > + > + sse_check_state(event_id, SBI_SSE_STATE_UNUSED); > + > + err_ret = 0; > +err: > + sse_free_stack(args.stack); > + report_prefix_pop(); > + > + return err_ret; > +} > + > +struct sse_foreign_cpu_test_arg { > + bool done; > + unsigned int expected_cpu; > + uint32_t event_id; > +}; > + > +static void sse_foreign_cpu_handler(void *data, struct pt_regs *regs, unsigned int hartid) > +{ > + struct sse_foreign_cpu_test_arg *arg = data; > + unsigned int expected_cpu; > + > + /* For arg content to be visible */ > + smp_rmb(); > + expected_cpu = READ_ONCE(arg->expected_cpu); > + report(expected_cpu == current_thread_info()->cpu, > + "Received event on CPU (%d), expected CPU (%d)", current_thread_info()->cpu, > + expected_cpu); > + > + WRITE_ONCE(arg->done, true); > + /* For arg update to be visible for other CPUs */ > + smp_wmb(); > +} > + > +struct sse_local_per_cpu { > + struct sbi_sse_handler_arg args; > + struct sbiret ret; > + struct sse_foreign_cpu_test_arg handler_arg; > +}; > + > +static void sse_register_enable_local(void *data) > +{ > + struct sbiret ret; > + struct sse_local_per_cpu *cpu_args = data; > + struct sse_local_per_cpu *cpu_arg = &cpu_args[current_thread_info()->cpu]; > + uint32_t event_id = cpu_arg->handler_arg.event_id; > + > + ret = sbi_sse_register(event_id, &cpu_arg->args); > + WRITE_ONCE(cpu_arg->ret, ret); > + if (ret.error) > + return; > + > + ret = sbi_sse_enable(event_id); > + WRITE_ONCE(cpu_arg->ret, ret); > +} > + > +static void sbi_sse_disable_unregister_local(void *data) > +{ > + struct sbiret ret; > + struct sse_local_per_cpu *cpu_args = data; > + struct sse_local_per_cpu *cpu_arg = &cpu_args[current_thread_info()->cpu]; > + uint32_t event_id = cpu_arg->handler_arg.event_id; > + > + ret = sbi_sse_disable(event_id); > + WRITE_ONCE(cpu_arg->ret, ret); > + if (ret.error) > + return; > + > + ret = sbi_sse_unregister(event_id); > + WRITE_ONCE(cpu_arg->ret, ret); > +} > + > +static int sse_test_inject_local(uint32_t event_id) > +{ > + int cpu; > + int err_ret = 1; > + struct sbiret ret; > + struct sse_local_per_cpu *cpu_args, *cpu_arg; > + struct sse_foreign_cpu_test_arg *handler_arg; > + > + cpu_args = calloc(NR_CPUS, sizeof(struct sbi_sse_handler_arg)); > + > + report_prefix_push("local_dispatch"); > + for_each_online_cpu(cpu) { > + cpu_arg = &cpu_args[cpu]; > + cpu_arg->handler_arg.event_id = event_id; > + cpu_arg->args.stack = sse_alloc_stack(); > + cpu_arg->args.handler = sse_foreign_cpu_handler; > + cpu_arg->args.handler_data = (void *)&cpu_arg->handler_arg; > + } > + > + on_cpus(sse_register_enable_local, cpu_args); > + for_each_online_cpu(cpu) { > + cpu_arg = &cpu_args[cpu]; > + ret = cpu_arg->ret; > + if (ret.error) { > + report_fail("CPU failed to register/enable event: %ld", ret.error); > + goto err; > + } > + > + handler_arg = &cpu_arg->handler_arg; > + WRITE_ONCE(handler_arg->expected_cpu, cpu); > + /* For handler_arg content to be visible for other CPUs */ > + smp_wmb(); > + ret = sbi_sse_inject(event_id, cpus[cpu].hartid); > + if (ret.error) { > + report_fail("CPU failed to inject event: %ld", ret.error); > + goto err; > + } > + } > + > + for_each_online_cpu(cpu) { > + handler_arg = &cpu_args[cpu].handler_arg; > + smp_rmb(); > + while (!READ_ONCE(handler_arg->done)) { > + /* For handler_arg update to be visible */ > + smp_rmb(); > + cpu_relax(); > + } > + WRITE_ONCE(handler_arg->done, false); > + } > + > + on_cpus(sbi_sse_disable_unregister_local, cpu_args); > + for_each_online_cpu(cpu) { > + cpu_arg = &cpu_args[cpu]; > + ret = READ_ONCE(cpu_arg->ret); > + if (ret.error) { > + report_fail("CPU failed to disable/unregister event: %ld", ret.error); > + goto err; > + } > + } > + > + err_ret = 0; > +err: > + for_each_online_cpu(cpu) { > + cpu_arg = &cpu_args[cpu]; > + sse_free_stack(cpu_arg->args.stack); > + } > + > + report_pass("local event dispatch on all CPUs"); > + report_prefix_pop(); > + > + return err_ret; > +} > + > +static int sse_test_inject_global(uint32_t event_id) > +{ > + unsigned long value; > + struct sbiret ret; > + unsigned int cpu; > + uint64_t timeout; > + int err_ret = 1; > + struct sse_foreign_cpu_test_arg test_arg = {.event_id = event_id}; > + struct sbi_sse_handler_arg args = { > + .handler = sse_foreign_cpu_handler, > + .handler_data = (void *)&test_arg, > + .stack = sse_alloc_stack(), > + }; > + enum sbi_sse_state state; > + > + report_prefix_push("global_dispatch"); > + > + ret = sbi_sse_register(event_id, &args); > + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Register event")) > + goto err; > + > + for_each_online_cpu(cpu) { > + WRITE_ONCE(test_arg.expected_cpu, cpu); > + /* For test_arg content to be visible for other CPUs */ > + smp_wmb(); > + value = cpu; > + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); > + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Set preferred hart")) > + goto err; > + > + ret = sbi_sse_enable(event_id); > + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Enable event")) > + goto err; > + > + ret = sbi_sse_inject(event_id, cpu); > + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Inject event")) > + goto err; > + > + smp_rmb(); > + while (!READ_ONCE(test_arg.done)) { > + /* For shared test_arg structure */ > + smp_rmb(); > + cpu_relax(); > + } > + > + WRITE_ONCE(test_arg.done, false); > + > + timeout = timer_get_cycles() + usec_to_cycles((uint64_t)1000); Should probably add an environment variable for this timeout and then use 1000 as the default when the variable isn't set. > + /* Wait for event to be back in ENABLED state */ > + do { > + ret = sse_event_get_state(event_id, &state); > + if (ret.error) > + goto err; > + cpu_relax(); > + } while (state != SBI_SSE_STATE_ENABLED || timer_get_cycles() < timeout); ^ && > + > + if (!report(state == SBI_SSE_STATE_ENABLED, > + "wait for event to be in enable state")) nit: no need to wrap this line, but if we do, then "wait should be under state > + goto err; > + > + ret = sbi_sse_disable(event_id); > + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Disable event")) > + goto err; > + > + report_pass("Global event on CPU %d", cpu); > + } > + > + ret = sbi_sse_unregister(event_id); > + sbiret_report_error(&ret, SBI_SUCCESS, "Unregister event"); > + > + err_ret = 0; > + > +err: > + sse_free_stack(args.stack); > + report_prefix_pop(); > + > + return err_ret; > +} > + > +struct priority_test_arg { > + uint32_t event_id; > + bool called; > + u32 prio; > + struct priority_test_arg *next_event_arg; > + void (*check_func)(struct priority_test_arg *arg); > +}; > + > +static void sse_hi_priority_test_handler(void *arg, struct pt_regs *regs, > + unsigned int hartid) > +{ > + struct priority_test_arg *targ = arg; > + struct priority_test_arg *next = targ->next_event_arg; > + > + targ->called = true; > + if (next) { > + sbi_sse_inject(next->event_id, current_thread_info()->hartid); > + > + report(!sse_event_pending(next->event_id), "Higher priority event is not pending"); > + report(next->called, "Higher priority event was handled"); > + } > +} > + > +static void sse_low_priority_test_handler(void *arg, struct pt_regs *regs, > + unsigned int hartid) > +{ > + struct priority_test_arg *targ = arg; > + struct priority_test_arg *next = targ->next_event_arg; > + > + targ->called = true; > + > + if (next) { > + sbi_sse_inject(next->event_id, current_thread_info()->hartid); > + > + report(sse_event_pending(next->event_id), "Lower priority event is pending"); > + report(!next->called, "Lower priority event %s was not handle before %s", handled > + sse_event_name(next->event_id), sse_event_name(targ->event_id)); > + } > +} > + > +static void sse_test_injection_priority_arg(struct priority_test_arg *in_args, > + unsigned int in_args_size, > + sbi_sse_handler_fn handler, > + const char *test_name) > +{ > + unsigned int i; > + unsigned long value, uret; > + struct sbiret ret; > + uint32_t event_id; > + struct priority_test_arg *arg; > + unsigned int args_size = 0; > + struct sbi_sse_handler_arg event_args[in_args_size]; > + struct priority_test_arg *args[in_args_size]; > + void *stack; > + struct sbi_sse_handler_arg *event_arg; > + > + report_prefix_push(test_name); > + > + for (i = 0; i < in_args_size; i++) { > + arg = &in_args[i]; > + event_id = arg->event_id; > + if (!sse_event_can_inject(event_id)) > + continue; > + > + args[args_size] = arg; > + args_size++; > + event_args->stack = 0; > + } > + > + if (!args_size) { > + report_skip("No injectable events"); > + goto skip; > + } > + > + for (i = 0; i < args_size; i++) { > + arg = args[i]; > + event_id = arg->event_id; > + stack = sse_alloc_stack(); > + > + event_arg = &event_args[i]; > + event_arg->handler = handler; > + event_arg->handler_data = (void *)arg; > + event_arg->stack = stack; > + > + if (i < (args_size - 1)) > + arg->next_event_arg = args[i + 1]; > + else > + arg->next_event_arg = NULL; > + > + /* Be sure global events are targeting the current hart */ > + if (sbi_sse_event_is_global(event_id)) { > + uret = sse_global_event_set_current_hart(event_id); > + if (uret) > + goto err; If we goto err from here or the next goto, then we'll also get disable and unregister failures reported. > + } > + > + ret = sbi_sse_register(event_id, event_arg); > + if (ret.error) { > + sbiret_report_error(&ret, SBI_SUCCESS, "register event 0x%x", event_id); > + goto err; > + } > + > + value = arg->prio; > + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); > + if (ret.error) { > + sbiret_report_error(&ret, SBI_SUCCESS, "set event 0x%x priority", event_id); > + goto err; from here we'll get disable failures reported. > + } > + ret = sbi_sse_enable(event_id); > + if (ret.error) { > + sbiret_report_error(&ret, SBI_SUCCESS, "enable event 0x%x", event_id); > + goto err; > + } > + } It looks like we need a complete if (!a) goto a_out; if (!b) goto b_out; /* ret = success */ b_out: /* undo b */ a_out: /* undo a */ return ret; type of model. Or just manage it directly. if (!a) ... if (!b) { /* undo a */ ... } > + > + /* Inject first event */ > + ret = sbi_sse_inject(args[0]->event_id, current_thread_info()->hartid); > + sbiret_report_error(&ret, SBI_SUCCESS, "injection"); > + > +err: > + for (i = 0; i < args_size; i++) { > + arg = args[i]; > + event_id = arg->event_id; > + > + report(arg->called, "Event %s handler called", sse_event_name(arg->event_id)); > + > + ret = sbi_sse_disable(event_id); > + if (ret.error) > + sbiret_report_error(&ret, SBI_SUCCESS, "disable event 0x%x", event_id); > + > + sbi_sse_unregister(event_id); > + if (ret.error) > + sbiret_report_error(&ret, SBI_SUCCESS, "unregister event 0x%x", event_id); > + > + event_arg = &event_args[i]; > + if (event_arg->stack) > + sse_free_stack(event_arg->stack); > + } > + > +skip: > + report_prefix_pop(); > +} > + > +static struct priority_test_arg hi_prio_args[] = { > + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE}, > + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE}, > + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS}, > + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS}, > + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW}, > + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS}, > + {.event_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP}, > + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS}, > +}; > + > +static struct priority_test_arg low_prio_args[] = { > + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS}, > + {.event_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP}, > + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS}, > + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW}, > + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS}, > + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS}, > + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE}, > + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE}, > +}; > + > +static struct priority_test_arg prio_args[] = { > + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE, .prio = 5}, > + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE, .prio = 10}, > + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, .prio = 12}, > + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, .prio = 15}, > + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, .prio = 20}, > + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, .prio = 22}, > + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, .prio = 25}, > +}; > + > +static struct priority_test_arg same_prio_args[] = { > + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, .prio = 0}, > + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, .prio = 0}, > + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, .prio = 10}, > + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE, .prio = 10}, > + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE, .prio = 10}, > + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, .prio = 20}, > + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, .prio = 20}, > +}; > + > +static void sse_test_injection_priority(void) > +{ > + report_prefix_push("prio"); > + > + sse_test_injection_priority_arg(hi_prio_args, ARRAY_SIZE(hi_prio_args), > + sse_hi_priority_test_handler, "high"); > + > + sse_test_injection_priority_arg(low_prio_args, ARRAY_SIZE(low_prio_args), > + sse_low_priority_test_handler, "low"); > + > + sse_test_injection_priority_arg(prio_args, ARRAY_SIZE(prio_args), > + sse_low_priority_test_handler, "changed"); > + > + sse_test_injection_priority_arg(same_prio_args, ARRAY_SIZE(same_prio_args), > + sse_low_priority_test_handler, "same_prio_args"); > + > + report_prefix_pop(); > +} > + > +static void test_invalid_event_id(unsigned long event_id) > +{ > + struct sbiret ret; > + unsigned long value = 0; > + > + ret = sbi_sse_register_raw(event_id, (unsigned long) sbi_sse_entry, 0); > + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, > + "register event_id 0x%lx", event_id); > + > + ret = sbi_sse_unregister(event_id); > + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, > + "unregister event_id 0x%lx", event_id); > + > + ret = sbi_sse_enable(event_id); > + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, > + "enable event_id 0x%lx", event_id); > + > + ret = sbi_sse_disable(event_id); > + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, > + "disable event_id 0x%lx", event_id); > + > + ret = sbi_sse_inject(event_id, 0); > + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, > + "inject event_id 0x%lx", event_id); > + > + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); > + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, > + "write attr event_id 0x%lx", event_id); > + > + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); > + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, > + "read attr event_id 0x%lx", event_id); > +} > + > +static void sse_test_invalid_event_id(void) > +{ > + > + report_prefix_push("event_id"); > + > + test_invalid_event_id(SBI_SSE_EVENT_LOCAL_RESERVED_0_START); > + > + report_prefix_pop(); > +} > + > +static void sse_check_event_availability(uint32_t event_id, bool *can_inject, bool *supported) > +{ > + unsigned long status; > + struct sbiret ret; > + > + *can_inject = false; > + *supported = false; > + > + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, 1, &status); > + if (ret.error != SBI_SUCCESS && ret.error != SBI_ERR_NOT_SUPPORTED) { > + report_fail("Get event status != SBI_SUCCESS && != SBI_ERR_NOT_SUPPORTED: %ld", ret.error); > + return; > + } > + if (ret.error == SBI_ERR_NOT_SUPPORTED) > + return; > + > + if (!ret.error) > + *supported = true; > + > + *can_inject = (status >> SBI_SSE_ATTR_STATUS_INJECT_OFFSET) & 1; this should also be under the 'if (!ret.error)' > +} > + > +static void sse_secondary_boot_and_unmask(void *data) > +{ > + sbi_sse_hart_unmask(); > +} > + > +static void sse_check_mask(void) > +{ > + struct sbiret ret; > + > + /* Upon boot, event are masked, check that */ > + ret = sbi_sse_hart_mask(); > + sbiret_report_error(&ret, SBI_ERR_ALREADY_STOPPED, "hart mask at boot time"); > + > + ret = sbi_sse_hart_unmask(); > + sbiret_report_error(&ret, SBI_SUCCESS, "hart unmask"); > + ret = sbi_sse_hart_unmask(); > + sbiret_report_error(&ret, SBI_ERR_ALREADY_STARTED, "hart unmask twice error"); > + > + ret = sbi_sse_hart_mask(); > + sbiret_report_error(&ret, SBI_SUCCESS, "hart mask"); > + ret = sbi_sse_hart_mask(); > + sbiret_report_error(&ret, SBI_ERR_ALREADY_STOPPED, "hart mask twice"); > +} > + > +static int run_inject_test(struct sse_event_info *info) > +{ > + unsigned long event_id = info->event_id; > + > + if (!info->can_inject) { > + report_skip("Event does not support injection, skipping injection tests"); > + return 0; > + } > + > + if (sse_test_inject_simple(event_id)) > + return 1; > + > + if (sbi_sse_event_is_global(event_id)) > + return sse_test_inject_global(event_id); > + else > + return sse_test_inject_local(event_id); > +} > + > +void check_sse(void) > +{ > + struct sse_event_info *info; > + unsigned long i, event_id; > + bool supported; > + > + report_prefix_push("sse"); > + > + if (!sbi_probe(SBI_EXT_SSE)) { > + report_skip("extension not available"); > + report_prefix_pop(); > + return; > + } > + > + sse_check_mask(); > + > + /* > + * Dummy wakeup of all processors since some of them will be targeted > + * by global events without going through the wakeup call as well as > + * unmasking SSE events on all harts > + */ > + on_cpus(sse_secondary_boot_and_unmask, NULL); > + > + sse_test_invalid_event_id(); > + > + for (i = 0; i < ARRAY_SIZE(sse_event_infos); i++) { > + info = &sse_event_infos[i]; > + event_id = info->event_id; > + report_prefix_push(info->name); > + sse_check_event_availability(event_id, &info->can_inject, &supported); > + if (!supported) { > + report_skip("Event is not supported, skipping tests"); > + report_prefix_pop(); > + continue; > + } > + > + sse_test_attrs(event_id); > + sse_test_register_error(event_id); > + > + if (run_inject_test(info)) { > + report_skip("Event test failed, event state unreliable"); > + info->can_inject = false; > + } > + > + report_prefix_pop(); > + } > + > + sse_test_injection_priority(); > + > + report_prefix_pop(); > +} > diff --git a/riscv/sbi.c b/riscv/sbi.c > index 0404bb81..478cb35d 100644 > --- a/riscv/sbi.c > +++ b/riscv/sbi.c > @@ -32,6 +32,7 @@ > > #define HIGH_ADDR_BOUNDARY ((phys_addr_t)1 << 32) > > +void check_sse(void); > void check_fwft(void); > > static long __labs(long a) > @@ -1567,6 +1568,7 @@ int main(int argc, char **argv) > check_hsm(); > check_dbcn(); > check_susp(); > + check_sse(); > check_fwft(); > > return report_summary(); > -- > 2.47.2 > Thanks, drew From ajones at ventanamicro.com Wed Mar 12 10:53:28 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Wed, 12 Mar 2025 18:53:28 +0100 Subject: [kvm-unit-tests PATCH v8 2/6] riscv: Set .aux.o files as .PRECIOUS In-Reply-To: <20250312-c9b8ee98ed38271ed7422550@orel> References: <20250307161549.1873770-1-cleger@rivosinc.com> <20250307161549.1873770-3-cleger@rivosinc.com> <20250312-c9b8ee98ed38271ed7422550@orel> Message-ID: <20250312-70978d5a25e382c2b36935f4@orel> On Wed, Mar 12, 2025 at 05:30:09PM +0100, Andrew Jones wrote: > On Fri, Mar 07, 2025 at 05:15:44PM +0100, Cl?ment L?ger wrote: > > When compiling, we need to keep .aux.o file or they will be removed > > after the compilation which leads to dependent files to be recompiled. > > Set these files as .PRECIOUS to keep them. > > > > Signed-off-by: Cl?ment L?ger > > --- > > riscv/Makefile | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/riscv/Makefile b/riscv/Makefile > > index 52718f3f..ae9cf02a 100644 > > --- a/riscv/Makefile > > +++ b/riscv/Makefile > > @@ -90,6 +90,7 @@ CFLAGS += -I $(SRCDIR)/lib -I $(SRCDIR)/lib/libfdt -I lib -I $(SRCDIR)/riscv > > asm-offsets = lib/riscv/asm-offsets.h > > include $(SRCDIR)/scripts/asm-offsets.mak > > > > +.PRECIOUS: %.aux.o > > %.aux.o: $(SRCDIR)/lib/auxinfo.c > > $(CC) $(CFLAGS) -c -o $@ $< \ > > -DPROGNAME=\"$(notdir $(@:.aux.o=.$(exe)))\" -DAUXFLAGS=$(AUXFLAGS) > > -- > > 2.47.2 > > > > Reviewed-by: Andrew Jones I meant to give this a Reviewed-by: Andrew Jones but I'm using the wrong mutt config right now... drew From thuth at redhat.com Thu Mar 13 00:50:14 2025 From: thuth at redhat.com (Thomas Huth) Date: Thu, 13 Mar 2025 08:50:14 +0100 Subject: [kvm-unit-tests PATCH v2] Makefile: Use CFLAGS in cc-option In-Reply-To: <20250307091828.57933-2-andrew.jones@linux.dev> References: <20250307091828.57933-2-andrew.jones@linux.dev> Message-ID: <0019ac7e-b190-4b1d-9c8d-f5f039cd53ba@redhat.com> On 07/03/2025 10.18, Andrew Jones wrote: > When cross compiling with clang we need to specify the target in > CFLAGS and cc-option will fail to recognize target-specific options > without it. Add CFLAGS to the CC invocation in cc-option. > > The introduction of the realmode_bits variable is necessary to > avoid make failing to build x86 due to CFLAGS referencing itself. > > Signed-off-by: Andrew Jones > --- > v2: > - Fixed x86 builds with the realmode_bits variable > > Makefile | 2 +- > x86/Makefile.common | 3 ++- > 2 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/Makefile b/Makefile > index 78352fced9d4..9dc5d2234e2a 100644 > --- a/Makefile > +++ b/Makefile > @@ -21,7 +21,7 @@ DESTDIR := $(PREFIX)/share/kvm-unit-tests/ > > # cc-option > # Usage: OP_CFLAGS+=$(call cc-option, -falign-functions=0, -malign-functions=0) > -cc-option = $(shell if $(CC) -Werror $(1) -S -o /dev/null -xc /dev/null \ > +cc-option = $(shell if $(CC) $(CFLAGS) -Werror $(1) -S -o /dev/null -xc /dev/null \ > > /dev/null 2>&1; then echo "$(1)"; else echo "$(2)"; fi ;) > > libcflat := lib/libcflat.a > diff --git a/x86/Makefile.common b/x86/Makefile.common > index 0b7f35c8de85..e97464912e28 100644 > --- a/x86/Makefile.common > +++ b/x86/Makefile.common > @@ -98,6 +98,7 @@ tests-common = $(TEST_DIR)/vmexit.$(exe) $(TEST_DIR)/tsc.$(exe) \ > ifneq ($(CONFIG_EFI),y) > tests-common += $(TEST_DIR)/realmode.$(exe) \ > $(TEST_DIR)/la57.$(exe) > +realmode_bits := $(if $(call cc-option,-m16,""),16,32) > endif > > test_cases: $(tests-common) $(tests) > @@ -108,7 +109,7 @@ $(TEST_DIR)/realmode.elf: $(TEST_DIR)/realmode.o > $(LD) -m elf_i386 -nostdlib -o $@ \ > -T $(SRCDIR)/$(TEST_DIR)/realmode.lds $^ > > -$(TEST_DIR)/realmode.o: bits = $(if $(call cc-option,-m16,""),16,32) > +$(TEST_DIR)/realmode.o: bits = $(realmode_bits) > > $(TEST_DIR)/access_test.$(bin): $(TEST_DIR)/access.o > Reviewed-by: Thomas Huth From alexandru.elisei at arm.com Thu Mar 13 03:11:03 2025 From: alexandru.elisei at arm.com (Alexandru Elisei) Date: Thu, 13 Mar 2025 10:11:03 +0000 Subject: [kvm-unit-tests PATCH v2] Makefile: Use CFLAGS in cc-option In-Reply-To: <20250307091828.57933-2-andrew.jones@linux.dev> References: <20250307091828.57933-2-andrew.jones@linux.dev> Message-ID: Hi Drew, Thank you for debugging this. I tested the patch by compiling the MTE test from Vladimir with clang and it works now: Tested-by: Alexandru Elisei Thanks, Alex On Fri, Mar 07, 2025 at 10:18:29AM +0100, Andrew Jones wrote: > When cross compiling with clang we need to specify the target in > CFLAGS and cc-option will fail to recognize target-specific options > without it. Add CFLAGS to the CC invocation in cc-option. > > The introduction of the realmode_bits variable is necessary to > avoid make failing to build x86 due to CFLAGS referencing itself. > > Signed-off-by: Andrew Jones > --- > v2: > - Fixed x86 builds with the realmode_bits variable > > Makefile | 2 +- > x86/Makefile.common | 3 ++- > 2 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/Makefile b/Makefile > index 78352fced9d4..9dc5d2234e2a 100644 > --- a/Makefile > +++ b/Makefile > @@ -21,7 +21,7 @@ DESTDIR := $(PREFIX)/share/kvm-unit-tests/ > > # cc-option > # Usage: OP_CFLAGS+=$(call cc-option, -falign-functions=0, -malign-functions=0) > -cc-option = $(shell if $(CC) -Werror $(1) -S -o /dev/null -xc /dev/null \ > +cc-option = $(shell if $(CC) $(CFLAGS) -Werror $(1) -S -o /dev/null -xc /dev/null \ > > /dev/null 2>&1; then echo "$(1)"; else echo "$(2)"; fi ;) > > libcflat := lib/libcflat.a > diff --git a/x86/Makefile.common b/x86/Makefile.common > index 0b7f35c8de85..e97464912e28 100644 > --- a/x86/Makefile.common > +++ b/x86/Makefile.common > @@ -98,6 +98,7 @@ tests-common = $(TEST_DIR)/vmexit.$(exe) $(TEST_DIR)/tsc.$(exe) \ > ifneq ($(CONFIG_EFI),y) > tests-common += $(TEST_DIR)/realmode.$(exe) \ > $(TEST_DIR)/la57.$(exe) > +realmode_bits := $(if $(call cc-option,-m16,""),16,32) > endif > > test_cases: $(tests-common) $(tests) > @@ -108,7 +109,7 @@ $(TEST_DIR)/realmode.elf: $(TEST_DIR)/realmode.o > $(LD) -m elf_i386 -nostdlib -o $@ \ > -T $(SRCDIR)/$(TEST_DIR)/realmode.lds $^ > > -$(TEST_DIR)/realmode.o: bits = $(if $(call cc-option,-m16,""),16,32) > +$(TEST_DIR)/realmode.o: bits = $(realmode_bits) > > $(TEST_DIR)/access_test.$(bin): $(TEST_DIR)/access.o > > -- > 2.48.1 > From ajones at ventanamicro.com Thu Mar 13 05:24:43 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Thu, 13 Mar 2025 13:24:43 +0100 Subject: [PATCH v3 01/17] riscv: add Firmware Feature (FWFT) SBI extensions definitions In-Reply-To: <20250310151229.2365992-2-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> <20250310151229.2365992-2-cleger@rivosinc.com> Message-ID: <20250313-924c6711597160f50c4cf90e@orel> On Mon, Mar 10, 2025 at 04:12:08PM +0100, Cl?ment L?ger wrote: > The Firmware Features extension (FWFT) was added as part of the SBI 3.0 > specification. Add SBI definitions to use this extension. > > Signed-off-by: Cl?ment L?ger > Reviewed-by: Samuel Holland > Tested-by: Samuel Holland > Reviewed-by: Deepak Gupta > --- > arch/riscv/include/asm/sbi.h | 33 +++++++++++++++++++++++++++++++++ > 1 file changed, 33 insertions(+) > Reviewed-by: Andrew Jones From ajones at ventanamicro.com Thu Mar 13 05:39:14 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Thu, 13 Mar 2025 13:39:14 +0100 Subject: [PATCH v3 02/17] riscv: sbi: add FWFT extension interface In-Reply-To: <20250310151229.2365992-3-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> <20250310151229.2365992-3-cleger@rivosinc.com> Message-ID: <20250313-5c22df0c08337905367fa125@orel> On Mon, Mar 10, 2025 at 04:12:09PM +0100, Cl?ment L?ger wrote: > This SBI extensions enables supervisor mode to control feature that are > under M-mode control (For instance, Svadu menvcfg ADUE bit, Ssdbltrp > DTE, etc). > > Signed-off-by: Cl?ment L?ger > --- > arch/riscv/include/asm/sbi.h | 5 ++ > arch/riscv/kernel/sbi.c | 97 ++++++++++++++++++++++++++++++++++++ > 2 files changed, 102 insertions(+) > > diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h > index bb077d0c912f..fc87c609c11a 100644 > --- a/arch/riscv/include/asm/sbi.h > +++ b/arch/riscv/include/asm/sbi.h > @@ -503,6 +503,11 @@ int sbi_remote_hfence_vvma_asid(const struct cpumask *cpu_mask, > unsigned long asid); > long sbi_probe_extension(int ext); > > +int sbi_fwft_all_cpus_set(u32 feature, unsigned long value, unsigned long flags, > + bool revert_on_failure); > +int sbi_fwft_get(u32 feature, unsigned long *value); > +int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags); > + > /* Check if current SBI specification version is 0.1 or not */ > static inline int sbi_spec_is_0_1(void) > { > diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c > index 1989b8cade1b..256910db1307 100644 > --- a/arch/riscv/kernel/sbi.c > +++ b/arch/riscv/kernel/sbi.c > @@ -299,6 +299,103 @@ static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask, > return 0; > } > > +int sbi_fwft_get(u32 feature, unsigned long *value) > +{ > + return -EOPNOTSUPP; > +} > + > +/** > + * sbi_fwft_set() - Set a feature on all online cpus copy+paste of description from sbi_fwft_all_cpus_set(). This function only sets the feature on the calling hart. > + * @feature: The feature to be set > + * @value: The feature value to be set > + * @flags: FWFT feature set flags > + * > + * Return: 0 on success, appropriate linux error code otherwise. > + */ > +int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags) > +{ > + return -EOPNOTSUPP; > +} > + > +struct fwft_set_req { > + u32 feature; > + unsigned long value; > + unsigned long flags; > + cpumask_t mask; > +}; > + > +static void cpu_sbi_fwft_set(void *arg) > +{ > + struct fwft_set_req *req = arg; > + > + if (sbi_fwft_set(req->feature, req->value, req->flags)) > + cpumask_clear_cpu(smp_processor_id(), &req->mask); > +} > + > +static int sbi_fwft_feature_local_set(u32 feature, unsigned long value, > + unsigned long flags, > + bool revert_on_fail) > +{ > + int ret; > + unsigned long prev_value; > + cpumask_t tmp; > + struct fwft_set_req req = { > + .feature = feature, > + .value = value, > + .flags = flags, > + }; > + > + cpumask_copy(&req.mask, cpu_online_mask); > + > + /* We can not revert if features are locked */ > + if (revert_on_fail && flags & SBI_FWFT_SET_FLAG_LOCK) Should use () around the flags &. I thought checkpatch complained about that? > + return -EINVAL; > + > + /* Reset value is the same for all cpus, read it once. */ How do we know we're reading the reset value? sbi_fwft_all_cpus_set() may be called multiple times on the same feature. And harts may have had sbi_fwft_set() called on them independently. I think we should drop the whole prev_value optimization. > + ret = sbi_fwft_get(feature, &prev_value); > + if (ret) > + return ret; > + > + /* Feature might already be set to the value we want */ > + if (prev_value == value) > + return 0; > + > + on_each_cpu_mask(&req.mask, cpu_sbi_fwft_set, &req, 1); > + if (cpumask_equal(&req.mask, cpu_online_mask)) > + return 0; > + > + pr_err("Failed to set feature %x for all online cpus, reverting\n", > + feature); nit: I'd let the above line stick out. We have 100 chars. > + > + req.value = prev_value; > + cpumask_copy(&tmp, &req.mask); > + on_each_cpu_mask(&req.mask, cpu_sbi_fwft_set, &req, 1); > + if (cpumask_equal(&req.mask, &tmp)) > + return 0; I'm not sure we want the revert_on_fail support either. What happens when the revert fails and we return -EINVAL below? Also returning zero when revert succeeds means the caller won't know if we successfully set what we wanted or just successfully reverted. > + > + return -EINVAL; > +} > + > +/** > + * sbi_fwft_all_cpus_set() - Set a feature on all online cpus > + * @feature: The feature to be set > + * @value: The feature value to be set > + * @flags: FWFT feature set flags > + * @revert_on_fail: true if feature value should be restored to it's orignal its original > + * value on failure. Line 'value' up under 'true' > + * > + * Return: 0 on success, appropriate linux error code otherwise. > + */ > +int sbi_fwft_all_cpus_set(u32 feature, unsigned long value, unsigned long flags, > + bool revert_on_fail) > +{ > + if (feature & SBI_FWFT_GLOBAL_FEATURE_BIT) > + return sbi_fwft_set(feature, value, flags); > + > + return sbi_fwft_feature_local_set(feature, value, flags, > + revert_on_fail); > +} > + > /** > * sbi_set_timer() - Program the timer for next timer event. > * @stime_value: The value after which next timer event should fire. > -- > 2.47.2 Thanks, drew From ajones at ventanamicro.com Thu Mar 13 05:44:16 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Thu, 13 Mar 2025 13:44:16 +0100 Subject: [PATCH v3 03/17] riscv: sbi: add SBI FWFT extension calls In-Reply-To: <20250310151229.2365992-4-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> <20250310151229.2365992-4-cleger@rivosinc.com> Message-ID: <20250313-ce439653d16b484dba6a8d3e@orel> On Mon, Mar 10, 2025 at 04:12:10PM +0100, Cl?ment L?ger wrote: > Add FWFT extension calls. This will be ratified in SBI V3.0 hence, it is > provided as a separate commit that can be left out if needed. > > Signed-off-by: Cl?ment L?ger > --- > arch/riscv/kernel/sbi.c | 30 ++++++++++++++++++++++++++++-- > 1 file changed, 28 insertions(+), 2 deletions(-) > > diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c > index 256910db1307..af8e2199e32d 100644 > --- a/arch/riscv/kernel/sbi.c > +++ b/arch/riscv/kernel/sbi.c > @@ -299,9 +299,19 @@ static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask, > return 0; > } > > +static bool sbi_fwft_supported; > + > int sbi_fwft_get(u32 feature, unsigned long *value) > { > - return -EOPNOTSUPP; > + struct sbiret ret; > + > + if (!sbi_fwft_supported) > + return -EOPNOTSUPP; > + > + ret = sbi_ecall(SBI_EXT_FWFT, SBI_EXT_FWFT_GET, > + feature, 0, 0, 0, 0, 0); > + > + return sbi_err_map_linux_errno(ret.error); > } > > /** > @@ -314,7 +324,15 @@ int sbi_fwft_get(u32 feature, unsigned long *value) > */ > int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags) > { > - return -EOPNOTSUPP; > + struct sbiret ret; > + > + if (!sbi_fwft_supported) > + return -EOPNOTSUPP; > + > + ret = sbi_ecall(SBI_EXT_FWFT, SBI_EXT_FWFT_SET, > + feature, value, flags, 0, 0, 0); > + > + return sbi_err_map_linux_errno(ret.error); sbi_err_map_linux_errno() doesn't know about SBI_ERR_DENIED_LOCKED. > } > > struct fwft_set_req { > @@ -389,6 +407,9 @@ static int sbi_fwft_feature_local_set(u32 feature, unsigned long value, > int sbi_fwft_all_cpus_set(u32 feature, unsigned long value, unsigned long flags, > bool revert_on_fail) > { > + if (!sbi_fwft_supported) > + return -EOPNOTSUPP; > + > if (feature & SBI_FWFT_GLOBAL_FEATURE_BIT) > return sbi_fwft_set(feature, value, flags); > > @@ -719,6 +740,11 @@ void __init sbi_init(void) > pr_info("SBI DBCN extension detected\n"); > sbi_debug_console_available = true; > } > + if ((sbi_spec_version >= sbi_mk_version(2, 0)) && Should check sbi_mk_version(3, 0) > + (sbi_probe_extension(SBI_EXT_FWFT) > 0)) { > + pr_info("SBI FWFT extension detected\n"); > + sbi_fwft_supported = true; > + } > } else { > __sbi_set_timer = __sbi_set_timer_v01; > __sbi_send_ipi = __sbi_send_ipi_v01; > -- > 2.47.2 > Thanks, drew From ajones at ventanamicro.com Thu Mar 13 05:52:58 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Thu, 13 Mar 2025 13:52:58 +0100 Subject: [PATCH v3 04/17] riscv: misaligned: request misaligned exception from SBI In-Reply-To: <20250310151229.2365992-5-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> <20250310151229.2365992-5-cleger@rivosinc.com> Message-ID: <20250313-28a56381a2c44ebeff100f91@orel> On Mon, Mar 10, 2025 at 04:12:11PM +0100, Cl?ment L?ger wrote: > Now that the kernel can handle misaligned accesses in S-mode, request > misaligned access exception delegation from SBI. This uses the FWFT SBI > extension defined in SBI version 3.0. > > Signed-off-by: Cl?ment L?ger > --- > arch/riscv/include/asm/cpufeature.h | 3 +- > arch/riscv/kernel/traps_misaligned.c | 77 +++++++++++++++++++++- > arch/riscv/kernel/unaligned_access_speed.c | 11 +++- > 3 files changed, 86 insertions(+), 5 deletions(-) > > diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h > index 569140d6e639..ad7d26788e6a 100644 > --- a/arch/riscv/include/asm/cpufeature.h > +++ b/arch/riscv/include/asm/cpufeature.h > @@ -64,8 +64,9 @@ void __init riscv_user_isa_enable(void); > _RISCV_ISA_EXT_DATA(_name, _id, _sub_exts, ARRAY_SIZE(_sub_exts), _validate) > > bool check_unaligned_access_emulated_all_cpus(void); > +void unaligned_access_init(void); > +int cpu_online_unaligned_access_init(unsigned int cpu); > #if defined(CONFIG_RISCV_SCALAR_MISALIGNED) > -void check_unaligned_access_emulated(struct work_struct *work __always_unused); > void unaligned_emulation_finish(void); > bool unaligned_ctl_available(void); > DECLARE_PER_CPU(long, misaligned_access_speed); > diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c > index 7cc108aed74e..90ac74191357 100644 > --- a/arch/riscv/kernel/traps_misaligned.c > +++ b/arch/riscv/kernel/traps_misaligned.c > @@ -16,6 +16,7 @@ > #include > #include > #include > +#include > #include > > #define INSN_MATCH_LB 0x3 > @@ -635,7 +636,7 @@ bool check_vector_unaligned_access_emulated_all_cpus(void) > > static bool unaligned_ctl __read_mostly; > > -void check_unaligned_access_emulated(struct work_struct *work __always_unused) > +static void check_unaligned_access_emulated(struct work_struct *work __always_unused) > { > int cpu = smp_processor_id(); > long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu); > @@ -646,6 +647,13 @@ void check_unaligned_access_emulated(struct work_struct *work __always_unused) > __asm__ __volatile__ ( > " "REG_L" %[tmp], 1(%[ptr])\n" > : [tmp] "=r" (tmp_val) : [ptr] "r" (&tmp_var) : "memory"); > +} > + > +static int cpu_online_check_unaligned_access_emulated(unsigned int cpu) > +{ > + long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu); > + > + check_unaligned_access_emulated(NULL); > > /* > * If unaligned_ctl is already set, this means that we detected that all > @@ -654,9 +662,10 @@ void check_unaligned_access_emulated(struct work_struct *work __always_unused) > */ > if (unlikely(unaligned_ctl && (*mas_ptr != RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED))) { > pr_crit("CPU misaligned accesses non homogeneous (expected all emulated)\n"); > - while (true) > - cpu_relax(); > + return -EINVAL; > } > + > + return 0; > } > > bool check_unaligned_access_emulated_all_cpus(void) > @@ -688,4 +697,66 @@ bool check_unaligned_access_emulated_all_cpus(void) > { > return false; > } > +static int cpu_online_check_unaligned_access_emulated(unsigned int cpu) > +{ > + return 0; > +} > #endif > + > +#ifdef CONFIG_RISCV_SBI > + > +static bool misaligned_traps_delegated; > + > +static int cpu_online_sbi_unaligned_setup(unsigned int cpu) > +{ > + if (sbi_fwft_set(SBI_FWFT_MISALIGNED_EXC_DELEG, 1, 0) && > + misaligned_traps_delegated) { > + pr_crit("Misaligned trap delegation non homogeneous (expected delegated)"); > + return -EINVAL; > + } > + > + return 0; > +} > + > +static void unaligned_sbi_request_delegation(void) > +{ > + int ret; > + > + ret = sbi_fwft_all_cpus_set(SBI_FWFT_MISALIGNED_EXC_DELEG, 1, 0, 0); > + if (ret) > + return; > + > + misaligned_traps_delegated = true; > + pr_info("SBI misaligned access exception delegation ok\n"); > + /* > + * Note that we don't have to take any specific action here, if > + * the delegation is successful, then > + * check_unaligned_access_emulated() will verify that indeed the > + * platform traps on misaligned accesses. > + */ > +} > + > +void unaligned_access_init(void) > +{ > + if (sbi_probe_extension(SBI_EXT_FWFT) > 0) > + unaligned_sbi_request_delegation(); > +} > +#else > +void unaligned_access_init(void) {} > + > +static int cpu_online_sbi_unaligned_setup(unsigned int cpu __always_unused) > +{ > + return 0; > +} > +#endif > + > +int cpu_online_unaligned_access_init(unsigned int cpu) > +{ > + int ret; > + > + ret = cpu_online_sbi_unaligned_setup(cpu); > + if (ret) > + return ret; > + > + return cpu_online_check_unaligned_access_emulated(cpu); > +} > diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel/unaligned_access_speed.c > index 91f189cf1611..2f3aba073297 100644 > --- a/arch/riscv/kernel/unaligned_access_speed.c > +++ b/arch/riscv/kernel/unaligned_access_speed.c > @@ -188,13 +188,20 @@ arch_initcall_sync(lock_and_set_unaligned_access_static_branch); > > static int riscv_online_cpu(unsigned int cpu) > { > + int ret; > static struct page *buf; > > /* We are already set since the last check */ > if (per_cpu(misaligned_access_speed, cpu) != RISCV_HWPROBE_MISALIGNED_SCALAR_UNKNOWN) > goto exit; > > - check_unaligned_access_emulated(NULL); > + ret = cpu_online_unaligned_access_init(cpu); > + if (ret) > + return ret; > + > + if (per_cpu(misaligned_access_speed, cpu) == RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED) > + goto exit; > + > buf = alloc_pages(GFP_KERNEL, MISALIGNED_BUFFER_ORDER); > if (!buf) { > pr_warn("Allocation failure, not measuring misaligned performance\n"); > @@ -403,6 +410,8 @@ static int check_unaligned_access_all_cpus(void) > { > bool all_cpus_emulated, all_cpus_vec_unsupported; > > + unaligned_access_init(); > + > all_cpus_emulated = check_unaligned_access_emulated_all_cpus(); > all_cpus_vec_unsupported = check_vector_unaligned_access_emulated_all_cpus(); > > -- > 2.47.2 > Reviewed-by: Andrew Jones From ajones at ventanamicro.com Thu Mar 13 05:57:32 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Thu, 13 Mar 2025 13:57:32 +0100 Subject: [PATCH v3 05/17] riscv: misaligned: use on_each_cpu() for scalar misaligned access probing In-Reply-To: <20250310151229.2365992-6-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> <20250310151229.2365992-6-cleger@rivosinc.com> Message-ID: <20250313-311b94f9bafe73bcd41158a1@orel> On Mon, Mar 10, 2025 at 04:12:12PM +0100, Cl?ment L?ger wrote: > schedule_on_each_cpu() was used without any good reason while documented > as very slow. This call was in the boot path, so better use > on_each_cpu() for scalar misaligned checking. Vector misaligned check > still needs to use schedule_on_each_cpu() since it requires irqs to be > enabled but that's less of a problem since this code is ran in a kthread. > Add a comment to explicit that. > > Signed-off-by: Cl?ment L?ger > --- > arch/riscv/kernel/traps_misaligned.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c > index 90ac74191357..ffac424faa88 100644 > --- a/arch/riscv/kernel/traps_misaligned.c > +++ b/arch/riscv/kernel/traps_misaligned.c > @@ -616,6 +616,11 @@ bool check_vector_unaligned_access_emulated_all_cpus(void) > return false; > } > > + /* > + * While being documented as very slow, schedule_on_each_cpu() is used > + * since kernel_vector_begin() that is called inside the vector code > + * expects irqs to be enabled or it will panic(). which expects > + */ > schedule_on_each_cpu(check_vector_unaligned_access_emulated); > > for_each_online_cpu(cpu) > @@ -636,7 +641,7 @@ bool check_vector_unaligned_access_emulated_all_cpus(void) > > static bool unaligned_ctl __read_mostly; > > -static void check_unaligned_access_emulated(struct work_struct *work __always_unused) > +static void check_unaligned_access_emulated(void *arg __always_unused) > { > int cpu = smp_processor_id(); > long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu); > @@ -677,7 +682,7 @@ bool check_unaligned_access_emulated_all_cpus(void) > * accesses emulated since tasks requesting such control can run on any > * CPU. > */ > - schedule_on_each_cpu(check_unaligned_access_emulated); > + on_each_cpu(check_unaligned_access_emulated, NULL, 1); > > for_each_online_cpu(cpu) > if (per_cpu(misaligned_access_speed, cpu) > -- > 2.47.2 > Reviewed-by: Andrew Jones From ajones at ventanamicro.com Thu Mar 13 06:06:09 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Thu, 13 Mar 2025 14:06:09 +0100 Subject: [PATCH v3 06/17] riscv: misaligned: use correct CONFIG_ ifdef for misaligned_access_speed In-Reply-To: <20250310151229.2365992-7-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> <20250310151229.2365992-7-cleger@rivosinc.com> Message-ID: <20250313-a437330d8e1c638a9aa61e0a@orel> On Mon, Mar 10, 2025 at 04:12:13PM +0100, Cl?ment L?ger wrote: > misaligned_access_speed is defined under CONFIG_RISCV_SCALAR_MISALIGNED > but was used under CONFIG_RISCV_PROBE_UNALIGNED_ACCESS. Fix that by > using the correct config option. > > Signed-off-by: Cl?ment L?ger > --- > arch/riscv/kernel/traps_misaligned.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c > index ffac424faa88..7fe25adf2539 100644 > --- a/arch/riscv/kernel/traps_misaligned.c > +++ b/arch/riscv/kernel/traps_misaligned.c > @@ -362,7 +362,7 @@ static int handle_scalar_misaligned_load(struct pt_regs *regs) > > perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS, 1, regs, addr); > > -#ifdef CONFIG_RISCV_PROBE_UNALIGNED_ACCESS > +#ifdef CONFIG_RISCV_SCALAR_MISALIGNED > *this_cpu_ptr(&misaligned_access_speed) = RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED; > #endif Sure, but CONFIG_RISCV_PROBE_UNALIGNED_ACCESS selects CONFIG_RISCV_SCALAR_MISALIGNED, so this isn't fixing anything. Changing it does make sense though since this line in handle_scalar_misaligned_load() "belongs" to check_unaligned_access_emulated() which is also under CONFIG_RISCV_SCALAR_MISALIGNED. Anyway, all this unaligned configs need a major cleanup. Reviewed-by: Andrew Jones Thanks, drew > > -- > 2.47.2 > > > -- > kvm-riscv mailing list > kvm-riscv at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kvm-riscv From ajones at ventanamicro.com Thu Mar 13 06:07:18 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Thu, 13 Mar 2025 14:07:18 +0100 Subject: [PATCH v3 07/17] riscv: misaligned: move emulated access uniformity check in a function In-Reply-To: <20250310151229.2365992-8-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> <20250310151229.2365992-8-cleger@rivosinc.com> Message-ID: <20250313-89b46bd06fbea0072ac4932f@orel> On Mon, Mar 10, 2025 at 04:12:14PM +0100, Cl?ment L?ger wrote: > Split the code that check for the uniformity of misaligned accesses > performance on all cpus from check_unaligned_access_emulated_all_cpus() > to its own function which will be used for delegation check. No > functional changes intended. > > Signed-off-by: Cl?ment L?ger > --- > arch/riscv/kernel/traps_misaligned.c | 18 +++++++++++++----- > 1 file changed, 13 insertions(+), 5 deletions(-) > > diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c > index 7fe25adf2539..db31966a834e 100644 > --- a/arch/riscv/kernel/traps_misaligned.c > +++ b/arch/riscv/kernel/traps_misaligned.c > @@ -673,10 +673,20 @@ static int cpu_online_check_unaligned_access_emulated(unsigned int cpu) > return 0; > } > > -bool check_unaligned_access_emulated_all_cpus(void) > +static bool all_cpus_unaligned_scalar_access_emulated(void) > { > int cpu; > > + for_each_online_cpu(cpu) > + if (per_cpu(misaligned_access_speed, cpu) != > + RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED) > + return false; > + > + return true; > +} > + > +bool check_unaligned_access_emulated_all_cpus(void) > +{ > /* > * We can only support PR_UNALIGN controls if all CPUs have misaligned > * accesses emulated since tasks requesting such control can run on any > @@ -684,10 +694,8 @@ bool check_unaligned_access_emulated_all_cpus(void) > */ > on_each_cpu(check_unaligned_access_emulated, NULL, 1); > > - for_each_online_cpu(cpu) > - if (per_cpu(misaligned_access_speed, cpu) > - != RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED) > - return false; > + if (!all_cpus_unaligned_scalar_access_emulated()) > + return false; > > unaligned_ctl = true; > return true; > -- > 2.47.2 > Reviewed-by: Andrew Jones From ajones at ventanamicro.com Thu Mar 13 06:19:32 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Thu, 13 Mar 2025 14:19:32 +0100 Subject: [PATCH v3 08/17] riscv: misaligned: add a function to check misalign trap delegability In-Reply-To: <20250310151229.2365992-9-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> <20250310151229.2365992-9-cleger@rivosinc.com> Message-ID: <20250313-4bea400c5770a5a5d3d70b38@orel> On Mon, Mar 10, 2025 at 04:12:15PM +0100, Cl?ment L?ger wrote: > Checking for the delegability of the misaligned access trap is needed > for the KVM FWFT extension implementation. Add a function to get the > delegability of the misaligned trap exception. > > Signed-off-by: Cl?ment L?ger > --- > arch/riscv/include/asm/cpufeature.h | 5 +++++ > arch/riscv/kernel/traps_misaligned.c | 17 +++++++++++++++-- > 2 files changed, 20 insertions(+), 2 deletions(-) > > diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h > index ad7d26788e6a..8b97cba99fc3 100644 > --- a/arch/riscv/include/asm/cpufeature.h > +++ b/arch/riscv/include/asm/cpufeature.h > @@ -69,12 +69,17 @@ int cpu_online_unaligned_access_init(unsigned int cpu); > #if defined(CONFIG_RISCV_SCALAR_MISALIGNED) > void unaligned_emulation_finish(void); > bool unaligned_ctl_available(void); > +bool misaligned_traps_can_delegate(void); > DECLARE_PER_CPU(long, misaligned_access_speed); > #else > static inline bool unaligned_ctl_available(void) > { > return false; > } > +static inline bool misaligned_traps_can_delegate(void) > +{ > + return false; > +} > #endif > > bool check_vector_unaligned_access_emulated_all_cpus(void); > diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c > index db31966a834e..a67a6e709a06 100644 > --- a/arch/riscv/kernel/traps_misaligned.c > +++ b/arch/riscv/kernel/traps_misaligned.c > @@ -716,10 +716,10 @@ static int cpu_online_check_unaligned_access_emulated(unsigned int cpu) > } > #endif > > -#ifdef CONFIG_RISCV_SBI > - > static bool misaligned_traps_delegated; > > +#ifdef CONFIG_RISCV_SBI > + > static int cpu_online_sbi_unaligned_setup(unsigned int cpu) > { > if (sbi_fwft_set(SBI_FWFT_MISALIGNED_EXC_DELEG, 1, 0) && > @@ -761,6 +761,7 @@ static int cpu_online_sbi_unaligned_setup(unsigned int cpu __always_unused) > { > return 0; > } > + > #endif > > int cpu_online_unaligned_access_init(unsigned int cpu) > @@ -773,3 +774,15 @@ int cpu_online_unaligned_access_init(unsigned int cpu) > > return cpu_online_check_unaligned_access_emulated(cpu); > } > + > +bool misaligned_traps_can_delegate(void) > +{ > + /* > + * Either we successfully requested misaligned traps delegation for all > + * CPUS or the SBI does not implemented FWFT extension but delegated the > + * exception by default. > + */ > + return misaligned_traps_delegated || > + all_cpus_unaligned_scalar_access_emulated(); > +} > +EXPORT_SYMBOL_GPL(misaligned_traps_can_delegate); > \ No newline at end of file Check your editor settings. > -- > 2.47.2 Reviewed-by: Andrew Jones From ajones at ventanamicro.com Thu Mar 13 07:27:17 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Thu, 13 Mar 2025 15:27:17 +0100 Subject: [PATCH v3 14/17] RISC-V: KVM: add SBI extension init()/deinit() functions In-Reply-To: <20250310151229.2365992-15-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> <20250310151229.2365992-15-cleger@rivosinc.com> Message-ID: <20250313-f08cee46c912f729d1829d37@orel> On Mon, Mar 10, 2025 at 04:12:21PM +0100, Cl?ment L?ger wrote: > The FWFT SBI extension will need to dynamically allocate memory and do > init time specific initialization. Add an init/deinit callbacks that > allows to do so. > > Signed-off-by: Cl?ment L?ger > --- > arch/riscv/include/asm/kvm_vcpu_sbi.h | 9 +++++++++ > arch/riscv/kvm/vcpu.c | 2 ++ > arch/riscv/kvm/vcpu_sbi.c | 29 +++++++++++++++++++++++++++ > 3 files changed, 40 insertions(+) > > diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h > index 4ed6203cdd30..bcb90757b149 100644 > --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h > +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h > @@ -49,6 +49,14 @@ struct kvm_vcpu_sbi_extension { > > /* Extension specific probe function */ > unsigned long (*probe)(struct kvm_vcpu *vcpu); > + > + /* > + * Init/deinit function called once during VCPU init/destroy. These > + * might be use if the SBI extensions need to allocate or do specific > + * init time only configuration. > + */ > + int (*init)(struct kvm_vcpu *vcpu); > + void (*deinit)(struct kvm_vcpu *vcpu); > }; > > void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run); > @@ -69,6 +77,7 @@ const struct kvm_vcpu_sbi_extension *kvm_vcpu_sbi_find_ext( > bool riscv_vcpu_supports_sbi_ext(struct kvm_vcpu *vcpu, int idx); > int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run); > void kvm_riscv_vcpu_sbi_init(struct kvm_vcpu *vcpu); > +void kvm_riscv_vcpu_sbi_deinit(struct kvm_vcpu *vcpu); > > int kvm_riscv_vcpu_get_reg_sbi_sta(struct kvm_vcpu *vcpu, unsigned long reg_num, > unsigned long *reg_val); > diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c > index 60d684c76c58..877bcc85c067 100644 > --- a/arch/riscv/kvm/vcpu.c > +++ b/arch/riscv/kvm/vcpu.c > @@ -185,6 +185,8 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu) > > void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu) > { > + kvm_riscv_vcpu_sbi_deinit(vcpu); > + > /* Cleanup VCPU AIA context */ > kvm_riscv_vcpu_aia_deinit(vcpu); > > diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c > index d1c83a77735e..858ddefd7e7f 100644 > --- a/arch/riscv/kvm/vcpu_sbi.c > +++ b/arch/riscv/kvm/vcpu_sbi.c > @@ -505,8 +505,37 @@ void kvm_riscv_vcpu_sbi_init(struct kvm_vcpu *vcpu) > continue; > } > > + if (!ext->default_disabled && ext->init && > + ext->init(vcpu) != 0) { > + scontext->ext_status[idx] = KVM_RISCV_SBI_EXT_STATUS_UNAVAILABLE; > + continue; > + } I think this new block should be below the assignment below (and it can drop the continue) and it shouldn't check default_disabled (as I've done below). IOW, we should always run ext->init when there is one to run here. Otherwise, I how will it get run later? > + > scontext->ext_status[idx] = ext->default_disabled ? > KVM_RISCV_SBI_EXT_STATUS_DISABLED : > KVM_RISCV_SBI_EXT_STATUS_ENABLED; if (ext->init && ext->init(vcpu)) scontext->ext_status[idx] = KVM_RISCV_SBI_EXT_STATUS_UNAVAILABLE; > } > } > + > +void kvm_riscv_vcpu_sbi_deinit(struct kvm_vcpu *vcpu) > +{ > + struct kvm_vcpu_sbi_context *scontext = &vcpu->arch.sbi_context; > + const struct kvm_riscv_sbi_extension_entry *entry; > + const struct kvm_vcpu_sbi_extension *ext; > + int idx, i; > + > + for (i = 0; i < ARRAY_SIZE(sbi_ext); i++) { > + entry = &sbi_ext[i]; > + ext = entry->ext_ptr; > + idx = entry->ext_idx; > + > + if (idx < 0 || idx >= ARRAY_SIZE(scontext->ext_status)) > + continue; > + > + if (scontext->ext_status[idx] == KVM_RISCV_SBI_EXT_STATUS_UNAVAILABLE || > + !ext->deinit) > + continue; > + > + ext->deinit(vcpu); > + } > +} > -- > 2.47.2 > Thanks, drew From ajones at ventanamicro.com Thu Mar 13 07:29:02 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Thu, 13 Mar 2025 15:29:02 +0100 Subject: [PATCH v3 15/17] RISC-V: KVM: add SBI extension reset callback In-Reply-To: <20250310151229.2365992-16-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> <20250310151229.2365992-16-cleger@rivosinc.com> Message-ID: <20250313-d269cf1812f8d080947fd64d@orel> On Mon, Mar 10, 2025 at 04:12:22PM +0100, Cl?ment L?ger wrote: > Currently, oonly the STA extension needed a reset function but that's only > going to be the case for FWFT as well. Add a reset callback that can be > implemented by SBI extensions. > > Signed-off-by: Cl?ment L?ger > --- > arch/riscv/include/asm/kvm_host.h | 1 - > arch/riscv/include/asm/kvm_vcpu_sbi.h | 2 ++ > arch/riscv/kvm/vcpu.c | 2 +- > arch/riscv/kvm/vcpu_sbi.c | 24 ++++++++++++++++++++++++ > arch/riscv/kvm/vcpu_sbi_sta.c | 3 ++- > 5 files changed, 29 insertions(+), 3 deletions(-) > > diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h > index cc33e35cd628..bb93d2995ea2 100644 > --- a/arch/riscv/include/asm/kvm_host.h > +++ b/arch/riscv/include/asm/kvm_host.h > @@ -409,7 +409,6 @@ void __kvm_riscv_vcpu_power_on(struct kvm_vcpu *vcpu); > void kvm_riscv_vcpu_power_on(struct kvm_vcpu *vcpu); > bool kvm_riscv_vcpu_stopped(struct kvm_vcpu *vcpu); > > -void kvm_riscv_vcpu_sbi_sta_reset(struct kvm_vcpu *vcpu); > void kvm_riscv_vcpu_record_steal_time(struct kvm_vcpu *vcpu); > > #endif /* __RISCV_KVM_HOST_H__ */ > diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h > index bcb90757b149..cb68b3a57c8f 100644 > --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h > +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h > @@ -57,6 +57,7 @@ struct kvm_vcpu_sbi_extension { > */ > int (*init)(struct kvm_vcpu *vcpu); > void (*deinit)(struct kvm_vcpu *vcpu); > + void (*reset)(struct kvm_vcpu *vcpu); > }; > > void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run); > @@ -78,6 +79,7 @@ bool riscv_vcpu_supports_sbi_ext(struct kvm_vcpu *vcpu, int idx); > int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run); > void kvm_riscv_vcpu_sbi_init(struct kvm_vcpu *vcpu); > void kvm_riscv_vcpu_sbi_deinit(struct kvm_vcpu *vcpu); > +void kvm_riscv_vcpu_sbi_reset(struct kvm_vcpu *vcpu); > > int kvm_riscv_vcpu_get_reg_sbi_sta(struct kvm_vcpu *vcpu, unsigned long reg_num, > unsigned long *reg_val); > diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c > index 877bcc85c067..542747e2c7f5 100644 > --- a/arch/riscv/kvm/vcpu.c > +++ b/arch/riscv/kvm/vcpu.c > @@ -94,7 +94,7 @@ static void kvm_riscv_reset_vcpu(struct kvm_vcpu *vcpu) > vcpu->arch.hfence_tail = 0; > memset(vcpu->arch.hfence_queue, 0, sizeof(vcpu->arch.hfence_queue)); > > - kvm_riscv_vcpu_sbi_sta_reset(vcpu); > + kvm_riscv_vcpu_sbi_reset(vcpu); > > /* Reset the guest CSRs for hotplug usecase */ > if (loaded) > diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c > index 858ddefd7e7f..18726096ef44 100644 > --- a/arch/riscv/kvm/vcpu_sbi.c > +++ b/arch/riscv/kvm/vcpu_sbi.c > @@ -539,3 +539,27 @@ void kvm_riscv_vcpu_sbi_deinit(struct kvm_vcpu *vcpu) > ext->deinit(vcpu); > } > } > + > +void kvm_riscv_vcpu_sbi_reset(struct kvm_vcpu *vcpu) > +{ > + struct kvm_vcpu_sbi_context *scontext = &vcpu->arch.sbi_context; > + const struct kvm_riscv_sbi_extension_entry *entry; > + const struct kvm_vcpu_sbi_extension *ext; > + int idx, i; > + > + for (i = 0; i < ARRAY_SIZE(sbi_ext); i++) { > + entry = &sbi_ext[i]; > + ext = entry->ext_ptr; > + idx = entry->ext_idx; > + > + if (idx < 0 || idx >= ARRAY_SIZE(scontext->ext_status)) > + continue; > + > + if (scontext->ext_status[idx] != KVM_RISCV_SBI_EXT_STATUS_ENABLED || > + !ext->reset) > + continue; > + > + ext->reset(vcpu); > + } > +} > + > diff --git a/arch/riscv/kvm/vcpu_sbi_sta.c b/arch/riscv/kvm/vcpu_sbi_sta.c > index 5f35427114c1..cc6cb7c8f0e4 100644 > --- a/arch/riscv/kvm/vcpu_sbi_sta.c > +++ b/arch/riscv/kvm/vcpu_sbi_sta.c > @@ -16,7 +16,7 @@ > #include > #include > > -void kvm_riscv_vcpu_sbi_sta_reset(struct kvm_vcpu *vcpu) > +static void kvm_riscv_vcpu_sbi_sta_reset(struct kvm_vcpu *vcpu) > { > vcpu->arch.sta.shmem = INVALID_GPA; > vcpu->arch.sta.last_steal = 0; > @@ -156,6 +156,7 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_sta = { > .extid_end = SBI_EXT_STA, > .handler = kvm_sbi_ext_sta_handler, > .probe = kvm_sbi_ext_sta_probe, > + .reset = kvm_riscv_vcpu_sbi_sta_reset, > }; > > int kvm_riscv_vcpu_get_reg_sbi_sta(struct kvm_vcpu *vcpu, > -- > 2.47.2 > Reviewed-by: Andrew Jones From ajones at ventanamicro.com Thu Mar 13 08:18:18 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Thu, 13 Mar 2025 16:18:18 +0100 Subject: [PATCH v3 16/17] RISC-V: KVM: add support for FWFT SBI extension In-Reply-To: <20250310151229.2365992-17-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> <20250310151229.2365992-17-cleger@rivosinc.com> Message-ID: <20250313-b15bd3f690fccfc5d1a946fc@orel> On Mon, Mar 10, 2025 at 04:12:23PM +0100, Cl?ment L?ger wrote: > Add basic infrastructure to support the FWFT extension in KVM. > > Signed-off-by: Cl?ment L?ger > --- > arch/riscv/include/asm/kvm_host.h | 4 + > arch/riscv/include/asm/kvm_vcpu_sbi.h | 1 + > arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h | 31 +++ > arch/riscv/include/uapi/asm/kvm.h | 1 + > arch/riscv/kvm/Makefile | 1 + > arch/riscv/kvm/vcpu_sbi.c | 4 + > arch/riscv/kvm/vcpu_sbi_fwft.c | 212 +++++++++++++++++++++ > 7 files changed, 254 insertions(+) > create mode 100644 arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h > create mode 100644 arch/riscv/kvm/vcpu_sbi_fwft.c > > diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h > index bb93d2995ea2..c0db61ba691a 100644 > --- a/arch/riscv/include/asm/kvm_host.h > +++ b/arch/riscv/include/asm/kvm_host.h > @@ -19,6 +19,7 @@ > #include > #include > #include > +#include > #include > #include > > @@ -281,6 +282,9 @@ struct kvm_vcpu_arch { > /* Performance monitoring context */ > struct kvm_pmu pmu_context; > > + /* Firmware feature SBI extension context */ > + struct kvm_sbi_fwft fwft_context; > + > /* 'static' configurations which are set only once */ > struct kvm_vcpu_config cfg; > > diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h > index cb68b3a57c8f..ffd03fed0c06 100644 > --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h > +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h > @@ -98,6 +98,7 @@ extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_hsm; > extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_dbcn; > extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_susp; > extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_sta; > +extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_fwft; > extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_experimental; > extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_vendor; > > diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h b/arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h > new file mode 100644 > index 000000000000..ec7568e0dc1a > --- /dev/null > +++ b/arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h > @@ -0,0 +1,31 @@ > +/* SPDX-License-Identifier: GPL-2.0-only */ > +/* > + * Copyright (c) 2025 Rivos Inc. > + * > + * Authors: > + * Cl?ment L?ger > + */ > + > +#ifndef __KVM_VCPU_RISCV_FWFT_H > +#define __KVM_VCPU_RISCV_FWFT_H > + > +#include > + > +struct kvm_sbi_fwft_config; > +struct kvm_sbi_fwft_feature; > +struct kvm_vcpu; Should only need the forward declaration for kvm_sbi_fwft_feature. > + > +struct kvm_sbi_fwft_config { > + const struct kvm_sbi_fwft_feature *feature; > + bool supported; > + unsigned long flags; > +}; > + > +/* FWFT data structure per vcpu */ > +struct kvm_sbi_fwft { > + struct kvm_sbi_fwft_config *configs; > +}; > + > +#define vcpu_to_fwft(vcpu) (&(vcpu)->arch.fwft_context) > + > +#endif /* !__KVM_VCPU_RISCV_FWFT_H */ > diff --git a/arch/riscv/include/uapi/asm/kvm.h b/arch/riscv/include/uapi/asm/kvm.h > index f06bc5efcd79..fa6eee1caf41 100644 > --- a/arch/riscv/include/uapi/asm/kvm.h > +++ b/arch/riscv/include/uapi/asm/kvm.h > @@ -202,6 +202,7 @@ enum KVM_RISCV_SBI_EXT_ID { > KVM_RISCV_SBI_EXT_DBCN, > KVM_RISCV_SBI_EXT_STA, > KVM_RISCV_SBI_EXT_SUSP, > + KVM_RISCV_SBI_EXT_FWFT, > KVM_RISCV_SBI_EXT_MAX, > }; > > diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile > index 4e0bba91d284..06e2d52a9b88 100644 > --- a/arch/riscv/kvm/Makefile > +++ b/arch/riscv/kvm/Makefile > @@ -26,6 +26,7 @@ kvm-y += vcpu_onereg.o > kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o > kvm-y += vcpu_sbi.o > kvm-y += vcpu_sbi_base.o > +kvm-y += vcpu_sbi_fwft.o > kvm-y += vcpu_sbi_hsm.o > kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_sbi_pmu.o > kvm-y += vcpu_sbi_replace.o > diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c > index 18726096ef44..27f22e98c8f8 100644 > --- a/arch/riscv/kvm/vcpu_sbi.c > +++ b/arch/riscv/kvm/vcpu_sbi.c > @@ -78,6 +78,10 @@ static const struct kvm_riscv_sbi_extension_entry sbi_ext[] = { > .ext_idx = KVM_RISCV_SBI_EXT_STA, > .ext_ptr = &vcpu_sbi_ext_sta, > }, > + { > + .ext_idx = KVM_RISCV_SBI_EXT_FWFT, > + .ext_ptr = &vcpu_sbi_ext_fwft, > + }, > { > .ext_idx = KVM_RISCV_SBI_EXT_EXPERIMENTAL, > .ext_ptr = &vcpu_sbi_ext_experimental, > diff --git a/arch/riscv/kvm/vcpu_sbi_fwft.c b/arch/riscv/kvm/vcpu_sbi_fwft.c > new file mode 100644 > index 000000000000..cce1e41d5490 > --- /dev/null > +++ b/arch/riscv/kvm/vcpu_sbi_fwft.c > @@ -0,0 +1,212 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Copyright (c) 2025 Rivos Inc. > + * > + * Authors: > + * Cl?ment L?ger > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +struct kvm_sbi_fwft_feature { > + /** > + * @id: Feature ID > + */ > + enum sbi_fwft_feature_t id; > + > + /** > + * @supported: Check if the feature is supported on the vcpu > + * > + * This callback is optional, if not provided the feature is assumed to > + * be supported > + */ > + bool (*supported)(struct kvm_vcpu *vcpu); > + > + /** > + * @set: Set the feature value > + * > + * This callback is mandatory Since we're doing all this documentation, then let's also state they return SBI errors (which are longs, so we should probably return longs). > + */ > + int (*set)(struct kvm_vcpu *vcpu, struct kvm_sbi_fwft_config *conf, unsigned long value); > + > + /** > + * @get: Get the feature current value > + * > + * This callback is mandatory > + */ > + int (*get)(struct kvm_vcpu *vcpu, struct kvm_sbi_fwft_config *conf, unsigned long *value); > +}; > + > +static const enum sbi_fwft_feature_t kvm_fwft_defined_features[] = { > + SBI_FWFT_MISALIGNED_EXC_DELEG, > + SBI_FWFT_LANDING_PAD, > + SBI_FWFT_SHADOW_STACK, > + SBI_FWFT_DOUBLE_TRAP, > + SBI_FWFT_PTE_AD_HW_UPDATING, > + SBI_FWFT_POINTER_MASKING_PMLEN, > +}; > + > +static bool kvm_fwft_is_defined_feature(enum sbi_fwft_feature_t feature) > +{ > + int i; > + > + for (i = 0; i < ARRAY_SIZE(kvm_fwft_defined_features); i++) { > + if (kvm_fwft_defined_features[i] == feature) > + return true; > + } > + > + return false; > +} > + > +static const struct kvm_sbi_fwft_feature features[] = { > +}; > + > +static struct kvm_sbi_fwft_config * > +kvm_sbi_fwft_get_config(struct kvm_vcpu *vcpu, enum sbi_fwft_feature_t feature) > +{ > + int i = 0; no need for '= 0' > + struct kvm_sbi_fwft *fwft = vcpu_to_fwft(vcpu); > + > + for (i = 0; i < ARRAY_SIZE(features); i++) { > + if (fwft->configs[i].feature->id == feature) > + return &fwft->configs[i]; > + } > + > + return NULL; > +} > + > +static int kvm_fwft_get_feature(struct kvm_vcpu *vcpu, u32 feature, > + struct kvm_sbi_fwft_config **conf) > +{ > + struct kvm_sbi_fwft_config *tconf; > + > + tconf = kvm_sbi_fwft_get_config(vcpu, feature); > + if (!tconf) { > + if (kvm_fwft_is_defined_feature(feature)) > + return SBI_ERR_NOT_SUPPORTED; > + > + return SBI_ERR_DENIED; > + } > + > + if (!tconf->supported) > + return SBI_ERR_NOT_SUPPORTED; > + > + *conf = tconf; > + > + return SBI_SUCCESS; > +} > + > +static int kvm_sbi_fwft_set(struct kvm_vcpu *vcpu, u32 feature, > + unsigned long value, unsigned long flags) > +{ > + int ret; > + struct kvm_sbi_fwft_config *conf; > + > + ret = kvm_fwft_get_feature(vcpu, feature, &conf); > + if (ret) > + return ret; > + > + if ((flags & ~SBI_FWFT_SET_FLAG_LOCK) != 0) > + return SBI_ERR_INVALID_PARAM; > + > + if (conf->flags & SBI_FWFT_SET_FLAG_LOCK) > + return SBI_ERR_DENIED_LOCKED; > + > + conf->flags = flags; > + > + return conf->feature->set(vcpu, conf, value); > +} > + > +static int kvm_sbi_fwft_get(struct kvm_vcpu *vcpu, unsigned long feature, > + unsigned long *value) > +{ > + int ret; > + struct kvm_sbi_fwft_config *conf; > + > + ret = kvm_fwft_get_feature(vcpu, feature, &conf); > + if (ret) > + return ret; > + > + return conf->feature->get(vcpu, conf, value); > +} > + > +static int kvm_sbi_ext_fwft_handler(struct kvm_vcpu *vcpu, struct kvm_run *run, > + struct kvm_vcpu_sbi_return *retdata) > +{ > + int ret = 0; > + struct kvm_cpu_context *cp = &vcpu->arch.guest_context; > + unsigned long funcid = cp->a6; > + > + switch (funcid) { > + case SBI_EXT_FWFT_SET: > + ret = kvm_sbi_fwft_set(vcpu, cp->a0, cp->a1, cp->a2); > + break; > + case SBI_EXT_FWFT_GET: > + ret = kvm_sbi_fwft_get(vcpu, cp->a0, &retdata->out_val); > + break; > + default: > + ret = SBI_ERR_NOT_SUPPORTED; > + break; > + } > + > + retdata->err_val = ret; > + > + return 0; > +} > + > +static int kvm_sbi_ext_fwft_init(struct kvm_vcpu *vcpu) > +{ > + struct kvm_sbi_fwft *fwft = vcpu_to_fwft(vcpu); > + const struct kvm_sbi_fwft_feature *feature; > + struct kvm_sbi_fwft_config *conf; > + int i; > + > + fwft->configs = kcalloc(ARRAY_SIZE(features), sizeof(struct kvm_sbi_fwft_config), > + GFP_KERNEL); > + if (!fwft->configs) > + return -ENOMEM; > + > + for (i = 0; i < ARRAY_SIZE(features); i++) { > + feature = &features[i]; > + conf = &fwft->configs[i]; > + if (feature->supported) > + conf->supported = feature->supported(vcpu); > + else > + conf->supported = true; > + > + conf->feature = feature; > + } > + > + return 0; > +} > + > +static void kvm_sbi_ext_fwft_deinit(struct kvm_vcpu *vcpu) > +{ > + struct kvm_sbi_fwft *fwft = vcpu_to_fwft(vcpu); > + > + kfree(fwft->configs); > +} > + > +static void kvm_sbi_ext_fwft_reset(struct kvm_vcpu *vcpu) > +{ > + int i = 0; no need for '= 0' > + struct kvm_sbi_fwft *fwft = vcpu_to_fwft(vcpu); > + > + for (i = 0; i < ARRAY_SIZE(features); i++) > + fwft->configs[i].flags = 0; > +} > + > +const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_fwft = { > + .extid_start = SBI_EXT_FWFT, > + .extid_end = SBI_EXT_FWFT, > + .handler = kvm_sbi_ext_fwft_handler, > + .init = kvm_sbi_ext_fwft_init, > + .deinit = kvm_sbi_ext_fwft_deinit, > + .reset = kvm_sbi_ext_fwft_reset, > +}; > -- > 2.47.2 > Reviewed-by: Andrew Jones From ajones at ventanamicro.com Thu Mar 13 08:23:13 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Thu, 13 Mar 2025 16:23:13 +0100 Subject: [PATCH v3 17/17] RISC-V: KVM: add support for SBI_FWFT_MISALIGNED_DELEG In-Reply-To: <20250310151229.2365992-18-cleger@rivosinc.com> References: <20250310151229.2365992-1-cleger@rivosinc.com> <20250310151229.2365992-18-cleger@rivosinc.com> Message-ID: <20250313-16176e19c15b63a156cb534c@orel> On Mon, Mar 10, 2025 at 04:12:24PM +0100, Cl?ment L?ger wrote: > SBI_FWFT_MISALIGNED_DELEG needs hedeleg to be modified to delegate > misaligned load/store exceptions. Save and restore it during CPU > load/put. > > Signed-off-by: Cl?ment L?ger > Reviewed-by: Deepak Gupta > --- > arch/riscv/kvm/vcpu.c | 3 +++ > arch/riscv/kvm/vcpu_sbi_fwft.c | 39 ++++++++++++++++++++++++++++++++++ > 2 files changed, 42 insertions(+) > > diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c > index 542747e2c7f5..d98e379945c3 100644 > --- a/arch/riscv/kvm/vcpu.c > +++ b/arch/riscv/kvm/vcpu.c > @@ -646,6 +646,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) > { > void *nsh; > struct kvm_vcpu_csr *csr = &vcpu->arch.guest_csr; > + struct kvm_vcpu_config *cfg = &vcpu->arch.cfg; > > vcpu->cpu = -1; > > @@ -671,6 +672,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) > csr->vstval = nacl_csr_read(nsh, CSR_VSTVAL); > csr->hvip = nacl_csr_read(nsh, CSR_HVIP); > csr->vsatp = nacl_csr_read(nsh, CSR_VSATP); > + cfg->hedeleg = nacl_csr_read(nsh, CSR_HEDELEG); > } else { > csr->vsstatus = csr_read(CSR_VSSTATUS); > csr->vsie = csr_read(CSR_VSIE); > @@ -681,6 +683,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) > csr->vstval = csr_read(CSR_VSTVAL); > csr->hvip = csr_read(CSR_HVIP); > csr->vsatp = csr_read(CSR_VSATP); > + cfg->hedeleg = csr_read(CSR_HEDELEG); > } > } > > diff --git a/arch/riscv/kvm/vcpu_sbi_fwft.c b/arch/riscv/kvm/vcpu_sbi_fwft.c > index cce1e41d5490..756fda1cf2e7 100644 > --- a/arch/riscv/kvm/vcpu_sbi_fwft.c > +++ b/arch/riscv/kvm/vcpu_sbi_fwft.c > @@ -14,6 +14,8 @@ > #include > #include > > +#define MIS_DELEG (BIT_ULL(EXC_LOAD_MISALIGNED) | BIT_ULL(EXC_STORE_MISALIGNED)) > + > struct kvm_sbi_fwft_feature { > /** > * @id: Feature ID > @@ -64,7 +66,44 @@ static bool kvm_fwft_is_defined_feature(enum sbi_fwft_feature_t feature) > return false; > } > > +static bool kvm_sbi_fwft_misaligned_delegation_supported(struct kvm_vcpu *vcpu) > +{ > + if (!misaligned_traps_can_delegate()) > + return false; > + > + return true; Just return misaligned_traps_can_delegate(); > +} > + > +static int kvm_sbi_fwft_set_misaligned_delegation(struct kvm_vcpu *vcpu, > + struct kvm_sbi_fwft_config *conf, > + unsigned long value) > +{ > + if (value == 1) > + csr_set(CSR_HEDELEG, MIS_DELEG); > + else if (value == 0) > + csr_clear(CSR_HEDELEG, MIS_DELEG); > + else > + return SBI_ERR_INVALID_PARAM; > + > + return SBI_SUCCESS; > +} > + > +static int kvm_sbi_fwft_get_misaligned_delegation(struct kvm_vcpu *vcpu, > + struct kvm_sbi_fwft_config *conf, > + unsigned long *value) > +{ > + *value = (csr_read(CSR_HEDELEG) & MIS_DELEG) != 0; > + > + return SBI_SUCCESS; > +} > + > static const struct kvm_sbi_fwft_feature features[] = { > + { > + .id = SBI_FWFT_MISALIGNED_EXC_DELEG, > + .supported = kvm_sbi_fwft_misaligned_delegation_supported, > + .set = kvm_sbi_fwft_set_misaligned_delegation, > + .get = kvm_sbi_fwft_get_misaligned_delegation, > + }, > }; > > static struct kvm_sbi_fwft_config * > -- > 2.47.2 > Reviewed-by: Andrew Jones From cleger at rivosinc.com Fri Mar 14 02:29:04 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Fri, 14 Mar 2025 10:29:04 +0100 Subject: [kvm-unit-tests PATCH v8 6/6] riscv: sbi: Add SSE extension tests In-Reply-To: <20250312-fa9b1889ef5f422be2e20cf4@orel> References: <20250307161549.1873770-1-cleger@rivosinc.com> <20250307161549.1873770-7-cleger@rivosinc.com> <20250312-fa9b1889ef5f422be2e20cf4@orel> Message-ID: On 12/03/2025 18:51, Andrew Jones wrote: > On Fri, Mar 07, 2025 at 05:15:48PM +0100, Cl?ment L?ger wrote: >> Add SBI SSE extension tests for the following features: >> - Test attributes errors (invalid values, RO, etc) >> - Registration errors >> - Simple events (register, enable, inject) >> - Events with different priorities >> - Global events dispatch on different harts >> - Local events on all harts >> - Hart mask/unmask events >> >> Signed-off-by: Cl?ment L?ger >> --- >> riscv/Makefile | 1 + >> riscv/sbi-tests.h | 1 + >> riscv/sbi-sse.c | 1215 +++++++++++++++++++++++++++++++++++++++++++++ >> riscv/sbi.c | 2 + >> 4 files changed, 1219 insertions(+) >> create mode 100644 riscv/sbi-sse.c >> >> diff --git a/riscv/Makefile b/riscv/Makefile >> index 16fc125b..4fe2f1bb 100644 >> --- a/riscv/Makefile >> +++ b/riscv/Makefile >> @@ -18,6 +18,7 @@ tests += $(TEST_DIR)/sieve.$(exe) >> all: $(tests) >> >> $(TEST_DIR)/sbi-deps = $(TEST_DIR)/sbi-asm.o $(TEST_DIR)/sbi-fwft.o >> +$(TEST_DIR)/sbi-deps += $(TEST_DIR)/sbi-sse.o >> >> # When built for EFI sieve needs extra memory, run with e.g. '-m 256' on QEMU >> $(TEST_DIR)/sieve.$(exe): AUXFLAGS = 0x1 >> diff --git a/riscv/sbi-tests.h b/riscv/sbi-tests.h >> index b081464d..a71da809 100644 >> --- a/riscv/sbi-tests.h >> +++ b/riscv/sbi-tests.h >> @@ -71,6 +71,7 @@ >> sbiret_report(ret, expected_error, expected_value, "check sbi.error and sbi.value") >> >> void sbi_bad_fid(int ext); >> +void check_sse(void); >> >> #endif /* __ASSEMBLER__ */ >> #endif /* _RISCV_SBI_TESTS_H_ */ >> diff --git a/riscv/sbi-sse.c b/riscv/sbi-sse.c >> new file mode 100644 >> index 00000000..7bd58b8b >> --- /dev/null >> +++ b/riscv/sbi-sse.c >> @@ -0,0 +1,1215 @@ >> +// SPDX-License-Identifier: GPL-2.0-only >> +/* >> + * SBI SSE testsuite >> + * >> + * Copyright (C) 2025, Rivos Inc., Cl?ment L?ger >> + */ >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> + >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> + >> +#include "sbi-tests.h" >> + >> +#define SSE_STACK_SIZE PAGE_SIZE >> + >> +struct sse_event_info { >> + uint32_t event_id; >> + const char *name; >> + bool can_inject; >> +}; >> + >> +static struct sse_event_info sse_event_infos[] = { >> + { >> + .event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, >> + .name = "local_high_prio_ras", >> + }, >> + { >> + .event_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP, >> + .name = "double_trap", >> + }, >> + { >> + .event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, >> + .name = "global_high_prio_ras", >> + }, >> + { >> + .event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, >> + .name = "local_pmu_overflow", >> + }, >> + { >> + .event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, >> + .name = "local_low_prio_ras", >> + }, >> + { >> + .event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, >> + .name = "global_low_prio_ras", >> + }, >> + { >> + .event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE, >> + .name = "local_software", >> + }, >> + { >> + .event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE, >> + .name = "global_software", >> + }, >> +}; >> + >> +static const char *const attr_names[] = { >> + [SBI_SSE_ATTR_STATUS] = "status", >> + [SBI_SSE_ATTR_PRIORITY] = "priority", >> + [SBI_SSE_ATTR_CONFIG] = "config", >> + [SBI_SSE_ATTR_PREFERRED_HART] = "preferred_hart", >> + [SBI_SSE_ATTR_ENTRY_PC] = "entry_pc", >> + [SBI_SSE_ATTR_ENTRY_ARG] = "entry_arg", >> + [SBI_SSE_ATTR_INTERRUPTED_SEPC] = "interrupted_sepc", >> + [SBI_SSE_ATTR_INTERRUPTED_FLAGS] = "interrupted_flags", >> + [SBI_SSE_ATTR_INTERRUPTED_A6] = "interrupted_a6", >> + [SBI_SSE_ATTR_INTERRUPTED_A7] = "interrupted_a7", > > nit: tabulate > >> +}; >> + >> +static const unsigned long ro_attrs[] = { >> + SBI_SSE_ATTR_STATUS, >> + SBI_SSE_ATTR_ENTRY_PC, >> + SBI_SSE_ATTR_ENTRY_ARG, >> +}; >> + >> +static const unsigned long interrupted_attrs[] = { >> + SBI_SSE_ATTR_INTERRUPTED_SEPC, >> + SBI_SSE_ATTR_INTERRUPTED_FLAGS, >> + SBI_SSE_ATTR_INTERRUPTED_A6, >> + SBI_SSE_ATTR_INTERRUPTED_A7, >> +}; >> + >> +static const unsigned long interrupted_flags[] = { >> + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPP, >> + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPIE, >> + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPELP, >> + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT, >> + SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPV, >> + SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPVP, >> +}; >> + >> +static struct sse_event_info *sse_event_get_info(uint32_t event_id) >> +{ >> + int i; >> + >> + for (i = 0; i < ARRAY_SIZE(sse_event_infos); i++) { >> + if (sse_event_infos[i].event_id == event_id) >> + return &sse_event_infos[i]; >> + } >> + >> + assert_msg(false, "Invalid event id: %d", event_id); >> +} >> + >> +static const char *sse_event_name(uint32_t event_id) >> +{ >> + return sse_event_get_info(event_id)->name; >> +} >> + >> +static bool sse_event_can_inject(uint32_t event_id) >> +{ >> + return sse_event_get_info(event_id)->can_inject; >> +} >> + >> +static struct sbiret sse_get_event_status_field(uint32_t event_id, unsigned long mask, >> + unsigned long shift, unsigned long *value) >> +{ >> + struct sbiret ret; >> + unsigned long status; >> + >> + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, 1, &status); >> + if (ret.error) { >> + sbiret_report_error(&ret, SBI_SUCCESS, "Get event status"); >> + return ret; >> + } >> + >> + *value = (status & mask) >> shift; >> + >> + return ret; >> +} >> + >> +static struct sbiret sse_event_get_state(uint32_t event_id, enum sbi_sse_state *state) >> +{ >> + unsigned long status = 0; >> + struct sbiret ret; >> + >> + ret = sse_get_event_status_field(event_id, SBI_SSE_ATTR_STATUS_STATE_MASK, >> + SBI_SSE_ATTR_STATUS_STATE_OFFSET, &status); >> + *state = status; >> + >> + return ret; >> +} >> + >> +static unsigned long sse_global_event_set_current_hart(uint32_t event_id) >> +{ >> + struct sbiret ret; >> + unsigned long current_hart = current_thread_info()->hartid; >> + >> + if (!sbi_sse_event_is_global(event_id)) >> + return SBI_ERR_INVALID_PARAM; > > The only two callers of sse_global_event_set_current_hart() check > sbi_sse_event_is_global() themselves, so this check can be an assert(). > I think it should be anyway, since returning an SBI error, which didn't > originate in SBI, could lead to confusion. Indeed, that check is clearly meant to be an assert. > >> + >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, ¤t_hart); >> + if (sbiret_report_error(&ret, SBI_SUCCESS, "Set preferred hart")) >> + return ret.error; >> + >> + return 0; >> +} >> + >> +static bool sse_check_state(uint32_t event_id, unsigned long expected_state) >> +{ >> + struct sbiret ret; >> + enum sbi_sse_state state; >> + >> + ret = sse_event_get_state(event_id, &state); >> + if (ret.error) >> + return false; >> + >> + return report(state == expected_state, "event status == %ld", expected_state); >> +} >> + >> +static bool sse_event_pending(uint32_t event_id) >> +{ >> + bool pending = 0; >> + >> + sse_get_event_status_field(event_id, BIT(SBI_SSE_ATTR_STATUS_PENDING_OFFSET), >> + SBI_SSE_ATTR_STATUS_PENDING_OFFSET, (unsigned long*)&pending); >> + >> + return pending; >> +} >> + >> +static void *sse_alloc_stack(void) >> +{ >> + /* >> + * We assume that SSE_STACK_SIZE always fit in one page. This page will >> + * always be decremented before storing anything on it in sse-entry.S. >> + */ >> + assert(SSE_STACK_SIZE <= PAGE_SIZE); >> + >> + return (alloc_page() + SSE_STACK_SIZE); >> +} >> + >> +static void sse_free_stack(void *stack) >> +{ >> + free_page(stack - SSE_STACK_SIZE); >> +} >> + >> +static void sse_read_write_test(uint32_t event_id, unsigned long attr, unsigned long attr_count, >> + unsigned long *value, long expected_error, const char *str) >> +{ >> + struct sbiret ret; >> + >> + ret = sbi_sse_read_attrs(event_id, attr, attr_count, value); >> + sbiret_report_error(&ret, expected_error, "Read %s error", str); >> + >> + ret = sbi_sse_write_attrs(event_id, attr, attr_count, value); >> + sbiret_report_error(&ret, expected_error, "Write %s error", str); >> +} >> + >> +#define ALL_ATTRS_COUNT (SBI_SSE_ATTR_INTERRUPTED_A7 + 1) >> + >> +static void sse_test_attrs(uint32_t event_id) >> +{ >> + unsigned long value = 0; >> + struct sbiret ret; >> + void *ptr; >> + unsigned long values[ALL_ATTRS_COUNT]; >> + unsigned int i; >> + const char *invalid_hart_str; >> + const char *attr_name; >> + >> + report_prefix_push("attrs"); >> + >> + for (i = 0; i < ARRAY_SIZE(ro_attrs); i++) { >> + ret = sbi_sse_write_attrs(event_id, ro_attrs[i], 1, &value); >> + sbiret_report_error(&ret, SBI_ERR_DENIED, "RO attribute %s not writable", >> + attr_names[ro_attrs[i]]); >> + } >> + >> + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, ALL_ATTRS_COUNT, values); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Read multiple attributes"); >> + >> + for (i = SBI_SSE_ATTR_STATUS; i <= SBI_SSE_ATTR_INTERRUPTED_A7; i++) { >> + ret = sbi_sse_read_attrs(event_id, i, 1, &value); >> + attr_name = attr_names[i]; >> + >> + sbiret_report_error(&ret, SBI_SUCCESS, "Read single attribute %s", attr_name); >> + if (values[i] != value) >> + report_fail("Attribute 0x%x single value read (0x%lx) differs from the one read with multiple attributes (0x%lx)", >> + i, value, values[i]); >> + /* >> + * Preferred hart reset value is defined by SBI vendor >> + */ >> + if(i != SBI_SSE_ATTR_PREFERRED_HART) { > ^ missing space > >> + /* >> + * Specification states that injectable bit is implementation dependent >> + * but other bits are zero-initialized. >> + */ >> + if (i == SBI_SSE_ATTR_STATUS) >> + value &= ~BIT(SBI_SSE_ATTR_STATUS_INJECT_OFFSET); >> + report(value == 0, "Attribute %s reset value is 0, found %lx", attr_name, >> + value); > > nit: no need to wrap the line > >> + } >> + } >> + >> +#if __riscv_xlen > 32 >> + value = BIT(32); >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Write invalid prio > 0xFFFFFFFF error"); >> +#endif >> + >> + value = ~SBI_SSE_ATTR_CONFIG_ONESHOT; >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_CONFIG, 1, &value); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Write invalid config value error"); >> + >> + if (sbi_sse_event_is_global(event_id)) { >> + invalid_hart_str = getenv("INVALID_HART_ID"); >> + if (!invalid_hart_str) >> + value = 0xFFFFFFFFUL; >> + else >> + value = strtoul(invalid_hart_str, NULL, 0); >> + >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid hart id error"); >> + } else { >> + /* Set Hart on local event -> RO */ >> + value = current_thread_info()->hartid; >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); >> + sbiret_report_error(&ret, SBI_ERR_DENIED, >> + "Set hart id on local event error"); >> + } >> + >> + /* Set/get flags, sepc, a6, a7 */ >> + for (i = 0; i < ARRAY_SIZE(interrupted_attrs); i++) { >> + attr_name = attr_names[interrupted_attrs[i]]; >> + ret = sbi_sse_read_attrs(event_id, interrupted_attrs[i], 1, &value); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Get interrupted %s", attr_name); >> + >> + value = ARRAY_SIZE(interrupted_attrs) - i; >> + ret = sbi_sse_write_attrs(event_id, interrupted_attrs[i], 1, &value); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, >> + "Set attribute %s invalid state error", attr_name); >> + } >> + >> + sse_read_write_test(event_id, SBI_SSE_ATTR_STATUS, 0, &value, SBI_ERR_INVALID_PARAM, >> + "attribute attr_count == 0"); >> + sse_read_write_test(event_id, SBI_SSE_ATTR_INTERRUPTED_A7 + 1, 1, &value, SBI_ERR_BAD_RANGE, >> + "invalid attribute"); >> + >> + /* Misaligned pointer address */ >> + ptr = (void *)&value; >> + ptr += 1; >> + sse_read_write_test(event_id, SBI_SSE_ATTR_STATUS, 1, ptr, SBI_ERR_INVALID_ADDRESS, >> + "attribute with invalid address"); >> + >> + report_prefix_pop(); >> +} >> + >> +static void sse_test_register_error(uint32_t event_id) >> +{ >> + struct sbiret ret; >> + >> + report_prefix_push("register"); >> + >> + ret = sbi_sse_unregister(event_id); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "unregister non-registered event"); >> + >> + ret = sbi_sse_register_raw(event_id, 0x1, 0); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "register misaligned entry"); >> + >> + ret = sbi_sse_register_raw(event_id, (unsigned long)sbi_sse_entry, 0); > > sbi_sse_register(event_id, sbi_sse_entry, NULL) to avoid a cast. > >> + sbiret_report_error(&ret, SBI_SUCCESS, "register"); >> + if (ret.error) >> + goto done; >> + >> + ret = sbi_sse_register_raw(event_id, (unsigned long)sbi_sse_entry, 0); > > same > >> + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "register used event failure"); >> + >> + ret = sbi_sse_unregister(event_id); >> + sbiret_report_error(&ret, SBI_SUCCESS, "unregister"); >> + >> +done: >> + report_prefix_pop(); >> +} >> + >> +struct sse_simple_test_arg { >> + bool done; >> + unsigned long expected_a6; >> + uint32_t event_id; >> +}; >> + >> +#if __riscv_xlen > 32 >> + >> +struct alias_test_params { >> + unsigned long event_id; >> + unsigned long attr_id; >> + unsigned long attr_count; >> + const char *str; >> +}; >> + >> +static void test_alias(uint32_t event_id) >> +{ >> + struct alias_test_params *write, *read; >> + unsigned long write_value, read_value; >> + struct sbiret ret; >> + bool err = false; >> + int r, w; >> + struct alias_test_params params[] = { >> + {event_id, SBI_SSE_ATTR_INTERRUPTED_A6, 1, "non aliased"}, >> + {BIT(32) + event_id, SBI_SSE_ATTR_INTERRUPTED_A6, 1, "aliased event_id"}, >> + {event_id, BIT(32) + SBI_SSE_ATTR_INTERRUPTED_A6, 1, "aliased attr_id"}, >> + {event_id, SBI_SSE_ATTR_INTERRUPTED_A6, BIT(32) + 1, "aliased attr_count"}, >> + }; >> + >> + report_prefix_push("alias"); >> + for (w = 0; w < ARRAY_SIZE(params); w++) { >> + write = ¶ms[w]; >> + >> + write_value = 0xDEADBEEF + w; >> + ret = sbi_sse_write_attrs(write->event_id, write->attr_id, write->attr_count, &write_value); >> + if (ret.error) >> + sbiret_report_error(&ret, SBI_SUCCESS, "Write %s, event 0x%lx attr 0x%lx, attr count 0x%lx", write->str, write->event_id, write->attr_id, write->attr_count); > > I like long lines, but this one is too long even for me. I'd break it > after the format string. > >> + >> + for (r = 0; r < ARRAY_SIZE(params); r++) { >> + read = ¶ms[r]; >> + read_value = 0; >> + ret = sbi_sse_read_attrs(read->event_id, read->attr_id, read->attr_count, &read_value); >> + if (ret.error) >> + sbiret_report_error(&ret, SBI_SUCCESS, >> + "Read %s, event 0x%lx attr 0x%lx, attr count 0x%lx", read->str, read->event_id, read->attr_id, read->attr_count); > > same > >> + >> + /* Do not spam output with a lot of reports */ >> + if (write_value != read_value) { >> + err = true; >> + report_fail("Write %s, event 0x%lx attr 0x%lx, attr count 0x%lx value %lx ==" >> + "Read %s, event 0x%lx attr 0x%lx, attr count 0x%lx value %lx", >> + write->str, write->event_id, write->attr_id, write->attr_count, >> + write_value, >> + read->str, read->event_id, read->attr_id, read->attr_count, >> + read_value); > > indentation is off > >> + } >> + } >> + } >> + >> + report(!err, "BIT(32) aliasing tests"); >> + report_prefix_pop(); >> +} >> +#endif >> + >> +static void sse_simple_handler(void *data, struct pt_regs *regs, unsigned int hartid) >> +{ >> + struct sse_simple_test_arg *arg = data; >> + int i; >> + struct sbiret ret; >> + const char *attr_name; >> + uint32_t event_id = READ_ONCE(arg->event_id), attr; >> + unsigned long value, prev_value, flags; >> + unsigned long interrupted_state[ARRAY_SIZE(interrupted_attrs)]; >> + unsigned long modified_state[ARRAY_SIZE(interrupted_attrs)] = {4, 3, 2, 1}; >> + unsigned long tmp_state[ARRAY_SIZE(interrupted_attrs)]; >> + >> + report((regs->status & SR_SPP) == SR_SPP, "Interrupted S-mode"); >> + report(hartid == current_thread_info()->hartid, "Hartid correctly passed"); >> + sse_check_state(event_id, SBI_SSE_STATE_RUNNING); >> + report(!sse_event_pending(event_id), "Event not pending"); >> + >> + /* Read full interrupted state */ >> + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, >> + ARRAY_SIZE(interrupted_attrs), interrupted_state); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Save full interrupted state from handler"); >> + >> + /* Write full modified state and read it */ >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, >> + ARRAY_SIZE(modified_state), modified_state); >> + sbiret_report_error(&ret, SBI_SUCCESS, >> + "Write full interrupted state from handler"); >> + >> + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, >> + ARRAY_SIZE(tmp_state), tmp_state); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Read full modified state from handler"); >> + >> + report(memcmp(tmp_state, modified_state, sizeof(modified_state)) == 0, >> + "Full interrupted state successfully written"); >> + >> +#if __riscv_xlen > 32 >> + test_alias(event_id); >> +#endif >> + >> + /* Restore full saved state */ >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, >> + ARRAY_SIZE(interrupted_attrs), interrupted_state); > > indentation is off > >> + sbiret_report_error(&ret, SBI_SUCCESS, >> + "Full interrupted state restore from handler"); > > no need to wrap > >> + >> + /* We test SBI_SSE_ATTR_INTERRUPTED_FLAGS below with specific flag values */ >> + for (i = 0; i < ARRAY_SIZE(interrupted_attrs); i++) { >> + attr = interrupted_attrs[i]; >> + if (attr == SBI_SSE_ATTR_INTERRUPTED_FLAGS) >> + continue; >> + >> + attr_name = attr_names[attr]; >> + >> + ret = sbi_sse_read_attrs(event_id, attr, 1, &prev_value); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Get attr %s", attr_name); >> + >> + value = 0xDEADBEEF + i; >> + ret = sbi_sse_write_attrs(event_id, attr, 1, &value); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Set attr %s", attr_name); >> + >> + ret = sbi_sse_read_attrs(event_id, attr, 1, &value); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Get attr %s", attr_name); >> + report(value == 0xDEADBEEF + i, "Get attr %s, value: 0x%lx", attr_name, >> + value); > > no need to wrap > >> + >> + ret = sbi_sse_write_attrs(event_id, attr, 1, &prev_value); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Restore attr %s value", attr_name); >> + } >> + >> + /* Test all flags allowed for SBI_SSE_ATTR_INTERRUPTED_FLAGS */ >> + attr = SBI_SSE_ATTR_INTERRUPTED_FLAGS; >> + ret = sbi_sse_read_attrs(event_id, attr, 1, &prev_value); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Save interrupted flags"); >> + >> + for (i = 0; i < ARRAY_SIZE(interrupted_flags); i++) { >> + flags = interrupted_flags[i]; >> + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); >> + sbiret_report_error(&ret, SBI_SUCCESS, >> + "Set interrupted flags bit 0x%lx value", flags); >> + ret = sbi_sse_read_attrs(event_id, attr, 1, &value); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Get interrupted flags after set"); >> + report(value == flags, "interrupted flags modified value: 0x%lx", value); >> + } >> + >> + /* Write invalid bit in flag register */ >> + flags = SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT << 1; >> + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", >> + flags); >> + > > no need to wrap That's actually > 100 columns if unwrap. > >> + flags = BIT(SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT + 1); >> + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", >> + flags); >> + > > no need to wrap Ditto > >> + ret = sbi_sse_write_attrs(event_id, attr, 1, &prev_value); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Restore interrupted flags"); >> + >> + /* Try to change HARTID/Priority while running */ >> + if (sbi_sse_event_is_global(event_id)) { >> + value = current_thread_info()->hartid; >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "Set hart id while running error"); >> + } >> + >> + value = 0; >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "Set priority while running error"); >> + >> + value = READ_ONCE(arg->expected_a6); >> + report(interrupted_state[2] == value, "Interrupted state a6, expected 0x%lx, got 0x%lx", >> + value, interrupted_state[2]); >> + >> + report(interrupted_state[3] == SBI_EXT_SSE, >> + "Interrupted state a7, expected 0x%x, got 0x%lx", SBI_EXT_SSE, >> + interrupted_state[3]); >> + >> + WRITE_ONCE(arg->done, true); >> +} >> + >> +static int sse_test_inject_simple(uint32_t event_id) >> +{ >> + unsigned long value, error; >> + struct sbiret ret; >> + int err_ret = 1; >> + struct sse_simple_test_arg test_arg = {.event_id = event_id}; >> + struct sbi_sse_handler_arg args = { >> + .handler = sse_simple_handler, >> + .handler_data = (void *)&test_arg, >> + .stack = sse_alloc_stack(), >> + }; >> + >> + report_prefix_push("simple"); >> + >> + if (!sse_check_state(event_id, SBI_SSE_STATE_UNUSED)) >> + goto err; >> + >> + ret = sbi_sse_register(event_id, &args); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "register")) >> + goto err; >> + >> + if (!sse_check_state(event_id, SBI_SSE_STATE_REGISTERED)) >> + goto err; >> + >> + if (sbi_sse_event_is_global(event_id)) { >> + /* Be sure global events are targeting the current hart */ >> + error = sse_global_event_set_current_hart(event_id); >> + if (error) >> + goto err; >> + } >> + >> + ret = sbi_sse_enable(event_id); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "enable")) >> + goto err; >> + >> + if (!sse_check_state(event_id, SBI_SSE_STATE_ENABLED)) >> + goto err; >> + >> + ret = sbi_sse_hart_mask(); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "hart mask")) >> + goto err; >> + >> + ret = sbi_sse_inject(event_id, current_thread_info()->hartid); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "injection masked")) { >> + sbi_sse_hart_unmask(); >> + goto err; >> + } >> + >> + report(READ_ONCE(test_arg.done) == 0, "event masked not handled"); >> + >> + /* >> + * When unmasking the SSE events, we expect it to be injected >> + * immediately so a6 should be SBI_EXT_SBI_SSE_HART_UNMASK >> + */ >> + WRITE_ONCE(test_arg.expected_a6, SBI_EXT_SSE_HART_UNMASK); >> + ret = sbi_sse_hart_unmask(); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "hart unmask")) { >> + goto err; >> + } >> + >> + report(READ_ONCE(test_arg.done) == 1, "event unmasked handled"); >> + WRITE_ONCE(test_arg.done, 0); >> + WRITE_ONCE(test_arg.expected_a6, SBI_EXT_SSE_INJECT); >> + >> + /* Set as oneshot and verify it is disabled */ >> + ret = sbi_sse_disable(event_id); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Disable event")) { >> + /* Nothing we can really do here, event can not be disabled */ >> + goto err; >> + } >> + >> + value = SBI_SSE_ATTR_CONFIG_ONESHOT; >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_CONFIG, 1, &value); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Set event attribute as ONESHOT")) >> + goto err; >> + >> + ret = sbi_sse_enable(event_id); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Enable event")) >> + goto err; >> + >> + ret = sbi_sse_inject(event_id, current_thread_info()->hartid); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "second injection")) >> + goto err; >> + >> + report(READ_ONCE(test_arg.done) == 1, "event handled"); >> + WRITE_ONCE(test_arg.done, 0); >> + >> + if (!sse_check_state(event_id, SBI_SSE_STATE_REGISTERED)) >> + goto err; >> + >> + /* Clear ONESHOT FLAG */ >> + value = 0; >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_CONFIG, 1, &value); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Clear CONFIG.ONESHOT flag")) >> + goto err; >> + >> + ret = sbi_sse_unregister(event_id); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "unregister")) >> + goto err; >> + >> + sse_check_state(event_id, SBI_SSE_STATE_UNUSED); >> + >> + err_ret = 0; >> +err: >> + sse_free_stack(args.stack); >> + report_prefix_pop(); >> + >> + return err_ret; >> +} >> + >> +struct sse_foreign_cpu_test_arg { >> + bool done; >> + unsigned int expected_cpu; >> + uint32_t event_id; >> +}; >> + >> +static void sse_foreign_cpu_handler(void *data, struct pt_regs *regs, unsigned int hartid) >> +{ >> + struct sse_foreign_cpu_test_arg *arg = data; >> + unsigned int expected_cpu; >> + >> + /* For arg content to be visible */ >> + smp_rmb(); >> + expected_cpu = READ_ONCE(arg->expected_cpu); >> + report(expected_cpu == current_thread_info()->cpu, >> + "Received event on CPU (%d), expected CPU (%d)", current_thread_info()->cpu, >> + expected_cpu); >> + >> + WRITE_ONCE(arg->done, true); >> + /* For arg update to be visible for other CPUs */ >> + smp_wmb(); >> +} >> + >> +struct sse_local_per_cpu { >> + struct sbi_sse_handler_arg args; >> + struct sbiret ret; >> + struct sse_foreign_cpu_test_arg handler_arg; >> +}; >> + >> +static void sse_register_enable_local(void *data) >> +{ >> + struct sbiret ret; >> + struct sse_local_per_cpu *cpu_args = data; >> + struct sse_local_per_cpu *cpu_arg = &cpu_args[current_thread_info()->cpu]; >> + uint32_t event_id = cpu_arg->handler_arg.event_id; >> + >> + ret = sbi_sse_register(event_id, &cpu_arg->args); >> + WRITE_ONCE(cpu_arg->ret, ret); >> + if (ret.error) >> + return; >> + >> + ret = sbi_sse_enable(event_id); >> + WRITE_ONCE(cpu_arg->ret, ret); >> +} >> + >> +static void sbi_sse_disable_unregister_local(void *data) >> +{ >> + struct sbiret ret; >> + struct sse_local_per_cpu *cpu_args = data; >> + struct sse_local_per_cpu *cpu_arg = &cpu_args[current_thread_info()->cpu]; >> + uint32_t event_id = cpu_arg->handler_arg.event_id; >> + >> + ret = sbi_sse_disable(event_id); >> + WRITE_ONCE(cpu_arg->ret, ret); >> + if (ret.error) >> + return; >> + >> + ret = sbi_sse_unregister(event_id); >> + WRITE_ONCE(cpu_arg->ret, ret); >> +} >> + >> +static int sse_test_inject_local(uint32_t event_id) >> +{ >> + int cpu; >> + int err_ret = 1; >> + struct sbiret ret; >> + struct sse_local_per_cpu *cpu_args, *cpu_arg; >> + struct sse_foreign_cpu_test_arg *handler_arg; >> + >> + cpu_args = calloc(NR_CPUS, sizeof(struct sbi_sse_handler_arg)); >> + >> + report_prefix_push("local_dispatch"); >> + for_each_online_cpu(cpu) { >> + cpu_arg = &cpu_args[cpu]; >> + cpu_arg->handler_arg.event_id = event_id; >> + cpu_arg->args.stack = sse_alloc_stack(); >> + cpu_arg->args.handler = sse_foreign_cpu_handler; >> + cpu_arg->args.handler_data = (void *)&cpu_arg->handler_arg; >> + } >> + >> + on_cpus(sse_register_enable_local, cpu_args); >> + for_each_online_cpu(cpu) { >> + cpu_arg = &cpu_args[cpu]; >> + ret = cpu_arg->ret; >> + if (ret.error) { >> + report_fail("CPU failed to register/enable event: %ld", ret.error); >> + goto err; >> + } >> + >> + handler_arg = &cpu_arg->handler_arg; >> + WRITE_ONCE(handler_arg->expected_cpu, cpu); >> + /* For handler_arg content to be visible for other CPUs */ >> + smp_wmb(); >> + ret = sbi_sse_inject(event_id, cpus[cpu].hartid); >> + if (ret.error) { >> + report_fail("CPU failed to inject event: %ld", ret.error); >> + goto err; >> + } >> + } >> + >> + for_each_online_cpu(cpu) { >> + handler_arg = &cpu_args[cpu].handler_arg; >> + smp_rmb(); >> + while (!READ_ONCE(handler_arg->done)) { >> + /* For handler_arg update to be visible */ >> + smp_rmb(); >> + cpu_relax(); >> + } >> + WRITE_ONCE(handler_arg->done, false); >> + } >> + >> + on_cpus(sbi_sse_disable_unregister_local, cpu_args); >> + for_each_online_cpu(cpu) { >> + cpu_arg = &cpu_args[cpu]; >> + ret = READ_ONCE(cpu_arg->ret); >> + if (ret.error) { >> + report_fail("CPU failed to disable/unregister event: %ld", ret.error); >> + goto err; >> + } >> + } >> + >> + err_ret = 0; >> +err: >> + for_each_online_cpu(cpu) { >> + cpu_arg = &cpu_args[cpu]; >> + sse_free_stack(cpu_arg->args.stack); >> + } >> + >> + report_pass("local event dispatch on all CPUs"); >> + report_prefix_pop(); >> + >> + return err_ret; >> +} >> + >> +static int sse_test_inject_global(uint32_t event_id) >> +{ >> + unsigned long value; >> + struct sbiret ret; >> + unsigned int cpu; >> + uint64_t timeout; >> + int err_ret = 1; >> + struct sse_foreign_cpu_test_arg test_arg = {.event_id = event_id}; >> + struct sbi_sse_handler_arg args = { >> + .handler = sse_foreign_cpu_handler, >> + .handler_data = (void *)&test_arg, >> + .stack = sse_alloc_stack(), >> + }; >> + enum sbi_sse_state state; >> + >> + report_prefix_push("global_dispatch"); >> + >> + ret = sbi_sse_register(event_id, &args); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Register event")) >> + goto err; >> + >> + for_each_online_cpu(cpu) { >> + WRITE_ONCE(test_arg.expected_cpu, cpu); >> + /* For test_arg content to be visible for other CPUs */ >> + smp_wmb(); >> + value = cpu; >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Set preferred hart")) >> + goto err; >> + >> + ret = sbi_sse_enable(event_id); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Enable event")) >> + goto err; >> + >> + ret = sbi_sse_inject(event_id, cpu); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Inject event")) >> + goto err; >> + >> + smp_rmb(); >> + while (!READ_ONCE(test_arg.done)) { >> + /* For shared test_arg structure */ >> + smp_rmb(); >> + cpu_relax(); >> + } >> + >> + WRITE_ONCE(test_arg.done, false); >> + >> + timeout = timer_get_cycles() + usec_to_cycles((uint64_t)1000); > > Should probably add an environment variable for this timeout and then use > 1000 as the default when the variable isn't set. Ack. > >> + /* Wait for event to be back in ENABLED state */ >> + do { >> + ret = sse_event_get_state(event_id, &state); >> + if (ret.error) >> + goto err; >> + cpu_relax(); >> + } while (state != SBI_SSE_STATE_ENABLED || timer_get_cycles() < timeout); > > ^ && >> + >> + if (!report(state == SBI_SSE_STATE_ENABLED, >> + "wait for event to be in enable state")) > > nit: no need to wrap this line, but if we do, then "wait should be under > state > >> + goto err; >> + >> + ret = sbi_sse_disable(event_id); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Disable event")) >> + goto err; >> + >> + report_pass("Global event on CPU %d", cpu); >> + } >> + >> + ret = sbi_sse_unregister(event_id); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Unregister event"); >> + >> + err_ret = 0; >> + >> +err: >> + sse_free_stack(args.stack); >> + report_prefix_pop(); >> + >> + return err_ret; >> +} >> + >> +struct priority_test_arg { >> + uint32_t event_id; >> + bool called; >> + u32 prio; >> + struct priority_test_arg *next_event_arg; >> + void (*check_func)(struct priority_test_arg *arg); >> +}; >> + >> +static void sse_hi_priority_test_handler(void *arg, struct pt_regs *regs, >> + unsigned int hartid) >> +{ >> + struct priority_test_arg *targ = arg; >> + struct priority_test_arg *next = targ->next_event_arg; >> + >> + targ->called = true; >> + if (next) { >> + sbi_sse_inject(next->event_id, current_thread_info()->hartid); >> + >> + report(!sse_event_pending(next->event_id), "Higher priority event is not pending"); >> + report(next->called, "Higher priority event was handled"); >> + } >> +} >> + >> +static void sse_low_priority_test_handler(void *arg, struct pt_regs *regs, >> + unsigned int hartid) >> +{ >> + struct priority_test_arg *targ = arg; >> + struct priority_test_arg *next = targ->next_event_arg; >> + >> + targ->called = true; >> + >> + if (next) { >> + sbi_sse_inject(next->event_id, current_thread_info()->hartid); >> + >> + report(sse_event_pending(next->event_id), "Lower priority event is pending"); >> + report(!next->called, "Lower priority event %s was not handle before %s", > > handled > >> + sse_event_name(next->event_id), sse_event_name(targ->event_id)); >> + } >> +} >> + >> +static void sse_test_injection_priority_arg(struct priority_test_arg *in_args, >> + unsigned int in_args_size, >> + sbi_sse_handler_fn handler, >> + const char *test_name) >> +{ >> + unsigned int i; >> + unsigned long value, uret; >> + struct sbiret ret; >> + uint32_t event_id; >> + struct priority_test_arg *arg; >> + unsigned int args_size = 0; >> + struct sbi_sse_handler_arg event_args[in_args_size]; >> + struct priority_test_arg *args[in_args_size]; >> + void *stack; >> + struct sbi_sse_handler_arg *event_arg; >> + >> + report_prefix_push(test_name); >> + >> + for (i = 0; i < in_args_size; i++) { >> + arg = &in_args[i]; >> + event_id = arg->event_id; >> + if (!sse_event_can_inject(event_id)) >> + continue; >> + >> + args[args_size] = arg; >> + args_size++; >> + event_args->stack = 0; >> + } >> + >> + if (!args_size) { >> + report_skip("No injectable events"); >> + goto skip; >> + } >> + >> + for (i = 0; i < args_size; i++) { >> + arg = args[i]; >> + event_id = arg->event_id; >> + stack = sse_alloc_stack(); >> + >> + event_arg = &event_args[i]; >> + event_arg->handler = handler; >> + event_arg->handler_data = (void *)arg; >> + event_arg->stack = stack; >> + >> + if (i < (args_size - 1)) >> + arg->next_event_arg = args[i + 1]; >> + else >> + arg->next_event_arg = NULL; >> + >> + /* Be sure global events are targeting the current hart */ >> + if (sbi_sse_event_is_global(event_id)) { >> + uret = sse_global_event_set_current_hart(event_id); >> + if (uret) >> + goto err; > > If we goto err from here or the next goto, then we'll also get disable and > unregister failures reported. > >> + } >> + >> + ret = sbi_sse_register(event_id, event_arg); >> + if (ret.error) { >> + sbiret_report_error(&ret, SBI_SUCCESS, "register event 0x%x", event_id); >> + goto err; >> + } >> + >> + value = arg->prio; >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); >> + if (ret.error) { >> + sbiret_report_error(&ret, SBI_SUCCESS, "set event 0x%x priority", event_id); >> + goto err; > > from here we'll get disable failures reported. > >> + } >> + ret = sbi_sse_enable(event_id); >> + if (ret.error) { >> + sbiret_report_error(&ret, SBI_SUCCESS, "enable event 0x%x", event_id); >> + goto err; >> + } >> + } > > It looks like we need a complete > > if (!a) > goto a_out; > if (!b) > goto b_out; > > /* ret = success */ > > b_out: > /* undo b */ > a_out: > /* undo a */ > > return ret; > > type of model. Or just manage it directly. > > if (!a) > ... > if (!b) { > /* undo a */ > ... > } > >> + >> + /* Inject first event */ >> + ret = sbi_sse_inject(args[0]->event_id, current_thread_info()->hartid); >> + sbiret_report_error(&ret, SBI_SUCCESS, "injection"); >> + >> +err: >> + for (i = 0; i < args_size; i++) { >> + arg = args[i]; >> + event_id = arg->event_id; >> + >> + report(arg->called, "Event %s handler called", sse_event_name(arg->event_id)); >> + >> + ret = sbi_sse_disable(event_id); >> + if (ret.error) >> + sbiret_report_error(&ret, SBI_SUCCESS, "disable event 0x%x", event_id); >> + >> + sbi_sse_unregister(event_id); >> + if (ret.error) >> + sbiret_report_error(&ret, SBI_SUCCESS, "unregister event 0x%x", event_id); >> + >> + event_arg = &event_args[i]; >> + if (event_arg->stack) >> + sse_free_stack(event_arg->stack); >> + } >> + >> +skip: >> + report_prefix_pop(); >> +} >> + >> +static struct priority_test_arg hi_prio_args[] = { >> + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE}, >> + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW}, >> + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS}, >> +}; >> + >> +static struct priority_test_arg low_prio_args[] = { >> + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP}, >> + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS}, >> + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE}, >> + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE}, >> +}; >> + >> +static struct priority_test_arg prio_args[] = { >> + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE, .prio = 5}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE, .prio = 10}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, .prio = 12}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, .prio = 15}, >> + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, .prio = 20}, >> + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, .prio = 22}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, .prio = 25}, >> +}; >> + >> +static struct priority_test_arg same_prio_args[] = { >> + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, .prio = 0}, >> + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, .prio = 0}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, .prio = 10}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE, .prio = 10}, >> + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE, .prio = 10}, >> + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, .prio = 20}, >> + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, .prio = 20}, >> +}; >> + >> +static void sse_test_injection_priority(void) >> +{ >> + report_prefix_push("prio"); >> + >> + sse_test_injection_priority_arg(hi_prio_args, ARRAY_SIZE(hi_prio_args), >> + sse_hi_priority_test_handler, "high"); >> + >> + sse_test_injection_priority_arg(low_prio_args, ARRAY_SIZE(low_prio_args), >> + sse_low_priority_test_handler, "low"); >> + >> + sse_test_injection_priority_arg(prio_args, ARRAY_SIZE(prio_args), >> + sse_low_priority_test_handler, "changed"); >> + >> + sse_test_injection_priority_arg(same_prio_args, ARRAY_SIZE(same_prio_args), >> + sse_low_priority_test_handler, "same_prio_args"); >> + >> + report_prefix_pop(); >> +} >> + >> +static void test_invalid_event_id(unsigned long event_id) >> +{ >> + struct sbiret ret; >> + unsigned long value = 0; >> + >> + ret = sbi_sse_register_raw(event_id, (unsigned long) sbi_sse_entry, 0); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, >> + "register event_id 0x%lx", event_id); >> + >> + ret = sbi_sse_unregister(event_id); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, >> + "unregister event_id 0x%lx", event_id); >> + >> + ret = sbi_sse_enable(event_id); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, >> + "enable event_id 0x%lx", event_id); >> + >> + ret = sbi_sse_disable(event_id); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, >> + "disable event_id 0x%lx", event_id); >> + >> + ret = sbi_sse_inject(event_id, 0); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, >> + "inject event_id 0x%lx", event_id); >> + >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, >> + "write attr event_id 0x%lx", event_id); >> + >> + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, >> + "read attr event_id 0x%lx", event_id); >> +} >> + >> +static void sse_test_invalid_event_id(void) >> +{ >> + >> + report_prefix_push("event_id"); >> + >> + test_invalid_event_id(SBI_SSE_EVENT_LOCAL_RESERVED_0_START); >> + >> + report_prefix_pop(); >> +} >> + >> +static void sse_check_event_availability(uint32_t event_id, bool *can_inject, bool *supported) >> +{ >> + unsigned long status; >> + struct sbiret ret; >> + >> + *can_inject = false; >> + *supported = false; >> + >> + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, 1, &status); >> + if (ret.error != SBI_SUCCESS && ret.error != SBI_ERR_NOT_SUPPORTED) { >> + report_fail("Get event status != SBI_SUCCESS && != SBI_ERR_NOT_SUPPORTED: %ld", ret.error); >> + return; >> + } >> + if (ret.error == SBI_ERR_NOT_SUPPORTED) >> + return; >> + >> + if (!ret.error) >> + *supported = true; >> + >> + *can_inject = (status >> SBI_SSE_ATTR_STATUS_INJECT_OFFSET) & 1; > > this should also be under the 'if (!ret.error)' the 'if (!ret.error)' is actually useless since at that point it is == 0 (ie based on the two 'if's above. > >> +} >> + >> +static void sse_secondary_boot_and_unmask(void *data) >> +{ >> + sbi_sse_hart_unmask(); >> +} >> + >> +static void sse_check_mask(void) >> +{ >> + struct sbiret ret; >> + >> + /* Upon boot, event are masked, check that */ >> + ret = sbi_sse_hart_mask(); >> + sbiret_report_error(&ret, SBI_ERR_ALREADY_STOPPED, "hart mask at boot time"); >> + >> + ret = sbi_sse_hart_unmask(); >> + sbiret_report_error(&ret, SBI_SUCCESS, "hart unmask"); >> + ret = sbi_sse_hart_unmask(); >> + sbiret_report_error(&ret, SBI_ERR_ALREADY_STARTED, "hart unmask twice error"); >> + >> + ret = sbi_sse_hart_mask(); >> + sbiret_report_error(&ret, SBI_SUCCESS, "hart mask"); >> + ret = sbi_sse_hart_mask(); >> + sbiret_report_error(&ret, SBI_ERR_ALREADY_STOPPED, "hart mask twice"); >> +} >> + >> +static int run_inject_test(struct sse_event_info *info) >> +{ >> + unsigned long event_id = info->event_id; >> + >> + if (!info->can_inject) { >> + report_skip("Event does not support injection, skipping injection tests"); >> + return 0; >> + } >> + >> + if (sse_test_inject_simple(event_id)) >> + return 1; >> + >> + if (sbi_sse_event_is_global(event_id)) >> + return sse_test_inject_global(event_id); >> + else >> + return sse_test_inject_local(event_id); >> +} >> + >> +void check_sse(void) >> +{ >> + struct sse_event_info *info; >> + unsigned long i, event_id; >> + bool supported; >> + >> + report_prefix_push("sse"); >> + >> + if (!sbi_probe(SBI_EXT_SSE)) { >> + report_skip("extension not available"); >> + report_prefix_pop(); >> + return; >> + } >> + >> + sse_check_mask(); >> + >> + /* >> + * Dummy wakeup of all processors since some of them will be targeted >> + * by global events without going through the wakeup call as well as >> + * unmasking SSE events on all harts >> + */ >> + on_cpus(sse_secondary_boot_and_unmask, NULL); >> + >> + sse_test_invalid_event_id(); >> + >> + for (i = 0; i < ARRAY_SIZE(sse_event_infos); i++) { >> + info = &sse_event_infos[i]; >> + event_id = info->event_id; >> + report_prefix_push(info->name); >> + sse_check_event_availability(event_id, &info->can_inject, &supported); >> + if (!supported) { >> + report_skip("Event is not supported, skipping tests"); >> + report_prefix_pop(); >> + continue; >> + } >> + >> + sse_test_attrs(event_id); >> + sse_test_register_error(event_id); >> + >> + if (run_inject_test(info)) { >> + report_skip("Event test failed, event state unreliable"); >> + info->can_inject = false; >> + } >> + >> + report_prefix_pop(); >> + } >> + >> + sse_test_injection_priority(); >> + >> + report_prefix_pop(); >> +} >> diff --git a/riscv/sbi.c b/riscv/sbi.c >> index 0404bb81..478cb35d 100644 >> --- a/riscv/sbi.c >> +++ b/riscv/sbi.c >> @@ -32,6 +32,7 @@ >> >> #define HIGH_ADDR_BOUNDARY ((phys_addr_t)1 << 32) >> >> +void check_sse(void); >> void check_fwft(void); >> >> static long __labs(long a) >> @@ -1567,6 +1568,7 @@ int main(int argc, char **argv) >> check_hsm(); >> check_dbcn(); >> check_susp(); >> + check_sse(); >> check_fwft(); >> >> return report_summary(); >> -- >> 2.47.2 >> > > Thanks, > drew From ajones at ventanamicro.com Fri Mar 14 03:09:37 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Fri, 14 Mar 2025 11:09:37 +0100 Subject: [kvm-unit-tests PATCH v8 6/6] riscv: sbi: Add SSE extension tests In-Reply-To: References: <20250307161549.1873770-1-cleger@rivosinc.com> <20250307161549.1873770-7-cleger@rivosinc.com> <20250312-fa9b1889ef5f422be2e20cf4@orel> Message-ID: <20250314-427187c2f57975472e8082e0@orel> On Fri, Mar 14, 2025 at 10:29:04AM +0100, Cl?ment L?ger wrote: > > > On 12/03/2025 18:51, Andrew Jones wrote: ... > >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", > >> + flags); > >> + > > > > no need to wrap > > That's actually > 100 columns if unwrap. True, even more than just 1 or 2 chars, so we can leave it wrapped. But, I'd still merge it sticking it out too :-) Thanks, drew From cleger at rivosinc.com Fri Mar 14 04:10:23 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Fri, 14 Mar 2025 12:10:23 +0100 Subject: [kvm-unit-tests PATCH v9 0/6] riscv: add SBI SSE extension tests Message-ID: <20250314111030.3728671-1-cleger@rivosinc.com> This series adds tests for SBI SSE extension as well as needed infrastructure for SSE support. It also adds test specific asm-offsets generation to use custom OFFSET and DEFINE from the test directory. --- V9: - Use __ASSEMBLER__ instead of __ASSEMBLY__ - Remove extra spaces - Use assert to check global event in sse_global_event_set_current_hart() - Tabulate SSE events names table - Use sbi_sse_register() instead of sbi_sse_register_raw() in error testing - Move a report_pass() out of error path - Rework all injection tests with better error handling - Use an env var for sse event completion timeout - Add timeout for some potentially infinite while() loops V8: - Short circuit current event tests if failure happens - Remove SSE from all report strings - Indent .prio field - Add cpu_relax()/smp_rmb() where needed - Add timeout for global event ENABLED state check - Added BIT(32) aliases tests for attribute/event_id. V7: - Test ids/attributes/attributes count > 32 bits - Rename all SSE function to sbi_sse_* - Use event_id instead of event/evt - Factorize read/write test - Use virt_to_phys() for attributes read/write. - Extensively use sbiret_report_error() - Change check function return values to bool. - Added assert for stack size to be below or equal to PAGE_SIZE - Use en env variable for the maximum hart ID - Check that individual read from attributes matches the multiple attributes read. - Added multiple attributes write at once - Used READ_ONCE/WRITE_ONCE - Inject all local event at once rather than looping fopr each core. - Split test_arg for local_dispatch test so that all CPUs can run at once. - Move SSE entry and generic code to lib/riscv for other tests - Fix unmask/mask state checking V6: - Add missing $(generated-file) dependencies for "-deps" objects - Split SSE entry from sbi-asm.S to sse-asm.S and all SSE core functions since it will be useful for other tests as well (dbltrp). V5: - Update event ranges based on latest spec - Rename asm-offset-test.c to sbi-asm-offset.c V4: - Fix typo sbi_ext_ss_fid -> sbi_ext_sse_fid - Add proper asm-offset generation for tests - Move SSE specific file from lib/riscv to riscv/ V3: - Add -deps variable for test specific dependencies - Fix formatting errors/typo in sbi.h - Add missing double trap event - Alphabetize sbi-sse.c includes - Fix a6 content after unmasking event - Add SSE HART_MASK/UNMASK test - Use mv instead of move - move sbi_check_sse() definition in sbi.c - Remove sbi_sse test from unitests.cfg V2: - Rebased on origin/master and integrate it into sbi.c tests Cl?ment L?ger (6): kbuild: Allow multiple asm-offsets file to be generated riscv: Set .aux.o files as .PRECIOUS riscv: Use asm-offsets to generate SBI_EXT_HSM values riscv: lib: Add SBI SSE extension definitions lib: riscv: Add SBI SSE support riscv: sbi: Add SSE extension tests scripts/asm-offsets.mak | 22 +- riscv/Makefile | 5 +- lib/riscv/asm/csr.h | 1 + lib/riscv/asm/sbi.h | 142 ++++- lib/riscv/sbi-sse-asm.S | 102 ++++ lib/riscv/asm-offsets.c | 9 + lib/riscv/sbi.c | 76 +++ riscv/sbi-tests.h | 1 + riscv/sbi-asm.S | 6 +- riscv/sbi-asm-offsets.c | 11 + riscv/sbi-sse.c | 1263 +++++++++++++++++++++++++++++++++++++++ riscv/sbi.c | 2 + riscv/.gitignore | 1 + 13 files changed, 1630 insertions(+), 11 deletions(-) create mode 100644 lib/riscv/sbi-sse-asm.S create mode 100644 riscv/sbi-asm-offsets.c create mode 100644 riscv/sbi-sse.c create mode 100644 riscv/.gitignore -- 2.47.2 From cleger at rivosinc.com Fri Mar 14 04:10:24 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Fri, 14 Mar 2025 12:10:24 +0100 Subject: [kvm-unit-tests PATCH v9 1/6] kbuild: Allow multiple asm-offsets file to be generated In-Reply-To: <20250314111030.3728671-1-cleger@rivosinc.com> References: <20250314111030.3728671-1-cleger@rivosinc.com> Message-ID: <20250314111030.3728671-2-cleger@rivosinc.com> In order to allow multiple asm-offsets files to generated the include guard need to be different between these file. Add a asm_offset_name makefile macro to obtain an uppercase name matching the original asm offsets file. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- scripts/asm-offsets.mak | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/scripts/asm-offsets.mak b/scripts/asm-offsets.mak index 7b64162d..a5fdbf5d 100644 --- a/scripts/asm-offsets.mak +++ b/scripts/asm-offsets.mak @@ -15,10 +15,14 @@ define sed-y s:->::; p;}' endef +define asm_offset_name + $(shell echo $(notdir $(1)) | tr [:lower:]- [:upper:]_) +endef + define make_asm_offsets (set -e; \ - echo "#ifndef __ASM_OFFSETS_H__"; \ - echo "#define __ASM_OFFSETS_H__"; \ + echo "#ifndef __$(strip $(asm_offset_name))_H__"; \ + echo "#define __$(strip $(asm_offset_name))_H__"; \ echo "/*"; \ echo " * Generated file. DO NOT MODIFY."; \ echo " *"; \ @@ -29,12 +33,16 @@ define make_asm_offsets echo "#endif" ) > $@ endef -$(asm-offsets:.h=.s): $(asm-offsets:.h=.c) - $(CC) $(CFLAGS) -fverbose-asm -S -o $@ $< +define gen_asm_offsets_rules +$(1).s: $(1).c + $(CC) $(CFLAGS) -fverbose-asm -S -o $$@ $$< + +$(1).h: $(1).s + $$(call make_asm_offsets,$(1)) + cp -f $$@ lib/generated/ +endef -$(asm-offsets): $(asm-offsets:.h=.s) - $(call make_asm_offsets) - cp -f $(asm-offsets) lib/generated/ +$(foreach o,$(asm-offsets),$(eval $(call gen_asm_offsets_rules, $(o:.h=)))) OBJDIRS += lib/generated -- 2.47.2 From cleger at rivosinc.com Fri Mar 14 04:10:25 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Fri, 14 Mar 2025 12:10:25 +0100 Subject: [kvm-unit-tests PATCH v9 2/6] riscv: Set .aux.o files as .PRECIOUS In-Reply-To: <20250314111030.3728671-1-cleger@rivosinc.com> References: <20250314111030.3728671-1-cleger@rivosinc.com> Message-ID: <20250314111030.3728671-3-cleger@rivosinc.com> When compiling, we need to keep .aux.o file or they will be removed after the compilation which leads to dependent files to be recompiled. Set these files as .PRECIOUS to keep them. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- riscv/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/riscv/Makefile b/riscv/Makefile index 52718f3f..ae9cf02a 100644 --- a/riscv/Makefile +++ b/riscv/Makefile @@ -90,6 +90,7 @@ CFLAGS += -I $(SRCDIR)/lib -I $(SRCDIR)/lib/libfdt -I lib -I $(SRCDIR)/riscv asm-offsets = lib/riscv/asm-offsets.h include $(SRCDIR)/scripts/asm-offsets.mak +.PRECIOUS: %.aux.o %.aux.o: $(SRCDIR)/lib/auxinfo.c $(CC) $(CFLAGS) -c -o $@ $< \ -DPROGNAME=\"$(notdir $(@:.aux.o=.$(exe)))\" -DAUXFLAGS=$(AUXFLAGS) -- 2.47.2 From cleger at rivosinc.com Fri Mar 14 04:10:26 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Fri, 14 Mar 2025 12:10:26 +0100 Subject: [kvm-unit-tests PATCH v9 3/6] riscv: Use asm-offsets to generate SBI_EXT_HSM values In-Reply-To: <20250314111030.3728671-1-cleger@rivosinc.com> References: <20250314111030.3728671-1-cleger@rivosinc.com> Message-ID: <20250314111030.3728671-4-cleger@rivosinc.com> Replace hardcoded values with generated ones using sbi-asm-offset. This allows to directly use ASM_SBI_EXT_HSM and ASM_SBI_EXT_HSM_STOP in assembly. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- riscv/Makefile | 2 +- riscv/sbi-asm.S | 6 ++++-- riscv/sbi-asm-offsets.c | 11 +++++++++++ riscv/.gitignore | 1 + 4 files changed, 17 insertions(+), 3 deletions(-) create mode 100644 riscv/sbi-asm-offsets.c create mode 100644 riscv/.gitignore diff --git a/riscv/Makefile b/riscv/Makefile index ae9cf02a..02d2ac39 100644 --- a/riscv/Makefile +++ b/riscv/Makefile @@ -87,7 +87,7 @@ CFLAGS += -ffreestanding CFLAGS += -O2 CFLAGS += -I $(SRCDIR)/lib -I $(SRCDIR)/lib/libfdt -I lib -I $(SRCDIR)/riscv -asm-offsets = lib/riscv/asm-offsets.h +asm-offsets = lib/riscv/asm-offsets.h riscv/sbi-asm-offsets.h include $(SRCDIR)/scripts/asm-offsets.mak .PRECIOUS: %.aux.o diff --git a/riscv/sbi-asm.S b/riscv/sbi-asm.S index f4185496..51f46efd 100644 --- a/riscv/sbi-asm.S +++ b/riscv/sbi-asm.S @@ -6,6 +6,8 @@ */ #include #include +#include +#include #include "sbi-tests.h" @@ -57,8 +59,8 @@ sbi_hsm_check: 7: lb t0, 0(t1) pause beqz t0, 7b - li a7, 0x48534d /* SBI_EXT_HSM */ - li a6, 1 /* SBI_EXT_HSM_HART_STOP */ + li a7, ASM_SBI_EXT_HSM + li a6, ASM_SBI_EXT_HSM_HART_STOP ecall 8: pause j 8b diff --git a/riscv/sbi-asm-offsets.c b/riscv/sbi-asm-offsets.c new file mode 100644 index 00000000..bd37b6a2 --- /dev/null +++ b/riscv/sbi-asm-offsets.c @@ -0,0 +1,11 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include + +int main(void) +{ + DEFINE(ASM_SBI_EXT_HSM, SBI_EXT_HSM); + DEFINE(ASM_SBI_EXT_HSM_HART_STOP, SBI_EXT_HSM_HART_STOP); + + return 0; +} diff --git a/riscv/.gitignore b/riscv/.gitignore new file mode 100644 index 00000000..0a8c5a36 --- /dev/null +++ b/riscv/.gitignore @@ -0,0 +1 @@ +/*-asm-offsets.[hs] -- 2.47.2 From cleger at rivosinc.com Fri Mar 14 04:10:27 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Fri, 14 Mar 2025 12:10:27 +0100 Subject: [kvm-unit-tests PATCH v9 4/6] riscv: lib: Add SBI SSE extension definitions In-Reply-To: <20250314111030.3728671-1-cleger@rivosinc.com> References: <20250314111030.3728671-1-cleger@rivosinc.com> Message-ID: <20250314111030.3728671-5-cleger@rivosinc.com> Add SBI SSE extension definitions in sbi.h Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- lib/riscv/asm/sbi.h | 106 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 105 insertions(+), 1 deletion(-) diff --git a/lib/riscv/asm/sbi.h b/lib/riscv/asm/sbi.h index 2f4d91ef..780c9edd 100644 --- a/lib/riscv/asm/sbi.h +++ b/lib/riscv/asm/sbi.h @@ -30,6 +30,7 @@ enum sbi_ext_id { SBI_EXT_DBCN = 0x4442434E, SBI_EXT_SUSP = 0x53555350, SBI_EXT_FWFT = 0x46574654, + SBI_EXT_SSE = 0x535345, }; enum sbi_ext_base_fid { @@ -78,7 +79,6 @@ enum sbi_ext_dbcn_fid { SBI_EXT_DBCN_CONSOLE_WRITE_BYTE, }; - enum sbi_ext_fwft_fid { SBI_EXT_FWFT_SET = 0, SBI_EXT_FWFT_GET, @@ -105,6 +105,110 @@ enum sbi_ext_fwft_fid { #define SBI_FWFT_SET_FLAG_LOCK BIT(0) +enum sbi_ext_sse_fid { + SBI_EXT_SSE_READ_ATTRS = 0, + SBI_EXT_SSE_WRITE_ATTRS, + SBI_EXT_SSE_REGISTER, + SBI_EXT_SSE_UNREGISTER, + SBI_EXT_SSE_ENABLE, + SBI_EXT_SSE_DISABLE, + SBI_EXT_SSE_COMPLETE, + SBI_EXT_SSE_INJECT, + SBI_EXT_SSE_HART_UNMASK, + SBI_EXT_SSE_HART_MASK, +}; + +/* SBI SSE Event Attributes. */ +enum sbi_sse_attr_id { + SBI_SSE_ATTR_STATUS = 0x00000000, + SBI_SSE_ATTR_PRIORITY = 0x00000001, + SBI_SSE_ATTR_CONFIG = 0x00000002, + SBI_SSE_ATTR_PREFERRED_HART = 0x00000003, + SBI_SSE_ATTR_ENTRY_PC = 0x00000004, + SBI_SSE_ATTR_ENTRY_ARG = 0x00000005, + SBI_SSE_ATTR_INTERRUPTED_SEPC = 0x00000006, + SBI_SSE_ATTR_INTERRUPTED_FLAGS = 0x00000007, + SBI_SSE_ATTR_INTERRUPTED_A6 = 0x00000008, + SBI_SSE_ATTR_INTERRUPTED_A7 = 0x00000009, +}; + +#define SBI_SSE_ATTR_STATUS_STATE_OFFSET 0 +#define SBI_SSE_ATTR_STATUS_STATE_MASK 0x3 +#define SBI_SSE_ATTR_STATUS_PENDING_OFFSET 2 +#define SBI_SSE_ATTR_STATUS_INJECT_OFFSET 3 + +#define SBI_SSE_ATTR_CONFIG_ONESHOT BIT(0) + +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPP BIT(0) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPIE BIT(1) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPV BIT(2) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPVP BIT(3) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPELP BIT(4) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT BIT(5) + +enum sbi_sse_state { + SBI_SSE_STATE_UNUSED = 0, + SBI_SSE_STATE_REGISTERED = 1, + SBI_SSE_STATE_ENABLED = 2, + SBI_SSE_STATE_RUNNING = 3, +}; + +/* SBI SSE Event IDs. */ +/* Range 0x00000000 - 0x0000ffff */ +#define SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS 0x00000000 +#define SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP 0x00000001 +#define SBI_SSE_EVENT_LOCAL_RESERVED_0_START 0x00000002 +#define SBI_SSE_EVENT_LOCAL_RESERVED_0_END 0x00003fff +#define SBI_SSE_EVENT_LOCAL_PLAT_0_START 0x00004000 +#define SBI_SSE_EVENT_LOCAL_PLAT_0_END 0x00007fff + +#define SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS 0x00008000 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_0_START 0x00008001 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_0_END 0x0000bfff +#define SBI_SSE_EVENT_GLOBAL_PLAT_0_START 0x0000c000 +#define SBI_SSE_EVENT_GLOBAL_PLAT_0_END 0x0000ffff + +/* Range 0x00010000 - 0x0001ffff */ +#define SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW 0x00010000 +#define SBI_SSE_EVENT_LOCAL_RESERVED_1_START 0x00010001 +#define SBI_SSE_EVENT_LOCAL_RESERVED_1_END 0x00013fff +#define SBI_SSE_EVENT_LOCAL_PLAT_1_START 0x00014000 +#define SBI_SSE_EVENT_LOCAL_PLAT_1_END 0x00017fff + +#define SBI_SSE_EVENT_GLOBAL_RESERVED_1_START 0x00018000 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_1_END 0x0001bfff +#define SBI_SSE_EVENT_GLOBAL_PLAT_1_START 0x0001c000 +#define SBI_SSE_EVENT_GLOBAL_PLAT_1_END 0x0001ffff + +/* Range 0x00100000 - 0x0010ffff */ +#define SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS 0x00100000 +#define SBI_SSE_EVENT_LOCAL_RESERVED_2_START 0x00100001 +#define SBI_SSE_EVENT_LOCAL_RESERVED_2_END 0x00103fff +#define SBI_SSE_EVENT_LOCAL_PLAT_2_START 0x00104000 +#define SBI_SSE_EVENT_LOCAL_PLAT_2_END 0x00107fff + +#define SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS 0x00108000 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_2_START 0x00108001 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_2_END 0x0010bfff +#define SBI_SSE_EVENT_GLOBAL_PLAT_2_START 0x0010c000 +#define SBI_SSE_EVENT_GLOBAL_PLAT_2_END 0x0010ffff + +/* Range 0xffff0000 - 0xffffffff */ +#define SBI_SSE_EVENT_LOCAL_SOFTWARE 0xffff0000 +#define SBI_SSE_EVENT_LOCAL_RESERVED_3_START 0xffff0001 +#define SBI_SSE_EVENT_LOCAL_RESERVED_3_END 0xffff3fff +#define SBI_SSE_EVENT_LOCAL_PLAT_3_START 0xffff4000 +#define SBI_SSE_EVENT_LOCAL_PLAT_3_END 0xffff7fff + +#define SBI_SSE_EVENT_GLOBAL_SOFTWARE 0xffff8000 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_3_START 0xffff8001 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_3_END 0xffffbfff +#define SBI_SSE_EVENT_GLOBAL_PLAT_3_START 0xffffc000 +#define SBI_SSE_EVENT_GLOBAL_PLAT_3_END 0xffffffff + +#define SBI_SSE_EVENT_PLATFORM_BIT BIT(14) +#define SBI_SSE_EVENT_GLOBAL_BIT BIT(15) + struct sbiret { long error; long value; -- 2.47.2 From cleger at rivosinc.com Fri Mar 14 04:10:28 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Fri, 14 Mar 2025 12:10:28 +0100 Subject: [kvm-unit-tests PATCH v9 5/6] lib: riscv: Add SBI SSE support In-Reply-To: <20250314111030.3728671-1-cleger@rivosinc.com> References: <20250314111030.3728671-1-cleger@rivosinc.com> Message-ID: <20250314111030.3728671-6-cleger@rivosinc.com> Add support for registering and handling SSE events. This will be used by sbi test as well as upcoming double trap tests. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- riscv/Makefile | 1 + lib/riscv/asm/csr.h | 1 + lib/riscv/asm/sbi.h | 36 ++++++++++++++ lib/riscv/sbi-sse-asm.S | 102 ++++++++++++++++++++++++++++++++++++++++ lib/riscv/asm-offsets.c | 9 ++++ lib/riscv/sbi.c | 76 ++++++++++++++++++++++++++++++ 6 files changed, 225 insertions(+) create mode 100644 lib/riscv/sbi-sse-asm.S diff --git a/riscv/Makefile b/riscv/Makefile index 02d2ac39..16fc125b 100644 --- a/riscv/Makefile +++ b/riscv/Makefile @@ -43,6 +43,7 @@ cflatobjs += lib/riscv/setup.o cflatobjs += lib/riscv/smp.o cflatobjs += lib/riscv/stack.o cflatobjs += lib/riscv/timer.o +cflatobjs += lib/riscv/sbi-sse-asm.o ifeq ($(ARCH),riscv32) cflatobjs += lib/ldiv32.o endif diff --git a/lib/riscv/asm/csr.h b/lib/riscv/asm/csr.h index c7fc87a9..3e4b5fca 100644 --- a/lib/riscv/asm/csr.h +++ b/lib/riscv/asm/csr.h @@ -17,6 +17,7 @@ #define CSR_TIME 0xc01 #define SR_SIE _AC(0x00000002, UL) +#define SR_SPP _AC(0x00000100, UL) /* Exception cause high bit - is an interrupt if set */ #define CAUSE_IRQ_FLAG (_AC(1, UL) << (__riscv_xlen - 1)) diff --git a/lib/riscv/asm/sbi.h b/lib/riscv/asm/sbi.h index 780c9edd..e0efcee6 100644 --- a/lib/riscv/asm/sbi.h +++ b/lib/riscv/asm/sbi.h @@ -230,5 +230,41 @@ struct sbiret sbi_send_ipi_broadcast(void); struct sbiret sbi_set_timer(unsigned long stime_value); long sbi_probe(int ext); +typedef void (*sbi_sse_handler_fn)(void *data, struct pt_regs *regs, unsigned int hartid); + +struct sbi_sse_handler_arg { + unsigned long reg_tmp; + sbi_sse_handler_fn handler; + void *handler_data; + void *stack; +}; + +extern void sbi_sse_entry(void); + +static inline bool sbi_sse_event_is_global(uint32_t event_id) +{ + return !!(event_id & SBI_SSE_EVENT_GLOBAL_BIT); +} + +struct sbiret sbi_sse_read_attrs_raw(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long phys_lo, + unsigned long phys_hi); +struct sbiret sbi_sse_read_attrs(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long *values); +struct sbiret sbi_sse_write_attrs_raw(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long phys_lo, + unsigned long phys_hi); +struct sbiret sbi_sse_write_attrs(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long *values); +struct sbiret sbi_sse_register_raw(unsigned long event_id, unsigned long entry_pc, + unsigned long entry_arg); +struct sbiret sbi_sse_register(unsigned long event_id, struct sbi_sse_handler_arg *arg); +struct sbiret sbi_sse_unregister(unsigned long event_id); +struct sbiret sbi_sse_enable(unsigned long event_id); +struct sbiret sbi_sse_disable(unsigned long event_id); +struct sbiret sbi_sse_hart_mask(void); +struct sbiret sbi_sse_hart_unmask(void); +struct sbiret sbi_sse_inject(unsigned long event_id, unsigned long hart_id); + #endif /* !__ASSEMBLER__ */ #endif /* _ASMRISCV_SBI_H_ */ diff --git a/lib/riscv/sbi-sse-asm.S b/lib/riscv/sbi-sse-asm.S new file mode 100644 index 00000000..b9e951f5 --- /dev/null +++ b/lib/riscv/sbi-sse-asm.S @@ -0,0 +1,102 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * RISC-V SSE events entry point. + * + * Copyright (C) 2025, Rivos Inc., Cl?ment L?ger + */ +#include +#include +#include +#include + +.section .text +.global sbi_sse_entry +sbi_sse_entry: + /* Save stack temporarily */ + REG_S sp, SBI_SSE_REG_TMP(a7) + /* Set entry stack */ + REG_L sp, SBI_SSE_HANDLER_STACK(a7) + + addi sp, sp, -(PT_SIZE) + REG_S ra, PT_RA(sp) + REG_S s0, PT_S0(sp) + REG_S s1, PT_S1(sp) + REG_S s2, PT_S2(sp) + REG_S s3, PT_S3(sp) + REG_S s4, PT_S4(sp) + REG_S s5, PT_S5(sp) + REG_S s6, PT_S6(sp) + REG_S s7, PT_S7(sp) + REG_S s8, PT_S8(sp) + REG_S s9, PT_S9(sp) + REG_S s10, PT_S10(sp) + REG_S s11, PT_S11(sp) + REG_S tp, PT_TP(sp) + REG_S t0, PT_T0(sp) + REG_S t1, PT_T1(sp) + REG_S t2, PT_T2(sp) + REG_S t3, PT_T3(sp) + REG_S t4, PT_T4(sp) + REG_S t5, PT_T5(sp) + REG_S t6, PT_T6(sp) + REG_S gp, PT_GP(sp) + REG_S a0, PT_A0(sp) + REG_S a1, PT_A1(sp) + REG_S a2, PT_A2(sp) + REG_S a3, PT_A3(sp) + REG_S a4, PT_A4(sp) + REG_S a5, PT_A5(sp) + csrr a1, CSR_SEPC + REG_S a1, PT_EPC(sp) + csrr a2, CSR_SSTATUS + REG_S a2, PT_STATUS(sp) + + REG_L a0, SBI_SSE_REG_TMP(a7) + REG_S a0, PT_SP(sp) + + REG_L t0, SBI_SSE_HANDLER(a7) + REG_L a0, SBI_SSE_HANDLER_DATA(a7) + mv a1, sp + mv a2, a6 + jalr t0 + + REG_L a1, PT_EPC(sp) + REG_L a2, PT_STATUS(sp) + csrw CSR_SEPC, a1 + csrw CSR_SSTATUS, a2 + + REG_L ra, PT_RA(sp) + REG_L s0, PT_S0(sp) + REG_L s1, PT_S1(sp) + REG_L s2, PT_S2(sp) + REG_L s3, PT_S3(sp) + REG_L s4, PT_S4(sp) + REG_L s5, PT_S5(sp) + REG_L s6, PT_S6(sp) + REG_L s7, PT_S7(sp) + REG_L s8, PT_S8(sp) + REG_L s9, PT_S9(sp) + REG_L s10, PT_S10(sp) + REG_L s11, PT_S11(sp) + REG_L tp, PT_TP(sp) + REG_L t0, PT_T0(sp) + REG_L t1, PT_T1(sp) + REG_L t2, PT_T2(sp) + REG_L t3, PT_T3(sp) + REG_L t4, PT_T4(sp) + REG_L t5, PT_T5(sp) + REG_L t6, PT_T6(sp) + REG_L gp, PT_GP(sp) + REG_L a0, PT_A0(sp) + REG_L a1, PT_A1(sp) + REG_L a2, PT_A2(sp) + REG_L a3, PT_A3(sp) + REG_L a4, PT_A4(sp) + REG_L a5, PT_A5(sp) + + REG_L sp, PT_SP(sp) + + li a7, ASM_SBI_EXT_SSE + li a6, ASM_SBI_EXT_SSE_COMPLETE + ecall + diff --git a/lib/riscv/asm-offsets.c b/lib/riscv/asm-offsets.c index 6c511c14..a96c6e97 100644 --- a/lib/riscv/asm-offsets.c +++ b/lib/riscv/asm-offsets.c @@ -3,6 +3,7 @@ #include #include #include +#include #include int main(void) @@ -63,5 +64,13 @@ int main(void) OFFSET(THREAD_INFO_HARTID, thread_info, hartid); DEFINE(THREAD_INFO_SIZE, sizeof(struct thread_info)); + DEFINE(ASM_SBI_EXT_SSE, SBI_EXT_SSE); + DEFINE(ASM_SBI_EXT_SSE_COMPLETE, SBI_EXT_SSE_COMPLETE); + + OFFSET(SBI_SSE_REG_TMP, sbi_sse_handler_arg, reg_tmp); + OFFSET(SBI_SSE_HANDLER, sbi_sse_handler_arg, handler); + OFFSET(SBI_SSE_HANDLER_DATA, sbi_sse_handler_arg, handler_data); + OFFSET(SBI_SSE_HANDLER_STACK, sbi_sse_handler_arg, stack); + return 0; } diff --git a/lib/riscv/sbi.c b/lib/riscv/sbi.c index 02dd338c..ff33fe7b 100644 --- a/lib/riscv/sbi.c +++ b/lib/riscv/sbi.c @@ -2,6 +2,7 @@ #include #include #include +#include #include #include @@ -31,6 +32,81 @@ struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0, return ret; } +struct sbiret sbi_sse_read_attrs_raw(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long phys_lo, + unsigned long phys_hi) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_READ_ATTRS, event_id, base_attr_id, attr_count, + phys_lo, phys_hi, 0); +} + +struct sbiret sbi_sse_read_attrs(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long *values) +{ + phys_addr_t p = virt_to_phys(values); + + return sbi_sse_read_attrs_raw(event_id, base_attr_id, attr_count, lower_32_bits(p), + upper_32_bits(p)); +} + +struct sbiret sbi_sse_write_attrs_raw(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long phys_lo, + unsigned long phys_hi) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_WRITE_ATTRS, event_id, base_attr_id, attr_count, + phys_lo, phys_hi, 0); +} + +struct sbiret sbi_sse_write_attrs(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long *values) +{ + phys_addr_t p = virt_to_phys(values); + + return sbi_sse_write_attrs_raw(event_id, base_attr_id, attr_count, lower_32_bits(p), + upper_32_bits(p)); +} + +struct sbiret sbi_sse_register_raw(unsigned long event_id, unsigned long entry_pc, + unsigned long entry_arg) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_REGISTER, event_id, entry_pc, entry_arg, 0, 0, 0); +} + +struct sbiret sbi_sse_register(unsigned long event_id, struct sbi_sse_handler_arg *arg) +{ + return sbi_sse_register_raw(event_id, (unsigned long)sbi_sse_entry, (unsigned long)arg); +} + +struct sbiret sbi_sse_unregister(unsigned long event_id) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_UNREGISTER, event_id, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_sse_enable(unsigned long event_id) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_ENABLE, event_id, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_sse_disable(unsigned long event_id) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_DISABLE, event_id, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_sse_hart_mask(void) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_HART_MASK, 0, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_sse_hart_unmask(void) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_HART_UNMASK, 0, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_sse_inject(unsigned long event_id, unsigned long hart_id) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_INJECT, event_id, hart_id, 0, 0, 0, 0); +} + void sbi_shutdown(void) { sbi_ecall(SBI_EXT_SRST, 0, 0, 0, 0, 0, 0, 0); -- 2.47.2 From cleger at rivosinc.com Fri Mar 14 04:10:29 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Fri, 14 Mar 2025 12:10:29 +0100 Subject: [kvm-unit-tests PATCH v9 6/6] riscv: sbi: Add SSE extension tests In-Reply-To: <20250314111030.3728671-1-cleger@rivosinc.com> References: <20250314111030.3728671-1-cleger@rivosinc.com> Message-ID: <20250314111030.3728671-7-cleger@rivosinc.com> Add SBI SSE extension tests for the following features: - Test attributes errors (invalid values, RO, etc) - Registration errors - Simple events (register, enable, inject) - Events with different priorities - Global events dispatch on different harts - Local events on all harts - Hart mask/unmask events Signed-off-by: Cl?ment L?ger --- riscv/Makefile | 1 + riscv/sbi-tests.h | 1 + riscv/sbi-sse.c | 1263 +++++++++++++++++++++++++++++++++++++++++++++ riscv/sbi.c | 2 + 4 files changed, 1267 insertions(+) create mode 100644 riscv/sbi-sse.c diff --git a/riscv/Makefile b/riscv/Makefile index 16fc125b..4fe2f1bb 100644 --- a/riscv/Makefile +++ b/riscv/Makefile @@ -18,6 +18,7 @@ tests += $(TEST_DIR)/sieve.$(exe) all: $(tests) $(TEST_DIR)/sbi-deps = $(TEST_DIR)/sbi-asm.o $(TEST_DIR)/sbi-fwft.o +$(TEST_DIR)/sbi-deps += $(TEST_DIR)/sbi-sse.o # When built for EFI sieve needs extra memory, run with e.g. '-m 256' on QEMU $(TEST_DIR)/sieve.$(exe): AUXFLAGS = 0x1 diff --git a/riscv/sbi-tests.h b/riscv/sbi-tests.h index b081464d..a71da809 100644 --- a/riscv/sbi-tests.h +++ b/riscv/sbi-tests.h @@ -71,6 +71,7 @@ sbiret_report(ret, expected_error, expected_value, "check sbi.error and sbi.value") void sbi_bad_fid(int ext); +void check_sse(void); #endif /* __ASSEMBLER__ */ #endif /* _RISCV_SBI_TESTS_H_ */ diff --git a/riscv/sbi-sse.c b/riscv/sbi-sse.c new file mode 100644 index 00000000..2e4fb83d --- /dev/null +++ b/riscv/sbi-sse.c @@ -0,0 +1,1263 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * SBI SSE testsuite + * + * Copyright (C) 2025, Rivos Inc., Cl?ment L?ger + */ +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "sbi-tests.h" + +#define SSE_STACK_SIZE PAGE_SIZE + +struct sse_event_info { + uint32_t event_id; + const char *name; + bool can_inject; +}; + +static struct sse_event_info sse_event_infos[] = { + { + .event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, + .name = "local_high_prio_ras", + }, + { + .event_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP, + .name = "double_trap", + }, + { + .event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, + .name = "global_high_prio_ras", + }, + { + .event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, + .name = "local_pmu_overflow", + }, + { + .event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, + .name = "local_low_prio_ras", + }, + { + .event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, + .name = "global_low_prio_ras", + }, + { + .event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE, + .name = "local_software", + }, + { + .event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE, + .name = "global_software", + }, +}; + +static const char *const attr_names[] = { + [SBI_SSE_ATTR_STATUS] = "status", + [SBI_SSE_ATTR_PRIORITY] = "priority", + [SBI_SSE_ATTR_CONFIG] = "config", + [SBI_SSE_ATTR_PREFERRED_HART] = "preferred_hart", + [SBI_SSE_ATTR_ENTRY_PC] = "entry_pc", + [SBI_SSE_ATTR_ENTRY_ARG] = "entry_arg", + [SBI_SSE_ATTR_INTERRUPTED_SEPC] = "interrupted_sepc", + [SBI_SSE_ATTR_INTERRUPTED_FLAGS] = "interrupted_flags", + [SBI_SSE_ATTR_INTERRUPTED_A6] = "interrupted_a6", + [SBI_SSE_ATTR_INTERRUPTED_A7] = "interrupted_a7", +}; + +static const unsigned long ro_attrs[] = { + SBI_SSE_ATTR_STATUS, + SBI_SSE_ATTR_ENTRY_PC, + SBI_SSE_ATTR_ENTRY_ARG, +}; + +static const unsigned long interrupted_attrs[] = { + SBI_SSE_ATTR_INTERRUPTED_SEPC, + SBI_SSE_ATTR_INTERRUPTED_FLAGS, + SBI_SSE_ATTR_INTERRUPTED_A6, + SBI_SSE_ATTR_INTERRUPTED_A7, +}; + +static const unsigned long interrupted_flags[] = { + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPP, + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPIE, + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPELP, + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT, + SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPV, + SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPVP, +}; + +static struct sse_event_info *sse_event_get_info(uint32_t event_id) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(sse_event_infos); i++) { + if (sse_event_infos[i].event_id == event_id) + return &sse_event_infos[i]; + } + + assert_msg(false, "Invalid event id: %d", event_id); +} + +static const char *sse_event_name(uint32_t event_id) +{ + return sse_event_get_info(event_id)->name; +} + +static bool sse_event_can_inject(uint32_t event_id) +{ + return sse_event_get_info(event_id)->can_inject; +} + +static struct sbiret sse_get_event_status_field(uint32_t event_id, unsigned long mask, + unsigned long shift, unsigned long *value) +{ + struct sbiret ret; + unsigned long status; + + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, 1, &status); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "Get event status"); + return ret; + } + + *value = (status & mask) >> shift; + + return ret; +} + +static struct sbiret sse_event_get_state(uint32_t event_id, enum sbi_sse_state *state) +{ + unsigned long status = 0; + struct sbiret ret; + + ret = sse_get_event_status_field(event_id, SBI_SSE_ATTR_STATUS_STATE_MASK, + SBI_SSE_ATTR_STATUS_STATE_OFFSET, &status); + *state = status; + + return ret; +} + +static unsigned long sse_global_event_set_current_hart(uint32_t event_id) +{ + struct sbiret ret; + unsigned long current_hart = current_thread_info()->hartid; + + assert(sbi_sse_event_is_global(event_id)); + + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, ¤t_hart); + if (sbiret_report_error(&ret, SBI_SUCCESS, "Set preferred hart")) + return ret.error; + + return 0; +} + +static bool sse_check_state(uint32_t event_id, unsigned long expected_state) +{ + struct sbiret ret; + enum sbi_sse_state state; + + ret = sse_event_get_state(event_id, &state); + if (ret.error) + return false; + + return report(state == expected_state, "event status == %ld", expected_state); +} + +static bool sse_event_pending(uint32_t event_id) +{ + bool pending = 0; + + sse_get_event_status_field(event_id, BIT(SBI_SSE_ATTR_STATUS_PENDING_OFFSET), + SBI_SSE_ATTR_STATUS_PENDING_OFFSET, (unsigned long *)&pending); + + return pending; +} + +static void *sse_alloc_stack(void) +{ + /* + * We assume that SSE_STACK_SIZE always fit in one page. This page will + * always be decremented before storing anything on it in sse-entry.S. + */ + assert(SSE_STACK_SIZE <= PAGE_SIZE); + + return (alloc_page() + SSE_STACK_SIZE); +} + +static void sse_free_stack(void *stack) +{ + free_page(stack - SSE_STACK_SIZE); +} + +static void sse_read_write_test(uint32_t event_id, unsigned long attr, unsigned long attr_count, + unsigned long *value, long expected_error, const char *str) +{ + struct sbiret ret; + + ret = sbi_sse_read_attrs(event_id, attr, attr_count, value); + sbiret_report_error(&ret, expected_error, "Read %s error", str); + + ret = sbi_sse_write_attrs(event_id, attr, attr_count, value); + sbiret_report_error(&ret, expected_error, "Write %s error", str); +} + +#define ALL_ATTRS_COUNT (SBI_SSE_ATTR_INTERRUPTED_A7 + 1) + +static void sse_test_attrs(uint32_t event_id) +{ + unsigned long value = 0; + struct sbiret ret; + void *ptr; + unsigned long values[ALL_ATTRS_COUNT]; + unsigned int i; + const char *invalid_hart_str; + const char *attr_name; + + report_prefix_push("attrs"); + + for (i = 0; i < ARRAY_SIZE(ro_attrs); i++) { + ret = sbi_sse_write_attrs(event_id, ro_attrs[i], 1, &value); + sbiret_report_error(&ret, SBI_ERR_DENIED, "RO attribute %s not writable", + attr_names[ro_attrs[i]]); + } + + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, ALL_ATTRS_COUNT, values); + sbiret_report_error(&ret, SBI_SUCCESS, "Read multiple attributes"); + + for (i = SBI_SSE_ATTR_STATUS; i <= SBI_SSE_ATTR_INTERRUPTED_A7; i++) { + ret = sbi_sse_read_attrs(event_id, i, 1, &value); + attr_name = attr_names[i]; + + sbiret_report_error(&ret, SBI_SUCCESS, "Read single attribute %s", attr_name); + if (values[i] != value) + report_fail("Attribute 0x%x single value read (0x%lx) differs from the one read with multiple attributes (0x%lx)", + i, value, values[i]); + /* + * Preferred hart reset value is defined by SBI vendor + */ + if (i != SBI_SSE_ATTR_PREFERRED_HART) { + /* + * Specification states that injectable bit is implementation dependent + * but other bits are zero-initialized. + */ + if (i == SBI_SSE_ATTR_STATUS) + value &= ~BIT(SBI_SSE_ATTR_STATUS_INJECT_OFFSET); + report(value == 0, "Attribute %s reset value is 0, found %lx", attr_name, value); + } + } + +#if __riscv_xlen > 32 + value = BIT(32); + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Write invalid prio > 0xFFFFFFFF error"); +#endif + + value = ~SBI_SSE_ATTR_CONFIG_ONESHOT; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_CONFIG, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Write invalid config value error"); + + if (sbi_sse_event_is_global(event_id)) { + invalid_hart_str = getenv("INVALID_HART_ID"); + if (!invalid_hart_str) + value = 0xFFFFFFFFUL; + else + value = strtoul(invalid_hart_str, NULL, 0); + + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid hart id error"); + } else { + /* Set Hart on local event -> RO */ + value = current_thread_info()->hartid; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); + sbiret_report_error(&ret, SBI_ERR_DENIED, + "Set hart id on local event error"); + } + + /* Set/get flags, sepc, a6, a7 */ + for (i = 0; i < ARRAY_SIZE(interrupted_attrs); i++) { + attr_name = attr_names[interrupted_attrs[i]]; + ret = sbi_sse_read_attrs(event_id, interrupted_attrs[i], 1, &value); + sbiret_report_error(&ret, SBI_SUCCESS, "Get interrupted %s", attr_name); + + value = ARRAY_SIZE(interrupted_attrs) - i; + ret = sbi_sse_write_attrs(event_id, interrupted_attrs[i], 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, + "Set attribute %s invalid state error", attr_name); + } + + sse_read_write_test(event_id, SBI_SSE_ATTR_STATUS, 0, &value, SBI_ERR_INVALID_PARAM, + "attribute attr_count == 0"); + sse_read_write_test(event_id, SBI_SSE_ATTR_INTERRUPTED_A7 + 1, 1, &value, SBI_ERR_BAD_RANGE, + "invalid attribute"); + + /* Misaligned pointer address */ + ptr = (void *)&value; + ptr += 1; + sse_read_write_test(event_id, SBI_SSE_ATTR_STATUS, 1, ptr, SBI_ERR_INVALID_ADDRESS, + "attribute with invalid address"); + + report_prefix_pop(); +} + +static void sse_test_register_error(uint32_t event_id) +{ + struct sbiret ret; + + report_prefix_push("register"); + + ret = sbi_sse_unregister(event_id); + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "unregister non-registered event"); + + ret = sbi_sse_register_raw(event_id, 0x1, 0); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "register misaligned entry"); + + ret = sbi_sse_register(event_id, NULL); + sbiret_report_error(&ret, SBI_SUCCESS, "register"); + if (ret.error) + goto done; + + ret = sbi_sse_register(event_id, NULL); + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "register used event failure"); + + ret = sbi_sse_unregister(event_id); + sbiret_report_error(&ret, SBI_SUCCESS, "unregister"); + +done: + report_prefix_pop(); +} + +struct sse_simple_test_arg { + bool done; + unsigned long expected_a6; + uint32_t event_id; +}; + +#if __riscv_xlen > 32 + +struct alias_test_params { + unsigned long event_id; + unsigned long attr_id; + unsigned long attr_count; + const char *str; +}; + +static void test_alias(uint32_t event_id) +{ + struct alias_test_params *write, *read; + unsigned long write_value, read_value; + struct sbiret ret; + bool err = false; + int r, w; + struct alias_test_params params[] = { + {event_id, SBI_SSE_ATTR_INTERRUPTED_A6, 1, "non aliased"}, + {BIT(32) + event_id, SBI_SSE_ATTR_INTERRUPTED_A6, 1, "aliased event_id"}, + {event_id, BIT(32) + SBI_SSE_ATTR_INTERRUPTED_A6, 1, "aliased attr_id"}, + {event_id, SBI_SSE_ATTR_INTERRUPTED_A6, BIT(32) + 1, "aliased attr_count"}, + }; + + report_prefix_push("alias"); + for (w = 0; w < ARRAY_SIZE(params); w++) { + write = ¶ms[w]; + + write_value = 0xDEADBEEF + w; + ret = sbi_sse_write_attrs(write->event_id, write->attr_id, write->attr_count, &write_value); + if (ret.error) + sbiret_report_error(&ret, SBI_SUCCESS, "Write %s, event 0x%lx attr 0x%lx, attr count 0x%lx", + write->str, write->event_id, write->attr_id, write->attr_count); + + for (r = 0; r < ARRAY_SIZE(params); r++) { + read = ¶ms[r]; + read_value = 0; + ret = sbi_sse_read_attrs(read->event_id, read->attr_id, read->attr_count, &read_value); + if (ret.error) + sbiret_report_error(&ret, SBI_SUCCESS, + "Read %s, event 0x%lx attr 0x%lx, attr count 0x%lx", + read->str, read->event_id, read->attr_id, read->attr_count); + + /* Do not spam output with a lot of reports */ + if (write_value != read_value) { + err = true; + report_fail("Write %s, event 0x%lx attr 0x%lx, attr count 0x%lx value %lx ==" + "Read %s, event 0x%lx attr 0x%lx, attr count 0x%lx value %lx", + write->str, write->event_id, write->attr_id, + write->attr_count, write_value, read->str, + read->event_id, read->attr_id, read->attr_count, + read_value); + } + } + } + + report(!err, "BIT(32) aliasing tests"); + report_prefix_pop(); +} +#endif + +static void sse_simple_handler(void *data, struct pt_regs *regs, unsigned int hartid) +{ + struct sse_simple_test_arg *arg = data; + int i; + struct sbiret ret; + const char *attr_name; + uint32_t event_id = READ_ONCE(arg->event_id), attr; + unsigned long value, prev_value, flags; + unsigned long interrupted_state[ARRAY_SIZE(interrupted_attrs)]; + unsigned long modified_state[ARRAY_SIZE(interrupted_attrs)] = {4, 3, 2, 1}; + unsigned long tmp_state[ARRAY_SIZE(interrupted_attrs)]; + + report((regs->status & SR_SPP) == SR_SPP, "Interrupted S-mode"); + report(hartid == current_thread_info()->hartid, "Hartid correctly passed"); + sse_check_state(event_id, SBI_SSE_STATE_RUNNING); + report(!sse_event_pending(event_id), "Event not pending"); + + /* Read full interrupted state */ + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, + ARRAY_SIZE(interrupted_attrs), interrupted_state); + sbiret_report_error(&ret, SBI_SUCCESS, "Save full interrupted state from handler"); + + /* Write full modified state and read it */ + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, + ARRAY_SIZE(modified_state), modified_state); + sbiret_report_error(&ret, SBI_SUCCESS, + "Write full interrupted state from handler"); + + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, + ARRAY_SIZE(tmp_state), tmp_state); + sbiret_report_error(&ret, SBI_SUCCESS, "Read full modified state from handler"); + + report(memcmp(tmp_state, modified_state, sizeof(modified_state)) == 0, + "Full interrupted state successfully written"); + +#if __riscv_xlen > 32 + test_alias(event_id); +#endif + + /* Restore full saved state */ + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, + ARRAY_SIZE(interrupted_attrs), interrupted_state); + sbiret_report_error(&ret, SBI_SUCCESS, "Full interrupted state restore from handler"); + + /* We test SBI_SSE_ATTR_INTERRUPTED_FLAGS below with specific flag values */ + for (i = 0; i < ARRAY_SIZE(interrupted_attrs); i++) { + attr = interrupted_attrs[i]; + if (attr == SBI_SSE_ATTR_INTERRUPTED_FLAGS) + continue; + + attr_name = attr_names[attr]; + + ret = sbi_sse_read_attrs(event_id, attr, 1, &prev_value); + sbiret_report_error(&ret, SBI_SUCCESS, "Get attr %s", attr_name); + + value = 0xDEADBEEF + i; + ret = sbi_sse_write_attrs(event_id, attr, 1, &value); + sbiret_report_error(&ret, SBI_SUCCESS, "Set attr %s", attr_name); + + ret = sbi_sse_read_attrs(event_id, attr, 1, &value); + sbiret_report_error(&ret, SBI_SUCCESS, "Get attr %s", attr_name); + report(value == 0xDEADBEEF + i, "Get attr %s, value: 0x%lx", attr_name, value); + + ret = sbi_sse_write_attrs(event_id, attr, 1, &prev_value); + sbiret_report_error(&ret, SBI_SUCCESS, "Restore attr %s value", attr_name); + } + + /* Test all flags allowed for SBI_SSE_ATTR_INTERRUPTED_FLAGS */ + attr = SBI_SSE_ATTR_INTERRUPTED_FLAGS; + ret = sbi_sse_read_attrs(event_id, attr, 1, &prev_value); + sbiret_report_error(&ret, SBI_SUCCESS, "Save interrupted flags"); + + for (i = 0; i < ARRAY_SIZE(interrupted_flags); i++) { + flags = interrupted_flags[i]; + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); + sbiret_report_error(&ret, SBI_SUCCESS, + "Set interrupted flags bit 0x%lx value", flags); + ret = sbi_sse_read_attrs(event_id, attr, 1, &value); + sbiret_report_error(&ret, SBI_SUCCESS, "Get interrupted flags after set"); + report(value == flags, "interrupted flags modified value: 0x%lx", value); + } + + /* Write invalid bit in flag register */ + flags = SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT << 1; + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", + flags); + + flags = BIT(SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT + 1); + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", + flags); + + ret = sbi_sse_write_attrs(event_id, attr, 1, &prev_value); + sbiret_report_error(&ret, SBI_SUCCESS, "Restore interrupted flags"); + + /* Try to change HARTID/Priority while running */ + if (sbi_sse_event_is_global(event_id)) { + value = current_thread_info()->hartid; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "Set hart id while running error"); + } + + value = 0; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "Set priority while running error"); + + value = READ_ONCE(arg->expected_a6); + report(interrupted_state[2] == value, "Interrupted state a6, expected 0x%lx, got 0x%lx", + value, interrupted_state[2]); + + report(interrupted_state[3] == SBI_EXT_SSE, + "Interrupted state a7, expected 0x%x, got 0x%lx", SBI_EXT_SSE, + interrupted_state[3]); + + WRITE_ONCE(arg->done, true); +} + +static void sse_test_inject_simple(uint32_t event_id) +{ + unsigned long value, error; + struct sbiret ret; + enum sbi_sse_state state = SBI_SSE_STATE_UNUSED; + struct sse_simple_test_arg test_arg = {.event_id = event_id}; + struct sbi_sse_handler_arg args = { + .handler = sse_simple_handler, + .handler_data = (void *)&test_arg, + .stack = sse_alloc_stack(), + }; + + report_prefix_push("simple"); + + if (!sse_check_state(event_id, SBI_SSE_STATE_UNUSED)) + goto cleanup; + + ret = sbi_sse_register(event_id, &args); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "register")) + goto cleanup; + + state = SBI_SSE_STATE_REGISTERED; + + if (!sse_check_state(event_id, SBI_SSE_STATE_REGISTERED)) + goto cleanup; + + if (sbi_sse_event_is_global(event_id)) { + /* Be sure global events are targeting the current hart */ + error = sse_global_event_set_current_hart(event_id); + if (error) + goto cleanup; + } + + ret = sbi_sse_enable(event_id); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "enable")) + goto cleanup; + + state = SBI_SSE_STATE_ENABLED; + if (!sse_check_state(event_id, SBI_SSE_STATE_ENABLED)) + goto cleanup; + + ret = sbi_sse_hart_mask(); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "hart mask")) + goto cleanup; + + ret = sbi_sse_inject(event_id, current_thread_info()->hartid); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "injection masked")) { + sbi_sse_hart_unmask(); + goto cleanup; + } + + report(READ_ONCE(test_arg.done) == 0, "event masked not handled"); + + /* + * When unmasking the SSE events, we expect it to be injected + * immediately so a6 should be SBI_EXT_SBI_SSE_HART_UNMASK + */ + WRITE_ONCE(test_arg.expected_a6, SBI_EXT_SSE_HART_UNMASK); + ret = sbi_sse_hart_unmask(); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "hart unmask")) + goto cleanup; + + report(READ_ONCE(test_arg.done) == 1, "event unmasked handled"); + WRITE_ONCE(test_arg.done, 0); + WRITE_ONCE(test_arg.expected_a6, SBI_EXT_SSE_INJECT); + + /* Set as oneshot and verify it is disabled */ + ret = sbi_sse_disable(event_id); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Disable event")) { + /* Nothing we can really do here, event can not be disabled */ + goto cleanup; + } + state = SBI_SSE_STATE_REGISTERED; + + value = SBI_SSE_ATTR_CONFIG_ONESHOT; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_CONFIG, 1, &value); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Set event attribute as ONESHOT")) + goto cleanup; + + ret = sbi_sse_enable(event_id); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Enable event")) + goto cleanup; + state = SBI_SSE_STATE_ENABLED; + + ret = sbi_sse_inject(event_id, current_thread_info()->hartid); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "second injection")) + goto cleanup; + + report(READ_ONCE(test_arg.done) == 1, "event handled"); + WRITE_ONCE(test_arg.done, 0); + + if (!sse_check_state(event_id, SBI_SSE_STATE_REGISTERED)) + goto cleanup; + state = SBI_SSE_STATE_REGISTERED; + + /* Clear ONESHOT FLAG */ + value = 0; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_CONFIG, 1, &value); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Clear CONFIG.ONESHOT flag")) + goto cleanup; + + ret = sbi_sse_unregister(event_id); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "unregister")) + goto cleanup; + state = SBI_SSE_STATE_UNUSED; + + sse_check_state(event_id, SBI_SSE_STATE_UNUSED); + +cleanup: + switch (state) { + case SBI_SSE_STATE_ENABLED: + ret = sbi_sse_disable(event_id); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "disable event 0x%x", event_id); + break; + } + case SBI_SSE_STATE_REGISTERED: + sbi_sse_unregister(event_id); + if (ret.error) + sbiret_report_error(&ret, SBI_SUCCESS, "unregister event 0x%x", event_id); + default: + } + + sse_free_stack(args.stack); + report_prefix_pop(); +} + +struct sse_foreign_cpu_test_arg { + bool done; + unsigned int expected_cpu; + uint32_t event_id; +}; + +static void sse_foreign_cpu_handler(void *data, struct pt_regs *regs, unsigned int hartid) +{ + struct sse_foreign_cpu_test_arg *arg = data; + unsigned int expected_cpu; + + /* For arg content to be visible */ + smp_rmb(); + expected_cpu = READ_ONCE(arg->expected_cpu); + report(expected_cpu == current_thread_info()->cpu, + "Received event on CPU (%d), expected CPU (%d)", current_thread_info()->cpu, + expected_cpu); + + WRITE_ONCE(arg->done, true); + /* For arg update to be visible for other CPUs */ + smp_wmb(); +} + +struct sse_local_per_cpu { + struct sbi_sse_handler_arg args; + struct sbiret ret; + struct sse_foreign_cpu_test_arg handler_arg; + enum sbi_sse_state state; +}; + +static void sse_register_enable_local(void *data) +{ + struct sbiret ret; + struct sse_local_per_cpu *cpu_args = data; + struct sse_local_per_cpu *cpu_arg = &cpu_args[current_thread_info()->cpu]; + uint32_t event_id = cpu_arg->handler_arg.event_id; + + ret = sbi_sse_register(event_id, &cpu_arg->args); + WRITE_ONCE(cpu_arg->ret, ret); + if (ret.error) + return; + cpu_arg->state = SBI_SSE_STATE_REGISTERED; + + ret = sbi_sse_enable(event_id); + WRITE_ONCE(cpu_arg->ret, ret); + if (ret.error) + return; + cpu_arg->state = SBI_SSE_STATE_ENABLED; +} + +static void sbi_sse_disable_unregister_local(void *data) +{ + struct sbiret ret; + struct sse_local_per_cpu *cpu_args = data; + struct sse_local_per_cpu *cpu_arg = &cpu_args[current_thread_info()->cpu]; + uint32_t event_id = cpu_arg->handler_arg.event_id; + + switch (cpu_arg->state) { + case SBI_SSE_STATE_ENABLED: + ret = sbi_sse_disable(event_id); + WRITE_ONCE(cpu_arg->ret, ret); + if (ret.error) + return; + case SBI_SSE_STATE_REGISTERED: + ret = sbi_sse_unregister(event_id); + WRITE_ONCE(cpu_arg->ret, ret); + default: + } +} + +static uint64_t sse_event_get_complete_timeout(void) +{ + char *event_complete_timeout_str; + uint64_t timeout; + + event_complete_timeout_str = getenv("SSE_EVENT_COMPLETE_TIMEOUT"); + if (!event_complete_timeout_str) + timeout = 1000; + else + timeout = strtoul(event_complete_timeout_str, NULL, 0); + + return timer_get_cycles() + usec_to_cycles(timeout); +} + +static void sse_test_inject_local(uint32_t event_id) +{ + int cpu; + uint64_t timeout; + struct sbiret ret; + struct sse_local_per_cpu *cpu_args, *cpu_arg; + struct sse_foreign_cpu_test_arg *handler_arg; + + cpu_args = calloc(NR_CPUS, sizeof(struct sbi_sse_handler_arg)); + + report_prefix_push("local_dispatch"); + for_each_online_cpu(cpu) { + cpu_arg = &cpu_args[cpu]; + cpu_arg->handler_arg.event_id = event_id; + cpu_arg->args.stack = sse_alloc_stack(); + cpu_arg->args.handler = sse_foreign_cpu_handler; + cpu_arg->args.handler_data = (void *)&cpu_arg->handler_arg; + cpu_arg->state = SBI_SSE_STATE_UNUSED; + } + + on_cpus(sse_register_enable_local, cpu_args); + for_each_online_cpu(cpu) { + cpu_arg = &cpu_args[cpu]; + ret = cpu_arg->ret; + if (ret.error) { + report_fail("CPU failed to register/enable event: %ld", ret.error); + goto cleanup; + } + + handler_arg = &cpu_arg->handler_arg; + WRITE_ONCE(handler_arg->expected_cpu, cpu); + /* For handler_arg content to be visible for other CPUs */ + smp_wmb(); + ret = sbi_sse_inject(event_id, cpus[cpu].hartid); + if (ret.error) { + report_fail("CPU failed to inject event: %ld", ret.error); + goto cleanup; + } + } + + for_each_online_cpu(cpu) { + handler_arg = &cpu_args[cpu].handler_arg; + smp_rmb(); + + timeout = sse_event_get_complete_timeout(); + while (!READ_ONCE(handler_arg->done) || timer_get_cycles() < timeout) { + /* For handler_arg update to be visible */ + smp_rmb(); + cpu_relax(); + } + report(READ_ONCE(handler_arg->done), "Event handled"); + WRITE_ONCE(handler_arg->done, false); + } + +cleanup: + on_cpus(sbi_sse_disable_unregister_local, cpu_args); + for_each_online_cpu(cpu) { + cpu_arg = &cpu_args[cpu]; + ret = READ_ONCE(cpu_arg->ret); + if (ret.error) + report_fail("CPU failed to disable/unregister event: %ld", ret.error); + } + + for_each_online_cpu(cpu) { + cpu_arg = &cpu_args[cpu]; + sse_free_stack(cpu_arg->args.stack); + } + + report_prefix_pop(); +} + +static void sse_test_inject_global_cpu(uint32_t event_id, unsigned int cpu, + struct sse_foreign_cpu_test_arg *test_arg) +{ + unsigned long value; + struct sbiret ret; + uint64_t timeout; + enum sbi_sse_state state; + + WRITE_ONCE(test_arg->expected_cpu, cpu); + /* For test_arg content to be visible for other CPUs */ + smp_wmb(); + value = cpu; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Set preferred hart")) + return; + + ret = sbi_sse_enable(event_id); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Enable event")) + return; + + ret = sbi_sse_inject(event_id, cpu); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Inject event")) + goto disable; + + smp_rmb(); + timeout = sse_event_get_complete_timeout(); + while (!READ_ONCE(test_arg->done) || timer_get_cycles() < timeout) { + /* For shared test_arg structure */ + smp_rmb(); + cpu_relax(); + } + + report(READ_ONCE(test_arg->done), "event handler called"); + WRITE_ONCE(test_arg->done, false); + + timeout = sse_event_get_complete_timeout(); + /* Wait for event to be back in ENABLED state */ + do { + ret = sse_event_get_state(event_id, &state); + if (ret.error) + goto disable; + cpu_relax(); + } while (state != SBI_SSE_STATE_ENABLED || timer_get_cycles() < timeout); + + report(state == SBI_SSE_STATE_ENABLED, "Event in enabled state"); + +disable: + ret = sbi_sse_disable(event_id); + sbiret_report_error(&ret, SBI_SUCCESS, "Disable event"); +} + +static void sse_test_inject_global(uint32_t event_id) +{ + struct sbiret ret; + unsigned int cpu; + struct sse_foreign_cpu_test_arg test_arg = {.event_id = event_id}; + struct sbi_sse_handler_arg args = { + .handler = sse_foreign_cpu_handler, + .handler_data = (void *)&test_arg, + .stack = sse_alloc_stack(), + }; + + report_prefix_push("global_dispatch"); + + ret = sbi_sse_register(event_id, &args); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Register event")) + goto err; + + for_each_online_cpu(cpu) + sse_test_inject_global_cpu(event_id, cpu, &test_arg); + + ret = sbi_sse_unregister(event_id); + sbiret_report_error(&ret, SBI_SUCCESS, "Unregister event"); + +err: + sse_free_stack(args.stack); + report_prefix_pop(); +} + +struct priority_test_arg { + uint32_t event_id; + bool called; + u32 prio; + enum sbi_sse_state state; /* Used for error handling */ + struct priority_test_arg *next_event_arg; + void (*check_func)(struct priority_test_arg *arg); +}; + +static void sse_hi_priority_test_handler(void *arg, struct pt_regs *regs, + unsigned int hartid) +{ + struct priority_test_arg *targ = arg; + struct priority_test_arg *next = targ->next_event_arg; + + targ->called = true; + if (next) { + sbi_sse_inject(next->event_id, current_thread_info()->hartid); + + report(!sse_event_pending(next->event_id), "Higher priority event is not pending"); + report(next->called, "Higher priority event was handled"); + } +} + +static void sse_low_priority_test_handler(void *arg, struct pt_regs *regs, + unsigned int hartid) +{ + struct priority_test_arg *targ = arg; + struct priority_test_arg *next = targ->next_event_arg; + + targ->called = true; + + if (next) { + sbi_sse_inject(next->event_id, current_thread_info()->hartid); + + report(sse_event_pending(next->event_id), "Lower priority event is pending"); + report(!next->called, "Lower priority event %s was not handled before %s", + sse_event_name(next->event_id), sse_event_name(targ->event_id)); + } +} + +static void sse_test_injection_priority_arg(struct priority_test_arg *in_args, + unsigned int in_args_size, + sbi_sse_handler_fn handler, + const char *test_name) +{ + unsigned int i; + unsigned long value, uret; + struct sbiret ret; + uint32_t event_id; + struct priority_test_arg *arg; + unsigned int args_size = 0; + struct sbi_sse_handler_arg event_args[in_args_size]; + struct priority_test_arg *args[in_args_size]; + void *stack; + struct sbi_sse_handler_arg *event_arg; + + report_prefix_push(test_name); + + for (i = 0; i < in_args_size; i++) { + arg = &in_args[i]; + arg->state = SBI_SSE_STATE_UNUSED; + event_id = arg->event_id; + if (!sse_event_can_inject(event_id)) + continue; + + args[args_size] = arg; + args_size++; + event_args->stack = 0; + } + + if (!args_size) { + report_skip("No injectable events"); + goto skip; + } + + for (i = 0; i < args_size; i++) { + arg = args[i]; + event_id = arg->event_id; + stack = sse_alloc_stack(); + + event_arg = &event_args[i]; + event_arg->handler = handler; + event_arg->handler_data = (void *)arg; + event_arg->stack = stack; + + if (i < (args_size - 1)) + arg->next_event_arg = args[i + 1]; + else + arg->next_event_arg = NULL; + + /* Be sure global events are targeting the current hart */ + if (sbi_sse_event_is_global(event_id)) { + uret = sse_global_event_set_current_hart(event_id); + if (uret) + goto err; + } + + ret = sbi_sse_register(event_id, event_arg); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "register event %s", + sse_event_name(event_id)); + goto err; + } + arg->state = SBI_SSE_STATE_REGISTERED; + + value = arg->prio; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "set event %s priority", + sse_event_name(event_id)); + goto err; + } + ret = sbi_sse_enable(event_id); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "enable event %s", + sse_event_name(event_id)); + goto err; + } + arg->state = SBI_SSE_STATE_ENABLED; + } + + /* Inject first event */ + ret = sbi_sse_inject(args[0]->event_id, current_thread_info()->hartid); + sbiret_report_error(&ret, SBI_SUCCESS, "injection"); + + /* Check that all handlers have been called */ + for (i = 0; i < args_size; i++) + report(arg->called, "Event %s handler called", sse_event_name(args[i]->event_id)); + +err: + for (i = 0; i < args_size; i++) { + arg = args[i]; + event_id = arg->event_id; + + switch (arg->state) { + case SBI_SSE_STATE_ENABLED: + ret = sbi_sse_disable(event_id); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "disable event 0x%x", + event_id); + break; + } + case SBI_SSE_STATE_REGISTERED: + sbi_sse_unregister(event_id); + if (ret.error) + sbiret_report_error(&ret, SBI_SUCCESS, "unregister event 0x%x", + event_id); + default: + } + + event_arg = &event_args[i]; + if (event_arg->stack) + sse_free_stack(event_arg->stack); + } + +skip: + report_prefix_pop(); +} + +static struct priority_test_arg hi_prio_args[] = { + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE}, + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE}, + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW}, + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP}, + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS}, +}; + +static struct priority_test_arg low_prio_args[] = { + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP}, + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW}, + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE}, + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE}, +}; + +static struct priority_test_arg prio_args[] = { + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE, .prio = 5}, + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE, .prio = 10}, + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, .prio = 12}, + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, .prio = 15}, + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, .prio = 20}, + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, .prio = 22}, + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, .prio = 25}, +}; + +static struct priority_test_arg same_prio_args[] = { + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, .prio = 0}, + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, .prio = 0}, + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, .prio = 10}, + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE, .prio = 10}, + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE, .prio = 10}, + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, .prio = 20}, + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, .prio = 20}, +}; + +static void sse_test_injection_priority(void) +{ + report_prefix_push("prio"); + + sse_test_injection_priority_arg(hi_prio_args, ARRAY_SIZE(hi_prio_args), + sse_hi_priority_test_handler, "high"); + + sse_test_injection_priority_arg(low_prio_args, ARRAY_SIZE(low_prio_args), + sse_low_priority_test_handler, "low"); + + sse_test_injection_priority_arg(prio_args, ARRAY_SIZE(prio_args), + sse_low_priority_test_handler, "changed"); + + sse_test_injection_priority_arg(same_prio_args, ARRAY_SIZE(same_prio_args), + sse_low_priority_test_handler, "same_prio_args"); + + report_prefix_pop(); +} + +static void test_invalid_event_id(unsigned long event_id) +{ + struct sbiret ret; + unsigned long value = 0; + + ret = sbi_sse_register_raw(event_id, (unsigned long) sbi_sse_entry, 0); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "register event_id 0x%lx", event_id); + + ret = sbi_sse_unregister(event_id); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "unregister event_id 0x%lx", event_id); + + ret = sbi_sse_enable(event_id); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "enable event_id 0x%lx", event_id); + + ret = sbi_sse_disable(event_id); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "disable event_id 0x%lx", event_id); + + ret = sbi_sse_inject(event_id, 0); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "inject event_id 0x%lx", event_id); + + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "write attr event_id 0x%lx", event_id); + + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "read attr event_id 0x%lx", event_id); +} + +static void sse_test_invalid_event_id(void) +{ + + report_prefix_push("event_id"); + + test_invalid_event_id(SBI_SSE_EVENT_LOCAL_RESERVED_0_START); + + report_prefix_pop(); +} + +static void sse_check_event_availability(uint32_t event_id, bool *can_inject, bool *supported) +{ + unsigned long status; + struct sbiret ret; + + *can_inject = false; + *supported = false; + + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, 1, &status); + if (ret.error != SBI_SUCCESS && ret.error != SBI_ERR_NOT_SUPPORTED) { + report_fail("Get event status != SBI_SUCCESS && != SBI_ERR_NOT_SUPPORTED: %ld", + ret.error); + return; + } + if (ret.error == SBI_ERR_NOT_SUPPORTED) + return; + + *supported = true; + *can_inject = (status >> SBI_SSE_ATTR_STATUS_INJECT_OFFSET) & 1; +} + +static void sse_secondary_boot_and_unmask(void *data) +{ + sbi_sse_hart_unmask(); +} + +static void sse_check_mask(void) +{ + struct sbiret ret; + + /* Upon boot, event are masked, check that */ + ret = sbi_sse_hart_mask(); + sbiret_report_error(&ret, SBI_ERR_ALREADY_STOPPED, "hart mask at boot time"); + + ret = sbi_sse_hart_unmask(); + sbiret_report_error(&ret, SBI_SUCCESS, "hart unmask"); + ret = sbi_sse_hart_unmask(); + sbiret_report_error(&ret, SBI_ERR_ALREADY_STARTED, "hart unmask twice error"); + + ret = sbi_sse_hart_mask(); + sbiret_report_error(&ret, SBI_SUCCESS, "hart mask"); + ret = sbi_sse_hart_mask(); + sbiret_report_error(&ret, SBI_ERR_ALREADY_STOPPED, "hart mask twice"); +} + +static void run_inject_test(struct sse_event_info *info) +{ + unsigned long event_id = info->event_id; + + if (!info->can_inject) { + report_skip("Event does not support injection, skipping injection tests"); + return; + } + + sse_test_inject_simple(event_id); + + if (sbi_sse_event_is_global(event_id)) + sse_test_inject_global(event_id); + else + sse_test_inject_local(event_id); +} + +void check_sse(void) +{ + struct sse_event_info *info; + unsigned long i, event_id; + bool supported; + + report_prefix_push("sse"); + + if (!sbi_probe(SBI_EXT_SSE)) { + report_skip("extension not available"); + report_prefix_pop(); + return; + } + + sse_check_mask(); + + /* + * Dummy wakeup of all processors since some of them will be targeted + * by global events without going through the wakeup call as well as + * unmasking SSE events on all harts + */ + on_cpus(sse_secondary_boot_and_unmask, NULL); + + sse_test_invalid_event_id(); + + for (i = 0; i < ARRAY_SIZE(sse_event_infos); i++) { + info = &sse_event_infos[i]; + event_id = info->event_id; + report_prefix_push(info->name); + sse_check_event_availability(event_id, &info->can_inject, &supported); + if (!supported) { + report_skip("Event is not supported, skipping tests"); + report_prefix_pop(); + continue; + } + + sse_test_attrs(event_id); + sse_test_register_error(event_id); + + run_inject_test(info); + + report_prefix_pop(); + } + + sse_test_injection_priority(); + + report_prefix_pop(); +} diff --git a/riscv/sbi.c b/riscv/sbi.c index 0404bb81..478cb35d 100644 --- a/riscv/sbi.c +++ b/riscv/sbi.c @@ -32,6 +32,7 @@ #define HIGH_ADDR_BOUNDARY ((phys_addr_t)1 << 32) +void check_sse(void); void check_fwft(void); static long __labs(long a) @@ -1567,6 +1568,7 @@ int main(int argc, char **argv) check_hsm(); check_dbcn(); check_susp(); + check_sse(); check_fwft(); return report_summary(); -- 2.47.2 From cleger at rivosinc.com Fri Mar 14 04:21:24 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Fri, 14 Mar 2025 12:21:24 +0100 Subject: [PATCH v3 03/17] riscv: sbi: add SBI FWFT extension calls In-Reply-To: <20250313-ce439653d16b484dba6a8d3e@orel> References: <20250310151229.2365992-1-cleger@rivosinc.com> <20250310151229.2365992-4-cleger@rivosinc.com> <20250313-ce439653d16b484dba6a8d3e@orel> Message-ID: On 13/03/2025 13:44, Andrew Jones wrote: > On Mon, Mar 10, 2025 at 04:12:10PM +0100, Cl?ment L?ger wrote: >> Add FWFT extension calls. This will be ratified in SBI V3.0 hence, it is >> provided as a separate commit that can be left out if needed. >> >> Signed-off-by: Cl?ment L?ger >> --- >> arch/riscv/kernel/sbi.c | 30 ++++++++++++++++++++++++++++-- >> 1 file changed, 28 insertions(+), 2 deletions(-) >> >> diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c >> index 256910db1307..af8e2199e32d 100644 >> --- a/arch/riscv/kernel/sbi.c >> +++ b/arch/riscv/kernel/sbi.c >> @@ -299,9 +299,19 @@ static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask, >> return 0; >> } >> >> +static bool sbi_fwft_supported; >> + >> int sbi_fwft_get(u32 feature, unsigned long *value) >> { >> - return -EOPNOTSUPP; >> + struct sbiret ret; >> + >> + if (!sbi_fwft_supported) >> + return -EOPNOTSUPP; >> + >> + ret = sbi_ecall(SBI_EXT_FWFT, SBI_EXT_FWFT_GET, >> + feature, 0, 0, 0, 0, 0); >> + >> + return sbi_err_map_linux_errno(ret.error); >> } >> >> /** >> @@ -314,7 +324,15 @@ int sbi_fwft_get(u32 feature, unsigned long *value) >> */ >> int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags) >> { >> - return -EOPNOTSUPP; >> + struct sbiret ret; >> + >> + if (!sbi_fwft_supported) >> + return -EOPNOTSUPP; >> + >> + ret = sbi_ecall(SBI_EXT_FWFT, SBI_EXT_FWFT_SET, >> + feature, value, flags, 0, 0, 0); >> + >> + return sbi_err_map_linux_errno(ret.error); > > sbi_err_map_linux_errno() doesn't know about SBI_ERR_DENIED_LOCKED. Not only it doesn't knows about DENIED_LOCKED but also another bunch of errors. I'll add them in a separate commit. > >> } >> >> struct fwft_set_req { >> @@ -389,6 +407,9 @@ static int sbi_fwft_feature_local_set(u32 feature, unsigned long value, >> int sbi_fwft_all_cpus_set(u32 feature, unsigned long value, unsigned long flags, >> bool revert_on_fail) >> { >> + if (!sbi_fwft_supported) >> + return -EOPNOTSUPP; >> + >> if (feature & SBI_FWFT_GLOBAL_FEATURE_BIT) >> return sbi_fwft_set(feature, value, flags); >> >> @@ -719,6 +740,11 @@ void __init sbi_init(void) >> pr_info("SBI DBCN extension detected\n"); >> sbi_debug_console_available = true; >> } >> + if ((sbi_spec_version >= sbi_mk_version(2, 0)) && > > Should check sbi_mk_version(3, 0) Oh yes that was for testing purpose and I incorrectly squashed it. > >> + (sbi_probe_extension(SBI_EXT_FWFT) > 0)) { >> + pr_info("SBI FWFT extension detected\n"); >> + sbi_fwft_supported = true; >> + } >> } else { >> __sbi_set_timer = __sbi_set_timer_v01; >> __sbi_send_ipi = __sbi_send_ipi_v01; >> -- >> 2.47.2 >> > Thanks, Cl?ment > Thanks, > drew From cleger at rivosinc.com Fri Mar 14 04:33:55 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Fri, 14 Mar 2025 12:33:55 +0100 Subject: [PATCH v3 02/17] riscv: sbi: add FWFT extension interface In-Reply-To: <20250313-5c22df0c08337905367fa125@orel> References: <20250310151229.2365992-1-cleger@rivosinc.com> <20250310151229.2365992-3-cleger@rivosinc.com> <20250313-5c22df0c08337905367fa125@orel> Message-ID: On 13/03/2025 13:39, Andrew Jones wrote: > On Mon, Mar 10, 2025 at 04:12:09PM +0100, Cl?ment L?ger wrote: >> This SBI extensions enables supervisor mode to control feature that are >> under M-mode control (For instance, Svadu menvcfg ADUE bit, Ssdbltrp >> DTE, etc). >> >> Signed-off-by: Cl?ment L?ger >> --- >> arch/riscv/include/asm/sbi.h | 5 ++ >> arch/riscv/kernel/sbi.c | 97 ++++++++++++++++++++++++++++++++++++ >> 2 files changed, 102 insertions(+) >> >> diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h >> index bb077d0c912f..fc87c609c11a 100644 >> --- a/arch/riscv/include/asm/sbi.h >> +++ b/arch/riscv/include/asm/sbi.h >> @@ -503,6 +503,11 @@ int sbi_remote_hfence_vvma_asid(const struct cpumask *cpu_mask, >> unsigned long asid); >> long sbi_probe_extension(int ext); >> >> +int sbi_fwft_all_cpus_set(u32 feature, unsigned long value, unsigned long flags, >> + bool revert_on_failure); >> +int sbi_fwft_get(u32 feature, unsigned long *value); >> +int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags); >> + >> /* Check if current SBI specification version is 0.1 or not */ >> static inline int sbi_spec_is_0_1(void) >> { >> diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c >> index 1989b8cade1b..256910db1307 100644 >> --- a/arch/riscv/kernel/sbi.c >> +++ b/arch/riscv/kernel/sbi.c >> @@ -299,6 +299,103 @@ static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask, >> return 0; >> } >> >> +int sbi_fwft_get(u32 feature, unsigned long *value) >> +{ >> + return -EOPNOTSUPP; >> +} >> + >> +/** >> + * sbi_fwft_set() - Set a feature on all online cpus > > copy+paste of description from sbi_fwft_all_cpus_set(). This function > only sets the feature on the calling hart. > >> + * @feature: The feature to be set >> + * @value: The feature value to be set >> + * @flags: FWFT feature set flags >> + * >> + * Return: 0 on success, appropriate linux error code otherwise. >> + */ >> +int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags) >> +{ >> + return -EOPNOTSUPP; >> +} >> + >> +struct fwft_set_req { >> + u32 feature; >> + unsigned long value; >> + unsigned long flags; >> + cpumask_t mask; >> +}; >> + >> +static void cpu_sbi_fwft_set(void *arg) >> +{ >> + struct fwft_set_req *req = arg; >> + >> + if (sbi_fwft_set(req->feature, req->value, req->flags)) >> + cpumask_clear_cpu(smp_processor_id(), &req->mask); >> +} >> + >> +static int sbi_fwft_feature_local_set(u32 feature, unsigned long value, >> + unsigned long flags, >> + bool revert_on_fail) >> +{ >> + int ret; >> + unsigned long prev_value; >> + cpumask_t tmp; >> + struct fwft_set_req req = { >> + .feature = feature, >> + .value = value, >> + .flags = flags, >> + }; >> + >> + cpumask_copy(&req.mask, cpu_online_mask); >> + >> + /* We can not revert if features are locked */ >> + if (revert_on_fail && flags & SBI_FWFT_SET_FLAG_LOCK) > > Should use () around the flags &. I thought checkpatch complained about > that? > >> + return -EINVAL; >> + >> + /* Reset value is the same for all cpus, read it once. */ > > How do we know we're reading the reset value? sbi_fwft_all_cpus_set() may > be called multiple times on the same feature. And harts may have had > sbi_fwft_set() called on them independently. I think we should drop the > whole prev_value optimization. That's actually used for revert_on_failure as well not only the optimization. > >> + ret = sbi_fwft_get(feature, &prev_value); >> + if (ret) >> + return ret; >> + >> + /* Feature might already be set to the value we want */ >> + if (prev_value == value) >> + return 0; >> + >> + on_each_cpu_mask(&req.mask, cpu_sbi_fwft_set, &req, 1); >> + if (cpumask_equal(&req.mask, cpu_online_mask)) >> + return 0; >> + >> + pr_err("Failed to set feature %x for all online cpus, reverting\n", >> + feature); > > nit: I'd let the above line stick out. We have 100 chars. > >> + >> + req.value = prev_value; >> + cpumask_copy(&tmp, &req.mask); >> + on_each_cpu_mask(&req.mask, cpu_sbi_fwft_set, &req, 1); >> + if (cpumask_equal(&req.mask, &tmp)) >> + return 0; > > I'm not sure we want the revert_on_fail support either. What happens when > the revert fails and we return -EINVAL below? Also returning zero when > revert succeeds means the caller won't know if we successfully set what > we wanted or just successfully reverted. So that might actually be needed for features that needs to be enabled on all hart or not enabled at all. If we fail to enable all of them, them the hart will be in some non coherent state between the harts. The returned error code though is wrong and I'm not sure we would have a way to gracefully handle revertion failure (except maybe panicking ?). Thanks, Cl?ment > >> + >> + return -EINVAL; >> +} >> + >> +/** >> + * sbi_fwft_all_cpus_set() - Set a feature on all online cpus >> + * @feature: The feature to be set >> + * @value: The feature value to be set >> + * @flags: FWFT feature set flags >> + * @revert_on_fail: true if feature value should be restored to it's orignal > > its original > >> + * value on failure. > > Line 'value' up under 'true' > >> + * >> + * Return: 0 on success, appropriate linux error code otherwise. >> + */ >> +int sbi_fwft_all_cpus_set(u32 feature, unsigned long value, unsigned long flags, >> + bool revert_on_fail) >> +{ >> + if (feature & SBI_FWFT_GLOBAL_FEATURE_BIT) >> + return sbi_fwft_set(feature, value, flags); >> + >> + return sbi_fwft_feature_local_set(feature, value, flags, >> + revert_on_fail); >> +} >> + >> /** >> * sbi_set_timer() - Program the timer for next timer event. >> * @stime_value: The value after which next timer event should fire. >> -- >> 2.47.2 > > Thanks, > drew From cleger at rivosinc.com Fri Mar 14 04:44:05 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Fri, 14 Mar 2025 12:44:05 +0100 Subject: [PATCH v3 05/17] riscv: misaligned: use on_each_cpu() for scalar misaligned access probing In-Reply-To: <20250313-311b94f9bafe73bcd41158a1@orel> References: <20250310151229.2365992-1-cleger@rivosinc.com> <20250310151229.2365992-6-cleger@rivosinc.com> <20250313-311b94f9bafe73bcd41158a1@orel> Message-ID: <1942580f-cd67-4ddd-b489-0532f95c1ef2@rivosinc.com> On 13/03/2025 13:57, Andrew Jones wrote: > On Mon, Mar 10, 2025 at 04:12:12PM +0100, Cl?ment L?ger wrote: >> schedule_on_each_cpu() was used without any good reason while documented >> as very slow. This call was in the boot path, so better use >> on_each_cpu() for scalar misaligned checking. Vector misaligned check >> still needs to use schedule_on_each_cpu() since it requires irqs to be >> enabled but that's less of a problem since this code is ran in a kthread. >> Add a comment to explicit that. >> >> Signed-off-by: Cl?ment L?ger >> --- >> arch/riscv/kernel/traps_misaligned.c | 9 +++++++-- >> 1 file changed, 7 insertions(+), 2 deletions(-) >> >> diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c >> index 90ac74191357..ffac424faa88 100644 >> --- a/arch/riscv/kernel/traps_misaligned.c >> +++ b/arch/riscv/kernel/traps_misaligned.c >> @@ -616,6 +616,11 @@ bool check_vector_unaligned_access_emulated_all_cpus(void) >> return false; >> } >> >> + /* >> + * While being documented as very slow, schedule_on_each_cpu() is used >> + * since kernel_vector_begin() expects irqs to be enabled or it will panic(). > > which expects Hum that would yield the following: "schedule_on_each_cpu() is used since kernel_vector_begin() that is called inside the vector code 'which' expects irqs to be enabled or it will panic()." which seems wrong as well. I guess something like this would be better: "While being documented as very slow, schedule_on_each_cpu() is used since kernel_vector_begin() expects irqs to be enabled or it will panic()" Thanks, Cl?ment > >> + */ >> schedule_on_each_cpu(check_vector_unaligned_access_emulated); >> >> for_each_online_cpu(cpu) >> @@ -636,7 +641,7 @@ bool check_vector_unaligned_access_emulated_all_cpus(void) >> >> static bool unaligned_ctl __read_mostly; >> >> -static void check_unaligned_access_emulated(struct work_struct *work __always_unused) >> +static void check_unaligned_access_emulated(void *arg __always_unused) >> { >> int cpu = smp_processor_id(); >> long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu); >> @@ -677,7 +682,7 @@ bool check_unaligned_access_emulated_all_cpus(void) >> * accesses emulated since tasks requesting such control can run on any >> * CPU. >> */ >> - schedule_on_each_cpu(check_unaligned_access_emulated); >> + on_each_cpu(check_unaligned_access_emulated, NULL, 1); >> >> for_each_online_cpu(cpu) >> if (per_cpu(misaligned_access_speed, cpu) >> -- >> 2.47.2 >> > > Reviewed-by: Andrew Jones From cleger at rivosinc.com Fri Mar 14 04:47:22 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Fri, 14 Mar 2025 12:47:22 +0100 Subject: [PATCH v3 06/17] riscv: misaligned: use correct CONFIG_ ifdef for misaligned_access_speed In-Reply-To: <20250313-a437330d8e1c638a9aa61e0a@orel> References: <20250310151229.2365992-1-cleger@rivosinc.com> <20250310151229.2365992-7-cleger@rivosinc.com> <20250313-a437330d8e1c638a9aa61e0a@orel> Message-ID: On 13/03/2025 14:06, Andrew Jones wrote: > On Mon, Mar 10, 2025 at 04:12:13PM +0100, Cl?ment L?ger wrote: >> misaligned_access_speed is defined under CONFIG_RISCV_SCALAR_MISALIGNED >> but was used under CONFIG_RISCV_PROBE_UNALIGNED_ACCESS. Fix that by >> using the correct config option. >> >> Signed-off-by: Cl?ment L?ger >> --- >> arch/riscv/kernel/traps_misaligned.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c >> index ffac424faa88..7fe25adf2539 100644 >> --- a/arch/riscv/kernel/traps_misaligned.c >> +++ b/arch/riscv/kernel/traps_misaligned.c >> @@ -362,7 +362,7 @@ static int handle_scalar_misaligned_load(struct pt_regs *regs) >> >> perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS, 1, regs, addr); >> >> -#ifdef CONFIG_RISCV_PROBE_UNALIGNED_ACCESS >> +#ifdef CONFIG_RISCV_SCALAR_MISALIGNED >> *this_cpu_ptr(&misaligned_access_speed) = RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED; >> #endif > > Sure, but CONFIG_RISCV_PROBE_UNALIGNED_ACCESS selects > CONFIG_RISCV_SCALAR_MISALIGNED, so this isn't fixing anything. Indeed, that is not fixing anything (hence no Fixes tag), it compiles as a side effect of Kconfig dependencies. > Changing it > does make sense though since this line in handle_scalar_misaligned_load() > "belongs" to check_unaligned_access_emulated() which is also under > CONFIG_RISCV_SCALAR_MISALIGNED. Anyway, all this unaligned configs need a > major cleanup. Yes, as I said, I'd be advocating to remove all that ifdefery mess. Thanks, Cl?ment > > > Reviewed-by: Andrew Jones > > Thanks, > drew > >> >> -- >> 2.47.2 >> >> >> -- >> kvm-riscv mailing list >> kvm-riscv at lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/kvm-riscv From cleger at rivosinc.com Fri Mar 14 04:49:22 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Fri, 14 Mar 2025 12:49:22 +0100 Subject: [PATCH v3 08/17] riscv: misaligned: add a function to check misalign trap delegability In-Reply-To: <20250313-4bea400c5770a5a5d3d70b38@orel> References: <20250310151229.2365992-1-cleger@rivosinc.com> <20250310151229.2365992-9-cleger@rivosinc.com> <20250313-4bea400c5770a5a5d3d70b38@orel> Message-ID: <7073c4de-4735-429b-b520-f18c33ecaab7@rivosinc.com> On 13/03/2025 14:19, Andrew Jones wrote: > On Mon, Mar 10, 2025 at 04:12:15PM +0100, Cl?ment L?ger wrote: >> Checking for the delegability of the misaligned access trap is needed >> for the KVM FWFT extension implementation. Add a function to get the >> delegability of the misaligned trap exception. >> >> Signed-off-by: Cl?ment L?ger >> --- >> arch/riscv/include/asm/cpufeature.h | 5 +++++ >> arch/riscv/kernel/traps_misaligned.c | 17 +++++++++++++++-- >> 2 files changed, 20 insertions(+), 2 deletions(-) >> >> diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h >> index ad7d26788e6a..8b97cba99fc3 100644 >> --- a/arch/riscv/include/asm/cpufeature.h >> +++ b/arch/riscv/include/asm/cpufeature.h >> @@ -69,12 +69,17 @@ int cpu_online_unaligned_access_init(unsigned int cpu); >> #if defined(CONFIG_RISCV_SCALAR_MISALIGNED) >> void unaligned_emulation_finish(void); >> bool unaligned_ctl_available(void); >> +bool misaligned_traps_can_delegate(void); >> DECLARE_PER_CPU(long, misaligned_access_speed); >> #else >> static inline bool unaligned_ctl_available(void) >> { >> return false; >> } >> +static inline bool misaligned_traps_can_delegate(void) >> +{ >> + return false; >> +} >> #endif >> >> bool check_vector_unaligned_access_emulated_all_cpus(void); >> diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c >> index db31966a834e..a67a6e709a06 100644 >> --- a/arch/riscv/kernel/traps_misaligned.c >> +++ b/arch/riscv/kernel/traps_misaligned.c >> @@ -716,10 +716,10 @@ static int cpu_online_check_unaligned_access_emulated(unsigned int cpu) >> } >> #endif >> >> -#ifdef CONFIG_RISCV_SBI >> - >> static bool misaligned_traps_delegated; >> >> +#ifdef CONFIG_RISCV_SBI >> + >> static int cpu_online_sbi_unaligned_setup(unsigned int cpu) >> { >> if (sbi_fwft_set(SBI_FWFT_MISALIGNED_EXC_DELEG, 1, 0) && >> @@ -761,6 +761,7 @@ static int cpu_online_sbi_unaligned_setup(unsigned int cpu __always_unused) >> { >> return 0; >> } >> + >> #endif >> >> int cpu_online_unaligned_access_init(unsigned int cpu) >> @@ -773,3 +774,15 @@ int cpu_online_unaligned_access_init(unsigned int cpu) >> >> return cpu_online_check_unaligned_access_emulated(cpu); >> } >> + >> +bool misaligned_traps_can_delegate(void) >> +{ >> + /* >> + * Either we successfully requested misaligned traps delegation for all >> + * CPUS or the SBI does not implemented FWFT extension but delegated the >> + * exception by default. >> + */ >> + return misaligned_traps_delegated || >> + all_cpus_unaligned_scalar_access_emulated(); >> +} >> +EXPORT_SYMBOL_GPL(misaligned_traps_can_delegate); >> \ No newline at end of file > > Check your editor settings. I just enabled EditorConfig as well as clang-format so hopefully that will be ok in the next series. Thanks, Cl?ment > >> -- >> 2.47.2 > > Reviewed-by: Andrew Jones From ajones at ventanamicro.com Fri Mar 14 05:02:15 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Fri, 14 Mar 2025 13:02:15 +0100 Subject: [PATCH v3 02/17] riscv: sbi: add FWFT extension interface In-Reply-To: References: <20250310151229.2365992-1-cleger@rivosinc.com> <20250310151229.2365992-3-cleger@rivosinc.com> <20250313-5c22df0c08337905367fa125@orel> Message-ID: <20250314-10d8d58329aceac21e727ebe@orel> On Fri, Mar 14, 2025 at 12:33:55PM +0100, Cl?ment L?ger wrote: > > > On 13/03/2025 13:39, Andrew Jones wrote: > > On Mon, Mar 10, 2025 at 04:12:09PM +0100, Cl?ment L?ger wrote: > >> This SBI extensions enables supervisor mode to control feature that are > >> under M-mode control (For instance, Svadu menvcfg ADUE bit, Ssdbltrp > >> DTE, etc). > >> > >> Signed-off-by: Cl?ment L?ger > >> --- > >> arch/riscv/include/asm/sbi.h | 5 ++ > >> arch/riscv/kernel/sbi.c | 97 ++++++++++++++++++++++++++++++++++++ > >> 2 files changed, 102 insertions(+) > >> > >> diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h > >> index bb077d0c912f..fc87c609c11a 100644 > >> --- a/arch/riscv/include/asm/sbi.h > >> +++ b/arch/riscv/include/asm/sbi.h > >> @@ -503,6 +503,11 @@ int sbi_remote_hfence_vvma_asid(const struct cpumask *cpu_mask, > >> unsigned long asid); > >> long sbi_probe_extension(int ext); > >> > >> +int sbi_fwft_all_cpus_set(u32 feature, unsigned long value, unsigned long flags, > >> + bool revert_on_failure); > >> +int sbi_fwft_get(u32 feature, unsigned long *value); > >> +int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags); > >> + > >> /* Check if current SBI specification version is 0.1 or not */ > >> static inline int sbi_spec_is_0_1(void) > >> { > >> diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c > >> index 1989b8cade1b..256910db1307 100644 > >> --- a/arch/riscv/kernel/sbi.c > >> +++ b/arch/riscv/kernel/sbi.c > >> @@ -299,6 +299,103 @@ static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask, > >> return 0; > >> } > >> > >> +int sbi_fwft_get(u32 feature, unsigned long *value) > >> +{ > >> + return -EOPNOTSUPP; > >> +} > >> + > >> +/** > >> + * sbi_fwft_set() - Set a feature on all online cpus > > > > copy+paste of description from sbi_fwft_all_cpus_set(). This function > > only sets the feature on the calling hart. > > > >> + * @feature: The feature to be set > >> + * @value: The feature value to be set > >> + * @flags: FWFT feature set flags > >> + * > >> + * Return: 0 on success, appropriate linux error code otherwise. > >> + */ > >> +int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags) > >> +{ > >> + return -EOPNOTSUPP; > >> +} > >> + > >> +struct fwft_set_req { > >> + u32 feature; > >> + unsigned long value; > >> + unsigned long flags; > >> + cpumask_t mask; > >> +}; > >> + > >> +static void cpu_sbi_fwft_set(void *arg) > >> +{ > >> + struct fwft_set_req *req = arg; > >> + > >> + if (sbi_fwft_set(req->feature, req->value, req->flags)) > >> + cpumask_clear_cpu(smp_processor_id(), &req->mask); > >> +} > >> + > >> +static int sbi_fwft_feature_local_set(u32 feature, unsigned long value, > >> + unsigned long flags, > >> + bool revert_on_fail) > >> +{ > >> + int ret; > >> + unsigned long prev_value; > >> + cpumask_t tmp; > >> + struct fwft_set_req req = { > >> + .feature = feature, > >> + .value = value, > >> + .flags = flags, > >> + }; > >> + > >> + cpumask_copy(&req.mask, cpu_online_mask); > >> + > >> + /* We can not revert if features are locked */ > >> + if (revert_on_fail && flags & SBI_FWFT_SET_FLAG_LOCK) > > > > Should use () around the flags &. I thought checkpatch complained about > > that? > > > >> + return -EINVAL; > >> + > >> + /* Reset value is the same for all cpus, read it once. */ > > > > How do we know we're reading the reset value? sbi_fwft_all_cpus_set() may > > be called multiple times on the same feature. And harts may have had > > sbi_fwft_set() called on them independently. I think we should drop the > > whole prev_value optimization. > > That's actually used for revert_on_failure as well not only the > optimization. At least the comment should drop the word 'Reset' and if there's a chance that not all harts having the same value then we should call get on all of them. (We'll probably want SBI FWFT functions which operate on hartmasks eventually.) > > > > >> + ret = sbi_fwft_get(feature, &prev_value); > >> + if (ret) > >> + return ret; > >> + > >> + /* Feature might already be set to the value we want */ > >> + if (prev_value == value) > >> + return 0; > >> + > >> + on_each_cpu_mask(&req.mask, cpu_sbi_fwft_set, &req, 1); > >> + if (cpumask_equal(&req.mask, cpu_online_mask)) > >> + return 0; > >> + > >> + pr_err("Failed to set feature %x for all online cpus, reverting\n", > >> + feature); > > > > nit: I'd let the above line stick out. We have 100 chars. > > > >> + > >> + req.value = prev_value; > >> + cpumask_copy(&tmp, &req.mask); > >> + on_each_cpu_mask(&req.mask, cpu_sbi_fwft_set, &req, 1); > >> + if (cpumask_equal(&req.mask, &tmp)) > >> + return 0; > > > > I'm not sure we want the revert_on_fail support either. What happens when > > the revert fails and we return -EINVAL below? Also returning zero when > > revert succeeds means the caller won't know if we successfully set what > > we wanted or just successfully reverted. > > So that might actually be needed for features that needs to be enabled > on all hart or not enabled at all. If we fail to enable all of them, > them the hart will be in some non coherent state between the harts. > The returned error code though is wrong and I'm not sure we would have a > way to gracefully handle revertion failure (except maybe panicking ?). How about offlining all harts which don't have the desired state, along with complaining loudly to the boot log. Thanks, drew From cleger at rivosinc.com Fri Mar 14 05:23:25 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Fri, 14 Mar 2025 13:23:25 +0100 Subject: [PATCH v3 02/17] riscv: sbi: add FWFT extension interface In-Reply-To: <20250314-10d8d58329aceac21e727ebe@orel> References: <20250310151229.2365992-1-cleger@rivosinc.com> <20250310151229.2365992-3-cleger@rivosinc.com> <20250313-5c22df0c08337905367fa125@orel> <20250314-10d8d58329aceac21e727ebe@orel> Message-ID: <8990d87e-b46e-4c1e-b337-a954563f6474@rivosinc.com> On 14/03/2025 13:02, Andrew Jones wrote: > On Fri, Mar 14, 2025 at 12:33:55PM +0100, Cl?ment L?ger wrote: >> >> >> On 13/03/2025 13:39, Andrew Jones wrote: >>> On Mon, Mar 10, 2025 at 04:12:09PM +0100, Cl?ment L?ger wrote: >>>> This SBI extensions enables supervisor mode to control feature that are >>>> under M-mode control (For instance, Svadu menvcfg ADUE bit, Ssdbltrp >>>> DTE, etc). >>>> >>>> Signed-off-by: Cl?ment L?ger >>>> --- >>>> arch/riscv/include/asm/sbi.h | 5 ++ >>>> arch/riscv/kernel/sbi.c | 97 ++++++++++++++++++++++++++++++++++++ >>>> 2 files changed, 102 insertions(+) >>>> >>>> diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h >>>> index bb077d0c912f..fc87c609c11a 100644 >>>> --- a/arch/riscv/include/asm/sbi.h >>>> +++ b/arch/riscv/include/asm/sbi.h >>>> @@ -503,6 +503,11 @@ int sbi_remote_hfence_vvma_asid(const struct cpumask *cpu_mask, >>>> unsigned long asid); >>>> long sbi_probe_extension(int ext); >>>> >>>> +int sbi_fwft_all_cpus_set(u32 feature, unsigned long value, unsigned long flags, >>>> + bool revert_on_failure); >>>> +int sbi_fwft_get(u32 feature, unsigned long *value); >>>> +int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags); >>>> + >>>> /* Check if current SBI specification version is 0.1 or not */ >>>> static inline int sbi_spec_is_0_1(void) >>>> { >>>> diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c >>>> index 1989b8cade1b..256910db1307 100644 >>>> --- a/arch/riscv/kernel/sbi.c >>>> +++ b/arch/riscv/kernel/sbi.c >>>> @@ -299,6 +299,103 @@ static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask, >>>> return 0; >>>> } >>>> >>>> +int sbi_fwft_get(u32 feature, unsigned long *value) >>>> +{ >>>> + return -EOPNOTSUPP; >>>> +} >>>> + >>>> +/** >>>> + * sbi_fwft_set() - Set a feature on all online cpus >>> >>> copy+paste of description from sbi_fwft_all_cpus_set(). This function >>> only sets the feature on the calling hart. >>> >>>> + * @feature: The feature to be set >>>> + * @value: The feature value to be set >>>> + * @flags: FWFT feature set flags >>>> + * >>>> + * Return: 0 on success, appropriate linux error code otherwise. >>>> + */ >>>> +int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags) >>>> +{ >>>> + return -EOPNOTSUPP; >>>> +} >>>> + >>>> +struct fwft_set_req { >>>> + u32 feature; >>>> + unsigned long value; >>>> + unsigned long flags; >>>> + cpumask_t mask; >>>> +}; >>>> + >>>> +static void cpu_sbi_fwft_set(void *arg) >>>> +{ >>>> + struct fwft_set_req *req = arg; >>>> + >>>> + if (sbi_fwft_set(req->feature, req->value, req->flags)) >>>> + cpumask_clear_cpu(smp_processor_id(), &req->mask); >>>> +} >>>> + >>>> +static int sbi_fwft_feature_local_set(u32 feature, unsigned long value, >>>> + unsigned long flags, >>>> + bool revert_on_fail) >>>> +{ >>>> + int ret; >>>> + unsigned long prev_value; >>>> + cpumask_t tmp; >>>> + struct fwft_set_req req = { >>>> + .feature = feature, >>>> + .value = value, >>>> + .flags = flags, >>>> + }; >>>> + >>>> + cpumask_copy(&req.mask, cpu_online_mask); >>>> + >>>> + /* We can not revert if features are locked */ >>>> + if (revert_on_fail && flags & SBI_FWFT_SET_FLAG_LOCK) >>> >>> Should use () around the flags &. I thought checkpatch complained about >>> that? >>> >>>> + return -EINVAL; >>>> + >>>> + /* Reset value is the same for all cpus, read it once. */ >>> >>> How do we know we're reading the reset value? sbi_fwft_all_cpus_set() may >>> be called multiple times on the same feature. And harts may have had >>> sbi_fwft_set() called on them independently. I think we should drop the >>> whole prev_value optimization. >> >> That's actually used for revert_on_failure as well not only the >> optimization. > > At least the comment should drop the word 'Reset' and if there's a chance > that not all harts having the same value then we should call get on all > of them. (We'll probably want SBI FWFT functions which operate on > hartmasks eventually.) Ok, then I can pass a cpu_mask as well so that caller just have to pass online_cpus() if they want it on all cpus. > >> >>> >>>> + ret = sbi_fwft_get(feature, &prev_value); >>>> + if (ret) >>>> + return ret; >>>> + >>>> + /* Feature might already be set to the value we want */ >>>> + if (prev_value == value) >>>> + return 0; >>>> + >>>> + on_each_cpu_mask(&req.mask, cpu_sbi_fwft_set, &req, 1); >>>> + if (cpumask_equal(&req.mask, cpu_online_mask)) >>>> + return 0; >>>> + >>>> + pr_err("Failed to set feature %x for all online cpus, reverting\n", >>>> + feature); >>> >>> nit: I'd let the above line stick out. We have 100 chars. >>> >>>> + >>>> + req.value = prev_value; >>>> + cpumask_copy(&tmp, &req.mask); >>>> + on_each_cpu_mask(&req.mask, cpu_sbi_fwft_set, &req, 1); >>>> + if (cpumask_equal(&req.mask, &tmp)) >>>> + return 0; >>> >>> I'm not sure we want the revert_on_fail support either. What happens when >>> the revert fails and we return -EINVAL below? Also returning zero when >>> revert succeeds means the caller won't know if we successfully set what >>> we wanted or just successfully reverted. >> >> So that might actually be needed for features that needs to be enabled >> on all hart or not enabled at all. If we fail to enable all of them, >> them the hart will be in some non coherent state between the harts. >> The returned error code though is wrong and I'm not sure we would have a >> way to gracefully handle revertion failure (except maybe panicking ?). > > How about offlining all harts which don't have the desired state, along > with complaining loudly to the boot log. > > Thanks, > drew From cleger at rivosinc.com Fri Mar 14 06:53:58 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Fri, 14 Mar 2025 14:53:58 +0100 Subject: [PATCH v3 14/17] RISC-V: KVM: add SBI extension init()/deinit() functions In-Reply-To: <20250313-f08cee46c912f729d1829d37@orel> References: <20250310151229.2365992-1-cleger@rivosinc.com> <20250310151229.2365992-15-cleger@rivosinc.com> <20250313-f08cee46c912f729d1829d37@orel> Message-ID: On 13/03/2025 15:27, Andrew Jones wrote: > On Mon, Mar 10, 2025 at 04:12:21PM +0100, Cl?ment L?ger wrote: >> The FWFT SBI extension will need to dynamically allocate memory and do >> init time specific initialization. Add an init/deinit callbacks that >> allows to do so. >> >> Signed-off-by: Cl?ment L?ger >> --- >> arch/riscv/include/asm/kvm_vcpu_sbi.h | 9 +++++++++ >> arch/riscv/kvm/vcpu.c | 2 ++ >> arch/riscv/kvm/vcpu_sbi.c | 29 +++++++++++++++++++++++++++ >> 3 files changed, 40 insertions(+) >> >> diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h >> index 4ed6203cdd30..bcb90757b149 100644 >> --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h >> +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h >> @@ -49,6 +49,14 @@ struct kvm_vcpu_sbi_extension { >> >> /* Extension specific probe function */ >> unsigned long (*probe)(struct kvm_vcpu *vcpu); >> + >> + /* >> + * Init/deinit function called once during VCPU init/destroy. These >> + * might be use if the SBI extensions need to allocate or do specific >> + * init time only configuration. >> + */ >> + int (*init)(struct kvm_vcpu *vcpu); >> + void (*deinit)(struct kvm_vcpu *vcpu); >> }; >> >> void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run); >> @@ -69,6 +77,7 @@ const struct kvm_vcpu_sbi_extension *kvm_vcpu_sbi_find_ext( >> bool riscv_vcpu_supports_sbi_ext(struct kvm_vcpu *vcpu, int idx); >> int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run); >> void kvm_riscv_vcpu_sbi_init(struct kvm_vcpu *vcpu); >> +void kvm_riscv_vcpu_sbi_deinit(struct kvm_vcpu *vcpu); >> >> int kvm_riscv_vcpu_get_reg_sbi_sta(struct kvm_vcpu *vcpu, unsigned long reg_num, >> unsigned long *reg_val); >> diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c >> index 60d684c76c58..877bcc85c067 100644 >> --- a/arch/riscv/kvm/vcpu.c >> +++ b/arch/riscv/kvm/vcpu.c >> @@ -185,6 +185,8 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu) >> >> void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu) >> { >> + kvm_riscv_vcpu_sbi_deinit(vcpu); >> + >> /* Cleanup VCPU AIA context */ >> kvm_riscv_vcpu_aia_deinit(vcpu); >> >> diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c >> index d1c83a77735e..858ddefd7e7f 100644 >> --- a/arch/riscv/kvm/vcpu_sbi.c >> +++ b/arch/riscv/kvm/vcpu_sbi.c >> @@ -505,8 +505,37 @@ void kvm_riscv_vcpu_sbi_init(struct kvm_vcpu *vcpu) >> continue; >> } >> >> + if (!ext->default_disabled && ext->init && >> + ext->init(vcpu) != 0) { >> + scontext->ext_status[idx] = KVM_RISCV_SBI_EXT_STATUS_UNAVAILABLE; >> + continue; >> + } > > I think this new block should be below the assignment below (and it can > drop the continue) and it shouldn't check default_disabled (as I've done > below). IOW, we should always run ext->init when there is one to run here. > Otherwise, I how will it get run later? Ok, i did not saw that there was a possibility to enable the extension at a later time. I'll fix that. Thanks, Cl?ment > >> + >> scontext->ext_status[idx] = ext->default_disabled ? >> KVM_RISCV_SBI_EXT_STATUS_DISABLED : >> KVM_RISCV_SBI_EXT_STATUS_ENABLED; > > if (ext->init && ext->init(vcpu)) > scontext->ext_status[idx] = KVM_RISCV_SBI_EXT_STATUS_UNAVAILABLE; > >> } >> } >> + >> +void kvm_riscv_vcpu_sbi_deinit(struct kvm_vcpu *vcpu) >> +{ >> + struct kvm_vcpu_sbi_context *scontext = &vcpu->arch.sbi_context; >> + const struct kvm_riscv_sbi_extension_entry *entry; >> + const struct kvm_vcpu_sbi_extension *ext; >> + int idx, i; >> + >> + for (i = 0; i < ARRAY_SIZE(sbi_ext); i++) { >> + entry = &sbi_ext[i]; >> + ext = entry->ext_ptr; >> + idx = entry->ext_idx; >> + >> + if (idx < 0 || idx >= ARRAY_SIZE(scontext->ext_status)) >> + continue; >> + >> + if (scontext->ext_status[idx] == KVM_RISCV_SBI_EXT_STATUS_UNAVAILABLE || >> + !ext->deinit) >> + continue; >> + >> + ext->deinit(vcpu); >> + } >> +} >> -- >> 2.47.2 >> > > Thanks, > drew From andrew.jones at linux.dev Fri Mar 14 07:11:14 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Fri, 14 Mar 2025 15:11:14 +0100 Subject: [kvm-unit-tests PATCH v9 6/6] riscv: sbi: Add SSE extension tests In-Reply-To: <20250314111030.3728671-7-cleger@rivosinc.com> References: <20250314111030.3728671-1-cleger@rivosinc.com> <20250314111030.3728671-7-cleger@rivosinc.com> Message-ID: <20250314-0940c2c0dcd92b285f43e4ca@orel> On Fri, Mar 14, 2025 at 12:10:29PM +0100, Cl?ment L?ger wrote: ... > +static void sse_test_inject_local(uint32_t event_id) > +{ > + int cpu; > + uint64_t timeout; > + struct sbiret ret; > + struct sse_local_per_cpu *cpu_args, *cpu_arg; > + struct sse_foreign_cpu_test_arg *handler_arg; > + > + cpu_args = calloc(NR_CPUS, sizeof(struct sbi_sse_handler_arg)); > + > + report_prefix_push("local_dispatch"); > + for_each_online_cpu(cpu) { > + cpu_arg = &cpu_args[cpu]; > + cpu_arg->handler_arg.event_id = event_id; > + cpu_arg->args.stack = sse_alloc_stack(); > + cpu_arg->args.handler = sse_foreign_cpu_handler; > + cpu_arg->args.handler_data = (void *)&cpu_arg->handler_arg; > + cpu_arg->state = SBI_SSE_STATE_UNUSED; > + } > + > + on_cpus(sse_register_enable_local, cpu_args); > + for_each_online_cpu(cpu) { > + cpu_arg = &cpu_args[cpu]; > + ret = cpu_arg->ret; > + if (ret.error) { > + report_fail("CPU failed to register/enable event: %ld", ret.error); > + goto cleanup; > + } > + > + handler_arg = &cpu_arg->handler_arg; > + WRITE_ONCE(handler_arg->expected_cpu, cpu); > + /* For handler_arg content to be visible for other CPUs */ > + smp_wmb(); > + ret = sbi_sse_inject(event_id, cpus[cpu].hartid); > + if (ret.error) { > + report_fail("CPU failed to inject event: %ld", ret.error); > + goto cleanup; > + } > + } > + > + for_each_online_cpu(cpu) { > + handler_arg = &cpu_args[cpu].handler_arg; > + smp_rmb(); > + > + timeout = sse_event_get_complete_timeout(); > + while (!READ_ONCE(handler_arg->done) || timer_get_cycles() < timeout) { I pointed this out in the last review, the || should be a &&. We don't want to keep waiting until we reach the timeout if we get the done signal earlier. > + /* For handler_arg update to be visible */ > + smp_rmb(); > + cpu_relax(); > + } > + report(READ_ONCE(handler_arg->done), "Event handled"); > + WRITE_ONCE(handler_arg->done, false); > + } > + > +cleanup: > + on_cpus(sbi_sse_disable_unregister_local, cpu_args); > + for_each_online_cpu(cpu) { > + cpu_arg = &cpu_args[cpu]; > + ret = READ_ONCE(cpu_arg->ret); > + if (ret.error) > + report_fail("CPU failed to disable/unregister event: %ld", ret.error); > + } > + > + for_each_online_cpu(cpu) { > + cpu_arg = &cpu_args[cpu]; > + sse_free_stack(cpu_arg->args.stack); > + } > + > + report_prefix_pop(); > +} > + > +static void sse_test_inject_global_cpu(uint32_t event_id, unsigned int cpu, > + struct sse_foreign_cpu_test_arg *test_arg) > +{ > + unsigned long value; > + struct sbiret ret; > + uint64_t timeout; > + enum sbi_sse_state state; > + > + WRITE_ONCE(test_arg->expected_cpu, cpu); > + /* For test_arg content to be visible for other CPUs */ > + smp_wmb(); > + value = cpu; > + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); > + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Set preferred hart")) > + return; > + > + ret = sbi_sse_enable(event_id); > + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Enable event")) > + return; > + > + ret = sbi_sse_inject(event_id, cpu); > + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Inject event")) > + goto disable; > + > + smp_rmb(); > + timeout = sse_event_get_complete_timeout(); > + while (!READ_ONCE(test_arg->done) || timer_get_cycles() < timeout) { same comment > + /* For shared test_arg structure */ > + smp_rmb(); > + cpu_relax(); > + } > + > + report(READ_ONCE(test_arg->done), "event handler called"); > + WRITE_ONCE(test_arg->done, false); > + > + timeout = sse_event_get_complete_timeout(); > + /* Wait for event to be back in ENABLED state */ > + do { > + ret = sse_event_get_state(event_id, &state); > + if (ret.error) > + goto disable; > + cpu_relax(); > + } while (state != SBI_SSE_STATE_ENABLED || timer_get_cycles() < timeout); same comment Otherwise, Reviewed-by: Andrew Jones Thanks, drew From cleger at rivosinc.com Fri Mar 14 07:13:07 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Fri, 14 Mar 2025 15:13:07 +0100 Subject: [kvm-unit-tests PATCH v9 6/6] riscv: sbi: Add SSE extension tests In-Reply-To: <20250314-0940c2c0dcd92b285f43e4ca@orel> References: <20250314111030.3728671-1-cleger@rivosinc.com> <20250314111030.3728671-7-cleger@rivosinc.com> <20250314-0940c2c0dcd92b285f43e4ca@orel> Message-ID: <56451a0e-7971-4619-8b2e-84c5129547b4@rivosinc.com> On 14/03/2025 15:11, Andrew Jones wrote: > On Fri, Mar 14, 2025 at 12:10:29PM +0100, Cl?ment L?ger wrote: > ... >> +static void sse_test_inject_local(uint32_t event_id) >> +{ >> + int cpu; >> + uint64_t timeout; >> + struct sbiret ret; >> + struct sse_local_per_cpu *cpu_args, *cpu_arg; >> + struct sse_foreign_cpu_test_arg *handler_arg; >> + >> + cpu_args = calloc(NR_CPUS, sizeof(struct sbi_sse_handler_arg)); >> + >> + report_prefix_push("local_dispatch"); >> + for_each_online_cpu(cpu) { >> + cpu_arg = &cpu_args[cpu]; >> + cpu_arg->handler_arg.event_id = event_id; >> + cpu_arg->args.stack = sse_alloc_stack(); >> + cpu_arg->args.handler = sse_foreign_cpu_handler; >> + cpu_arg->args.handler_data = (void *)&cpu_arg->handler_arg; >> + cpu_arg->state = SBI_SSE_STATE_UNUSED; >> + } >> + >> + on_cpus(sse_register_enable_local, cpu_args); >> + for_each_online_cpu(cpu) { >> + cpu_arg = &cpu_args[cpu]; >> + ret = cpu_arg->ret; >> + if (ret.error) { >> + report_fail("CPU failed to register/enable event: %ld", ret.error); >> + goto cleanup; >> + } >> + >> + handler_arg = &cpu_arg->handler_arg; >> + WRITE_ONCE(handler_arg->expected_cpu, cpu); >> + /* For handler_arg content to be visible for other CPUs */ >> + smp_wmb(); >> + ret = sbi_sse_inject(event_id, cpus[cpu].hartid); >> + if (ret.error) { >> + report_fail("CPU failed to inject event: %ld", ret.error); >> + goto cleanup; >> + } >> + } >> + >> + for_each_online_cpu(cpu) { >> + handler_arg = &cpu_args[cpu].handler_arg; >> + smp_rmb(); >> + >> + timeout = sse_event_get_complete_timeout(); >> + while (!READ_ONCE(handler_arg->done) || timer_get_cycles() < timeout) { > > I pointed this out in the last review, the || should be a &&. We don't > want to keep waiting until we reach the timeout if we get the done signal > earlier. Missed it. I'll resend a V10 if you don't have any other comments. Thanks, Cl?ment > >> + /* For handler_arg update to be visible */ >> + smp_rmb(); >> + cpu_relax(); >> + } >> + report(READ_ONCE(handler_arg->done), "Event handled"); >> + WRITE_ONCE(handler_arg->done, false); >> + } >> + >> +cleanup: >> + on_cpus(sbi_sse_disable_unregister_local, cpu_args); >> + for_each_online_cpu(cpu) { >> + cpu_arg = &cpu_args[cpu]; >> + ret = READ_ONCE(cpu_arg->ret); >> + if (ret.error) >> + report_fail("CPU failed to disable/unregister event: %ld", ret.error); >> + } >> + >> + for_each_online_cpu(cpu) { >> + cpu_arg = &cpu_args[cpu]; >> + sse_free_stack(cpu_arg->args.stack); >> + } >> + >> + report_prefix_pop(); >> +} >> + >> +static void sse_test_inject_global_cpu(uint32_t event_id, unsigned int cpu, >> + struct sse_foreign_cpu_test_arg *test_arg) >> +{ >> + unsigned long value; >> + struct sbiret ret; >> + uint64_t timeout; >> + enum sbi_sse_state state; >> + >> + WRITE_ONCE(test_arg->expected_cpu, cpu); >> + /* For test_arg content to be visible for other CPUs */ >> + smp_wmb(); >> + value = cpu; >> + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Set preferred hart")) >> + return; >> + >> + ret = sbi_sse_enable(event_id); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Enable event")) >> + return; >> + >> + ret = sbi_sse_inject(event_id, cpu); >> + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Inject event")) >> + goto disable; >> + >> + smp_rmb(); >> + timeout = sse_event_get_complete_timeout(); >> + while (!READ_ONCE(test_arg->done) || timer_get_cycles() < timeout) { > > same comment > >> + /* For shared test_arg structure */ >> + smp_rmb(); >> + cpu_relax(); >> + } >> + >> + report(READ_ONCE(test_arg->done), "event handler called"); >> + WRITE_ONCE(test_arg->done, false); >> + >> + timeout = sse_event_get_complete_timeout(); >> + /* Wait for event to be back in ENABLED state */ >> + do { >> + ret = sse_event_get_state(event_id, &state); >> + if (ret.error) >> + goto disable; >> + cpu_relax(); >> + } while (state != SBI_SSE_STATE_ENABLED || timer_get_cycles() < timeout); > > same comment > > Otherwise, > > Reviewed-by: Andrew Jones > > Thanks, > drew From andrew.jones at linux.dev Fri Mar 14 07:19:14 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Fri, 14 Mar 2025 15:19:14 +0100 Subject: [kvm-unit-tests PATCH v9 0/6] riscv: add SBI SSE extension tests In-Reply-To: <20250314111030.3728671-1-cleger@rivosinc.com> References: <20250314111030.3728671-1-cleger@rivosinc.com> Message-ID: <20250314-eb8cf0b719942c912e254ab2@orel> On Fri, Mar 14, 2025 at 12:10:23PM +0100, Cl?ment L?ger wrote: > This series adds tests for SBI SSE extension as well as needed > infrastructure for SSE support. It also adds test specific asm-offsets > generation to use custom OFFSET and DEFINE from the test directory. Is there an opensbi branch I should be using to test this? There are currently 54 failures reported with opensbi's master branch, and, with opensbi v1.5.1, which is the version provided by qemu's master branch, I get a crash which leads to a recursive stack walk. The crash occurs in what I'm guessing is sbi_sse_inject() by the last successful output. I can't merge this without it skipping/kfailing with qemu's opensbi, otherwise it'll fail CI. We could change CI to be more tolerant, but I'd rather use kfail instead, and of course not crash. Thanks, drew From andrew.jones at linux.dev Fri Mar 14 07:20:08 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Fri, 14 Mar 2025 15:20:08 +0100 Subject: [kvm-unit-tests PATCH v9 6/6] riscv: sbi: Add SSE extension tests In-Reply-To: <56451a0e-7971-4619-8b2e-84c5129547b4@rivosinc.com> References: <20250314111030.3728671-1-cleger@rivosinc.com> <20250314111030.3728671-7-cleger@rivosinc.com> <20250314-0940c2c0dcd92b285f43e4ca@orel> <56451a0e-7971-4619-8b2e-84c5129547b4@rivosinc.com> Message-ID: <20250314-378268547f75d0337f1a7836@orel> On Fri, Mar 14, 2025 at 03:13:07PM +0100, Cl?ment L?ger wrote: ... > Missed it. I'll resend a V10 if you don't have any other comments. Other comments are on the cover letter. Thanks, drew From cleger at rivosinc.com Fri Mar 14 07:23:39 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Fri, 14 Mar 2025 15:23:39 +0100 Subject: [kvm-unit-tests PATCH v9 0/6] riscv: add SBI SSE extension tests In-Reply-To: <20250314-eb8cf0b719942c912e254ab2@orel> References: <20250314111030.3728671-1-cleger@rivosinc.com> <20250314-eb8cf0b719942c912e254ab2@orel> Message-ID: On 14/03/2025 15:19, Andrew Jones wrote: > On Fri, Mar 14, 2025 at 12:10:23PM +0100, Cl?ment L?ger wrote: >> This series adds tests for SBI SSE extension as well as needed >> infrastructure for SSE support. It also adds test specific asm-offsets >> generation to use custom OFFSET and DEFINE from the test directory. > > Is there an opensbi branch I should be using to test this? There are > currently 54 failures reported with opensbi's master branch, and, with > opensbi v1.5.1, which is the version provided by qemu's master branch, > I get a crash which leads to a recursive stack walk. The crash occurs > in what I'm guessing is sbi_sse_inject() by the last successful output. Yeah that's due to a6/a7 being inverted at injection time. > > I can't merge this without it skipping/kfailing with qemu's opensbi, > otherwise it'll fail CI. We could change CI to be more tolerant, but I'd > rather use kfail instead, and of course not crash. Yes, the branch dev/cleger/sse on github can be used: https://github.com/rivosinc/opensbi/tree/dev/cleger/sse I'm waiting for the specification error changes to be merged before sending this one. Thanks, Cl?ment > > Thanks, > drew From andrew.jones at linux.dev Fri Mar 14 07:34:29 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Fri, 14 Mar 2025 15:34:29 +0100 Subject: [kvm-unit-tests PATCH v9 0/6] riscv: add SBI SSE extension tests In-Reply-To: References: <20250314111030.3728671-1-cleger@rivosinc.com> <20250314-eb8cf0b719942c912e254ab2@orel> Message-ID: <20250314-2452599f59b82e53e99100e7@orel> On Fri, Mar 14, 2025 at 03:23:39PM +0100, Cl?ment L?ger wrote: > > > On 14/03/2025 15:19, Andrew Jones wrote: > > On Fri, Mar 14, 2025 at 12:10:23PM +0100, Cl?ment L?ger wrote: > >> This series adds tests for SBI SSE extension as well as needed > >> infrastructure for SSE support. It also adds test specific asm-offsets > >> generation to use custom OFFSET and DEFINE from the test directory. > > > > Is there an opensbi branch I should be using to test this? There are > > currently 54 failures reported with opensbi's master branch, and, with > > opensbi v1.5.1, which is the version provided by qemu's master branch, > > I get a crash which leads to a recursive stack walk. The crash occurs > > in what I'm guessing is sbi_sse_inject() by the last successful output. > > Yeah that's due to a6/a7 being inverted at injection time. > > > > > I can't merge this without it skipping/kfailing with qemu's opensbi, > > otherwise it'll fail CI. We could change CI to be more tolerant, but I'd > > rather use kfail instead, and of course not crash. > > Yes, the branch dev/cleger/sse on github can be used: > > https://github.com/rivosinc/opensbi/tree/dev/cleger/sse > > I'm waiting for the specification error changes to be merged before > sending this one. While ugly, I'm not opposed to doing an SBI implementation ID and version check at the entry of the SSE tests and then just SKIP when we detect opensbi and not a late enough version of it. If we go that route, then please create a separate patch that adds a couple functions in lib/riscv/sbi allowing all sbi unit tests to easily check for specific SBI implementations and versions. (We'll probably want to steal the kernel's sbi_mk_version and add an enum or defines for the SBI implementation IDs.) Thanks, drew From cleger at rivosinc.com Fri Mar 14 07:40:10 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Fri, 14 Mar 2025 15:40:10 +0100 Subject: [kvm-unit-tests PATCH v9 0/6] riscv: add SBI SSE extension tests In-Reply-To: <20250314-2452599f59b82e53e99100e7@orel> References: <20250314111030.3728671-1-cleger@rivosinc.com> <20250314-eb8cf0b719942c912e254ab2@orel> <20250314-2452599f59b82e53e99100e7@orel> Message-ID: On 14/03/2025 15:34, Andrew Jones wrote: > On Fri, Mar 14, 2025 at 03:23:39PM +0100, Cl?ment L?ger wrote: >> >> >> On 14/03/2025 15:19, Andrew Jones wrote: >>> On Fri, Mar 14, 2025 at 12:10:23PM +0100, Cl?ment L?ger wrote: >>>> This series adds tests for SBI SSE extension as well as needed >>>> infrastructure for SSE support. It also adds test specific asm-offsets >>>> generation to use custom OFFSET and DEFINE from the test directory. >>> >>> Is there an opensbi branch I should be using to test this? There are >>> currently 54 failures reported with opensbi's master branch, and, with >>> opensbi v1.5.1, which is the version provided by qemu's master branch, >>> I get a crash which leads to a recursive stack walk. The crash occurs >>> in what I'm guessing is sbi_sse_inject() by the last successful output. >> >> Yeah that's due to a6/a7 being inverted at injection time. >> >>> >>> I can't merge this without it skipping/kfailing with qemu's opensbi, >>> otherwise it'll fail CI. We could change CI to be more tolerant, but I'd >>> rather use kfail instead, and of course not crash. >> >> Yes, the branch dev/cleger/sse on github can be used: >> >> https://github.com/rivosinc/opensbi/tree/dev/cleger/sse >> >> I'm waiting for the specification error changes to be merged before >> sending this one. > > While ugly, I'm not opposed to doing an SBI implementation ID and > version check at the entry of the SSE tests and then just SKIP > when we detect opensbi and not a late enough version of it. If we > go that route, then please create a separate patch that adds a > couple functions in lib/riscv/sbi allowing all sbi unit tests to > easily check for specific SBI implementations and versions. (We'll > probably want to steal the kernel's sbi_mk_version and add an enum > or defines for the SBI implementation IDs.) Based on the fact that it actually crashes on earlier OpenSBI version, I feel like we are not going to cut it if we want it to be reliable on all OpenSBI versions. I'll take a look at what you propose. Thanks, Cl?ment > > Thanks, > drew From jean-philippe at linaro.org Fri Mar 14 08:49:00 2025 From: jean-philippe at linaro.org (Jean-Philippe Brucker) Date: Fri, 14 Mar 2025 15:49:00 +0000 Subject: [kvm-unit-tests PATCH v2 0/5] arm64: Change the default QEMU CPU type to "max" Message-ID: <20250314154904.3946484-2-jean-philippe@linaro.org> This is v2 of the series that cleans up the configure flags and sets the default CPU type to "max" on arm64. Since v1 [1] (sent by Alexandru) we addressed the review comments and distinguished "--processor" from the new "--qemu-cpu", to configure build and run flags respectively. The default TCG CPU on arm64 was the ancient Cortex-A57. With this change, we can test the latest features of arm64 QEMU, and for example run Vladimir's MTE test [2]. [1] https://lore.kernel.org/all/20250110135848.35465-1-alexandru.elisei at arm.com/ [2] https://lore.kernel.org/all/20250227152240.118721-1-vladimir.murzin at arm.com/ Alexandru Elisei (3): configure: arm64: Don't display 'aarch64' as the default architecture configure: arm/arm64: Display the correct default processor arm64: Implement the ./configure --processor option Jean-Philippe Brucker (2): configure: Add --qemu-cpu option arm64: Use -cpu max as the default for TCG scripts/mkstandalone.sh | 2 +- arm/run | 17 +++++++++++------ riscv/run | 8 ++++---- configure | 42 +++++++++++++++++++++++++++++++---------- arm/Makefile.arm | 1 - arm/Makefile.common | 1 + 6 files changed, 49 insertions(+), 22 deletions(-) base-commit: 0cc3a351b925928827baa4b69cf0e46ff5837083 -- 2.48.1 From jean-philippe at linaro.org Fri Mar 14 08:49:01 2025 From: jean-philippe at linaro.org (Jean-Philippe Brucker) Date: Fri, 14 Mar 2025 15:49:01 +0000 Subject: [kvm-unit-tests PATCH v2 1/5] configure: arm64: Don't display 'aarch64' as the default architecture In-Reply-To: <20250314154904.3946484-2-jean-philippe@linaro.org> References: <20250314154904.3946484-2-jean-philippe@linaro.org> Message-ID: <20250314154904.3946484-3-jean-philippe@linaro.org> From: Alexandru Elisei --arch=aarch64, intentional or not, has been supported since the initial arm64 support, commit 39ac3f8494be ("arm64: initial drop"). However, "aarch64" does not show up in the list of supported architectures, but it's displayed as the default architecture if doing ./configure --help on an arm64 machine. Keep everything consistent and make sure that the default value for $arch is "arm64", but still allow --arch=aarch64, in case they are users that use this configuration for kvm-unit-tests. The help text for --arch changes from: --arch=ARCH architecture to compile for (aarch64). ARCH can be one of: arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 to: --arch=ARCH architecture to compile for (arm64). ARCH can be one of: arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 Signed-off-by: Alexandru Elisei --- configure | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/configure b/configure index 06532a89..dc3413fc 100755 --- a/configure +++ b/configure @@ -15,8 +15,9 @@ objdump=objdump readelf=readelf ar=ar addr2line=addr2line -arch=$(uname -m | sed -e 's/i.86/i386/;s/arm64/aarch64/;s/arm.*/arm/;s/ppc64.*/ppc64/') -host=$arch +host=$(uname -m | sed -e 's/i.86/i386/;s/arm64/aarch64/;s/arm.*/arm/;s/ppc64.*/ppc64/') +arch=$host +[ "$arch" = "aarch64" ] && arch="arm64" cross_prefix= endian="" pretty_print_stacks=yes -- 2.48.1 From jean-philippe at linaro.org Fri Mar 14 08:49:02 2025 From: jean-philippe at linaro.org (Jean-Philippe Brucker) Date: Fri, 14 Mar 2025 15:49:02 +0000 Subject: [kvm-unit-tests PATCH v2 2/5] configure: arm/arm64: Display the correct default processor In-Reply-To: <20250314154904.3946484-2-jean-philippe@linaro.org> References: <20250314154904.3946484-2-jean-philippe@linaro.org> Message-ID: <20250314154904.3946484-4-jean-philippe@linaro.org> From: Alexandru Elisei The help text for the --processor option displays the architecture name as the default processor type. But the default for arm is cortex-a15, and for arm64 is cortex-a57. Teach configure to display the correct default processor type for these two architectures. Signed-off-by: Alexandru Elisei --- configure | 30 ++++++++++++++++++++++-------- 1 file changed, 22 insertions(+), 8 deletions(-) diff --git a/configure b/configure index dc3413fc..5306bad3 100755 --- a/configure +++ b/configure @@ -5,6 +5,24 @@ if [ -z "${BASH_VERSINFO[0]}" ] || [ "${BASH_VERSINFO[0]}" -lt 4 ] ; then exit 1 fi +function get_default_processor() +{ + local arch="$1" + + case "$arch" in + "arm") + default_processor="cortex-a15" + ;; + "arm64") + default_processor="cortex-a57" + ;; + *) + default_processor=$arch + esac + + echo "$default_processor" +} + srcdir=$(cd "$(dirname "$0")"; pwd) prefix=/usr/local cc=gcc @@ -43,13 +61,14 @@ else fi usage() { + [ -z "$processor" ] && processor=$(get_default_processor $arch) cat <<-EOF Usage: $0 [options] Options include: --arch=ARCH architecture to compile for ($arch). ARCH can be one of: arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 - --processor=PROCESSOR processor to compile for ($arch) + --processor=PROCESSOR processor to compile for ($processor) --target=TARGET target platform that the tests will be running on (qemu or kvmtool, default is qemu) (arm/arm64 only) --cross-prefix=PREFIX cross compiler prefix @@ -319,13 +338,8 @@ if [ "$earlycon" ]; then fi fi -[ -z "$processor" ] && processor="$arch" - -if [ "$processor" = "arm64" ]; then - processor="cortex-a57" -elif [ "$processor" = "arm" ]; then - processor="cortex-a15" -fi +# $arch will have changed when cross-compiling. +[ -z "$processor" ] && processor=$(get_default_processor $arch) if [ "$arch" = "i386" ] || [ "$arch" = "x86_64" ]; then testdir=x86 -- 2.48.1 From jean-philippe at linaro.org Fri Mar 14 08:49:03 2025 From: jean-philippe at linaro.org (Jean-Philippe Brucker) Date: Fri, 14 Mar 2025 15:49:03 +0000 Subject: [kvm-unit-tests PATCH v2 3/5] arm64: Implement the ./configure --processor option In-Reply-To: <20250314154904.3946484-2-jean-philippe@linaro.org> References: <20250314154904.3946484-2-jean-philippe@linaro.org> Message-ID: <20250314154904.3946484-5-jean-philippe@linaro.org> From: Alexandru Elisei The help text for the ./configure --processor option says: --processor=PROCESSOR processor to compile for (cortex-a57) but, unlike arm, the build system does not pass a -mcpu argument to the compiler. Fix it, and bring arm64 at parity with arm. Note that this introduces a regression, which is also present on arm: if the --processor argument is something that the compiler doesn't understand, but qemu does (like 'max'), then compilation fails. This will be fixed in a following patch; another fix is to specify a CPU model that gcc implements by using --cflags=-mcpu=. Reviewed-by: Andrew Jones Signed-off-by: Alexandru Elisei --- arm/Makefile.arm | 1 - arm/Makefile.common | 1 + 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/arm/Makefile.arm b/arm/Makefile.arm index 7fd39f3a..d6250b7f 100644 --- a/arm/Makefile.arm +++ b/arm/Makefile.arm @@ -12,7 +12,6 @@ $(error Cannot build arm32 tests as EFI apps) endif CFLAGS += $(machine) -CFLAGS += -mcpu=$(PROCESSOR) CFLAGS += -mno-unaligned-access ifeq ($(TARGET),qemu) diff --git a/arm/Makefile.common b/arm/Makefile.common index f828dbe0..a5d97bcf 100644 --- a/arm/Makefile.common +++ b/arm/Makefile.common @@ -25,6 +25,7 @@ AUXFLAGS ?= 0x0 # stack.o relies on frame pointers. KEEP_FRAME_POINTER := y +CFLAGS += -mcpu=$(PROCESSOR) CFLAGS += -std=gnu99 CFLAGS += -ffreestanding CFLAGS += -O2 -- 2.48.1 From jean-philippe at linaro.org Fri Mar 14 08:49:04 2025 From: jean-philippe at linaro.org (Jean-Philippe Brucker) Date: Fri, 14 Mar 2025 15:49:04 +0000 Subject: [kvm-unit-tests PATCH v2 4/5] configure: Add --qemu-cpu option In-Reply-To: <20250314154904.3946484-2-jean-philippe@linaro.org> References: <20250314154904.3946484-2-jean-philippe@linaro.org> Message-ID: <20250314154904.3946484-6-jean-philippe@linaro.org> Add the --qemu-cpu option to let users set the CPU type to run on. At the moment --processor allows to set both GCC -mcpu flag and QEMU -cpu. On Arm we'd like to pass `-cpu max` to QEMU in order to enable all the TCG features by default, and it could also be nice to let users modify the CPU capabilities by setting extra -cpu options. Since GCC -mcpu doesn't accept "max" or "host", separate the compiler and QEMU arguments. `--processor` is now exclusively for compiler options, as indicated by its documentation ("processor to compile for"). So use $QEMU_CPU on RISC-V as well. Suggested-by: Andrew Jones Signed-off-by: Jean-Philippe Brucker --- scripts/mkstandalone.sh | 2 +- arm/run | 17 +++++++++++------ riscv/run | 8 ++++---- configure | 7 +++++++ 4 files changed, 23 insertions(+), 11 deletions(-) diff --git a/scripts/mkstandalone.sh b/scripts/mkstandalone.sh index 2318a85f..6b5f725d 100755 --- a/scripts/mkstandalone.sh +++ b/scripts/mkstandalone.sh @@ -42,7 +42,7 @@ generate_test () config_export ARCH config_export ARCH_NAME - config_export PROCESSOR + config_export QEMU_CPU echo "echo BUILD_HEAD=$(cat build-head)" diff --git a/arm/run b/arm/run index efdd44ce..561bafab 100755 --- a/arm/run +++ b/arm/run @@ -8,7 +8,7 @@ if [ -z "$KUT_STANDALONE" ]; then source config.mak source scripts/arch-run.bash fi -processor="$PROCESSOR" +qemu_cpu="$QEMU_CPU" if [ "$QEMU" ] && [ -z "$ACCEL" ] && [ "$HOST" = "aarch64" ] && [ "$ARCH" = "arm" ] && @@ -37,12 +37,17 @@ if [ "$ACCEL" = "kvm" ]; then fi fi -if [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ]; then - if [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ]; then - processor="host" +if [ -z "$qemu_cpu" ]; then + if ( [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ] ) && + ( [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ] ); then + qemu_cpu="host" if [ "$ARCH" = "arm" ] && [ "$HOST" = "aarch64" ]; then - processor+=",aarch64=off" + qemu_cpu+=",aarch64=off" fi + elif [ "$ARCH" = "arm64" ]; then + qemu_cpu="cortex-a57" + else + qemu_cpu="cortex-a15" fi fi @@ -71,7 +76,7 @@ if $qemu $M -device '?' | grep -q pci-testdev; then fi A="-accel $ACCEL$ACCEL_PROPS" -command="$qemu -nodefaults $M $A -cpu $processor $chr_testdev $pci_testdev" +command="$qemu -nodefaults $M $A -cpu $qemu_cpu $chr_testdev $pci_testdev" command+=" -display none -serial stdio" command="$(migration_cmd) $(timeout_cmd) $command" diff --git a/riscv/run b/riscv/run index e2f5a922..02fcf0c0 100755 --- a/riscv/run +++ b/riscv/run @@ -11,12 +11,12 @@ fi # Allow user overrides of some config.mak variables mach=$MACHINE_OVERRIDE -processor=$PROCESSOR_OVERRIDE +qemu_cpu=$QEMU_CPU_OVERRIDE firmware=$FIRMWARE_OVERRIDE -[ "$PROCESSOR" = "$ARCH" ] && PROCESSOR="max" +[ -z "$QEMU_CPU" ] && QEMU_CPU="max" : "${mach:=virt}" -: "${processor:=$PROCESSOR}" +: "${qemu_cpu:=$QEMU_CPU}" : "${firmware:=$FIRMWARE}" [ "$firmware" ] && firmware="-bios $firmware" @@ -32,7 +32,7 @@ fi mach="-machine $mach" command="$qemu -nodefaults -nographic -serial mon:stdio" -command+=" $mach $acc $firmware -cpu $processor " +command+=" $mach $acc $firmware -cpu $qemu_cpu " command="$(migration_cmd) $(timeout_cmd) $command" if [ "$UEFI_SHELL_RUN" = "y" ]; then diff --git a/configure b/configure index 5306bad3..d25bd23e 100755 --- a/configure +++ b/configure @@ -52,6 +52,7 @@ page_size= earlycon= efi= efi_direct= +qemu_cpu= # Enable -Werror by default for git repositories only (i.e. developer builds) if [ -e "$srcdir"/.git ]; then @@ -69,6 +70,8 @@ usage() { --arch=ARCH architecture to compile for ($arch). ARCH can be one of: arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 --processor=PROCESSOR processor to compile for ($processor) + --qemu-cpu=CPU the CPU model to run on. The default depends on + the configuration, usually it is "host" or "max". --target=TARGET target platform that the tests will be running on (qemu or kvmtool, default is qemu) (arm/arm64 only) --cross-prefix=PREFIX cross compiler prefix @@ -142,6 +145,9 @@ while [[ $optno -le $argc ]]; do --processor) processor="$arg" ;; + --qemu-cpu) + qemu_cpu="$arg" + ;; --target) target="$arg" ;; @@ -464,6 +470,7 @@ ARCH=$arch ARCH_NAME=$arch_name ARCH_LIBDIR=$arch_libdir PROCESSOR=$processor +QEMU_CPU=$qemu_cpu CC=$cc CFLAGS=$cflags LD=$cross_prefix$ld -- 2.48.1 From jean-philippe at linaro.org Fri Mar 14 08:49:05 2025 From: jean-philippe at linaro.org (Jean-Philippe Brucker) Date: Fri, 14 Mar 2025 15:49:05 +0000 Subject: [kvm-unit-tests PATCH v2 5/5] arm64: Use -cpu max as the default for TCG In-Reply-To: <20250314154904.3946484-2-jean-philippe@linaro.org> References: <20250314154904.3946484-2-jean-philippe@linaro.org> Message-ID: <20250314154904.3946484-7-jean-philippe@linaro.org> In order to test all the latest features, default to "max" as the QEMU CPU type on arm64. Signed-off-by: Jean-Philippe Brucker --- arm/run | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arm/run b/arm/run index 561bafab..84232e28 100755 --- a/arm/run +++ b/arm/run @@ -45,7 +45,7 @@ if [ -z "$qemu_cpu" ]; then qemu_cpu+=",aarch64=off" fi elif [ "$ARCH" = "arm64" ]; then - qemu_cpu="cortex-a57" + qemu_cpu="max" else qemu_cpu="cortex-a15" fi -- 2.48.1 From atishp at rivosinc.com Mon Mar 17 00:41:44 2025 From: atishp at rivosinc.com (Atish Patra) Date: Mon, 17 Mar 2025 00:41:44 -0700 Subject: [PATCH] RISC-V: KVM: Teardown riscv specific bits after kvm_exit Message-ID: <20250317-kvm_exit_fix-v1-1-aa5240c5dbd2@rivosinc.com> During a module removal, kvm_exit invokes arch specific disable call which disables AIA. However, we invoke aia_exit before kvm_exit resulting in the following warning. KVM kernel module can't be inserted afterwards due to inconsistent state of IRQ. [25469.031389] percpu IRQ 31 still enabled on CPU0! [25469.031732] WARNING: CPU: 3 PID: 943 at kernel/irq/manage.c:2476 __free_percpu_irq+0xa2/0x150 [25469.031804] Modules linked in: kvm(-) [25469.031848] CPU: 3 UID: 0 PID: 943 Comm: rmmod Not tainted 6.14.0-rc5-06947-g91c763118f47-dirty #2 [25469.031905] Hardware name: riscv-virtio,qemu (DT) [25469.031928] epc : __free_percpu_irq+0xa2/0x150 [25469.031976] ra : __free_percpu_irq+0xa2/0x150 [25469.032197] epc : ffffffff8007db1e ra : ffffffff8007db1e sp : ff2000000088bd50 [25469.032241] gp : ffffffff8131cef8 tp : ff60000080b96400 t0 : ff2000000088baf8 [25469.032285] t1 : fffffffffffffffc t2 : 5249207570637265 s0 : ff2000000088bd90 [25469.032329] s1 : ff60000098b21080 a0 : 037d527a15eb4f00 a1 : 037d527a15eb4f00 [25469.032372] a2 : 0000000000000023 a3 : 0000000000000001 a4 : ffffffff8122dbf8 [25469.032410] a5 : 0000000000000fff a6 : 0000000000000000 a7 : ffffffff8122dc10 [25469.032448] s2 : ff60000080c22eb0 s3 : 0000000200000022 s4 : 000000000000001f [25469.032488] s5 : ff60000080c22e00 s6 : ffffffff80c351c0 s7 : 0000000000000000 [25469.032582] s8 : 0000000000000003 s9 : 000055556b7fb490 s10: 00007ffff0e12fa0 [25469.032621] s11: 00007ffff0e13e9a t3 : ffffffff81354ac7 t4 : ffffffff81354ac7 [25469.032664] t5 : ffffffff81354ac8 t6 : ffffffff81354ac7 [25469.032698] status: 0000000200000100 badaddr: ffffffff8007db1e cause: 0000000000000003 [25469.032738] [] __free_percpu_irq+0xa2/0x150 [25469.032797] [] free_percpu_irq+0x30/0x5e [25469.032856] [] kvm_riscv_aia_exit+0x40/0x42 [kvm] [25469.033947] [] cleanup_module+0x10/0x32 [kvm] [25469.035300] [] __riscv_sys_delete_module+0x18e/0x1fc [25469.035374] [] syscall_handler+0x3a/0x46 [25469.035456] [] do_trap_ecall_u+0x72/0x134 [25469.035536] [] handle_exception+0x148/0x156 Invoke aia_exit and other arch specific cleanup functions after kvm_exit so that disable gets a chance to be called first before exit. Fixes: 54e43320c2ba ("RISC-V: KVM: Initial skeletal support for AIA") Signed-off-by: Atish Patra --- arch/riscv/kvm/main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c index 1fa8be5ee509..4b24705dc63a 100644 --- a/arch/riscv/kvm/main.c +++ b/arch/riscv/kvm/main.c @@ -172,8 +172,8 @@ module_init(riscv_kvm_init); static void __exit riscv_kvm_exit(void) { - kvm_riscv_teardown(); - kvm_exit(); + + kvm_riscv_teardown(); } module_exit(riscv_kvm_exit); --- base-commit: 4701f33a10702d5fc577c32434eb62adde0a1ae1 change-id: 20250316-kvm_exit_fix-77cd0632d740 -- Regards, Atish patra From eric.auger at redhat.com Mon Mar 17 01:19:06 2025 From: eric.auger at redhat.com (Eric Auger) Date: Mon, 17 Mar 2025 09:19:06 +0100 Subject: [kvm-unit-tests PATCH v2 1/5] configure: arm64: Don't display 'aarch64' as the default architecture In-Reply-To: <20250314154904.3946484-3-jean-philippe@linaro.org> References: <20250314154904.3946484-2-jean-philippe@linaro.org> <20250314154904.3946484-3-jean-philippe@linaro.org> Message-ID: <1b0233dc-b303-4317-a65d-572cc3582b8a@redhat.com> Hi Jean-Philippe, On 3/14/25 4:49 PM, Jean-Philippe Brucker wrote: > From: Alexandru Elisei > > --arch=aarch64, intentional or not, has been supported since the initial > arm64 support, commit 39ac3f8494be ("arm64: initial drop"). However, > "aarch64" does not show up in the list of supported architectures, but > it's displayed as the default architecture if doing ./configure --help > on an arm64 machine. > > Keep everything consistent and make sure that the default value for > $arch is "arm64", but still allow --arch=aarch64, in case they are users there > that use this configuration for kvm-unit-tests. > > The help text for --arch changes from: > > --arch=ARCH architecture to compile for (aarch64). ARCH can be one of: > arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 > > to: > > --arch=ARCH architecture to compile for (arm64). ARCH can be one of: > arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 > > Signed-off-by: Alexandru Elisei > --- > configure | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/configure b/configure > index 06532a89..dc3413fc 100755 > --- a/configure > +++ b/configure > @@ -15,8 +15,9 @@ objdump=objdump > readelf=readelf > ar=ar > addr2line=addr2line > -arch=$(uname -m | sed -e 's/i.86/i386/;s/arm64/aarch64/;s/arm.*/arm/;s/ppc64.*/ppc64/') > -host=$arch > +host=$(uname -m | sed -e 's/i.86/i386/;s/arm64/aarch64/;s/arm.*/arm/;s/ppc64.*/ppc64/') > +arch=$host > +[ "$arch" = "aarch64" ] && arch="arm64" Looks the same it done again below arch_name=$arch [ "$arch" = "aarch64" ] && arch="arm64" [ "$arch_name" = "arm64" ] && arch_name="aarch64" <--- arch_libdir=$arch maybe we could move all arch settings at the same location so that it becomes clearer? Thanks Eric > cross_prefix= > endian="" > pretty_print_stacks=yes From eric.auger at redhat.com Mon Mar 17 01:27:32 2025 From: eric.auger at redhat.com (Eric Auger) Date: Mon, 17 Mar 2025 09:27:32 +0100 Subject: [kvm-unit-tests PATCH v2 1/5] configure: arm64: Don't display 'aarch64' as the default architecture In-Reply-To: <1b0233dc-b303-4317-a65d-572cc3582b8a@redhat.com> References: <20250314154904.3946484-2-jean-philippe@linaro.org> <20250314154904.3946484-3-jean-philippe@linaro.org> <1b0233dc-b303-4317-a65d-572cc3582b8a@redhat.com> Message-ID: On 3/17/25 9:19 AM, Eric Auger wrote: > Hi Jean-Philippe, > > > On 3/14/25 4:49 PM, Jean-Philippe Brucker wrote: >> From: Alexandru Elisei >> >> --arch=aarch64, intentional or not, has been supported since the initial >> arm64 support, commit 39ac3f8494be ("arm64: initial drop"). However, >> "aarch64" does not show up in the list of supported architectures, but >> it's displayed as the default architecture if doing ./configure --help >> on an arm64 machine. >> >> Keep everything consistent and make sure that the default value for >> $arch is "arm64", but still allow --arch=aarch64, in case they are users > there >> that use this configuration for kvm-unit-tests. >> >> The help text for --arch changes from: >> >> --arch=ARCH architecture to compile for (aarch64). ARCH can be one of: >> arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 >> >> to: >> >> --arch=ARCH architecture to compile for (arm64). ARCH can be one of: >> arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 >> >> Signed-off-by: Alexandru Elisei >> --- >> configure | 5 +++-- >> 1 file changed, 3 insertions(+), 2 deletions(-) >> >> diff --git a/configure b/configure >> index 06532a89..dc3413fc 100755 >> --- a/configure >> +++ b/configure >> @@ -15,8 +15,9 @@ objdump=objdump >> readelf=readelf >> ar=ar >> addr2line=addr2line >> -arch=$(uname -m | sed -e 's/i.86/i386/;s/arm64/aarch64/;s/arm.*/arm/;s/ppc64.*/ppc64/') >> -host=$arch >> +host=$(uname -m | sed -e 's/i.86/i386/;s/arm64/aarch64/;s/arm.*/arm/;s/ppc64.*/ppc64/') >> +arch=$host >> +[ "$arch" = "aarch64" ] && arch="arm64" > Looks the same it done again below Ignore this. This is done again after explicit arch setting :-/ Need another coffee So looks good to me Reviewed-by: Eric Auger Eric > > arch_name=$arch > [ "$arch" = "aarch64" ] && arch="arm64" > [ "$arch_name" = "arm64" ] && arch_name="aarch64" <--- > arch_libdir=$arch > > maybe we could move all arch settings at the same location so that it > becomes clearer? > > Thanks > > Eric > >> cross_prefix= >> endian="" >> pretty_print_stacks=yes > From eric.auger at redhat.com Mon Mar 17 01:30:52 2025 From: eric.auger at redhat.com (Eric Auger) Date: Mon, 17 Mar 2025 09:30:52 +0100 Subject: [kvm-unit-tests PATCH v2 2/5] configure: arm/arm64: Display the correct default processor In-Reply-To: <20250314154904.3946484-4-jean-philippe@linaro.org> References: <20250314154904.3946484-2-jean-philippe@linaro.org> <20250314154904.3946484-4-jean-philippe@linaro.org> Message-ID: <312bfffa-8e91-4b64-b88e-c9868a59d7ec@redhat.com> On 3/14/25 4:49 PM, Jean-Philippe Brucker wrote: > From: Alexandru Elisei > > The help text for the --processor option displays the architecture name as > the default processor type. But the default for arm is cortex-a15, and for > arm64 is cortex-a57. Teach configure to display the correct default > processor type for these two architectures. > > Signed-off-by: Alexandru Elisei * Reviewed-by: Eric Auger Eric * > --- > configure | 30 ++++++++++++++++++++++-------- > 1 file changed, 22 insertions(+), 8 deletions(-) > > diff --git a/configure b/configure > index dc3413fc..5306bad3 100755 > --- a/configure > +++ b/configure > @@ -5,6 +5,24 @@ if [ -z "${BASH_VERSINFO[0]}" ] || [ "${BASH_VERSINFO[0]}" -lt 4 ] ; then > exit 1 > fi > > +function get_default_processor() > +{ > + local arch="$1" > + > + case "$arch" in > + "arm") > + default_processor="cortex-a15" > + ;; > + "arm64") > + default_processor="cortex-a57" > + ;; > + *) > + default_processor=$arch > + esac > + > + echo "$default_processor" > +} > + > srcdir=$(cd "$(dirname "$0")"; pwd) > prefix=/usr/local > cc=gcc > @@ -43,13 +61,14 @@ else > fi > > usage() { > + [ -z "$processor" ] && processor=$(get_default_processor $arch) > cat <<-EOF > Usage: $0 [options] > > Options include: > --arch=ARCH architecture to compile for ($arch). ARCH can be one of: > arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 > - --processor=PROCESSOR processor to compile for ($arch) > + --processor=PROCESSOR processor to compile for ($processor) > --target=TARGET target platform that the tests will be running on (qemu or > kvmtool, default is qemu) (arm/arm64 only) > --cross-prefix=PREFIX cross compiler prefix > @@ -319,13 +338,8 @@ if [ "$earlycon" ]; then > fi > fi > > -[ -z "$processor" ] && processor="$arch" > - > -if [ "$processor" = "arm64" ]; then > - processor="cortex-a57" > -elif [ "$processor" = "arm" ]; then > - processor="cortex-a15" > -fi > +# $arch will have changed when cross-compiling. > +[ -z "$processor" ] && processor=$(get_default_processor $arch) > > if [ "$arch" = "i386" ] || [ "$arch" = "x86_64" ]; then > testdir=x86 From eric.auger at redhat.com Mon Mar 17 01:34:46 2025 From: eric.auger at redhat.com (Eric Auger) Date: Mon, 17 Mar 2025 09:34:46 +0100 Subject: [kvm-unit-tests PATCH v2 3/5] arm64: Implement the ./configure --processor option In-Reply-To: <20250314154904.3946484-5-jean-philippe@linaro.org> References: <20250314154904.3946484-2-jean-philippe@linaro.org> <20250314154904.3946484-5-jean-philippe@linaro.org> Message-ID: On 3/14/25 4:49 PM, Jean-Philippe Brucker wrote: > From: Alexandru Elisei > > The help text for the ./configure --processor option says: > > --processor=PROCESSOR processor to compile for (cortex-a57) > > but, unlike arm, the build system does not pass a -mcpu argument to the > compiler. Fix it, and bring arm64 at parity with arm. > > Note that this introduces a regression, which is also present on arm: if > the --processor argument is something that the compiler doesn't understand, > but qemu does (like 'max'), then compilation fails. This will be fixed in a > following patch; another fix is to specify a CPU model that gcc implements > by using --cflags=-mcpu=. > > Reviewed-by: Andrew Jones > Signed-off-by: Alexandru Elisei Reviewed-by: Eric Auger Eric > --- > arm/Makefile.arm | 1 - > arm/Makefile.common | 1 + > 2 files changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arm/Makefile.arm b/arm/Makefile.arm > index 7fd39f3a..d6250b7f 100644 > --- a/arm/Makefile.arm > +++ b/arm/Makefile.arm > @@ -12,7 +12,6 @@ $(error Cannot build arm32 tests as EFI apps) > endif > > CFLAGS += $(machine) > -CFLAGS += -mcpu=$(PROCESSOR) > CFLAGS += -mno-unaligned-access > > ifeq ($(TARGET),qemu) > diff --git a/arm/Makefile.common b/arm/Makefile.common > index f828dbe0..a5d97bcf 100644 > --- a/arm/Makefile.common > +++ b/arm/Makefile.common > @@ -25,6 +25,7 @@ AUXFLAGS ?= 0x0 > # stack.o relies on frame pointers. > KEEP_FRAME_POINTER := y > > +CFLAGS += -mcpu=$(PROCESSOR) > CFLAGS += -std=gnu99 > CFLAGS += -ffreestanding > CFLAGS += -O2 From eric.auger at redhat.com Mon Mar 17 01:53:12 2025 From: eric.auger at redhat.com (Eric Auger) Date: Mon, 17 Mar 2025 09:53:12 +0100 Subject: [kvm-unit-tests PATCH v2 4/5] configure: Add --qemu-cpu option In-Reply-To: <20250314154904.3946484-6-jean-philippe@linaro.org> References: <20250314154904.3946484-2-jean-philippe@linaro.org> <20250314154904.3946484-6-jean-philippe@linaro.org> Message-ID: <4ab33998-4d16-4ecb-aa76-3919afcbf2d7@redhat.com> Hi Jean-Philippe, On 3/14/25 4:49 PM, Jean-Philippe Brucker wrote: > Add the --qemu-cpu option to let users set the CPU type to run on. > At the moment --processor allows to set both GCC -mcpu flag and QEMU > -cpu. On Arm we'd like to pass `-cpu max` to QEMU in order to enable all > the TCG features by default, and it could also be nice to let users > modify the CPU capabilities by setting extra -cpu options. > Since GCC -mcpu doesn't accept "max" or "host", separate the compiler > and QEMU arguments. > > `--processor` is now exclusively for compiler options, as indicated by > its documentation ("processor to compile for"). So use $QEMU_CPU on > RISC-V as well. > > Suggested-by: Andrew Jones > Signed-off-by: Jean-Philippe Brucker > --- > scripts/mkstandalone.sh | 2 +- > arm/run | 17 +++++++++++------ > riscv/run | 8 ++++---- > configure | 7 +++++++ > 4 files changed, 23 insertions(+), 11 deletions(-) > > diff --git a/scripts/mkstandalone.sh b/scripts/mkstandalone.sh > index 2318a85f..6b5f725d 100755 > --- a/scripts/mkstandalone.sh > +++ b/scripts/mkstandalone.sh > @@ -42,7 +42,7 @@ generate_test () > > config_export ARCH > config_export ARCH_NAME > - config_export PROCESSOR > + config_export QEMU_CPU > > echo "echo BUILD_HEAD=$(cat build-head)" > > diff --git a/arm/run b/arm/run > index efdd44ce..561bafab 100755 > --- a/arm/run > +++ b/arm/run > @@ -8,7 +8,7 @@ if [ -z "$KUT_STANDALONE" ]; then > source config.mak > source scripts/arch-run.bash > fi > -processor="$PROCESSOR" > +qemu_cpu="$QEMU_CPU" > > if [ "$QEMU" ] && [ -z "$ACCEL" ] && > [ "$HOST" = "aarch64" ] && [ "$ARCH" = "arm" ] && > @@ -37,12 +37,17 @@ if [ "$ACCEL" = "kvm" ]; then > fi > fi > > -if [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ]; then > - if [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ]; then > - processor="host" > +if [ -z "$qemu_cpu" ]; then > + if ( [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ] ) && > + ( [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ] ); then > + qemu_cpu="host" > if [ "$ARCH" = "arm" ] && [ "$HOST" = "aarch64" ]; then > - processor+=",aarch64=off" > + qemu_cpu+=",aarch64=off" > fi > + elif [ "$ARCH" = "arm64" ]; then > + qemu_cpu="cortex-a57" > + else > + qemu_cpu="cortex-a15" > fi > fi > > @@ -71,7 +76,7 @@ if $qemu $M -device '?' | grep -q pci-testdev; then > fi > > A="-accel $ACCEL$ACCEL_PROPS" > -command="$qemu -nodefaults $M $A -cpu $processor $chr_testdev $pci_testdev" > +command="$qemu -nodefaults $M $A -cpu $qemu_cpu $chr_testdev $pci_testdev" > command+=" -display none -serial stdio" > command="$(migration_cmd) $(timeout_cmd) $command" > > diff --git a/riscv/run b/riscv/run > index e2f5a922..02fcf0c0 100755 > --- a/riscv/run > +++ b/riscv/run > @@ -11,12 +11,12 @@ fi > > # Allow user overrides of some config.mak variables > mach=$MACHINE_OVERRIDE > -processor=$PROCESSOR_OVERRIDE > +qemu_cpu=$QEMU_CPU_OVERRIDE > firmware=$FIRMWARE_OVERRIDE > > -[ "$PROCESSOR" = "$ARCH" ] && PROCESSOR="max" > +[ -z "$QEMU_CPU" ] && QEMU_CPU="max" > : "${mach:=virt}" > -: "${processor:=$PROCESSOR}" > +: "${qemu_cpu:=$QEMU_CPU}" > : "${firmware:=$FIRMWARE}" > [ "$firmware" ] && firmware="-bios $firmware" > > @@ -32,7 +32,7 @@ fi > mach="-machine $mach" > > command="$qemu -nodefaults -nographic -serial mon:stdio" > -command+=" $mach $acc $firmware -cpu $processor " > +command+=" $mach $acc $firmware -cpu $qemu_cpu " > command="$(migration_cmd) $(timeout_cmd) $command" > > if [ "$UEFI_SHELL_RUN" = "y" ]; then > diff --git a/configure b/configure > index 5306bad3..d25bd23e 100755 > --- a/configure > +++ b/configure > @@ -52,6 +52,7 @@ page_size= > earlycon= > efi= > efi_direct= > +qemu_cpu= > > # Enable -Werror by default for git repositories only (i.e. developer builds) > if [ -e "$srcdir"/.git ]; then > @@ -69,6 +70,8 @@ usage() { > --arch=ARCH architecture to compile for ($arch). ARCH can be one of: > arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 > --processor=PROCESSOR processor to compile for ($processor) > + --qemu-cpu=CPU the CPU model to run on. The default depends on > + the configuration, usually it is "host" or "max". nit: I would remove 'usually it is "host" or "max"'. I don't see any consistency check between qemu_cpu and processor in case it is explicitly set? Do we care? Eric > --target=TARGET target platform that the tests will be running on (qemu or > kvmtool, default is qemu) (arm/arm64 only) > --cross-prefix=PREFIX cross compiler prefix > @@ -142,6 +145,9 @@ while [[ $optno -le $argc ]]; do > --processor) > processor="$arg" > ;; > + --qemu-cpu) > + qemu_cpu="$arg" > + ;; > --target) > target="$arg" > ;; > @@ -464,6 +470,7 @@ ARCH=$arch > ARCH_NAME=$arch_name > ARCH_LIBDIR=$arch_libdir > PROCESSOR=$processor > +QEMU_CPU=$qemu_cpu > CC=$cc > CFLAGS=$cflags > LD=$cross_prefix$ld Reviewed-by: Eric Auger Eric From eric.auger at redhat.com Mon Mar 17 03:13:52 2025 From: eric.auger at redhat.com (Eric Auger) Date: Mon, 17 Mar 2025 11:13:52 +0100 Subject: [kvm-unit-tests PATCH v2 5/5] arm64: Use -cpu max as the default for TCG In-Reply-To: <20250314154904.3946484-7-jean-philippe@linaro.org> References: <20250314154904.3946484-2-jean-philippe@linaro.org> <20250314154904.3946484-7-jean-philippe@linaro.org> Message-ID: <1ad28176-d564-4d62-898d-d1d80a2609b0@redhat.com> Hi Jean-Philippe, On 3/14/25 4:49 PM, Jean-Philippe Brucker wrote: > In order to test all the latest features, default to "max" as the QEMU > CPU type on arm64. > > Signed-off-by: Jean-Philippe Brucker Reviewed-by: Eric Auger Eric > --- > arm/run | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arm/run b/arm/run > index 561bafab..84232e28 100755 > --- a/arm/run > +++ b/arm/run > @@ -45,7 +45,7 @@ if [ -z "$qemu_cpu" ]; then > qemu_cpu+=",aarch64=off" > fi > elif [ "$ARCH" = "arm64" ]; then > - qemu_cpu="cortex-a57" > + qemu_cpu="max" > else > qemu_cpu="cortex-a15" > fi From cleger at rivosinc.com Mon Mar 17 03:19:46 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 11:19:46 +0100 Subject: [kvm-unit-tests PATCH v10 0/8] riscv: add SBI SSE extension tests Message-ID: <20250317101956.526834-1-cleger@rivosinc.com> This series adds tests for SBI SSE extension as well as needed infrastructure for SSE support. It also adds test specific asm-offsets generation to use custom OFFSET and DEFINE from the test directory. These tests can be run using an OpenSBI version that implements latest specifications modification [1] Link: https://github.com/rivosinc/opensbi/tree/dev/cleger/sse [1] --- V10: - Use && instead of || for timeout handling - Add SBI patches which introduce function to get implementer ID and version as well as implementer ID defines. - Skip injection tests in OpenSBI < v1.6 V9: - Use __ASSEMBLER__ instead of __ASSEMBLY__ - Remove extra spaces - Use assert to check global event in sse_global_event_set_current_hart() - Tabulate SSE events names table - Use sbi_sse_register() instead of sbi_sse_register_raw() in error testing - Move a report_pass() out of error path - Rework all injection tests with better error handling - Use an env var for sse event completion timeout - Add timeout for some potentially infinite while() loops V8: - Short circuit current event tests if failure happens - Remove SSE from all report strings - Indent .prio field - Add cpu_relax()/smp_rmb() where needed - Add timeout for global event ENABLED state check - Added BIT(32) aliases tests for attribute/event_id. V7: - Test ids/attributes/attributes count > 32 bits - Rename all SSE function to sbi_sse_* - Use event_id instead of event/evt - Factorize read/write test - Use virt_to_phys() for attributes read/write. - Extensively use sbiret_report_error() - Change check function return values to bool. - Added assert for stack size to be below or equal to PAGE_SIZE - Use en env variable for the maximum hart ID - Check that individual read from attributes matches the multiple attributes read. - Added multiple attributes write at once - Used READ_ONCE/WRITE_ONCE - Inject all local event at once rather than looping fopr each core. - Split test_arg for local_dispatch test so that all CPUs can run at once. - Move SSE entry and generic code to lib/riscv for other tests - Fix unmask/mask state checking V6: - Add missing $(generated-file) dependencies for "-deps" objects - Split SSE entry from sbi-asm.S to sse-asm.S and all SSE core functions since it will be useful for other tests as well (dbltrp). V5: - Update event ranges based on latest spec - Rename asm-offset-test.c to sbi-asm-offset.c V4: - Fix typo sbi_ext_ss_fid -> sbi_ext_sse_fid - Add proper asm-offset generation for tests - Move SSE specific file from lib/riscv to riscv/ V3: - Add -deps variable for test specific dependencies - Fix formatting errors/typo in sbi.h - Add missing double trap event - Alphabetize sbi-sse.c includes - Fix a6 content after unmasking event - Add SSE HART_MASK/UNMASK test - Use mv instead of move - move sbi_check_sse() definition in sbi.c - Remove sbi_sse test from unitests.cfg V2: - Rebased on origin/master and integrate it into sbi.c tests Cl?ment L?ger (8): kbuild: Allow multiple asm-offsets file to be generated riscv: Set .aux.o files as .PRECIOUS riscv: Use asm-offsets to generate SBI_EXT_HSM values riscv: sbi: Add functions for version checking lib: riscv: add functions to get implementer ID and version riscv: lib: Add SBI SSE extension definitions lib: riscv: Add SBI SSE support riscv: sbi: Add SSE extension tests scripts/asm-offsets.mak | 22 +- riscv/Makefile | 5 +- lib/riscv/asm/csr.h | 1 + lib/riscv/asm/sbi.h | 187 +++++- lib/riscv/sbi-sse-asm.S | 102 ++++ lib/riscv/asm-offsets.c | 9 + lib/riscv/sbi.c | 95 ++- riscv/sbi-tests.h | 1 + riscv/sbi-asm.S | 6 +- riscv/sbi-asm-offsets.c | 11 + riscv/sbi-sse.c | 1280 +++++++++++++++++++++++++++++++++++++++ riscv/sbi.c | 2 + riscv/.gitignore | 1 + 13 files changed, 1709 insertions(+), 13 deletions(-) create mode 100644 lib/riscv/sbi-sse-asm.S create mode 100644 riscv/sbi-asm-offsets.c create mode 100644 riscv/sbi-sse.c create mode 100644 riscv/.gitignore -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 03:19:47 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 11:19:47 +0100 Subject: [kvm-unit-tests PATCH v10 1/8] kbuild: Allow multiple asm-offsets file to be generated In-Reply-To: <20250317101956.526834-1-cleger@rivosinc.com> References: <20250317101956.526834-1-cleger@rivosinc.com> Message-ID: <20250317101956.526834-2-cleger@rivosinc.com> In order to allow multiple asm-offsets files to generated the include guard need to be different between these file. Add a asm_offset_name makefile macro to obtain an uppercase name matching the original asm offsets file. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- scripts/asm-offsets.mak | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/scripts/asm-offsets.mak b/scripts/asm-offsets.mak index 7b64162d..a5fdbf5d 100644 --- a/scripts/asm-offsets.mak +++ b/scripts/asm-offsets.mak @@ -15,10 +15,14 @@ define sed-y s:->::; p;}' endef +define asm_offset_name + $(shell echo $(notdir $(1)) | tr [:lower:]- [:upper:]_) +endef + define make_asm_offsets (set -e; \ - echo "#ifndef __ASM_OFFSETS_H__"; \ - echo "#define __ASM_OFFSETS_H__"; \ + echo "#ifndef __$(strip $(asm_offset_name))_H__"; \ + echo "#define __$(strip $(asm_offset_name))_H__"; \ echo "/*"; \ echo " * Generated file. DO NOT MODIFY."; \ echo " *"; \ @@ -29,12 +33,16 @@ define make_asm_offsets echo "#endif" ) > $@ endef -$(asm-offsets:.h=.s): $(asm-offsets:.h=.c) - $(CC) $(CFLAGS) -fverbose-asm -S -o $@ $< +define gen_asm_offsets_rules +$(1).s: $(1).c + $(CC) $(CFLAGS) -fverbose-asm -S -o $$@ $$< + +$(1).h: $(1).s + $$(call make_asm_offsets,$(1)) + cp -f $$@ lib/generated/ +endef -$(asm-offsets): $(asm-offsets:.h=.s) - $(call make_asm_offsets) - cp -f $(asm-offsets) lib/generated/ +$(foreach o,$(asm-offsets),$(eval $(call gen_asm_offsets_rules, $(o:.h=)))) OBJDIRS += lib/generated -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 03:19:48 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 11:19:48 +0100 Subject: [kvm-unit-tests PATCH v10 2/8] riscv: Set .aux.o files as .PRECIOUS In-Reply-To: <20250317101956.526834-1-cleger@rivosinc.com> References: <20250317101956.526834-1-cleger@rivosinc.com> Message-ID: <20250317101956.526834-3-cleger@rivosinc.com> When compiling, we need to keep .aux.o file or they will be removed after the compilation which leads to dependent files to be recompiled. Set these files as .PRECIOUS to keep them. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- riscv/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/riscv/Makefile b/riscv/Makefile index 52718f3f..ae9cf02a 100644 --- a/riscv/Makefile +++ b/riscv/Makefile @@ -90,6 +90,7 @@ CFLAGS += -I $(SRCDIR)/lib -I $(SRCDIR)/lib/libfdt -I lib -I $(SRCDIR)/riscv asm-offsets = lib/riscv/asm-offsets.h include $(SRCDIR)/scripts/asm-offsets.mak +.PRECIOUS: %.aux.o %.aux.o: $(SRCDIR)/lib/auxinfo.c $(CC) $(CFLAGS) -c -o $@ $< \ -DPROGNAME=\"$(notdir $(@:.aux.o=.$(exe)))\" -DAUXFLAGS=$(AUXFLAGS) -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 03:19:49 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 11:19:49 +0100 Subject: [kvm-unit-tests PATCH v10 3/8] riscv: Use asm-offsets to generate SBI_EXT_HSM values In-Reply-To: <20250317101956.526834-1-cleger@rivosinc.com> References: <20250317101956.526834-1-cleger@rivosinc.com> Message-ID: <20250317101956.526834-4-cleger@rivosinc.com> Replace hardcoded values with generated ones using sbi-asm-offset. This allows to directly use ASM_SBI_EXT_HSM and ASM_SBI_EXT_HSM_STOP in assembly. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- riscv/Makefile | 2 +- riscv/sbi-asm.S | 6 ++++-- riscv/sbi-asm-offsets.c | 11 +++++++++++ riscv/.gitignore | 1 + 4 files changed, 17 insertions(+), 3 deletions(-) create mode 100644 riscv/sbi-asm-offsets.c create mode 100644 riscv/.gitignore diff --git a/riscv/Makefile b/riscv/Makefile index ae9cf02a..02d2ac39 100644 --- a/riscv/Makefile +++ b/riscv/Makefile @@ -87,7 +87,7 @@ CFLAGS += -ffreestanding CFLAGS += -O2 CFLAGS += -I $(SRCDIR)/lib -I $(SRCDIR)/lib/libfdt -I lib -I $(SRCDIR)/riscv -asm-offsets = lib/riscv/asm-offsets.h +asm-offsets = lib/riscv/asm-offsets.h riscv/sbi-asm-offsets.h include $(SRCDIR)/scripts/asm-offsets.mak .PRECIOUS: %.aux.o diff --git a/riscv/sbi-asm.S b/riscv/sbi-asm.S index f4185496..51f46efd 100644 --- a/riscv/sbi-asm.S +++ b/riscv/sbi-asm.S @@ -6,6 +6,8 @@ */ #include #include +#include +#include #include "sbi-tests.h" @@ -57,8 +59,8 @@ sbi_hsm_check: 7: lb t0, 0(t1) pause beqz t0, 7b - li a7, 0x48534d /* SBI_EXT_HSM */ - li a6, 1 /* SBI_EXT_HSM_HART_STOP */ + li a7, ASM_SBI_EXT_HSM + li a6, ASM_SBI_EXT_HSM_HART_STOP ecall 8: pause j 8b diff --git a/riscv/sbi-asm-offsets.c b/riscv/sbi-asm-offsets.c new file mode 100644 index 00000000..bd37b6a2 --- /dev/null +++ b/riscv/sbi-asm-offsets.c @@ -0,0 +1,11 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include + +int main(void) +{ + DEFINE(ASM_SBI_EXT_HSM, SBI_EXT_HSM); + DEFINE(ASM_SBI_EXT_HSM_HART_STOP, SBI_EXT_HSM_HART_STOP); + + return 0; +} diff --git a/riscv/.gitignore b/riscv/.gitignore new file mode 100644 index 00000000..0a8c5a36 --- /dev/null +++ b/riscv/.gitignore @@ -0,0 +1 @@ +/*-asm-offsets.[hs] -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 03:19:50 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 11:19:50 +0100 Subject: [kvm-unit-tests PATCH v10 4/8] riscv: sbi: Add functions for version checking In-Reply-To: <20250317101956.526834-1-cleger@rivosinc.com> References: <20250317101956.526834-1-cleger@rivosinc.com> Message-ID: <20250317101956.526834-5-cleger@rivosinc.com> Version checking was done using some custom hardcoded values, backport a few SBI function and defines from Linux to do that cleanly. Signed-off-by: Cl?ment L?ger --- lib/riscv/asm/sbi.h | 15 +++++++++++++++ lib/riscv/sbi.c | 9 +++++++-- 2 files changed, 22 insertions(+), 2 deletions(-) diff --git a/lib/riscv/asm/sbi.h b/lib/riscv/asm/sbi.h index 2f4d91ef..197288c7 100644 --- a/lib/riscv/asm/sbi.h +++ b/lib/riscv/asm/sbi.h @@ -18,6 +18,12 @@ #define SBI_ERR_IO -13 #define SBI_ERR_DENIED_LOCKED -14 +/* SBI spec version fields */ +#define SBI_SPEC_VERSION_DEFAULT 0x1 +#define SBI_SPEC_VERSION_MAJOR_SHIFT 24 +#define SBI_SPEC_VERSION_MAJOR_MASK 0x7f +#define SBI_SPEC_VERSION_MINOR_MASK 0xffffff + #ifndef __ASSEMBLER__ #include @@ -110,6 +116,14 @@ struct sbiret { long value; }; +/* Make SBI version */ +static inline unsigned long sbi_mk_version(unsigned long major, unsigned long minor) +{ + return ((major & SBI_SPEC_VERSION_MAJOR_MASK) << SBI_SPEC_VERSION_MAJOR_SHIFT) + | (minor & SBI_SPEC_VERSION_MINOR_MASK); +} + + struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0, unsigned long arg1, unsigned long arg2, unsigned long arg3, unsigned long arg4, @@ -124,6 +138,7 @@ struct sbiret sbi_send_ipi_cpu(int cpu); struct sbiret sbi_send_ipi_cpumask(const cpumask_t *mask); struct sbiret sbi_send_ipi_broadcast(void); struct sbiret sbi_set_timer(unsigned long stime_value); +struct sbiret sbi_get_spec_version(void); long sbi_probe(int ext); #endif /* !__ASSEMBLER__ */ diff --git a/lib/riscv/sbi.c b/lib/riscv/sbi.c index 02dd338c..3c395cff 100644 --- a/lib/riscv/sbi.c +++ b/lib/riscv/sbi.c @@ -107,12 +107,17 @@ struct sbiret sbi_set_timer(unsigned long stime_value) return sbi_ecall(SBI_EXT_TIME, SBI_EXT_TIME_SET_TIMER, stime_value, 0, 0, 0, 0, 0); } +struct sbiret sbi_get_spec_version(void) +{ + return sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_SPEC_VERSION, 0, 0, 0, 0, 0, 0); +} + long sbi_probe(int ext) { struct sbiret ret; - ret = sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_SPEC_VERSION, 0, 0, 0, 0, 0, 0); - assert(!ret.error && (ret.value & 0x7ffffffful) >= 2); + ret = sbi_get_spec_version(); + assert(!ret.error && ret.value >= sbi_mk_version(2, 0)); ret = sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_PROBE_EXT, ext, 0, 0, 0, 0, 0); assert(!ret.error); -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 03:19:51 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 11:19:51 +0100 Subject: [kvm-unit-tests PATCH v10 5/8] lib: riscv: add functions to get implementer ID and version In-Reply-To: <20250317101956.526834-1-cleger@rivosinc.com> References: <20250317101956.526834-1-cleger@rivosinc.com> Message-ID: <20250317101956.526834-6-cleger@rivosinc.com> These function will be used by SSE tests to check for a specific opensbi version. sbi_impl_check() is an helper allowing to check for a specific SBI implementor without needing to check for ret.error. Signed-off-by: Cl?ment L?ger --- lib/riscv/asm/sbi.h | 30 ++++++++++++++++++++++++++++++ lib/riscv/sbi.c | 10 ++++++++++ 2 files changed, 40 insertions(+) diff --git a/lib/riscv/asm/sbi.h b/lib/riscv/asm/sbi.h index 197288c7..06bcec16 100644 --- a/lib/riscv/asm/sbi.h +++ b/lib/riscv/asm/sbi.h @@ -18,6 +18,19 @@ #define SBI_ERR_IO -13 #define SBI_ERR_DENIED_LOCKED -14 +#define SBI_IMPL_BBL 0 +#define SBI_IMPL_OPENSBI 1 +#define SBI_IMPL_XVISOR 2 +#define SBI_IMPL_KVM 3 +#define SBI_IMPL_RUSTSBI 4 +#define SBI_IMPL_DIOSIX 5 +#define SBI_IMPL_COFFER 6 +#define SBI_IMPL_XEN Project 7 +#define SBI_IMPL_POLARFIRE_HSS 8 +#define SBI_IMPL_COREBOOT 9 +#define SBI_IMPL_OREBOOT 10 +#define SBI_IMPL_BHYVE 11 + /* SBI spec version fields */ #define SBI_SPEC_VERSION_DEFAULT 0x1 #define SBI_SPEC_VERSION_MAJOR_SHIFT 24 @@ -123,6 +136,10 @@ static inline unsigned long sbi_mk_version(unsigned long major, unsigned long mi | (minor & SBI_SPEC_VERSION_MINOR_MASK); } +static inline unsigned long sbi_impl_opensbi_mk_version(unsigned long major, unsigned long minor) +{ + return ((major << 16) | (minor)); +} struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0, unsigned long arg1, unsigned long arg2, @@ -139,7 +156,20 @@ struct sbiret sbi_send_ipi_cpumask(const cpumask_t *mask); struct sbiret sbi_send_ipi_broadcast(void); struct sbiret sbi_set_timer(unsigned long stime_value); struct sbiret sbi_get_spec_version(void); +struct sbiret sbi_get_imp_version(void); +struct sbiret sbi_get_imp_id(void); long sbi_probe(int ext); +static inline bool sbi_check_impl(unsigned long impl) +{ + struct sbiret ret; + + ret = sbi_get_imp_id(); + if (ret.error) + return false; + + return ret.value == impl; +} + #endif /* !__ASSEMBLER__ */ #endif /* _ASMRISCV_SBI_H_ */ diff --git a/lib/riscv/sbi.c b/lib/riscv/sbi.c index 3c395cff..9cb5757e 100644 --- a/lib/riscv/sbi.c +++ b/lib/riscv/sbi.c @@ -107,6 +107,16 @@ struct sbiret sbi_set_timer(unsigned long stime_value) return sbi_ecall(SBI_EXT_TIME, SBI_EXT_TIME_SET_TIMER, stime_value, 0, 0, 0, 0, 0); } +struct sbiret sbi_get_imp_version(void) +{ + return sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_IMP_VERSION, 0, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_get_imp_id(void) +{ + return sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_IMP_ID, 0, 0, 0, 0, 0, 0); +} + struct sbiret sbi_get_spec_version(void) { return sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_SPEC_VERSION, 0, 0, 0, 0, 0, 0); -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 03:19:52 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 11:19:52 +0100 Subject: [kvm-unit-tests PATCH v10 6/8] riscv: lib: Add SBI SSE extension definitions In-Reply-To: <20250317101956.526834-1-cleger@rivosinc.com> References: <20250317101956.526834-1-cleger@rivosinc.com> Message-ID: <20250317101956.526834-7-cleger@rivosinc.com> Add SBI SSE extension definitions in sbi.h Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- lib/riscv/asm/sbi.h | 106 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 105 insertions(+), 1 deletion(-) diff --git a/lib/riscv/asm/sbi.h b/lib/riscv/asm/sbi.h index 06bcec16..b8688e47 100644 --- a/lib/riscv/asm/sbi.h +++ b/lib/riscv/asm/sbi.h @@ -49,6 +49,7 @@ enum sbi_ext_id { SBI_EXT_DBCN = 0x4442434E, SBI_EXT_SUSP = 0x53555350, SBI_EXT_FWFT = 0x46574654, + SBI_EXT_SSE = 0x535345, }; enum sbi_ext_base_fid { @@ -97,7 +98,6 @@ enum sbi_ext_dbcn_fid { SBI_EXT_DBCN_CONSOLE_WRITE_BYTE, }; - enum sbi_ext_fwft_fid { SBI_EXT_FWFT_SET = 0, SBI_EXT_FWFT_GET, @@ -124,6 +124,110 @@ enum sbi_ext_fwft_fid { #define SBI_FWFT_SET_FLAG_LOCK BIT(0) +enum sbi_ext_sse_fid { + SBI_EXT_SSE_READ_ATTRS = 0, + SBI_EXT_SSE_WRITE_ATTRS, + SBI_EXT_SSE_REGISTER, + SBI_EXT_SSE_UNREGISTER, + SBI_EXT_SSE_ENABLE, + SBI_EXT_SSE_DISABLE, + SBI_EXT_SSE_COMPLETE, + SBI_EXT_SSE_INJECT, + SBI_EXT_SSE_HART_UNMASK, + SBI_EXT_SSE_HART_MASK, +}; + +/* SBI SSE Event Attributes. */ +enum sbi_sse_attr_id { + SBI_SSE_ATTR_STATUS = 0x00000000, + SBI_SSE_ATTR_PRIORITY = 0x00000001, + SBI_SSE_ATTR_CONFIG = 0x00000002, + SBI_SSE_ATTR_PREFERRED_HART = 0x00000003, + SBI_SSE_ATTR_ENTRY_PC = 0x00000004, + SBI_SSE_ATTR_ENTRY_ARG = 0x00000005, + SBI_SSE_ATTR_INTERRUPTED_SEPC = 0x00000006, + SBI_SSE_ATTR_INTERRUPTED_FLAGS = 0x00000007, + SBI_SSE_ATTR_INTERRUPTED_A6 = 0x00000008, + SBI_SSE_ATTR_INTERRUPTED_A7 = 0x00000009, +}; + +#define SBI_SSE_ATTR_STATUS_STATE_OFFSET 0 +#define SBI_SSE_ATTR_STATUS_STATE_MASK 0x3 +#define SBI_SSE_ATTR_STATUS_PENDING_OFFSET 2 +#define SBI_SSE_ATTR_STATUS_INJECT_OFFSET 3 + +#define SBI_SSE_ATTR_CONFIG_ONESHOT BIT(0) + +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPP BIT(0) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPIE BIT(1) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPV BIT(2) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPVP BIT(3) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPELP BIT(4) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT BIT(5) + +enum sbi_sse_state { + SBI_SSE_STATE_UNUSED = 0, + SBI_SSE_STATE_REGISTERED = 1, + SBI_SSE_STATE_ENABLED = 2, + SBI_SSE_STATE_RUNNING = 3, +}; + +/* SBI SSE Event IDs. */ +/* Range 0x00000000 - 0x0000ffff */ +#define SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS 0x00000000 +#define SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP 0x00000001 +#define SBI_SSE_EVENT_LOCAL_RESERVED_0_START 0x00000002 +#define SBI_SSE_EVENT_LOCAL_RESERVED_0_END 0x00003fff +#define SBI_SSE_EVENT_LOCAL_PLAT_0_START 0x00004000 +#define SBI_SSE_EVENT_LOCAL_PLAT_0_END 0x00007fff + +#define SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS 0x00008000 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_0_START 0x00008001 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_0_END 0x0000bfff +#define SBI_SSE_EVENT_GLOBAL_PLAT_0_START 0x0000c000 +#define SBI_SSE_EVENT_GLOBAL_PLAT_0_END 0x0000ffff + +/* Range 0x00010000 - 0x0001ffff */ +#define SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW 0x00010000 +#define SBI_SSE_EVENT_LOCAL_RESERVED_1_START 0x00010001 +#define SBI_SSE_EVENT_LOCAL_RESERVED_1_END 0x00013fff +#define SBI_SSE_EVENT_LOCAL_PLAT_1_START 0x00014000 +#define SBI_SSE_EVENT_LOCAL_PLAT_1_END 0x00017fff + +#define SBI_SSE_EVENT_GLOBAL_RESERVED_1_START 0x00018000 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_1_END 0x0001bfff +#define SBI_SSE_EVENT_GLOBAL_PLAT_1_START 0x0001c000 +#define SBI_SSE_EVENT_GLOBAL_PLAT_1_END 0x0001ffff + +/* Range 0x00100000 - 0x0010ffff */ +#define SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS 0x00100000 +#define SBI_SSE_EVENT_LOCAL_RESERVED_2_START 0x00100001 +#define SBI_SSE_EVENT_LOCAL_RESERVED_2_END 0x00103fff +#define SBI_SSE_EVENT_LOCAL_PLAT_2_START 0x00104000 +#define SBI_SSE_EVENT_LOCAL_PLAT_2_END 0x00107fff + +#define SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS 0x00108000 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_2_START 0x00108001 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_2_END 0x0010bfff +#define SBI_SSE_EVENT_GLOBAL_PLAT_2_START 0x0010c000 +#define SBI_SSE_EVENT_GLOBAL_PLAT_2_END 0x0010ffff + +/* Range 0xffff0000 - 0xffffffff */ +#define SBI_SSE_EVENT_LOCAL_SOFTWARE 0xffff0000 +#define SBI_SSE_EVENT_LOCAL_RESERVED_3_START 0xffff0001 +#define SBI_SSE_EVENT_LOCAL_RESERVED_3_END 0xffff3fff +#define SBI_SSE_EVENT_LOCAL_PLAT_3_START 0xffff4000 +#define SBI_SSE_EVENT_LOCAL_PLAT_3_END 0xffff7fff + +#define SBI_SSE_EVENT_GLOBAL_SOFTWARE 0xffff8000 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_3_START 0xffff8001 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_3_END 0xffffbfff +#define SBI_SSE_EVENT_GLOBAL_PLAT_3_START 0xffffc000 +#define SBI_SSE_EVENT_GLOBAL_PLAT_3_END 0xffffffff + +#define SBI_SSE_EVENT_PLATFORM_BIT BIT(14) +#define SBI_SSE_EVENT_GLOBAL_BIT BIT(15) + struct sbiret { long error; long value; -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 03:19:53 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 11:19:53 +0100 Subject: [kvm-unit-tests PATCH v10 7/8] lib: riscv: Add SBI SSE support In-Reply-To: <20250317101956.526834-1-cleger@rivosinc.com> References: <20250317101956.526834-1-cleger@rivosinc.com> Message-ID: <20250317101956.526834-8-cleger@rivosinc.com> Add support for registering and handling SSE events. This will be used by sbi test as well as upcoming double trap tests. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- riscv/Makefile | 1 + lib/riscv/asm/csr.h | 1 + lib/riscv/asm/sbi.h | 36 ++++++++++++++ lib/riscv/sbi-sse-asm.S | 102 ++++++++++++++++++++++++++++++++++++++++ lib/riscv/asm-offsets.c | 9 ++++ lib/riscv/sbi.c | 76 ++++++++++++++++++++++++++++++ 6 files changed, 225 insertions(+) create mode 100644 lib/riscv/sbi-sse-asm.S diff --git a/riscv/Makefile b/riscv/Makefile index 02d2ac39..16fc125b 100644 --- a/riscv/Makefile +++ b/riscv/Makefile @@ -43,6 +43,7 @@ cflatobjs += lib/riscv/setup.o cflatobjs += lib/riscv/smp.o cflatobjs += lib/riscv/stack.o cflatobjs += lib/riscv/timer.o +cflatobjs += lib/riscv/sbi-sse-asm.o ifeq ($(ARCH),riscv32) cflatobjs += lib/ldiv32.o endif diff --git a/lib/riscv/asm/csr.h b/lib/riscv/asm/csr.h index c7fc87a9..3e4b5fca 100644 --- a/lib/riscv/asm/csr.h +++ b/lib/riscv/asm/csr.h @@ -17,6 +17,7 @@ #define CSR_TIME 0xc01 #define SR_SIE _AC(0x00000002, UL) +#define SR_SPP _AC(0x00000100, UL) /* Exception cause high bit - is an interrupt if set */ #define CAUSE_IRQ_FLAG (_AC(1, UL) << (__riscv_xlen - 1)) diff --git a/lib/riscv/asm/sbi.h b/lib/riscv/asm/sbi.h index b8688e47..ebc89c10 100644 --- a/lib/riscv/asm/sbi.h +++ b/lib/riscv/asm/sbi.h @@ -275,5 +275,41 @@ static inline bool sbi_check_impl(unsigned long impl) return ret.value == impl; } +typedef void (*sbi_sse_handler_fn)(void *data, struct pt_regs *regs, unsigned int hartid); + +struct sbi_sse_handler_arg { + unsigned long reg_tmp; + sbi_sse_handler_fn handler; + void *handler_data; + void *stack; +}; + +extern void sbi_sse_entry(void); + +static inline bool sbi_sse_event_is_global(uint32_t event_id) +{ + return !!(event_id & SBI_SSE_EVENT_GLOBAL_BIT); +} + +struct sbiret sbi_sse_read_attrs_raw(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long phys_lo, + unsigned long phys_hi); +struct sbiret sbi_sse_read_attrs(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long *values); +struct sbiret sbi_sse_write_attrs_raw(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long phys_lo, + unsigned long phys_hi); +struct sbiret sbi_sse_write_attrs(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long *values); +struct sbiret sbi_sse_register_raw(unsigned long event_id, unsigned long entry_pc, + unsigned long entry_arg); +struct sbiret sbi_sse_register(unsigned long event_id, struct sbi_sse_handler_arg *arg); +struct sbiret sbi_sse_unregister(unsigned long event_id); +struct sbiret sbi_sse_enable(unsigned long event_id); +struct sbiret sbi_sse_disable(unsigned long event_id); +struct sbiret sbi_sse_hart_mask(void); +struct sbiret sbi_sse_hart_unmask(void); +struct sbiret sbi_sse_inject(unsigned long event_id, unsigned long hart_id); + #endif /* !__ASSEMBLER__ */ #endif /* _ASMRISCV_SBI_H_ */ diff --git a/lib/riscv/sbi-sse-asm.S b/lib/riscv/sbi-sse-asm.S new file mode 100644 index 00000000..b9e951f5 --- /dev/null +++ b/lib/riscv/sbi-sse-asm.S @@ -0,0 +1,102 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * RISC-V SSE events entry point. + * + * Copyright (C) 2025, Rivos Inc., Cl?ment L?ger + */ +#include +#include +#include +#include + +.section .text +.global sbi_sse_entry +sbi_sse_entry: + /* Save stack temporarily */ + REG_S sp, SBI_SSE_REG_TMP(a7) + /* Set entry stack */ + REG_L sp, SBI_SSE_HANDLER_STACK(a7) + + addi sp, sp, -(PT_SIZE) + REG_S ra, PT_RA(sp) + REG_S s0, PT_S0(sp) + REG_S s1, PT_S1(sp) + REG_S s2, PT_S2(sp) + REG_S s3, PT_S3(sp) + REG_S s4, PT_S4(sp) + REG_S s5, PT_S5(sp) + REG_S s6, PT_S6(sp) + REG_S s7, PT_S7(sp) + REG_S s8, PT_S8(sp) + REG_S s9, PT_S9(sp) + REG_S s10, PT_S10(sp) + REG_S s11, PT_S11(sp) + REG_S tp, PT_TP(sp) + REG_S t0, PT_T0(sp) + REG_S t1, PT_T1(sp) + REG_S t2, PT_T2(sp) + REG_S t3, PT_T3(sp) + REG_S t4, PT_T4(sp) + REG_S t5, PT_T5(sp) + REG_S t6, PT_T6(sp) + REG_S gp, PT_GP(sp) + REG_S a0, PT_A0(sp) + REG_S a1, PT_A1(sp) + REG_S a2, PT_A2(sp) + REG_S a3, PT_A3(sp) + REG_S a4, PT_A4(sp) + REG_S a5, PT_A5(sp) + csrr a1, CSR_SEPC + REG_S a1, PT_EPC(sp) + csrr a2, CSR_SSTATUS + REG_S a2, PT_STATUS(sp) + + REG_L a0, SBI_SSE_REG_TMP(a7) + REG_S a0, PT_SP(sp) + + REG_L t0, SBI_SSE_HANDLER(a7) + REG_L a0, SBI_SSE_HANDLER_DATA(a7) + mv a1, sp + mv a2, a6 + jalr t0 + + REG_L a1, PT_EPC(sp) + REG_L a2, PT_STATUS(sp) + csrw CSR_SEPC, a1 + csrw CSR_SSTATUS, a2 + + REG_L ra, PT_RA(sp) + REG_L s0, PT_S0(sp) + REG_L s1, PT_S1(sp) + REG_L s2, PT_S2(sp) + REG_L s3, PT_S3(sp) + REG_L s4, PT_S4(sp) + REG_L s5, PT_S5(sp) + REG_L s6, PT_S6(sp) + REG_L s7, PT_S7(sp) + REG_L s8, PT_S8(sp) + REG_L s9, PT_S9(sp) + REG_L s10, PT_S10(sp) + REG_L s11, PT_S11(sp) + REG_L tp, PT_TP(sp) + REG_L t0, PT_T0(sp) + REG_L t1, PT_T1(sp) + REG_L t2, PT_T2(sp) + REG_L t3, PT_T3(sp) + REG_L t4, PT_T4(sp) + REG_L t5, PT_T5(sp) + REG_L t6, PT_T6(sp) + REG_L gp, PT_GP(sp) + REG_L a0, PT_A0(sp) + REG_L a1, PT_A1(sp) + REG_L a2, PT_A2(sp) + REG_L a3, PT_A3(sp) + REG_L a4, PT_A4(sp) + REG_L a5, PT_A5(sp) + + REG_L sp, PT_SP(sp) + + li a7, ASM_SBI_EXT_SSE + li a6, ASM_SBI_EXT_SSE_COMPLETE + ecall + diff --git a/lib/riscv/asm-offsets.c b/lib/riscv/asm-offsets.c index 6c511c14..a96c6e97 100644 --- a/lib/riscv/asm-offsets.c +++ b/lib/riscv/asm-offsets.c @@ -3,6 +3,7 @@ #include #include #include +#include #include int main(void) @@ -63,5 +64,13 @@ int main(void) OFFSET(THREAD_INFO_HARTID, thread_info, hartid); DEFINE(THREAD_INFO_SIZE, sizeof(struct thread_info)); + DEFINE(ASM_SBI_EXT_SSE, SBI_EXT_SSE); + DEFINE(ASM_SBI_EXT_SSE_COMPLETE, SBI_EXT_SSE_COMPLETE); + + OFFSET(SBI_SSE_REG_TMP, sbi_sse_handler_arg, reg_tmp); + OFFSET(SBI_SSE_HANDLER, sbi_sse_handler_arg, handler); + OFFSET(SBI_SSE_HANDLER_DATA, sbi_sse_handler_arg, handler_data); + OFFSET(SBI_SSE_HANDLER_STACK, sbi_sse_handler_arg, stack); + return 0; } diff --git a/lib/riscv/sbi.c b/lib/riscv/sbi.c index 9cb5757e..8249cb1b 100644 --- a/lib/riscv/sbi.c +++ b/lib/riscv/sbi.c @@ -2,6 +2,7 @@ #include #include #include +#include #include #include @@ -31,6 +32,81 @@ struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0, return ret; } +struct sbiret sbi_sse_read_attrs_raw(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long phys_lo, + unsigned long phys_hi) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_READ_ATTRS, event_id, base_attr_id, attr_count, + phys_lo, phys_hi, 0); +} + +struct sbiret sbi_sse_read_attrs(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long *values) +{ + phys_addr_t p = virt_to_phys(values); + + return sbi_sse_read_attrs_raw(event_id, base_attr_id, attr_count, lower_32_bits(p), + upper_32_bits(p)); +} + +struct sbiret sbi_sse_write_attrs_raw(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long phys_lo, + unsigned long phys_hi) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_WRITE_ATTRS, event_id, base_attr_id, attr_count, + phys_lo, phys_hi, 0); +} + +struct sbiret sbi_sse_write_attrs(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long *values) +{ + phys_addr_t p = virt_to_phys(values); + + return sbi_sse_write_attrs_raw(event_id, base_attr_id, attr_count, lower_32_bits(p), + upper_32_bits(p)); +} + +struct sbiret sbi_sse_register_raw(unsigned long event_id, unsigned long entry_pc, + unsigned long entry_arg) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_REGISTER, event_id, entry_pc, entry_arg, 0, 0, 0); +} + +struct sbiret sbi_sse_register(unsigned long event_id, struct sbi_sse_handler_arg *arg) +{ + return sbi_sse_register_raw(event_id, (unsigned long)sbi_sse_entry, (unsigned long)arg); +} + +struct sbiret sbi_sse_unregister(unsigned long event_id) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_UNREGISTER, event_id, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_sse_enable(unsigned long event_id) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_ENABLE, event_id, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_sse_disable(unsigned long event_id) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_DISABLE, event_id, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_sse_hart_mask(void) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_HART_MASK, 0, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_sse_hart_unmask(void) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_HART_UNMASK, 0, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_sse_inject(unsigned long event_id, unsigned long hart_id) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_INJECT, event_id, hart_id, 0, 0, 0, 0); +} + void sbi_shutdown(void) { sbi_ecall(SBI_EXT_SRST, 0, 0, 0, 0, 0, 0, 0); -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 03:19:54 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 11:19:54 +0100 Subject: [kvm-unit-tests PATCH v10 8/8] riscv: sbi: Add SSE extension tests In-Reply-To: <20250317101956.526834-1-cleger@rivosinc.com> References: <20250317101956.526834-1-cleger@rivosinc.com> Message-ID: <20250317101956.526834-9-cleger@rivosinc.com> Add SBI SSE extension tests for the following features: - Test attributes errors (invalid values, RO, etc) - Registration errors - Simple events (register, enable, inject) - Events with different priorities - Global events dispatch on different harts - Local events on all harts - Hart mask/unmask events Signed-off-by: Cl?ment L?ger --- riscv/Makefile | 1 + riscv/sbi-tests.h | 1 + riscv/sbi-sse.c | 1280 +++++++++++++++++++++++++++++++++++++++++++++ riscv/sbi.c | 2 + 4 files changed, 1284 insertions(+) create mode 100644 riscv/sbi-sse.c diff --git a/riscv/Makefile b/riscv/Makefile index 16fc125b..4fe2f1bb 100644 --- a/riscv/Makefile +++ b/riscv/Makefile @@ -18,6 +18,7 @@ tests += $(TEST_DIR)/sieve.$(exe) all: $(tests) $(TEST_DIR)/sbi-deps = $(TEST_DIR)/sbi-asm.o $(TEST_DIR)/sbi-fwft.o +$(TEST_DIR)/sbi-deps += $(TEST_DIR)/sbi-sse.o # When built for EFI sieve needs extra memory, run with e.g. '-m 256' on QEMU $(TEST_DIR)/sieve.$(exe): AUXFLAGS = 0x1 diff --git a/riscv/sbi-tests.h b/riscv/sbi-tests.h index b081464d..a71da809 100644 --- a/riscv/sbi-tests.h +++ b/riscv/sbi-tests.h @@ -71,6 +71,7 @@ sbiret_report(ret, expected_error, expected_value, "check sbi.error and sbi.value") void sbi_bad_fid(int ext); +void check_sse(void); #endif /* __ASSEMBLER__ */ #endif /* _RISCV_SBI_TESTS_H_ */ diff --git a/riscv/sbi-sse.c b/riscv/sbi-sse.c new file mode 100644 index 00000000..d6d313b2 --- /dev/null +++ b/riscv/sbi-sse.c @@ -0,0 +1,1280 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * SBI SSE testsuite + * + * Copyright (C) 2025, Rivos Inc., Cl?ment L?ger + */ +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "sbi-tests.h" + +#define SSE_STACK_SIZE PAGE_SIZE + +struct sse_event_info { + uint32_t event_id; + const char *name; + bool can_inject; +}; + +static struct sse_event_info sse_event_infos[] = { + { + .event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, + .name = "local_high_prio_ras", + }, + { + .event_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP, + .name = "double_trap", + }, + { + .event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, + .name = "global_high_prio_ras", + }, + { + .event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, + .name = "local_pmu_overflow", + }, + { + .event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, + .name = "local_low_prio_ras", + }, + { + .event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, + .name = "global_low_prio_ras", + }, + { + .event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE, + .name = "local_software", + }, + { + .event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE, + .name = "global_software", + }, +}; + +static const char *const attr_names[] = { + [SBI_SSE_ATTR_STATUS] = "status", + [SBI_SSE_ATTR_PRIORITY] = "priority", + [SBI_SSE_ATTR_CONFIG] = "config", + [SBI_SSE_ATTR_PREFERRED_HART] = "preferred_hart", + [SBI_SSE_ATTR_ENTRY_PC] = "entry_pc", + [SBI_SSE_ATTR_ENTRY_ARG] = "entry_arg", + [SBI_SSE_ATTR_INTERRUPTED_SEPC] = "interrupted_sepc", + [SBI_SSE_ATTR_INTERRUPTED_FLAGS] = "interrupted_flags", + [SBI_SSE_ATTR_INTERRUPTED_A6] = "interrupted_a6", + [SBI_SSE_ATTR_INTERRUPTED_A7] = "interrupted_a7", +}; + +static const unsigned long ro_attrs[] = { + SBI_SSE_ATTR_STATUS, + SBI_SSE_ATTR_ENTRY_PC, + SBI_SSE_ATTR_ENTRY_ARG, +}; + +static const unsigned long interrupted_attrs[] = { + SBI_SSE_ATTR_INTERRUPTED_SEPC, + SBI_SSE_ATTR_INTERRUPTED_FLAGS, + SBI_SSE_ATTR_INTERRUPTED_A6, + SBI_SSE_ATTR_INTERRUPTED_A7, +}; + +static const unsigned long interrupted_flags[] = { + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPP, + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPIE, + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPELP, + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT, + SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPV, + SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPVP, +}; + +static struct sse_event_info *sse_event_get_info(uint32_t event_id) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(sse_event_infos); i++) { + if (sse_event_infos[i].event_id == event_id) + return &sse_event_infos[i]; + } + + assert_msg(false, "Invalid event id: %d", event_id); +} + +static const char *sse_event_name(uint32_t event_id) +{ + return sse_event_get_info(event_id)->name; +} + +static bool sse_event_can_inject(uint32_t event_id) +{ + return sse_event_get_info(event_id)->can_inject; +} + +static struct sbiret sse_get_event_status_field(uint32_t event_id, unsigned long mask, + unsigned long shift, unsigned long *value) +{ + struct sbiret ret; + unsigned long status; + + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, 1, &status); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "Get event status"); + return ret; + } + + *value = (status & mask) >> shift; + + return ret; +} + +static struct sbiret sse_event_get_state(uint32_t event_id, enum sbi_sse_state *state) +{ + unsigned long status = 0; + struct sbiret ret; + + ret = sse_get_event_status_field(event_id, SBI_SSE_ATTR_STATUS_STATE_MASK, + SBI_SSE_ATTR_STATUS_STATE_OFFSET, &status); + *state = status; + + return ret; +} + +static unsigned long sse_global_event_set_current_hart(uint32_t event_id) +{ + struct sbiret ret; + unsigned long current_hart = current_thread_info()->hartid; + + assert(sbi_sse_event_is_global(event_id)); + + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, ¤t_hart); + if (sbiret_report_error(&ret, SBI_SUCCESS, "Set preferred hart")) + return ret.error; + + return 0; +} + +static bool sse_check_state(uint32_t event_id, unsigned long expected_state) +{ + struct sbiret ret; + enum sbi_sse_state state; + + ret = sse_event_get_state(event_id, &state); + if (ret.error) + return false; + + return report(state == expected_state, "event status == %ld", expected_state); +} + +static bool sse_event_pending(uint32_t event_id) +{ + bool pending = 0; + + sse_get_event_status_field(event_id, BIT(SBI_SSE_ATTR_STATUS_PENDING_OFFSET), + SBI_SSE_ATTR_STATUS_PENDING_OFFSET, (unsigned long *)&pending); + + return pending; +} + +static void *sse_alloc_stack(void) +{ + /* + * We assume that SSE_STACK_SIZE always fit in one page. This page will + * always be decremented before storing anything on it in sse-entry.S. + */ + assert(SSE_STACK_SIZE <= PAGE_SIZE); + + return (alloc_page() + SSE_STACK_SIZE); +} + +static void sse_free_stack(void *stack) +{ + free_page(stack - SSE_STACK_SIZE); +} + +static void sse_read_write_test(uint32_t event_id, unsigned long attr, unsigned long attr_count, + unsigned long *value, long expected_error, const char *str) +{ + struct sbiret ret; + + ret = sbi_sse_read_attrs(event_id, attr, attr_count, value); + sbiret_report_error(&ret, expected_error, "Read %s error", str); + + ret = sbi_sse_write_attrs(event_id, attr, attr_count, value); + sbiret_report_error(&ret, expected_error, "Write %s error", str); +} + +#define ALL_ATTRS_COUNT (SBI_SSE_ATTR_INTERRUPTED_A7 + 1) + +static void sse_test_attrs(uint32_t event_id) +{ + unsigned long value = 0; + struct sbiret ret; + void *ptr; + unsigned long values[ALL_ATTRS_COUNT]; + unsigned int i; + const char *invalid_hart_str; + const char *attr_name; + + report_prefix_push("attrs"); + + for (i = 0; i < ARRAY_SIZE(ro_attrs); i++) { + ret = sbi_sse_write_attrs(event_id, ro_attrs[i], 1, &value); + sbiret_report_error(&ret, SBI_ERR_DENIED, "RO attribute %s not writable", + attr_names[ro_attrs[i]]); + } + + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, ALL_ATTRS_COUNT, values); + sbiret_report_error(&ret, SBI_SUCCESS, "Read multiple attributes"); + + for (i = SBI_SSE_ATTR_STATUS; i <= SBI_SSE_ATTR_INTERRUPTED_A7; i++) { + ret = sbi_sse_read_attrs(event_id, i, 1, &value); + attr_name = attr_names[i]; + + sbiret_report_error(&ret, SBI_SUCCESS, "Read single attribute %s", attr_name); + if (values[i] != value) + report_fail("Attribute 0x%x single value read (0x%lx) differs from the one read with multiple attributes (0x%lx)", + i, value, values[i]); + /* + * Preferred hart reset value is defined by SBI vendor + */ + if (i != SBI_SSE_ATTR_PREFERRED_HART) { + /* + * Specification states that injectable bit is implementation dependent + * but other bits are zero-initialized. + */ + if (i == SBI_SSE_ATTR_STATUS) + value &= ~BIT(SBI_SSE_ATTR_STATUS_INJECT_OFFSET); + report(value == 0, "Attribute %s reset value is 0, found %lx", attr_name, value); + } + } + +#if __riscv_xlen > 32 + value = BIT(32); + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Write invalid prio > 0xFFFFFFFF error"); +#endif + + value = ~SBI_SSE_ATTR_CONFIG_ONESHOT; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_CONFIG, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Write invalid config value error"); + + if (sbi_sse_event_is_global(event_id)) { + invalid_hart_str = getenv("INVALID_HART_ID"); + if (!invalid_hart_str) + value = 0xFFFFFFFFUL; + else + value = strtoul(invalid_hart_str, NULL, 0); + + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid hart id error"); + } else { + /* Set Hart on local event -> RO */ + value = current_thread_info()->hartid; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); + sbiret_report_error(&ret, SBI_ERR_DENIED, + "Set hart id on local event error"); + } + + /* Set/get flags, sepc, a6, a7 */ + for (i = 0; i < ARRAY_SIZE(interrupted_attrs); i++) { + attr_name = attr_names[interrupted_attrs[i]]; + ret = sbi_sse_read_attrs(event_id, interrupted_attrs[i], 1, &value); + sbiret_report_error(&ret, SBI_SUCCESS, "Get interrupted %s", attr_name); + + value = ARRAY_SIZE(interrupted_attrs) - i; + ret = sbi_sse_write_attrs(event_id, interrupted_attrs[i], 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, + "Set attribute %s invalid state error", attr_name); + } + + sse_read_write_test(event_id, SBI_SSE_ATTR_STATUS, 0, &value, SBI_ERR_INVALID_PARAM, + "attribute attr_count == 0"); + sse_read_write_test(event_id, SBI_SSE_ATTR_INTERRUPTED_A7 + 1, 1, &value, SBI_ERR_BAD_RANGE, + "invalid attribute"); + + /* Misaligned pointer address */ + ptr = (void *)&value; + ptr += 1; + sse_read_write_test(event_id, SBI_SSE_ATTR_STATUS, 1, ptr, SBI_ERR_INVALID_ADDRESS, + "attribute with invalid address"); + + report_prefix_pop(); +} + +static void sse_test_register_error(uint32_t event_id) +{ + struct sbiret ret; + + report_prefix_push("register"); + + ret = sbi_sse_unregister(event_id); + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "unregister non-registered event"); + + ret = sbi_sse_register_raw(event_id, 0x1, 0); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "register misaligned entry"); + + ret = sbi_sse_register(event_id, NULL); + sbiret_report_error(&ret, SBI_SUCCESS, "register"); + if (ret.error) + goto done; + + ret = sbi_sse_register(event_id, NULL); + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "register used event failure"); + + ret = sbi_sse_unregister(event_id); + sbiret_report_error(&ret, SBI_SUCCESS, "unregister"); + +done: + report_prefix_pop(); +} + +struct sse_simple_test_arg { + bool done; + unsigned long expected_a6; + uint32_t event_id; +}; + +#if __riscv_xlen > 32 + +struct alias_test_params { + unsigned long event_id; + unsigned long attr_id; + unsigned long attr_count; + const char *str; +}; + +static void test_alias(uint32_t event_id) +{ + struct alias_test_params *write, *read; + unsigned long write_value, read_value; + struct sbiret ret; + bool err = false; + int r, w; + struct alias_test_params params[] = { + {event_id, SBI_SSE_ATTR_INTERRUPTED_A6, 1, "non aliased"}, + {BIT(32) + event_id, SBI_SSE_ATTR_INTERRUPTED_A6, 1, "aliased event_id"}, + {event_id, BIT(32) + SBI_SSE_ATTR_INTERRUPTED_A6, 1, "aliased attr_id"}, + {event_id, SBI_SSE_ATTR_INTERRUPTED_A6, BIT(32) + 1, "aliased attr_count"}, + }; + + report_prefix_push("alias"); + for (w = 0; w < ARRAY_SIZE(params); w++) { + write = ¶ms[w]; + + write_value = 0xDEADBEEF + w; + ret = sbi_sse_write_attrs(write->event_id, write->attr_id, write->attr_count, &write_value); + if (ret.error) + sbiret_report_error(&ret, SBI_SUCCESS, "Write %s, event 0x%lx attr 0x%lx, attr count 0x%lx", + write->str, write->event_id, write->attr_id, write->attr_count); + + for (r = 0; r < ARRAY_SIZE(params); r++) { + read = ¶ms[r]; + read_value = 0; + ret = sbi_sse_read_attrs(read->event_id, read->attr_id, read->attr_count, &read_value); + if (ret.error) + sbiret_report_error(&ret, SBI_SUCCESS, + "Read %s, event 0x%lx attr 0x%lx, attr count 0x%lx", + read->str, read->event_id, read->attr_id, read->attr_count); + + /* Do not spam output with a lot of reports */ + if (write_value != read_value) { + err = true; + report_fail("Write %s, event 0x%lx attr 0x%lx, attr count 0x%lx value %lx ==" + "Read %s, event 0x%lx attr 0x%lx, attr count 0x%lx value %lx", + write->str, write->event_id, write->attr_id, + write->attr_count, write_value, read->str, + read->event_id, read->attr_id, read->attr_count, + read_value); + } + } + } + + report(!err, "BIT(32) aliasing tests"); + report_prefix_pop(); +} +#endif + +static void sse_simple_handler(void *data, struct pt_regs *regs, unsigned int hartid) +{ + struct sse_simple_test_arg *arg = data; + int i; + struct sbiret ret; + const char *attr_name; + uint32_t event_id = READ_ONCE(arg->event_id), attr; + unsigned long value, prev_value, flags; + unsigned long interrupted_state[ARRAY_SIZE(interrupted_attrs)]; + unsigned long modified_state[ARRAY_SIZE(interrupted_attrs)] = {4, 3, 2, 1}; + unsigned long tmp_state[ARRAY_SIZE(interrupted_attrs)]; + + report((regs->status & SR_SPP) == SR_SPP, "Interrupted S-mode"); + report(hartid == current_thread_info()->hartid, "Hartid correctly passed"); + sse_check_state(event_id, SBI_SSE_STATE_RUNNING); + report(!sse_event_pending(event_id), "Event not pending"); + + /* Read full interrupted state */ + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, + ARRAY_SIZE(interrupted_attrs), interrupted_state); + sbiret_report_error(&ret, SBI_SUCCESS, "Save full interrupted state from handler"); + + /* Write full modified state and read it */ + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, + ARRAY_SIZE(modified_state), modified_state); + sbiret_report_error(&ret, SBI_SUCCESS, + "Write full interrupted state from handler"); + + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, + ARRAY_SIZE(tmp_state), tmp_state); + sbiret_report_error(&ret, SBI_SUCCESS, "Read full modified state from handler"); + + report(memcmp(tmp_state, modified_state, sizeof(modified_state)) == 0, + "Full interrupted state successfully written"); + +#if __riscv_xlen > 32 + test_alias(event_id); +#endif + + /* Restore full saved state */ + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, + ARRAY_SIZE(interrupted_attrs), interrupted_state); + sbiret_report_error(&ret, SBI_SUCCESS, "Full interrupted state restore from handler"); + + /* We test SBI_SSE_ATTR_INTERRUPTED_FLAGS below with specific flag values */ + for (i = 0; i < ARRAY_SIZE(interrupted_attrs); i++) { + attr = interrupted_attrs[i]; + if (attr == SBI_SSE_ATTR_INTERRUPTED_FLAGS) + continue; + + attr_name = attr_names[attr]; + + ret = sbi_sse_read_attrs(event_id, attr, 1, &prev_value); + sbiret_report_error(&ret, SBI_SUCCESS, "Get attr %s", attr_name); + + value = 0xDEADBEEF + i; + ret = sbi_sse_write_attrs(event_id, attr, 1, &value); + sbiret_report_error(&ret, SBI_SUCCESS, "Set attr %s", attr_name); + + ret = sbi_sse_read_attrs(event_id, attr, 1, &value); + sbiret_report_error(&ret, SBI_SUCCESS, "Get attr %s", attr_name); + report(value == 0xDEADBEEF + i, "Get attr %s, value: 0x%lx", attr_name, value); + + ret = sbi_sse_write_attrs(event_id, attr, 1, &prev_value); + sbiret_report_error(&ret, SBI_SUCCESS, "Restore attr %s value", attr_name); + } + + /* Test all flags allowed for SBI_SSE_ATTR_INTERRUPTED_FLAGS */ + attr = SBI_SSE_ATTR_INTERRUPTED_FLAGS; + ret = sbi_sse_read_attrs(event_id, attr, 1, &prev_value); + sbiret_report_error(&ret, SBI_SUCCESS, "Save interrupted flags"); + + for (i = 0; i < ARRAY_SIZE(interrupted_flags); i++) { + flags = interrupted_flags[i]; + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); + sbiret_report_error(&ret, SBI_SUCCESS, + "Set interrupted flags bit 0x%lx value", flags); + ret = sbi_sse_read_attrs(event_id, attr, 1, &value); + sbiret_report_error(&ret, SBI_SUCCESS, "Get interrupted flags after set"); + report(value == flags, "interrupted flags modified value: 0x%lx", value); + } + + /* Write invalid bit in flag register */ + flags = SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT << 1; + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", + flags); + + flags = BIT(SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT + 1); + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", + flags); + + ret = sbi_sse_write_attrs(event_id, attr, 1, &prev_value); + sbiret_report_error(&ret, SBI_SUCCESS, "Restore interrupted flags"); + + /* Try to change HARTID/Priority while running */ + if (sbi_sse_event_is_global(event_id)) { + value = current_thread_info()->hartid; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "Set hart id while running error"); + } + + value = 0; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "Set priority while running error"); + + value = READ_ONCE(arg->expected_a6); + report(interrupted_state[2] == value, "Interrupted state a6, expected 0x%lx, got 0x%lx", + value, interrupted_state[2]); + + report(interrupted_state[3] == SBI_EXT_SSE, + "Interrupted state a7, expected 0x%x, got 0x%lx", SBI_EXT_SSE, + interrupted_state[3]); + + WRITE_ONCE(arg->done, true); +} + +static void sse_test_inject_simple(uint32_t event_id) +{ + unsigned long value, error; + struct sbiret ret; + enum sbi_sse_state state = SBI_SSE_STATE_UNUSED; + struct sse_simple_test_arg test_arg = {.event_id = event_id}; + struct sbi_sse_handler_arg args = { + .handler = sse_simple_handler, + .handler_data = (void *)&test_arg, + .stack = sse_alloc_stack(), + }; + + report_prefix_push("simple"); + + if (!sse_check_state(event_id, SBI_SSE_STATE_UNUSED)) + goto cleanup; + + ret = sbi_sse_register(event_id, &args); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "register")) + goto cleanup; + + state = SBI_SSE_STATE_REGISTERED; + + if (!sse_check_state(event_id, SBI_SSE_STATE_REGISTERED)) + goto cleanup; + + if (sbi_sse_event_is_global(event_id)) { + /* Be sure global events are targeting the current hart */ + error = sse_global_event_set_current_hart(event_id); + if (error) + goto cleanup; + } + + ret = sbi_sse_enable(event_id); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "enable")) + goto cleanup; + + state = SBI_SSE_STATE_ENABLED; + if (!sse_check_state(event_id, SBI_SSE_STATE_ENABLED)) + goto cleanup; + + ret = sbi_sse_hart_mask(); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "hart mask")) + goto cleanup; + + ret = sbi_sse_inject(event_id, current_thread_info()->hartid); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "injection masked")) { + sbi_sse_hart_unmask(); + goto cleanup; + } + + report(READ_ONCE(test_arg.done) == 0, "event masked not handled"); + + /* + * When unmasking the SSE events, we expect it to be injected + * immediately so a6 should be SBI_EXT_SBI_SSE_HART_UNMASK + */ + WRITE_ONCE(test_arg.expected_a6, SBI_EXT_SSE_HART_UNMASK); + ret = sbi_sse_hart_unmask(); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "hart unmask")) + goto cleanup; + + report(READ_ONCE(test_arg.done) == 1, "event unmasked handled"); + WRITE_ONCE(test_arg.done, 0); + WRITE_ONCE(test_arg.expected_a6, SBI_EXT_SSE_INJECT); + + /* Set as oneshot and verify it is disabled */ + ret = sbi_sse_disable(event_id); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Disable event")) { + /* Nothing we can really do here, event can not be disabled */ + goto cleanup; + } + state = SBI_SSE_STATE_REGISTERED; + + value = SBI_SSE_ATTR_CONFIG_ONESHOT; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_CONFIG, 1, &value); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Set event attribute as ONESHOT")) + goto cleanup; + + ret = sbi_sse_enable(event_id); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Enable event")) + goto cleanup; + state = SBI_SSE_STATE_ENABLED; + + ret = sbi_sse_inject(event_id, current_thread_info()->hartid); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "second injection")) + goto cleanup; + + report(READ_ONCE(test_arg.done) == 1, "event handled"); + WRITE_ONCE(test_arg.done, 0); + + if (!sse_check_state(event_id, SBI_SSE_STATE_REGISTERED)) + goto cleanup; + state = SBI_SSE_STATE_REGISTERED; + + /* Clear ONESHOT FLAG */ + value = 0; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_CONFIG, 1, &value); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Clear CONFIG.ONESHOT flag")) + goto cleanup; + + ret = sbi_sse_unregister(event_id); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "unregister")) + goto cleanup; + state = SBI_SSE_STATE_UNUSED; + + sse_check_state(event_id, SBI_SSE_STATE_UNUSED); + +cleanup: + switch (state) { + case SBI_SSE_STATE_ENABLED: + ret = sbi_sse_disable(event_id); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "disable event 0x%x", event_id); + break; + } + case SBI_SSE_STATE_REGISTERED: + sbi_sse_unregister(event_id); + if (ret.error) + sbiret_report_error(&ret, SBI_SUCCESS, "unregister event 0x%x", event_id); + default: + } + + sse_free_stack(args.stack); + report_prefix_pop(); +} + +struct sse_foreign_cpu_test_arg { + bool done; + unsigned int expected_cpu; + uint32_t event_id; +}; + +static void sse_foreign_cpu_handler(void *data, struct pt_regs *regs, unsigned int hartid) +{ + struct sse_foreign_cpu_test_arg *arg = data; + unsigned int expected_cpu; + + /* For arg content to be visible */ + smp_rmb(); + expected_cpu = READ_ONCE(arg->expected_cpu); + report(expected_cpu == current_thread_info()->cpu, + "Received event on CPU (%d), expected CPU (%d)", current_thread_info()->cpu, + expected_cpu); + + WRITE_ONCE(arg->done, true); + /* For arg update to be visible for other CPUs */ + smp_wmb(); +} + +struct sse_local_per_cpu { + struct sbi_sse_handler_arg args; + struct sbiret ret; + struct sse_foreign_cpu_test_arg handler_arg; + enum sbi_sse_state state; +}; + +static void sse_register_enable_local(void *data) +{ + struct sbiret ret; + struct sse_local_per_cpu *cpu_args = data; + struct sse_local_per_cpu *cpu_arg = &cpu_args[current_thread_info()->cpu]; + uint32_t event_id = cpu_arg->handler_arg.event_id; + + ret = sbi_sse_register(event_id, &cpu_arg->args); + WRITE_ONCE(cpu_arg->ret, ret); + if (ret.error) + return; + cpu_arg->state = SBI_SSE_STATE_REGISTERED; + + ret = sbi_sse_enable(event_id); + WRITE_ONCE(cpu_arg->ret, ret); + if (ret.error) + return; + cpu_arg->state = SBI_SSE_STATE_ENABLED; +} + +static void sbi_sse_disable_unregister_local(void *data) +{ + struct sbiret ret; + struct sse_local_per_cpu *cpu_args = data; + struct sse_local_per_cpu *cpu_arg = &cpu_args[current_thread_info()->cpu]; + uint32_t event_id = cpu_arg->handler_arg.event_id; + + switch (cpu_arg->state) { + case SBI_SSE_STATE_ENABLED: + ret = sbi_sse_disable(event_id); + WRITE_ONCE(cpu_arg->ret, ret); + if (ret.error) + return; + case SBI_SSE_STATE_REGISTERED: + ret = sbi_sse_unregister(event_id); + WRITE_ONCE(cpu_arg->ret, ret); + default: + } +} + +static uint64_t sse_event_get_complete_timeout(void) +{ + char *event_complete_timeout_str; + uint64_t timeout; + + event_complete_timeout_str = getenv("SSE_EVENT_COMPLETE_TIMEOUT"); + if (!event_complete_timeout_str) + timeout = 1000; + else + timeout = strtoul(event_complete_timeout_str, NULL, 0); + + return timer_get_cycles() + usec_to_cycles(timeout); +} + +static void sse_test_inject_local(uint32_t event_id) +{ + int cpu; + uint64_t timeout; + struct sbiret ret; + struct sse_local_per_cpu *cpu_args, *cpu_arg; + struct sse_foreign_cpu_test_arg *handler_arg; + + cpu_args = calloc(NR_CPUS, sizeof(struct sbi_sse_handler_arg)); + + report_prefix_push("local_dispatch"); + for_each_online_cpu(cpu) { + cpu_arg = &cpu_args[cpu]; + cpu_arg->handler_arg.event_id = event_id; + cpu_arg->args.stack = sse_alloc_stack(); + cpu_arg->args.handler = sse_foreign_cpu_handler; + cpu_arg->args.handler_data = (void *)&cpu_arg->handler_arg; + cpu_arg->state = SBI_SSE_STATE_UNUSED; + } + + on_cpus(sse_register_enable_local, cpu_args); + for_each_online_cpu(cpu) { + cpu_arg = &cpu_args[cpu]; + ret = cpu_arg->ret; + if (ret.error) { + report_fail("CPU failed to register/enable event: %ld", ret.error); + goto cleanup; + } + + handler_arg = &cpu_arg->handler_arg; + WRITE_ONCE(handler_arg->expected_cpu, cpu); + /* For handler_arg content to be visible for other CPUs */ + smp_wmb(); + ret = sbi_sse_inject(event_id, cpus[cpu].hartid); + if (ret.error) { + report_fail("CPU failed to inject event: %ld", ret.error); + goto cleanup; + } + } + + for_each_online_cpu(cpu) { + handler_arg = &cpu_args[cpu].handler_arg; + smp_rmb(); + + timeout = sse_event_get_complete_timeout(); + while (!READ_ONCE(handler_arg->done) && timer_get_cycles() < timeout) { + /* For handler_arg update to be visible */ + smp_rmb(); + cpu_relax(); + } + report(READ_ONCE(handler_arg->done), "Event handled"); + WRITE_ONCE(handler_arg->done, false); + } + +cleanup: + on_cpus(sbi_sse_disable_unregister_local, cpu_args); + for_each_online_cpu(cpu) { + cpu_arg = &cpu_args[cpu]; + ret = READ_ONCE(cpu_arg->ret); + if (ret.error) + report_fail("CPU failed to disable/unregister event: %ld", ret.error); + } + + for_each_online_cpu(cpu) { + cpu_arg = &cpu_args[cpu]; + sse_free_stack(cpu_arg->args.stack); + } + + report_prefix_pop(); +} + +static void sse_test_inject_global_cpu(uint32_t event_id, unsigned int cpu, + struct sse_foreign_cpu_test_arg *test_arg) +{ + unsigned long value; + struct sbiret ret; + uint64_t timeout; + enum sbi_sse_state state; + + WRITE_ONCE(test_arg->expected_cpu, cpu); + /* For test_arg content to be visible for other CPUs */ + smp_wmb(); + value = cpu; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Set preferred hart")) + return; + + ret = sbi_sse_enable(event_id); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Enable event")) + return; + + ret = sbi_sse_inject(event_id, cpu); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Inject event")) + goto disable; + + smp_rmb(); + timeout = sse_event_get_complete_timeout(); + while (!READ_ONCE(test_arg->done) && timer_get_cycles() < timeout) { + /* For shared test_arg structure */ + smp_rmb(); + cpu_relax(); + } + + report(READ_ONCE(test_arg->done), "event handler called"); + WRITE_ONCE(test_arg->done, false); + + timeout = sse_event_get_complete_timeout(); + /* Wait for event to be back in ENABLED state */ + do { + ret = sse_event_get_state(event_id, &state); + if (ret.error) + goto disable; + cpu_relax(); + } while (state != SBI_SSE_STATE_ENABLED && timer_get_cycles() < timeout); + + report(state == SBI_SSE_STATE_ENABLED, "Event in enabled state"); + +disable: + ret = sbi_sse_disable(event_id); + sbiret_report_error(&ret, SBI_SUCCESS, "Disable event"); +} + +static void sse_test_inject_global(uint32_t event_id) +{ + struct sbiret ret; + unsigned int cpu; + struct sse_foreign_cpu_test_arg test_arg = {.event_id = event_id}; + struct sbi_sse_handler_arg args = { + .handler = sse_foreign_cpu_handler, + .handler_data = (void *)&test_arg, + .stack = sse_alloc_stack(), + }; + + report_prefix_push("global_dispatch"); + + ret = sbi_sse_register(event_id, &args); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Register event")) + goto err; + + for_each_online_cpu(cpu) + sse_test_inject_global_cpu(event_id, cpu, &test_arg); + + ret = sbi_sse_unregister(event_id); + sbiret_report_error(&ret, SBI_SUCCESS, "Unregister event"); + +err: + sse_free_stack(args.stack); + report_prefix_pop(); +} + +struct priority_test_arg { + uint32_t event_id; + bool called; + u32 prio; + enum sbi_sse_state state; /* Used for error handling */ + struct priority_test_arg *next_event_arg; + void (*check_func)(struct priority_test_arg *arg); +}; + +static void sse_hi_priority_test_handler(void *arg, struct pt_regs *regs, + unsigned int hartid) +{ + struct priority_test_arg *targ = arg; + struct priority_test_arg *next = targ->next_event_arg; + + targ->called = true; + if (next) { + sbi_sse_inject(next->event_id, current_thread_info()->hartid); + + report(!sse_event_pending(next->event_id), "Higher priority event is not pending"); + report(next->called, "Higher priority event was handled"); + } +} + +static void sse_low_priority_test_handler(void *arg, struct pt_regs *regs, + unsigned int hartid) +{ + struct priority_test_arg *targ = arg; + struct priority_test_arg *next = targ->next_event_arg; + + targ->called = true; + + if (next) { + sbi_sse_inject(next->event_id, current_thread_info()->hartid); + + report(sse_event_pending(next->event_id), "Lower priority event is pending"); + report(!next->called, "Lower priority event %s was not handled before %s", + sse_event_name(next->event_id), sse_event_name(targ->event_id)); + } +} + +static void sse_test_injection_priority_arg(struct priority_test_arg *in_args, + unsigned int in_args_size, + sbi_sse_handler_fn handler, + const char *test_name) +{ + unsigned int i; + unsigned long value, uret; + struct sbiret ret; + uint32_t event_id; + struct priority_test_arg *arg; + unsigned int args_size = 0; + struct sbi_sse_handler_arg event_args[in_args_size]; + struct priority_test_arg *args[in_args_size]; + void *stack; + struct sbi_sse_handler_arg *event_arg; + + report_prefix_push(test_name); + + for (i = 0; i < in_args_size; i++) { + arg = &in_args[i]; + arg->state = SBI_SSE_STATE_UNUSED; + event_id = arg->event_id; + if (!sse_event_can_inject(event_id)) + continue; + + args[args_size] = arg; + args_size++; + event_args->stack = 0; + } + + if (!args_size) { + report_skip("No injectable events"); + goto skip; + } + + for (i = 0; i < args_size; i++) { + arg = args[i]; + event_id = arg->event_id; + stack = sse_alloc_stack(); + + event_arg = &event_args[i]; + event_arg->handler = handler; + event_arg->handler_data = (void *)arg; + event_arg->stack = stack; + + if (i < (args_size - 1)) + arg->next_event_arg = args[i + 1]; + else + arg->next_event_arg = NULL; + + /* Be sure global events are targeting the current hart */ + if (sbi_sse_event_is_global(event_id)) { + uret = sse_global_event_set_current_hart(event_id); + if (uret) + goto err; + } + + ret = sbi_sse_register(event_id, event_arg); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "register event %s", + sse_event_name(event_id)); + goto err; + } + arg->state = SBI_SSE_STATE_REGISTERED; + + value = arg->prio; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "set event %s priority", + sse_event_name(event_id)); + goto err; + } + ret = sbi_sse_enable(event_id); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "enable event %s", + sse_event_name(event_id)); + goto err; + } + arg->state = SBI_SSE_STATE_ENABLED; + } + + /* Inject first event */ + ret = sbi_sse_inject(args[0]->event_id, current_thread_info()->hartid); + sbiret_report_error(&ret, SBI_SUCCESS, "injection"); + + /* Check that all handlers have been called */ + for (i = 0; i < args_size; i++) + report(arg->called, "Event %s handler called", sse_event_name(args[i]->event_id)); + +err: + for (i = 0; i < args_size; i++) { + arg = args[i]; + event_id = arg->event_id; + + switch (arg->state) { + case SBI_SSE_STATE_ENABLED: + ret = sbi_sse_disable(event_id); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "disable event 0x%x", + event_id); + break; + } + case SBI_SSE_STATE_REGISTERED: + sbi_sse_unregister(event_id); + if (ret.error) + sbiret_report_error(&ret, SBI_SUCCESS, "unregister event 0x%x", + event_id); + default: + } + + event_arg = &event_args[i]; + if (event_arg->stack) + sse_free_stack(event_arg->stack); + } + +skip: + report_prefix_pop(); +} + +static struct priority_test_arg hi_prio_args[] = { + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE}, + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE}, + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW}, + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP}, + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS}, +}; + +static struct priority_test_arg low_prio_args[] = { + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP}, + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW}, + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE}, + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE}, +}; + +static struct priority_test_arg prio_args[] = { + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE, .prio = 5}, + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE, .prio = 10}, + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, .prio = 12}, + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, .prio = 15}, + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, .prio = 20}, + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, .prio = 22}, + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, .prio = 25}, +}; + +static struct priority_test_arg same_prio_args[] = { + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, .prio = 0}, + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, .prio = 0}, + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, .prio = 10}, + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE, .prio = 10}, + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE, .prio = 10}, + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, .prio = 20}, + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, .prio = 20}, +}; + +static void sse_test_injection_priority(void) +{ + report_prefix_push("prio"); + + sse_test_injection_priority_arg(hi_prio_args, ARRAY_SIZE(hi_prio_args), + sse_hi_priority_test_handler, "high"); + + sse_test_injection_priority_arg(low_prio_args, ARRAY_SIZE(low_prio_args), + sse_low_priority_test_handler, "low"); + + sse_test_injection_priority_arg(prio_args, ARRAY_SIZE(prio_args), + sse_low_priority_test_handler, "changed"); + + sse_test_injection_priority_arg(same_prio_args, ARRAY_SIZE(same_prio_args), + sse_low_priority_test_handler, "same_prio_args"); + + report_prefix_pop(); +} + +static void test_invalid_event_id(unsigned long event_id) +{ + struct sbiret ret; + unsigned long value = 0; + + ret = sbi_sse_register_raw(event_id, (unsigned long) sbi_sse_entry, 0); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "register event_id 0x%lx", event_id); + + ret = sbi_sse_unregister(event_id); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "unregister event_id 0x%lx", event_id); + + ret = sbi_sse_enable(event_id); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "enable event_id 0x%lx", event_id); + + ret = sbi_sse_disable(event_id); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "disable event_id 0x%lx", event_id); + + ret = sbi_sse_inject(event_id, 0); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "inject event_id 0x%lx", event_id); + + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "write attr event_id 0x%lx", event_id); + + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "read attr event_id 0x%lx", event_id); +} + +static void sse_test_invalid_event_id(void) +{ + + report_prefix_push("event_id"); + + test_invalid_event_id(SBI_SSE_EVENT_LOCAL_RESERVED_0_START); + + report_prefix_pop(); +} + +static void sse_check_event_availability(uint32_t event_id, bool *can_inject, bool *supported) +{ + unsigned long status; + struct sbiret ret; + + *can_inject = false; + *supported = false; + + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, 1, &status); + if (ret.error != SBI_SUCCESS && ret.error != SBI_ERR_NOT_SUPPORTED) { + report_fail("Get event status != SBI_SUCCESS && != SBI_ERR_NOT_SUPPORTED: %ld", + ret.error); + return; + } + if (ret.error == SBI_ERR_NOT_SUPPORTED) + return; + + *supported = true; + *can_inject = (status >> SBI_SSE_ATTR_STATUS_INJECT_OFFSET) & 1; +} + +static void sse_secondary_boot_and_unmask(void *data) +{ + sbi_sse_hart_unmask(); +} + +static void sse_check_mask(void) +{ + struct sbiret ret; + + /* Upon boot, event are masked, check that */ + ret = sbi_sse_hart_mask(); + sbiret_report_error(&ret, SBI_ERR_ALREADY_STOPPED, "hart mask at boot time"); + + ret = sbi_sse_hart_unmask(); + sbiret_report_error(&ret, SBI_SUCCESS, "hart unmask"); + ret = sbi_sse_hart_unmask(); + sbiret_report_error(&ret, SBI_ERR_ALREADY_STARTED, "hart unmask twice error"); + + ret = sbi_sse_hart_mask(); + sbiret_report_error(&ret, SBI_SUCCESS, "hart mask"); + ret = sbi_sse_hart_mask(); + sbiret_report_error(&ret, SBI_ERR_ALREADY_STOPPED, "hart mask twice"); +} + +static void run_inject_test(struct sse_event_info *info) +{ + unsigned long event_id = info->event_id; + + if (!info->can_inject) { + report_skip("Event does not support injection, skipping injection tests"); + return; + } + + sse_test_inject_simple(event_id); + + if (sbi_sse_event_is_global(event_id)) + sse_test_inject_global(event_id); + else + sse_test_inject_local(event_id); +} + +void check_sse(void) +{ + struct sse_event_info *info; + unsigned long i, event_id; + bool sbi_skip_inject = false; + struct sbiret ret; + bool supported; + + report_prefix_push("sse"); + + if (!sbi_probe(SBI_EXT_SSE)) { + report_skip("extension not available"); + report_prefix_pop(); + return; + } + + sse_check_mask(); + + /* + * Dummy wakeup of all processors since some of them will be targeted + * by global events without going through the wakeup call as well as + * unmasking SSE events on all harts + */ + on_cpus(sse_secondary_boot_and_unmask, NULL); + + /* Check for OpenSBI to support injection */ + if (sbi_check_impl(SBI_IMPL_OPENSBI)) { + ret = sbi_get_imp_version(); + if (!ret.error && ret.value < sbi_impl_opensbi_mk_version(1, 6)) { + /* + * OpenSBI < v1.6 crashes kvm-unit-tests upon injection since injection + * arguments (a6/a7) were reversed. Skip injection tests. + */ + report_skip("OpenSBI < v1.6 detected, skipping injection tests"); + sbi_skip_inject = true; + } + } + + sse_test_invalid_event_id(); + + for (i = 0; i < ARRAY_SIZE(sse_event_infos); i++) { + info = &sse_event_infos[i]; + event_id = info->event_id; + report_prefix_push(info->name); + sse_check_event_availability(event_id, &info->can_inject, &supported); + if (!supported) { + report_skip("Event is not supported, skipping tests"); + report_prefix_pop(); + continue; + } + + sse_test_attrs(event_id); + sse_test_register_error(event_id); + + if (!sbi_skip_inject) + run_inject_test(info); + + report_prefix_pop(); + } + + if (!sbi_skip_inject) + sse_test_injection_priority(); + + report_prefix_pop(); +} diff --git a/riscv/sbi.c b/riscv/sbi.c index 0404bb81..478cb35d 100644 --- a/riscv/sbi.c +++ b/riscv/sbi.c @@ -32,6 +32,7 @@ #define HIGH_ADDR_BOUNDARY ((phys_addr_t)1 << 32) +void check_sse(void); void check_fwft(void); static long __labs(long a) @@ -1567,6 +1568,7 @@ int main(int argc, char **argv) check_hsm(); check_dbcn(); check_susp(); + check_sse(); check_fwft(); return report_summary(); -- 2.47.2 From andrew.jones at linux.dev Mon Mar 17 05:08:15 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Mon, 17 Mar 2025 13:08:15 +0100 Subject: [kvm-unit-tests PATCH] riscv: sbi: Add no target ipi test Message-ID: <20250317120814.35241-2-andrew.jones@linux.dev> While an hmask of zero without an hbase of -1 (which would mean broadcast) is pointless, since nothing is targeted, the spec doesn't say it should return an error. Maybe it should? Anyway, for now, just confirm that it "works". Signed-off-by: Andrew Jones --- riscv/sbi.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/riscv/sbi.c b/riscv/sbi.c index 219f7187acf3..16d918d00848 100644 --- a/riscv/sbi.c +++ b/riscv/sbi.c @@ -507,6 +507,10 @@ end_two: max_hartid = cpus[cpu].hartid; } + /* Test no targets */ + ret = sbi_send_ipi(0, 0); + sbiret_report_error(&ret, SBI_SUCCESS, "no targets"); + /* Try the next higher hartid than the max */ ret = sbi_send_ipi(2, max_hartid); report_kfail(true, ret.error == SBI_ERR_INVALID_PARAM, "hart_mask got expected error (%ld)", ret.error); -- 2.48.1 From ajones at ventanamicro.com Mon Mar 17 05:30:56 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Mon, 17 Mar 2025 13:30:56 +0100 Subject: [kvm-unit-tests PATCH v10 4/8] riscv: sbi: Add functions for version checking In-Reply-To: <20250317101956.526834-5-cleger@rivosinc.com> References: <20250317101956.526834-1-cleger@rivosinc.com> <20250317101956.526834-5-cleger@rivosinc.com> Message-ID: <20250317-4edd893e742f7186b515be76@orel> On Mon, Mar 17, 2025 at 11:19:50AM +0100, Cl?ment L?ger wrote: > Version checking was done using some custom hardcoded values, backport a > few SBI function and defines from Linux to do that cleanly. > > Signed-off-by: Cl?ment L?ger > --- > lib/riscv/asm/sbi.h | 15 +++++++++++++++ > lib/riscv/sbi.c | 9 +++++++-- > 2 files changed, 22 insertions(+), 2 deletions(-) > > diff --git a/lib/riscv/asm/sbi.h b/lib/riscv/asm/sbi.h > index 2f4d91ef..197288c7 100644 > --- a/lib/riscv/asm/sbi.h > +++ b/lib/riscv/asm/sbi.h > @@ -18,6 +18,12 @@ > #define SBI_ERR_IO -13 > #define SBI_ERR_DENIED_LOCKED -14 > > +/* SBI spec version fields */ > +#define SBI_SPEC_VERSION_DEFAULT 0x1 We don't need this define, since we don't support version 0.1. But there is another mask we should add. See below. > +#define SBI_SPEC_VERSION_MAJOR_SHIFT 24 > +#define SBI_SPEC_VERSION_MAJOR_MASK 0x7f > +#define SBI_SPEC_VERSION_MINOR_MASK 0xffffff > + > #ifndef __ASSEMBLER__ > #include > > @@ -110,6 +116,14 @@ struct sbiret { > long value; > }; > > +/* Make SBI version */ > +static inline unsigned long sbi_mk_version(unsigned long major, unsigned long minor) > +{ > + return ((major & SBI_SPEC_VERSION_MAJOR_MASK) << SBI_SPEC_VERSION_MAJOR_SHIFT) > + | (minor & SBI_SPEC_VERSION_MINOR_MASK); > +} > + > + Extra blank, but I see it goes away with the next patch. > struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0, > unsigned long arg1, unsigned long arg2, > unsigned long arg3, unsigned long arg4, > @@ -124,6 +138,7 @@ struct sbiret sbi_send_ipi_cpu(int cpu); > struct sbiret sbi_send_ipi_cpumask(const cpumask_t *mask); > struct sbiret sbi_send_ipi_broadcast(void); > struct sbiret sbi_set_timer(unsigned long stime_value); > +struct sbiret sbi_get_spec_version(void); > long sbi_probe(int ext); > > #endif /* !__ASSEMBLER__ */ > diff --git a/lib/riscv/sbi.c b/lib/riscv/sbi.c > index 02dd338c..3c395cff 100644 > --- a/lib/riscv/sbi.c > +++ b/lib/riscv/sbi.c > @@ -107,12 +107,17 @@ struct sbiret sbi_set_timer(unsigned long stime_value) > return sbi_ecall(SBI_EXT_TIME, SBI_EXT_TIME_SET_TIMER, stime_value, 0, 0, 0, 0, 0); > } > > +struct sbiret sbi_get_spec_version(void) > +{ > + return sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_SPEC_VERSION, 0, 0, 0, 0, 0, 0); > +} > + > long sbi_probe(int ext) > { > struct sbiret ret; > > - ret = sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_SPEC_VERSION, 0, 0, 0, 0, 0, 0); > - assert(!ret.error && (ret.value & 0x7ffffffful) >= 2); > + ret = sbi_get_spec_version(); > + assert(!ret.error && ret.value >= sbi_mk_version(2, 0)); This changes the check from >= 0.2 to >= 2.0. Also, before, we were more tolerant with the return value potentially having upper bits set, since the spec only recently added "When XLEN is greater than 32, bits 32 and above are also reserved and must be 0." and we can leave it to the SBI tests to check those bits. IOW, this should be #define SBI_SPEC_VERSION_MASK 0x7fffffff assert(!ret.error && (ret.value & SBI_SPEC_VERSION_MASK) >= sbi_mk_version(0, 2)); > > ret = sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_PROBE_EXT, ext, 0, 0, 0, 0, 0); > assert(!ret.error); > -- > 2.47.2 > Thanks, drew From ajones at ventanamicro.com Mon Mar 17 05:46:03 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Mon, 17 Mar 2025 13:46:03 +0100 Subject: [kvm-unit-tests PATCH v10 5/8] lib: riscv: add functions to get implementer ID and version In-Reply-To: <20250317101956.526834-6-cleger@rivosinc.com> References: <20250317101956.526834-1-cleger@rivosinc.com> <20250317101956.526834-6-cleger@rivosinc.com> Message-ID: <20250317-a53b74017397269b474dd6b9@orel> On Mon, Mar 17, 2025 at 11:19:51AM +0100, Cl?ment L?ger wrote: > These function will be used by SSE tests to check for a specific opensbi > version. sbi_impl_check() is an helper allowing to check for a specific > SBI implementor without needing to check for ret.error. > > Signed-off-by: Cl?ment L?ger > --- > lib/riscv/asm/sbi.h | 30 ++++++++++++++++++++++++++++++ > lib/riscv/sbi.c | 10 ++++++++++ > 2 files changed, 40 insertions(+) > > diff --git a/lib/riscv/asm/sbi.h b/lib/riscv/asm/sbi.h > index 197288c7..06bcec16 100644 > --- a/lib/riscv/asm/sbi.h > +++ b/lib/riscv/asm/sbi.h > @@ -18,6 +18,19 @@ > #define SBI_ERR_IO -13 > #define SBI_ERR_DENIED_LOCKED -14 > > +#define SBI_IMPL_BBL 0 > +#define SBI_IMPL_OPENSBI 1 > +#define SBI_IMPL_XVISOR 2 > +#define SBI_IMPL_KVM 3 > +#define SBI_IMPL_RUSTSBI 4 > +#define SBI_IMPL_DIOSIX 5 > +#define SBI_IMPL_COFFER 6 > +#define SBI_IMPL_XEN Project 7 s/Project// > +#define SBI_IMPL_POLARFIRE_HSS 8 > +#define SBI_IMPL_COREBOOT 9 > +#define SBI_IMPL_OREBOOT 10 > +#define SBI_IMPL_BHYVE 11 > + > /* SBI spec version fields */ > #define SBI_SPEC_VERSION_DEFAULT 0x1 > #define SBI_SPEC_VERSION_MAJOR_SHIFT 24 > @@ -123,6 +136,10 @@ static inline unsigned long sbi_mk_version(unsigned long major, unsigned long mi > | (minor & SBI_SPEC_VERSION_MINOR_MASK); > } > > +static inline unsigned long sbi_impl_opensbi_mk_version(unsigned long major, unsigned long minor) > +{ > + return ((major << 16) | (minor)); Should at least mask minor, best to mask both. > +} > > struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0, > unsigned long arg1, unsigned long arg2, > @@ -139,7 +156,20 @@ struct sbiret sbi_send_ipi_cpumask(const cpumask_t *mask); > struct sbiret sbi_send_ipi_broadcast(void); > struct sbiret sbi_set_timer(unsigned long stime_value); > struct sbiret sbi_get_spec_version(void); > +struct sbiret sbi_get_imp_version(void); > +struct sbiret sbi_get_imp_id(void); > long sbi_probe(int ext); > > +static inline bool sbi_check_impl(unsigned long impl) > +{ > + struct sbiret ret; > + > + ret = sbi_get_imp_id(); > + if (ret.error) > + return false; > + > + return ret.value == impl; Or, more tersely, struct sbiret ret = sbi_get_imp_id(); return !ret.error && ret.value == impl; but an assert would make more sense, since get-impl-id really shouldn't fail, unless the SBI version is 0.1, which we don't support. > +} > + > #endif /* !__ASSEMBLER__ */ > #endif /* _ASMRISCV_SBI_H_ */ > diff --git a/lib/riscv/sbi.c b/lib/riscv/sbi.c > index 3c395cff..9cb5757e 100644 > --- a/lib/riscv/sbi.c > +++ b/lib/riscv/sbi.c > @@ -107,6 +107,16 @@ struct sbiret sbi_set_timer(unsigned long stime_value) > return sbi_ecall(SBI_EXT_TIME, SBI_EXT_TIME_SET_TIMER, stime_value, 0, 0, 0, 0, 0); > } > > +struct sbiret sbi_get_imp_version(void) > +{ > + return sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_IMP_VERSION, 0, 0, 0, 0, 0, 0); > +} > + > +struct sbiret sbi_get_imp_id(void) > +{ > + return sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_IMP_ID, 0, 0, 0, 0, 0, 0); > +} Going with error asserts, then we can just put the asserts in these functions and return ret.value directly, like sbi_probe() does, which means sbi_check_impl() can be dropped. Thanks, drew > + > struct sbiret sbi_get_spec_version(void) > { > return sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_SPEC_VERSION, 0, 0, 0, 0, 0, 0); > -- > 2.47.2 > From cleger at rivosinc.com Mon Mar 17 05:47:17 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Mon, 17 Mar 2025 13:47:17 +0100 Subject: [kvm-unit-tests PATCH v10 5/8] lib: riscv: add functions to get implementer ID and version In-Reply-To: <20250317-a53b74017397269b474dd6b9@orel> References: <20250317101956.526834-1-cleger@rivosinc.com> <20250317101956.526834-6-cleger@rivosinc.com> <20250317-a53b74017397269b474dd6b9@orel> Message-ID: <2214520b-4ee8-4147-806b-d25f3fc324fe@rivosinc.com> On 17/03/2025 13:46, Andrew Jones wrote: > On Mon, Mar 17, 2025 at 11:19:51AM +0100, Cl?ment L?ger wrote: >> These function will be used by SSE tests to check for a specific opensbi >> version. sbi_impl_check() is an helper allowing to check for a specific >> SBI implementor without needing to check for ret.error. >> >> Signed-off-by: Cl?ment L?ger >> --- >> lib/riscv/asm/sbi.h | 30 ++++++++++++++++++++++++++++++ >> lib/riscv/sbi.c | 10 ++++++++++ >> 2 files changed, 40 insertions(+) >> >> diff --git a/lib/riscv/asm/sbi.h b/lib/riscv/asm/sbi.h >> index 197288c7..06bcec16 100644 >> --- a/lib/riscv/asm/sbi.h >> +++ b/lib/riscv/asm/sbi.h >> @@ -18,6 +18,19 @@ >> #define SBI_ERR_IO -13 >> #define SBI_ERR_DENIED_LOCKED -14 >> >> +#define SBI_IMPL_BBL 0 >> +#define SBI_IMPL_OPENSBI 1 >> +#define SBI_IMPL_XVISOR 2 >> +#define SBI_IMPL_KVM 3 >> +#define SBI_IMPL_RUSTSBI 4 >> +#define SBI_IMPL_DIOSIX 5 >> +#define SBI_IMPL_COFFER 6 >> +#define SBI_IMPL_XEN Project 7 > > s/Project// > >> +#define SBI_IMPL_POLARFIRE_HSS 8 >> +#define SBI_IMPL_COREBOOT 9 >> +#define SBI_IMPL_OREBOOT 10 >> +#define SBI_IMPL_BHYVE 11 >> + >> /* SBI spec version fields */ >> #define SBI_SPEC_VERSION_DEFAULT 0x1 >> #define SBI_SPEC_VERSION_MAJOR_SHIFT 24 >> @@ -123,6 +136,10 @@ static inline unsigned long sbi_mk_version(unsigned long major, unsigned long mi >> | (minor & SBI_SPEC_VERSION_MINOR_MASK); >> } >> >> +static inline unsigned long sbi_impl_opensbi_mk_version(unsigned long major, unsigned long minor) >> +{ >> + return ((major << 16) | (minor)); > > Should at least mask minor, best to mask both. > >> +} >> >> struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0, >> unsigned long arg1, unsigned long arg2, >> @@ -139,7 +156,20 @@ struct sbiret sbi_send_ipi_cpumask(const cpumask_t *mask); >> struct sbiret sbi_send_ipi_broadcast(void); >> struct sbiret sbi_set_timer(unsigned long stime_value); >> struct sbiret sbi_get_spec_version(void); >> +struct sbiret sbi_get_imp_version(void); >> +struct sbiret sbi_get_imp_id(void); >> long sbi_probe(int ext); >> >> +static inline bool sbi_check_impl(unsigned long impl) >> +{ >> + struct sbiret ret; >> + >> + ret = sbi_get_imp_id(); >> + if (ret.error) >> + return false; >> + >> + return ret.value == impl; > > Or, more tersely, > > struct sbiret ret = sbi_get_imp_id(); > return !ret.error && ret.value == impl; > > but an assert would make more sense, since get-impl-id really shouldn't > fail, unless the SBI version is 0.1, which we don't support. > >> +} >> + >> #endif /* !__ASSEMBLER__ */ >> #endif /* _ASMRISCV_SBI_H_ */ >> diff --git a/lib/riscv/sbi.c b/lib/riscv/sbi.c >> index 3c395cff..9cb5757e 100644 >> --- a/lib/riscv/sbi.c >> +++ b/lib/riscv/sbi.c >> @@ -107,6 +107,16 @@ struct sbiret sbi_set_timer(unsigned long stime_value) >> return sbi_ecall(SBI_EXT_TIME, SBI_EXT_TIME_SET_TIMER, stime_value, 0, 0, 0, 0, 0); >> } >> >> +struct sbiret sbi_get_imp_version(void) >> +{ >> + return sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_IMP_VERSION, 0, 0, 0, 0, 0, 0); >> +} >> + >> +struct sbiret sbi_get_imp_id(void) >> +{ >> + return sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_IMP_ID, 0, 0, 0, 0, 0, 0); >> +} > > Going with error asserts, then we can just put the asserts in these > functions and return ret.value directly, like sbi_probe() does, which > means sbi_check_impl() can be dropped. Ack, I'll modify that. Thanks, Cl?ment > > Thanks, > drew > >> + >> struct sbiret sbi_get_spec_version(void) >> { >> return sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_SPEC_VERSION, 0, 0, 0, 0, 0, 0); >> -- >> 2.47.2 >> From andrew.jones at linux.dev Mon Mar 17 05:49:33 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Mon, 17 Mar 2025 13:49:33 +0100 Subject: [kvm-unit-tests PATCH v10 8/8] riscv: sbi: Add SSE extension tests In-Reply-To: <20250317101956.526834-9-cleger@rivosinc.com> References: <20250317101956.526834-1-cleger@rivosinc.com> <20250317101956.526834-9-cleger@rivosinc.com> Message-ID: <20250317-a427b5b91080ab28e40a56b7@orel> On Mon, Mar 17, 2025 at 11:19:54AM +0100, Cl?ment L?ger wrote: > Add SBI SSE extension tests for the following features: > - Test attributes errors (invalid values, RO, etc) > - Registration errors > - Simple events (register, enable, inject) > - Events with different priorities > - Global events dispatch on different harts > - Local events on all harts > - Hart mask/unmask events > > Signed-off-by: Cl?ment L?ger > --- > riscv/Makefile | 1 + > riscv/sbi-tests.h | 1 + > riscv/sbi-sse.c | 1280 +++++++++++++++++++++++++++++++++++++++++++++ > riscv/sbi.c | 2 + > 4 files changed, 1284 insertions(+) > create mode 100644 riscv/sbi-sse.c > Reviewed-by: Andrew Jones From seanjc at google.com Mon Mar 17 09:08:06 2025 From: seanjc at google.com (Sean Christopherson) Date: Mon, 17 Mar 2025 09:08:06 -0700 Subject: [PATCH] RISC-V: KVM: Teardown riscv specific bits after kvm_exit In-Reply-To: <20250317-kvm_exit_fix-v1-1-aa5240c5dbd2@rivosinc.com> References: <20250317-kvm_exit_fix-v1-1-aa5240c5dbd2@rivosinc.com> Message-ID: On Mon, Mar 17, 2025, Atish Patra wrote: > During a module removal, kvm_exit invokes arch specific disable > call which disables AIA. However, we invoke aia_exit before kvm_exit > resulting in the following warning. KVM kernel module can't be inserted > afterwards due to inconsistent state of IRQ. > > [25469.031389] percpu IRQ 31 still enabled on CPU0! > [25469.031732] WARNING: CPU: 3 PID: 943 at kernel/irq/manage.c:2476 __free_percpu_irq+0xa2/0x150 > [25469.031804] Modules linked in: kvm(-) > [25469.031848] CPU: 3 UID: 0 PID: 943 Comm: rmmod Not tainted 6.14.0-rc5-06947-g91c763118f47-dirty #2 > [25469.031905] Hardware name: riscv-virtio,qemu (DT) > [25469.031928] epc : __free_percpu_irq+0xa2/0x150 > [25469.031976] ra : __free_percpu_irq+0xa2/0x150 > [25469.032197] epc : ffffffff8007db1e ra : ffffffff8007db1e sp : ff2000000088bd50 > [25469.032241] gp : ffffffff8131cef8 tp : ff60000080b96400 t0 : ff2000000088baf8 > [25469.032285] t1 : fffffffffffffffc t2 : 5249207570637265 s0 : ff2000000088bd90 > [25469.032329] s1 : ff60000098b21080 a0 : 037d527a15eb4f00 a1 : 037d527a15eb4f00 > [25469.032372] a2 : 0000000000000023 a3 : 0000000000000001 a4 : ffffffff8122dbf8 > [25469.032410] a5 : 0000000000000fff a6 : 0000000000000000 a7 : ffffffff8122dc10 > [25469.032448] s2 : ff60000080c22eb0 s3 : 0000000200000022 s4 : 000000000000001f > [25469.032488] s5 : ff60000080c22e00 s6 : ffffffff80c351c0 s7 : 0000000000000000 > [25469.032582] s8 : 0000000000000003 s9 : 000055556b7fb490 s10: 00007ffff0e12fa0 > [25469.032621] s11: 00007ffff0e13e9a t3 : ffffffff81354ac7 t4 : ffffffff81354ac7 > [25469.032664] t5 : ffffffff81354ac8 t6 : ffffffff81354ac7 > [25469.032698] status: 0000000200000100 badaddr: ffffffff8007db1e cause: 0000000000000003 > [25469.032738] [] __free_percpu_irq+0xa2/0x150 > [25469.032797] [] free_percpu_irq+0x30/0x5e > [25469.032856] [] kvm_riscv_aia_exit+0x40/0x42 [kvm] > [25469.033947] [] cleanup_module+0x10/0x32 [kvm] > [25469.035300] [] __riscv_sys_delete_module+0x18e/0x1fc > [25469.035374] [] syscall_handler+0x3a/0x46 > [25469.035456] [] do_trap_ecall_u+0x72/0x134 > [25469.035536] [] handle_exception+0x148/0x156 > > Invoke aia_exit and other arch specific cleanup functions after kvm_exit > so that disable gets a chance to be called first before exit. > > Fixes: 54e43320c2ba ("RISC-V: KVM: Initial skeletal support for AIA") > > Signed-off-by: Atish Patra > --- FWIW, Reviewed-by: Sean Christopherson > arch/riscv/kvm/main.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c > index 1fa8be5ee509..4b24705dc63a 100644 > --- a/arch/riscv/kvm/main.c > +++ b/arch/riscv/kvm/main.c > @@ -172,8 +172,8 @@ module_init(riscv_kvm_init); > > static void __exit riscv_kvm_exit(void) > { > - kvm_riscv_teardown(); > - > kvm_exit(); > + > + kvm_riscv_teardown(); I wonder if there's a way we can guard against kvm_init()/kvm_exit() being called too early/late. x86 had similar bugs for a very long time, e.g. see commit e32b120071ea ("KVM: VMX: Do _all_ initialization before exposing /dev/kvm to userspace"). E.g. maybe we do something like create+destroy a VM at the end of kvm_init() and the beginning of kvm_exit()? Not sure if that would work for kvm_exit(), but it should definitely be fine for kvm_init(). It wouldn't prevent bugs, but maybe it would help detect them during development? From cleger at rivosinc.com Mon Mar 17 09:46:45 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 17:46:45 +0100 Subject: [kvm-unit-tests PATCH v11 0/8] riscv: add SBI SSE extension tests Message-ID: <20250317164655.1120015-1-cleger@rivosinc.com> This series adds tests for SBI SSE extension as well as needed infrastructure for SSE support. It also adds test specific asm-offsets generation to use custom OFFSET and DEFINE from the test directory. These tests can be run using an OpenSBI version that implements latest specifications modification [1] Link: https://github.com/rivosinc/opensbi/tree/dev/cleger/sse [1] --- V11: - Use mask inside sbi_impl_opensbi_mk_version() - Mask the SBI version with a new mask - Use assert inside sbi_get_impl_id/version() - Remove sbi_check_impl() - Increase completion timeout as events failed completing under 1000 micros when system is loaded. V10: - Use && instead of || for timeout handling - Add SBI patches which introduce function to get implementer ID and version as well as implementer ID defines. - Skip injection tests in OpenSBI < v1.6 V9: - Use __ASSEMBLER__ instead of __ASSEMBLY__ - Remove extra spaces - Use assert to check global event in sse_global_event_set_current_hart() - Tabulate SSE events names table - Use sbi_sse_register() instead of sbi_sse_register_raw() in error testing - Move a report_pass() out of error path - Rework all injection tests with better error handling - Use an env var for sse event completion timeout - Add timeout for some potentially infinite while() loops V8: - Short circuit current event tests if failure happens - Remove SSE from all report strings - Indent .prio field - Add cpu_relax()/smp_rmb() where needed - Add timeout for global event ENABLED state check - Added BIT(32) aliases tests for attribute/event_id. V7: - Test ids/attributes/attributes count > 32 bits - Rename all SSE function to sbi_sse_* - Use event_id instead of event/evt - Factorize read/write test - Use virt_to_phys() for attributes read/write. - Extensively use sbiret_report_error() - Change check function return values to bool. - Added assert for stack size to be below or equal to PAGE_SIZE - Use en env variable for the maximum hart ID - Check that individual read from attributes matches the multiple attributes read. - Added multiple attributes write at once - Used READ_ONCE/WRITE_ONCE - Inject all local event at once rather than looping fopr each core. - Split test_arg for local_dispatch test so that all CPUs can run at once. - Move SSE entry and generic code to lib/riscv for other tests - Fix unmask/mask state checking V6: - Add missing $(generated-file) dependencies for "-deps" objects - Split SSE entry from sbi-asm.S to sse-asm.S and all SSE core functions since it will be useful for other tests as well (dbltrp). V5: - Update event ranges based on latest spec - Rename asm-offset-test.c to sbi-asm-offset.c V4: - Fix typo sbi_ext_ss_fid -> sbi_ext_sse_fid - Add proper asm-offset generation for tests - Move SSE specific file from lib/riscv to riscv/ V3: - Add -deps variable for test specific dependencies - Fix formatting errors/typo in sbi.h - Add missing double trap event - Alphabetize sbi-sse.c includes - Fix a6 content after unmasking event - Add SSE HART_MASK/UNMASK test - Use mv instead of move - move sbi_check_sse() definition in sbi.c - Remove sbi_sse test from unitests.cfg V2: - Rebased on origin/master and integrate it into sbi.c tests Cl?ment L?ger (8): kbuild: Allow multiple asm-offsets file to be generated riscv: Set .aux.o files as .PRECIOUS riscv: Use asm-offsets to generate SBI_EXT_HSM values lib: riscv: Add functions for version checking lib: riscv: Add functions to get implementer ID and version riscv: lib: Add SBI SSE extension definitions lib: riscv: Add SBI SSE support riscv: sbi: Add SSE extension tests scripts/asm-offsets.mak | 22 +- riscv/Makefile | 5 +- lib/riscv/asm/csr.h | 1 + lib/riscv/asm/sbi.h | 177 +++++- lib/riscv/sbi-sse-asm.S | 102 ++++ lib/riscv/asm-offsets.c | 9 + lib/riscv/sbi.c | 105 +++- riscv/sbi-tests.h | 1 + riscv/sbi-asm.S | 6 +- riscv/sbi-asm-offsets.c | 11 + riscv/sbi-sse.c | 1278 +++++++++++++++++++++++++++++++++++++++ riscv/sbi.c | 2 + riscv/.gitignore | 1 + 13 files changed, 1707 insertions(+), 13 deletions(-) create mode 100644 lib/riscv/sbi-sse-asm.S create mode 100644 riscv/sbi-asm-offsets.c create mode 100644 riscv/sbi-sse.c create mode 100644 riscv/.gitignore -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 09:46:46 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 17:46:46 +0100 Subject: [kvm-unit-tests PATCH v11 1/8] kbuild: Allow multiple asm-offsets file to be generated In-Reply-To: <20250317164655.1120015-1-cleger@rivosinc.com> References: <20250317164655.1120015-1-cleger@rivosinc.com> Message-ID: <20250317164655.1120015-2-cleger@rivosinc.com> In order to allow multiple asm-offsets files to generated the include guard need to be different between these file. Add a asm_offset_name makefile macro to obtain an uppercase name matching the original asm offsets file. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- scripts/asm-offsets.mak | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/scripts/asm-offsets.mak b/scripts/asm-offsets.mak index 7b64162d..a5fdbf5d 100644 --- a/scripts/asm-offsets.mak +++ b/scripts/asm-offsets.mak @@ -15,10 +15,14 @@ define sed-y s:->::; p;}' endef +define asm_offset_name + $(shell echo $(notdir $(1)) | tr [:lower:]- [:upper:]_) +endef + define make_asm_offsets (set -e; \ - echo "#ifndef __ASM_OFFSETS_H__"; \ - echo "#define __ASM_OFFSETS_H__"; \ + echo "#ifndef __$(strip $(asm_offset_name))_H__"; \ + echo "#define __$(strip $(asm_offset_name))_H__"; \ echo "/*"; \ echo " * Generated file. DO NOT MODIFY."; \ echo " *"; \ @@ -29,12 +33,16 @@ define make_asm_offsets echo "#endif" ) > $@ endef -$(asm-offsets:.h=.s): $(asm-offsets:.h=.c) - $(CC) $(CFLAGS) -fverbose-asm -S -o $@ $< +define gen_asm_offsets_rules +$(1).s: $(1).c + $(CC) $(CFLAGS) -fverbose-asm -S -o $$@ $$< + +$(1).h: $(1).s + $$(call make_asm_offsets,$(1)) + cp -f $$@ lib/generated/ +endef -$(asm-offsets): $(asm-offsets:.h=.s) - $(call make_asm_offsets) - cp -f $(asm-offsets) lib/generated/ +$(foreach o,$(asm-offsets),$(eval $(call gen_asm_offsets_rules, $(o:.h=)))) OBJDIRS += lib/generated -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 09:46:47 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 17:46:47 +0100 Subject: [kvm-unit-tests PATCH v11 2/8] riscv: Set .aux.o files as .PRECIOUS In-Reply-To: <20250317164655.1120015-1-cleger@rivosinc.com> References: <20250317164655.1120015-1-cleger@rivosinc.com> Message-ID: <20250317164655.1120015-3-cleger@rivosinc.com> When compiling, we need to keep .aux.o file or they will be removed after the compilation which leads to dependent files to be recompiled. Set these files as .PRECIOUS to keep them. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- riscv/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/riscv/Makefile b/riscv/Makefile index 52718f3f..ae9cf02a 100644 --- a/riscv/Makefile +++ b/riscv/Makefile @@ -90,6 +90,7 @@ CFLAGS += -I $(SRCDIR)/lib -I $(SRCDIR)/lib/libfdt -I lib -I $(SRCDIR)/riscv asm-offsets = lib/riscv/asm-offsets.h include $(SRCDIR)/scripts/asm-offsets.mak +.PRECIOUS: %.aux.o %.aux.o: $(SRCDIR)/lib/auxinfo.c $(CC) $(CFLAGS) -c -o $@ $< \ -DPROGNAME=\"$(notdir $(@:.aux.o=.$(exe)))\" -DAUXFLAGS=$(AUXFLAGS) -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 09:46:48 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 17:46:48 +0100 Subject: [kvm-unit-tests PATCH v11 3/8] riscv: Use asm-offsets to generate SBI_EXT_HSM values In-Reply-To: <20250317164655.1120015-1-cleger@rivosinc.com> References: <20250317164655.1120015-1-cleger@rivosinc.com> Message-ID: <20250317164655.1120015-4-cleger@rivosinc.com> Replace hardcoded values with generated ones using sbi-asm-offset. This allows to directly use ASM_SBI_EXT_HSM and ASM_SBI_EXT_HSM_STOP in assembly. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- riscv/Makefile | 2 +- riscv/sbi-asm.S | 6 ++++-- riscv/sbi-asm-offsets.c | 11 +++++++++++ riscv/.gitignore | 1 + 4 files changed, 17 insertions(+), 3 deletions(-) create mode 100644 riscv/sbi-asm-offsets.c create mode 100644 riscv/.gitignore diff --git a/riscv/Makefile b/riscv/Makefile index ae9cf02a..02d2ac39 100644 --- a/riscv/Makefile +++ b/riscv/Makefile @@ -87,7 +87,7 @@ CFLAGS += -ffreestanding CFLAGS += -O2 CFLAGS += -I $(SRCDIR)/lib -I $(SRCDIR)/lib/libfdt -I lib -I $(SRCDIR)/riscv -asm-offsets = lib/riscv/asm-offsets.h +asm-offsets = lib/riscv/asm-offsets.h riscv/sbi-asm-offsets.h include $(SRCDIR)/scripts/asm-offsets.mak .PRECIOUS: %.aux.o diff --git a/riscv/sbi-asm.S b/riscv/sbi-asm.S index f4185496..51f46efd 100644 --- a/riscv/sbi-asm.S +++ b/riscv/sbi-asm.S @@ -6,6 +6,8 @@ */ #include #include +#include +#include #include "sbi-tests.h" @@ -57,8 +59,8 @@ sbi_hsm_check: 7: lb t0, 0(t1) pause beqz t0, 7b - li a7, 0x48534d /* SBI_EXT_HSM */ - li a6, 1 /* SBI_EXT_HSM_HART_STOP */ + li a7, ASM_SBI_EXT_HSM + li a6, ASM_SBI_EXT_HSM_HART_STOP ecall 8: pause j 8b diff --git a/riscv/sbi-asm-offsets.c b/riscv/sbi-asm-offsets.c new file mode 100644 index 00000000..bd37b6a2 --- /dev/null +++ b/riscv/sbi-asm-offsets.c @@ -0,0 +1,11 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include + +int main(void) +{ + DEFINE(ASM_SBI_EXT_HSM, SBI_EXT_HSM); + DEFINE(ASM_SBI_EXT_HSM_HART_STOP, SBI_EXT_HSM_HART_STOP); + + return 0; +} diff --git a/riscv/.gitignore b/riscv/.gitignore new file mode 100644 index 00000000..0a8c5a36 --- /dev/null +++ b/riscv/.gitignore @@ -0,0 +1 @@ +/*-asm-offsets.[hs] -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 09:46:49 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 17:46:49 +0100 Subject: [kvm-unit-tests PATCH v11 4/8] lib: riscv: Add functions for version checking In-Reply-To: <20250317164655.1120015-1-cleger@rivosinc.com> References: <20250317164655.1120015-1-cleger@rivosinc.com> Message-ID: <20250317164655.1120015-5-cleger@rivosinc.com> Version checking was done using some custom hardcoded values, backport a few SBI function and defines from Linux to do that cleanly. Signed-off-by: Cl?ment L?ger --- lib/riscv/asm/sbi.h | 15 +++++++++++++++ lib/riscv/sbi.c | 9 +++++++-- 2 files changed, 22 insertions(+), 2 deletions(-) diff --git a/lib/riscv/asm/sbi.h b/lib/riscv/asm/sbi.h index 2f4d91ef..ee9d6e50 100644 --- a/lib/riscv/asm/sbi.h +++ b/lib/riscv/asm/sbi.h @@ -18,6 +18,13 @@ #define SBI_ERR_IO -13 #define SBI_ERR_DENIED_LOCKED -14 +/* SBI spec version fields */ +#define SBI_SPEC_VERSION_MAJOR_SHIFT 24 +#define SBI_SPEC_VERSION_MAJOR_MASK 0x7f +#define SBI_SPEC_VERSION_MINOR_MASK 0xffffff +#define SBI_SPEC_VERSION_MASK ((SBI_SPEC_VERSION_MAJOR_MASK << SBI_SPEC_VERSION_MAJOR_SHIFT) | \ + SBI_SPEC_VERSION_MINOR_MASK) + #ifndef __ASSEMBLER__ #include @@ -110,6 +117,13 @@ struct sbiret { long value; }; +/* Make SBI version */ +static inline unsigned long sbi_mk_version(unsigned long major, unsigned long minor) +{ + return ((major & SBI_SPEC_VERSION_MAJOR_MASK) << SBI_SPEC_VERSION_MAJOR_SHIFT) + | (minor & SBI_SPEC_VERSION_MINOR_MASK); +} + struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0, unsigned long arg1, unsigned long arg2, unsigned long arg3, unsigned long arg4, @@ -124,6 +138,7 @@ struct sbiret sbi_send_ipi_cpu(int cpu); struct sbiret sbi_send_ipi_cpumask(const cpumask_t *mask); struct sbiret sbi_send_ipi_broadcast(void); struct sbiret sbi_set_timer(unsigned long stime_value); +struct sbiret sbi_get_spec_version(void); long sbi_probe(int ext); #endif /* !__ASSEMBLER__ */ diff --git a/lib/riscv/sbi.c b/lib/riscv/sbi.c index 02dd338c..9d4eb541 100644 --- a/lib/riscv/sbi.c +++ b/lib/riscv/sbi.c @@ -107,12 +107,17 @@ struct sbiret sbi_set_timer(unsigned long stime_value) return sbi_ecall(SBI_EXT_TIME, SBI_EXT_TIME_SET_TIMER, stime_value, 0, 0, 0, 0, 0); } +struct sbiret sbi_get_spec_version(void) +{ + return sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_SPEC_VERSION, 0, 0, 0, 0, 0, 0); +} + long sbi_probe(int ext) { struct sbiret ret; - ret = sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_SPEC_VERSION, 0, 0, 0, 0, 0, 0); - assert(!ret.error && (ret.value & 0x7ffffffful) >= 2); + ret = sbi_get_spec_version(); + assert(!ret.error && (ret.value & SBI_SPEC_VERSION_MASK) >= sbi_mk_version(0, 2)); ret = sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_PROBE_EXT, ext, 0, 0, 0, 0, 0); assert(!ret.error); -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 09:46:50 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 17:46:50 +0100 Subject: [kvm-unit-tests PATCH v11 5/8] lib: riscv: Add functions to get implementer ID and version In-Reply-To: <20250317164655.1120015-1-cleger@rivosinc.com> References: <20250317164655.1120015-1-cleger@rivosinc.com> Message-ID: <20250317164655.1120015-6-cleger@rivosinc.com> These functions will be used by SSE tests to check for a specific OpenSBI version. Signed-off-by: Cl?ment L?ger --- lib/riscv/asm/sbi.h | 20 ++++++++++++++++++++ lib/riscv/sbi.c | 20 ++++++++++++++++++++ 2 files changed, 40 insertions(+) diff --git a/lib/riscv/asm/sbi.h b/lib/riscv/asm/sbi.h index ee9d6e50..90111628 100644 --- a/lib/riscv/asm/sbi.h +++ b/lib/riscv/asm/sbi.h @@ -18,6 +18,19 @@ #define SBI_ERR_IO -13 #define SBI_ERR_DENIED_LOCKED -14 +#define SBI_IMPL_BBL 0 +#define SBI_IMPL_OPENSBI 1 +#define SBI_IMPL_XVISOR 2 +#define SBI_IMPL_KVM 3 +#define SBI_IMPL_RUSTSBI 4 +#define SBI_IMPL_DIOSIX 5 +#define SBI_IMPL_COFFER 6 +#define SBI_IMPL_XEN 7 +#define SBI_IMPL_POLARFIRE_HSS 8 +#define SBI_IMPL_COREBOOT 9 +#define SBI_IMPL_OREBOOT 10 +#define SBI_IMPL_BHYVE 11 + /* SBI spec version fields */ #define SBI_SPEC_VERSION_MAJOR_SHIFT 24 #define SBI_SPEC_VERSION_MAJOR_MASK 0x7f @@ -124,6 +137,11 @@ static inline unsigned long sbi_mk_version(unsigned long major, unsigned long mi | (minor & SBI_SPEC_VERSION_MINOR_MASK); } +static inline unsigned long sbi_impl_opensbi_mk_version(unsigned long major, unsigned long minor) +{ + return (((major & 0xffff) << 16) | (minor & 0xffff)); +} + struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0, unsigned long arg1, unsigned long arg2, unsigned long arg3, unsigned long arg4, @@ -139,6 +157,8 @@ struct sbiret sbi_send_ipi_cpumask(const cpumask_t *mask); struct sbiret sbi_send_ipi_broadcast(void); struct sbiret sbi_set_timer(unsigned long stime_value); struct sbiret sbi_get_spec_version(void); +unsigned long sbi_get_imp_version(void); +unsigned long sbi_get_imp_id(void); long sbi_probe(int ext); #endif /* !__ASSEMBLER__ */ diff --git a/lib/riscv/sbi.c b/lib/riscv/sbi.c index 9d4eb541..ab032e3e 100644 --- a/lib/riscv/sbi.c +++ b/lib/riscv/sbi.c @@ -107,6 +107,26 @@ struct sbiret sbi_set_timer(unsigned long stime_value) return sbi_ecall(SBI_EXT_TIME, SBI_EXT_TIME_SET_TIMER, stime_value, 0, 0, 0, 0, 0); } +unsigned long sbi_get_imp_version(void) +{ + struct sbiret ret; + + ret = sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_IMP_VERSION, 0, 0, 0, 0, 0, 0); + assert(!ret.error); + + return ret.value; +} + +unsigned long sbi_get_imp_id(void) +{ + struct sbiret ret; + + ret = sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_IMP_ID, 0, 0, 0, 0, 0, 0); + assert(!ret.error); + + return ret.value; +} + struct sbiret sbi_get_spec_version(void) { return sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_SPEC_VERSION, 0, 0, 0, 0, 0, 0); -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 09:46:51 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 17:46:51 +0100 Subject: [kvm-unit-tests PATCH v11 6/8] riscv: lib: Add SBI SSE extension definitions In-Reply-To: <20250317164655.1120015-1-cleger@rivosinc.com> References: <20250317164655.1120015-1-cleger@rivosinc.com> Message-ID: <20250317164655.1120015-7-cleger@rivosinc.com> Add SBI SSE extension definitions in sbi.h Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- lib/riscv/asm/sbi.h | 106 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 105 insertions(+), 1 deletion(-) diff --git a/lib/riscv/asm/sbi.h b/lib/riscv/asm/sbi.h index 90111628..887a4cd4 100644 --- a/lib/riscv/asm/sbi.h +++ b/lib/riscv/asm/sbi.h @@ -50,6 +50,7 @@ enum sbi_ext_id { SBI_EXT_DBCN = 0x4442434E, SBI_EXT_SUSP = 0x53555350, SBI_EXT_FWFT = 0x46574654, + SBI_EXT_SSE = 0x535345, }; enum sbi_ext_base_fid { @@ -98,7 +99,6 @@ enum sbi_ext_dbcn_fid { SBI_EXT_DBCN_CONSOLE_WRITE_BYTE, }; - enum sbi_ext_fwft_fid { SBI_EXT_FWFT_SET = 0, SBI_EXT_FWFT_GET, @@ -125,6 +125,110 @@ enum sbi_ext_fwft_fid { #define SBI_FWFT_SET_FLAG_LOCK BIT(0) +enum sbi_ext_sse_fid { + SBI_EXT_SSE_READ_ATTRS = 0, + SBI_EXT_SSE_WRITE_ATTRS, + SBI_EXT_SSE_REGISTER, + SBI_EXT_SSE_UNREGISTER, + SBI_EXT_SSE_ENABLE, + SBI_EXT_SSE_DISABLE, + SBI_EXT_SSE_COMPLETE, + SBI_EXT_SSE_INJECT, + SBI_EXT_SSE_HART_UNMASK, + SBI_EXT_SSE_HART_MASK, +}; + +/* SBI SSE Event Attributes. */ +enum sbi_sse_attr_id { + SBI_SSE_ATTR_STATUS = 0x00000000, + SBI_SSE_ATTR_PRIORITY = 0x00000001, + SBI_SSE_ATTR_CONFIG = 0x00000002, + SBI_SSE_ATTR_PREFERRED_HART = 0x00000003, + SBI_SSE_ATTR_ENTRY_PC = 0x00000004, + SBI_SSE_ATTR_ENTRY_ARG = 0x00000005, + SBI_SSE_ATTR_INTERRUPTED_SEPC = 0x00000006, + SBI_SSE_ATTR_INTERRUPTED_FLAGS = 0x00000007, + SBI_SSE_ATTR_INTERRUPTED_A6 = 0x00000008, + SBI_SSE_ATTR_INTERRUPTED_A7 = 0x00000009, +}; + +#define SBI_SSE_ATTR_STATUS_STATE_OFFSET 0 +#define SBI_SSE_ATTR_STATUS_STATE_MASK 0x3 +#define SBI_SSE_ATTR_STATUS_PENDING_OFFSET 2 +#define SBI_SSE_ATTR_STATUS_INJECT_OFFSET 3 + +#define SBI_SSE_ATTR_CONFIG_ONESHOT BIT(0) + +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPP BIT(0) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPIE BIT(1) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPV BIT(2) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPVP BIT(3) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPELP BIT(4) +#define SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT BIT(5) + +enum sbi_sse_state { + SBI_SSE_STATE_UNUSED = 0, + SBI_SSE_STATE_REGISTERED = 1, + SBI_SSE_STATE_ENABLED = 2, + SBI_SSE_STATE_RUNNING = 3, +}; + +/* SBI SSE Event IDs. */ +/* Range 0x00000000 - 0x0000ffff */ +#define SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS 0x00000000 +#define SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP 0x00000001 +#define SBI_SSE_EVENT_LOCAL_RESERVED_0_START 0x00000002 +#define SBI_SSE_EVENT_LOCAL_RESERVED_0_END 0x00003fff +#define SBI_SSE_EVENT_LOCAL_PLAT_0_START 0x00004000 +#define SBI_SSE_EVENT_LOCAL_PLAT_0_END 0x00007fff + +#define SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS 0x00008000 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_0_START 0x00008001 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_0_END 0x0000bfff +#define SBI_SSE_EVENT_GLOBAL_PLAT_0_START 0x0000c000 +#define SBI_SSE_EVENT_GLOBAL_PLAT_0_END 0x0000ffff + +/* Range 0x00010000 - 0x0001ffff */ +#define SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW 0x00010000 +#define SBI_SSE_EVENT_LOCAL_RESERVED_1_START 0x00010001 +#define SBI_SSE_EVENT_LOCAL_RESERVED_1_END 0x00013fff +#define SBI_SSE_EVENT_LOCAL_PLAT_1_START 0x00014000 +#define SBI_SSE_EVENT_LOCAL_PLAT_1_END 0x00017fff + +#define SBI_SSE_EVENT_GLOBAL_RESERVED_1_START 0x00018000 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_1_END 0x0001bfff +#define SBI_SSE_EVENT_GLOBAL_PLAT_1_START 0x0001c000 +#define SBI_SSE_EVENT_GLOBAL_PLAT_1_END 0x0001ffff + +/* Range 0x00100000 - 0x0010ffff */ +#define SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS 0x00100000 +#define SBI_SSE_EVENT_LOCAL_RESERVED_2_START 0x00100001 +#define SBI_SSE_EVENT_LOCAL_RESERVED_2_END 0x00103fff +#define SBI_SSE_EVENT_LOCAL_PLAT_2_START 0x00104000 +#define SBI_SSE_EVENT_LOCAL_PLAT_2_END 0x00107fff + +#define SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS 0x00108000 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_2_START 0x00108001 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_2_END 0x0010bfff +#define SBI_SSE_EVENT_GLOBAL_PLAT_2_START 0x0010c000 +#define SBI_SSE_EVENT_GLOBAL_PLAT_2_END 0x0010ffff + +/* Range 0xffff0000 - 0xffffffff */ +#define SBI_SSE_EVENT_LOCAL_SOFTWARE 0xffff0000 +#define SBI_SSE_EVENT_LOCAL_RESERVED_3_START 0xffff0001 +#define SBI_SSE_EVENT_LOCAL_RESERVED_3_END 0xffff3fff +#define SBI_SSE_EVENT_LOCAL_PLAT_3_START 0xffff4000 +#define SBI_SSE_EVENT_LOCAL_PLAT_3_END 0xffff7fff + +#define SBI_SSE_EVENT_GLOBAL_SOFTWARE 0xffff8000 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_3_START 0xffff8001 +#define SBI_SSE_EVENT_GLOBAL_RESERVED_3_END 0xffffbfff +#define SBI_SSE_EVENT_GLOBAL_PLAT_3_START 0xffffc000 +#define SBI_SSE_EVENT_GLOBAL_PLAT_3_END 0xffffffff + +#define SBI_SSE_EVENT_PLATFORM_BIT BIT(14) +#define SBI_SSE_EVENT_GLOBAL_BIT BIT(15) + struct sbiret { long error; long value; -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 09:46:52 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 17:46:52 +0100 Subject: [kvm-unit-tests PATCH v11 7/8] lib: riscv: Add SBI SSE support In-Reply-To: <20250317164655.1120015-1-cleger@rivosinc.com> References: <20250317164655.1120015-1-cleger@rivosinc.com> Message-ID: <20250317164655.1120015-8-cleger@rivosinc.com> Add support for registering and handling SSE events. This will be used for sbi tests as well as upcoming double trap tests. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- riscv/Makefile | 1 + lib/riscv/asm/csr.h | 1 + lib/riscv/asm/sbi.h | 36 ++++++++++++++ lib/riscv/sbi-sse-asm.S | 102 ++++++++++++++++++++++++++++++++++++++++ lib/riscv/asm-offsets.c | 9 ++++ lib/riscv/sbi.c | 76 ++++++++++++++++++++++++++++++ 6 files changed, 225 insertions(+) create mode 100644 lib/riscv/sbi-sse-asm.S diff --git a/riscv/Makefile b/riscv/Makefile index 02d2ac39..16fc125b 100644 --- a/riscv/Makefile +++ b/riscv/Makefile @@ -43,6 +43,7 @@ cflatobjs += lib/riscv/setup.o cflatobjs += lib/riscv/smp.o cflatobjs += lib/riscv/stack.o cflatobjs += lib/riscv/timer.o +cflatobjs += lib/riscv/sbi-sse-asm.o ifeq ($(ARCH),riscv32) cflatobjs += lib/ldiv32.o endif diff --git a/lib/riscv/asm/csr.h b/lib/riscv/asm/csr.h index c7fc87a9..3e4b5fca 100644 --- a/lib/riscv/asm/csr.h +++ b/lib/riscv/asm/csr.h @@ -17,6 +17,7 @@ #define CSR_TIME 0xc01 #define SR_SIE _AC(0x00000002, UL) +#define SR_SPP _AC(0x00000100, UL) /* Exception cause high bit - is an interrupt if set */ #define CAUSE_IRQ_FLAG (_AC(1, UL) << (__riscv_xlen - 1)) diff --git a/lib/riscv/asm/sbi.h b/lib/riscv/asm/sbi.h index 887a4cd4..ad5f50de 100644 --- a/lib/riscv/asm/sbi.h +++ b/lib/riscv/asm/sbi.h @@ -265,5 +265,41 @@ unsigned long sbi_get_imp_version(void); unsigned long sbi_get_imp_id(void); long sbi_probe(int ext); +typedef void (*sbi_sse_handler_fn)(void *data, struct pt_regs *regs, unsigned int hartid); + +struct sbi_sse_handler_arg { + unsigned long reg_tmp; + sbi_sse_handler_fn handler; + void *handler_data; + void *stack; +}; + +extern void sbi_sse_entry(void); + +static inline bool sbi_sse_event_is_global(uint32_t event_id) +{ + return !!(event_id & SBI_SSE_EVENT_GLOBAL_BIT); +} + +struct sbiret sbi_sse_read_attrs_raw(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long phys_lo, + unsigned long phys_hi); +struct sbiret sbi_sse_read_attrs(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long *values); +struct sbiret sbi_sse_write_attrs_raw(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long phys_lo, + unsigned long phys_hi); +struct sbiret sbi_sse_write_attrs(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long *values); +struct sbiret sbi_sse_register_raw(unsigned long event_id, unsigned long entry_pc, + unsigned long entry_arg); +struct sbiret sbi_sse_register(unsigned long event_id, struct sbi_sse_handler_arg *arg); +struct sbiret sbi_sse_unregister(unsigned long event_id); +struct sbiret sbi_sse_enable(unsigned long event_id); +struct sbiret sbi_sse_disable(unsigned long event_id); +struct sbiret sbi_sse_hart_mask(void); +struct sbiret sbi_sse_hart_unmask(void); +struct sbiret sbi_sse_inject(unsigned long event_id, unsigned long hart_id); + #endif /* !__ASSEMBLER__ */ #endif /* _ASMRISCV_SBI_H_ */ diff --git a/lib/riscv/sbi-sse-asm.S b/lib/riscv/sbi-sse-asm.S new file mode 100644 index 00000000..b9e951f5 --- /dev/null +++ b/lib/riscv/sbi-sse-asm.S @@ -0,0 +1,102 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * RISC-V SSE events entry point. + * + * Copyright (C) 2025, Rivos Inc., Cl?ment L?ger + */ +#include +#include +#include +#include + +.section .text +.global sbi_sse_entry +sbi_sse_entry: + /* Save stack temporarily */ + REG_S sp, SBI_SSE_REG_TMP(a7) + /* Set entry stack */ + REG_L sp, SBI_SSE_HANDLER_STACK(a7) + + addi sp, sp, -(PT_SIZE) + REG_S ra, PT_RA(sp) + REG_S s0, PT_S0(sp) + REG_S s1, PT_S1(sp) + REG_S s2, PT_S2(sp) + REG_S s3, PT_S3(sp) + REG_S s4, PT_S4(sp) + REG_S s5, PT_S5(sp) + REG_S s6, PT_S6(sp) + REG_S s7, PT_S7(sp) + REG_S s8, PT_S8(sp) + REG_S s9, PT_S9(sp) + REG_S s10, PT_S10(sp) + REG_S s11, PT_S11(sp) + REG_S tp, PT_TP(sp) + REG_S t0, PT_T0(sp) + REG_S t1, PT_T1(sp) + REG_S t2, PT_T2(sp) + REG_S t3, PT_T3(sp) + REG_S t4, PT_T4(sp) + REG_S t5, PT_T5(sp) + REG_S t6, PT_T6(sp) + REG_S gp, PT_GP(sp) + REG_S a0, PT_A0(sp) + REG_S a1, PT_A1(sp) + REG_S a2, PT_A2(sp) + REG_S a3, PT_A3(sp) + REG_S a4, PT_A4(sp) + REG_S a5, PT_A5(sp) + csrr a1, CSR_SEPC + REG_S a1, PT_EPC(sp) + csrr a2, CSR_SSTATUS + REG_S a2, PT_STATUS(sp) + + REG_L a0, SBI_SSE_REG_TMP(a7) + REG_S a0, PT_SP(sp) + + REG_L t0, SBI_SSE_HANDLER(a7) + REG_L a0, SBI_SSE_HANDLER_DATA(a7) + mv a1, sp + mv a2, a6 + jalr t0 + + REG_L a1, PT_EPC(sp) + REG_L a2, PT_STATUS(sp) + csrw CSR_SEPC, a1 + csrw CSR_SSTATUS, a2 + + REG_L ra, PT_RA(sp) + REG_L s0, PT_S0(sp) + REG_L s1, PT_S1(sp) + REG_L s2, PT_S2(sp) + REG_L s3, PT_S3(sp) + REG_L s4, PT_S4(sp) + REG_L s5, PT_S5(sp) + REG_L s6, PT_S6(sp) + REG_L s7, PT_S7(sp) + REG_L s8, PT_S8(sp) + REG_L s9, PT_S9(sp) + REG_L s10, PT_S10(sp) + REG_L s11, PT_S11(sp) + REG_L tp, PT_TP(sp) + REG_L t0, PT_T0(sp) + REG_L t1, PT_T1(sp) + REG_L t2, PT_T2(sp) + REG_L t3, PT_T3(sp) + REG_L t4, PT_T4(sp) + REG_L t5, PT_T5(sp) + REG_L t6, PT_T6(sp) + REG_L gp, PT_GP(sp) + REG_L a0, PT_A0(sp) + REG_L a1, PT_A1(sp) + REG_L a2, PT_A2(sp) + REG_L a3, PT_A3(sp) + REG_L a4, PT_A4(sp) + REG_L a5, PT_A5(sp) + + REG_L sp, PT_SP(sp) + + li a7, ASM_SBI_EXT_SSE + li a6, ASM_SBI_EXT_SSE_COMPLETE + ecall + diff --git a/lib/riscv/asm-offsets.c b/lib/riscv/asm-offsets.c index 6c511c14..a96c6e97 100644 --- a/lib/riscv/asm-offsets.c +++ b/lib/riscv/asm-offsets.c @@ -3,6 +3,7 @@ #include #include #include +#include #include int main(void) @@ -63,5 +64,13 @@ int main(void) OFFSET(THREAD_INFO_HARTID, thread_info, hartid); DEFINE(THREAD_INFO_SIZE, sizeof(struct thread_info)); + DEFINE(ASM_SBI_EXT_SSE, SBI_EXT_SSE); + DEFINE(ASM_SBI_EXT_SSE_COMPLETE, SBI_EXT_SSE_COMPLETE); + + OFFSET(SBI_SSE_REG_TMP, sbi_sse_handler_arg, reg_tmp); + OFFSET(SBI_SSE_HANDLER, sbi_sse_handler_arg, handler); + OFFSET(SBI_SSE_HANDLER_DATA, sbi_sse_handler_arg, handler_data); + OFFSET(SBI_SSE_HANDLER_STACK, sbi_sse_handler_arg, stack); + return 0; } diff --git a/lib/riscv/sbi.c b/lib/riscv/sbi.c index ab032e3e..53d25489 100644 --- a/lib/riscv/sbi.c +++ b/lib/riscv/sbi.c @@ -2,6 +2,7 @@ #include #include #include +#include #include #include @@ -31,6 +32,81 @@ struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0, return ret; } +struct sbiret sbi_sse_read_attrs_raw(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long phys_lo, + unsigned long phys_hi) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_READ_ATTRS, event_id, base_attr_id, attr_count, + phys_lo, phys_hi, 0); +} + +struct sbiret sbi_sse_read_attrs(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long *values) +{ + phys_addr_t p = virt_to_phys(values); + + return sbi_sse_read_attrs_raw(event_id, base_attr_id, attr_count, lower_32_bits(p), + upper_32_bits(p)); +} + +struct sbiret sbi_sse_write_attrs_raw(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long phys_lo, + unsigned long phys_hi) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_WRITE_ATTRS, event_id, base_attr_id, attr_count, + phys_lo, phys_hi, 0); +} + +struct sbiret sbi_sse_write_attrs(unsigned long event_id, unsigned long base_attr_id, + unsigned long attr_count, unsigned long *values) +{ + phys_addr_t p = virt_to_phys(values); + + return sbi_sse_write_attrs_raw(event_id, base_attr_id, attr_count, lower_32_bits(p), + upper_32_bits(p)); +} + +struct sbiret sbi_sse_register_raw(unsigned long event_id, unsigned long entry_pc, + unsigned long entry_arg) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_REGISTER, event_id, entry_pc, entry_arg, 0, 0, 0); +} + +struct sbiret sbi_sse_register(unsigned long event_id, struct sbi_sse_handler_arg *arg) +{ + return sbi_sse_register_raw(event_id, (unsigned long)sbi_sse_entry, (unsigned long)arg); +} + +struct sbiret sbi_sse_unregister(unsigned long event_id) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_UNREGISTER, event_id, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_sse_enable(unsigned long event_id) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_ENABLE, event_id, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_sse_disable(unsigned long event_id) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_DISABLE, event_id, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_sse_hart_mask(void) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_HART_MASK, 0, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_sse_hart_unmask(void) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_HART_UNMASK, 0, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_sse_inject(unsigned long event_id, unsigned long hart_id) +{ + return sbi_ecall(SBI_EXT_SSE, SBI_EXT_SSE_INJECT, event_id, hart_id, 0, 0, 0, 0); +} + void sbi_shutdown(void) { sbi_ecall(SBI_EXT_SRST, 0, 0, 0, 0, 0, 0, 0); -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 09:46:53 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 17:46:53 +0100 Subject: [kvm-unit-tests PATCH v11 8/8] riscv: sbi: Add SSE extension tests In-Reply-To: <20250317164655.1120015-1-cleger@rivosinc.com> References: <20250317164655.1120015-1-cleger@rivosinc.com> Message-ID: <20250317164655.1120015-9-cleger@rivosinc.com> Add SBI SSE extension tests for the following features: - Test attributes errors (invalid values, RO, etc) - Registration errors - Simple events (register, enable, inject) - Events with different priorities - Global events dispatch on different harts - Local events on all harts - Hart mask/unmask events Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- riscv/Makefile | 1 + riscv/sbi-tests.h | 1 + riscv/sbi-sse.c | 1278 +++++++++++++++++++++++++++++++++++++++++++++ riscv/sbi.c | 2 + 4 files changed, 1282 insertions(+) create mode 100644 riscv/sbi-sse.c diff --git a/riscv/Makefile b/riscv/Makefile index 16fc125b..4fe2f1bb 100644 --- a/riscv/Makefile +++ b/riscv/Makefile @@ -18,6 +18,7 @@ tests += $(TEST_DIR)/sieve.$(exe) all: $(tests) $(TEST_DIR)/sbi-deps = $(TEST_DIR)/sbi-asm.o $(TEST_DIR)/sbi-fwft.o +$(TEST_DIR)/sbi-deps += $(TEST_DIR)/sbi-sse.o # When built for EFI sieve needs extra memory, run with e.g. '-m 256' on QEMU $(TEST_DIR)/sieve.$(exe): AUXFLAGS = 0x1 diff --git a/riscv/sbi-tests.h b/riscv/sbi-tests.h index b081464d..a71da809 100644 --- a/riscv/sbi-tests.h +++ b/riscv/sbi-tests.h @@ -71,6 +71,7 @@ sbiret_report(ret, expected_error, expected_value, "check sbi.error and sbi.value") void sbi_bad_fid(int ext); +void check_sse(void); #endif /* __ASSEMBLER__ */ #endif /* _RISCV_SBI_TESTS_H_ */ diff --git a/riscv/sbi-sse.c b/riscv/sbi-sse.c new file mode 100644 index 00000000..a31c84c3 --- /dev/null +++ b/riscv/sbi-sse.c @@ -0,0 +1,1278 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * SBI SSE testsuite + * + * Copyright (C) 2025, Rivos Inc., Cl?ment L?ger + */ +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "sbi-tests.h" + +#define SSE_STACK_SIZE PAGE_SIZE + +struct sse_event_info { + uint32_t event_id; + const char *name; + bool can_inject; +}; + +static struct sse_event_info sse_event_infos[] = { + { + .event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, + .name = "local_high_prio_ras", + }, + { + .event_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP, + .name = "double_trap", + }, + { + .event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, + .name = "global_high_prio_ras", + }, + { + .event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, + .name = "local_pmu_overflow", + }, + { + .event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, + .name = "local_low_prio_ras", + }, + { + .event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, + .name = "global_low_prio_ras", + }, + { + .event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE, + .name = "local_software", + }, + { + .event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE, + .name = "global_software", + }, +}; + +static const char *const attr_names[] = { + [SBI_SSE_ATTR_STATUS] = "status", + [SBI_SSE_ATTR_PRIORITY] = "priority", + [SBI_SSE_ATTR_CONFIG] = "config", + [SBI_SSE_ATTR_PREFERRED_HART] = "preferred_hart", + [SBI_SSE_ATTR_ENTRY_PC] = "entry_pc", + [SBI_SSE_ATTR_ENTRY_ARG] = "entry_arg", + [SBI_SSE_ATTR_INTERRUPTED_SEPC] = "interrupted_sepc", + [SBI_SSE_ATTR_INTERRUPTED_FLAGS] = "interrupted_flags", + [SBI_SSE_ATTR_INTERRUPTED_A6] = "interrupted_a6", + [SBI_SSE_ATTR_INTERRUPTED_A7] = "interrupted_a7", +}; + +static const unsigned long ro_attrs[] = { + SBI_SSE_ATTR_STATUS, + SBI_SSE_ATTR_ENTRY_PC, + SBI_SSE_ATTR_ENTRY_ARG, +}; + +static const unsigned long interrupted_attrs[] = { + SBI_SSE_ATTR_INTERRUPTED_SEPC, + SBI_SSE_ATTR_INTERRUPTED_FLAGS, + SBI_SSE_ATTR_INTERRUPTED_A6, + SBI_SSE_ATTR_INTERRUPTED_A7, +}; + +static const unsigned long interrupted_flags[] = { + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPP, + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPIE, + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SPELP, + SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT, + SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPV, + SBI_SSE_ATTR_INTERRUPTED_FLAGS_HSTATUS_SPVP, +}; + +static struct sse_event_info *sse_event_get_info(uint32_t event_id) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(sse_event_infos); i++) { + if (sse_event_infos[i].event_id == event_id) + return &sse_event_infos[i]; + } + + assert_msg(false, "Invalid event id: %d", event_id); +} + +static const char *sse_event_name(uint32_t event_id) +{ + return sse_event_get_info(event_id)->name; +} + +static bool sse_event_can_inject(uint32_t event_id) +{ + return sse_event_get_info(event_id)->can_inject; +} + +static struct sbiret sse_get_event_status_field(uint32_t event_id, unsigned long mask, + unsigned long shift, unsigned long *value) +{ + struct sbiret ret; + unsigned long status; + + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, 1, &status); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "Get event status"); + return ret; + } + + *value = (status & mask) >> shift; + + return ret; +} + +static struct sbiret sse_event_get_state(uint32_t event_id, enum sbi_sse_state *state) +{ + unsigned long status = 0; + struct sbiret ret; + + ret = sse_get_event_status_field(event_id, SBI_SSE_ATTR_STATUS_STATE_MASK, + SBI_SSE_ATTR_STATUS_STATE_OFFSET, &status); + *state = status; + + return ret; +} + +static unsigned long sse_global_event_set_current_hart(uint32_t event_id) +{ + struct sbiret ret; + unsigned long current_hart = current_thread_info()->hartid; + + assert(sbi_sse_event_is_global(event_id)); + + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, ¤t_hart); + if (sbiret_report_error(&ret, SBI_SUCCESS, "Set preferred hart")) + return ret.error; + + return 0; +} + +static bool sse_check_state(uint32_t event_id, unsigned long expected_state) +{ + struct sbiret ret; + enum sbi_sse_state state; + + ret = sse_event_get_state(event_id, &state); + if (ret.error) + return false; + + return report(state == expected_state, "event status == %ld", expected_state); +} + +static bool sse_event_pending(uint32_t event_id) +{ + bool pending = 0; + + sse_get_event_status_field(event_id, BIT(SBI_SSE_ATTR_STATUS_PENDING_OFFSET), + SBI_SSE_ATTR_STATUS_PENDING_OFFSET, (unsigned long *)&pending); + + return pending; +} + +static void *sse_alloc_stack(void) +{ + /* + * We assume that SSE_STACK_SIZE always fit in one page. This page will + * always be decremented before storing anything on it in sse-entry.S. + */ + assert(SSE_STACK_SIZE <= PAGE_SIZE); + + return (alloc_page() + SSE_STACK_SIZE); +} + +static void sse_free_stack(void *stack) +{ + free_page(stack - SSE_STACK_SIZE); +} + +static void sse_read_write_test(uint32_t event_id, unsigned long attr, unsigned long attr_count, + unsigned long *value, long expected_error, const char *str) +{ + struct sbiret ret; + + ret = sbi_sse_read_attrs(event_id, attr, attr_count, value); + sbiret_report_error(&ret, expected_error, "Read %s error", str); + + ret = sbi_sse_write_attrs(event_id, attr, attr_count, value); + sbiret_report_error(&ret, expected_error, "Write %s error", str); +} + +#define ALL_ATTRS_COUNT (SBI_SSE_ATTR_INTERRUPTED_A7 + 1) + +static void sse_test_attrs(uint32_t event_id) +{ + unsigned long value = 0; + struct sbiret ret; + void *ptr; + unsigned long values[ALL_ATTRS_COUNT]; + unsigned int i; + const char *invalid_hart_str; + const char *attr_name; + + report_prefix_push("attrs"); + + for (i = 0; i < ARRAY_SIZE(ro_attrs); i++) { + ret = sbi_sse_write_attrs(event_id, ro_attrs[i], 1, &value); + sbiret_report_error(&ret, SBI_ERR_DENIED, "RO attribute %s not writable", + attr_names[ro_attrs[i]]); + } + + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, ALL_ATTRS_COUNT, values); + sbiret_report_error(&ret, SBI_SUCCESS, "Read multiple attributes"); + + for (i = SBI_SSE_ATTR_STATUS; i <= SBI_SSE_ATTR_INTERRUPTED_A7; i++) { + ret = sbi_sse_read_attrs(event_id, i, 1, &value); + attr_name = attr_names[i]; + + sbiret_report_error(&ret, SBI_SUCCESS, "Read single attribute %s", attr_name); + if (values[i] != value) + report_fail("Attribute 0x%x single value read (0x%lx) differs from the one read with multiple attributes (0x%lx)", + i, value, values[i]); + /* + * Preferred hart reset value is defined by SBI vendor + */ + if (i != SBI_SSE_ATTR_PREFERRED_HART) { + /* + * Specification states that injectable bit is implementation dependent + * but other bits are zero-initialized. + */ + if (i == SBI_SSE_ATTR_STATUS) + value &= ~BIT(SBI_SSE_ATTR_STATUS_INJECT_OFFSET); + report(value == 0, "Attribute %s reset value is 0, found %lx", attr_name, value); + } + } + +#if __riscv_xlen > 32 + value = BIT(32); + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Write invalid prio > 0xFFFFFFFF error"); +#endif + + value = ~SBI_SSE_ATTR_CONFIG_ONESHOT; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_CONFIG, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Write invalid config value error"); + + if (sbi_sse_event_is_global(event_id)) { + invalid_hart_str = getenv("INVALID_HART_ID"); + if (!invalid_hart_str) + value = 0xFFFFFFFFUL; + else + value = strtoul(invalid_hart_str, NULL, 0); + + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid hart id error"); + } else { + /* Set Hart on local event -> RO */ + value = current_thread_info()->hartid; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); + sbiret_report_error(&ret, SBI_ERR_DENIED, + "Set hart id on local event error"); + } + + /* Set/get flags, sepc, a6, a7 */ + for (i = 0; i < ARRAY_SIZE(interrupted_attrs); i++) { + attr_name = attr_names[interrupted_attrs[i]]; + ret = sbi_sse_read_attrs(event_id, interrupted_attrs[i], 1, &value); + sbiret_report_error(&ret, SBI_SUCCESS, "Get interrupted %s", attr_name); + + value = ARRAY_SIZE(interrupted_attrs) - i; + ret = sbi_sse_write_attrs(event_id, interrupted_attrs[i], 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, + "Set attribute %s invalid state error", attr_name); + } + + sse_read_write_test(event_id, SBI_SSE_ATTR_STATUS, 0, &value, SBI_ERR_INVALID_PARAM, + "attribute attr_count == 0"); + sse_read_write_test(event_id, SBI_SSE_ATTR_INTERRUPTED_A7 + 1, 1, &value, SBI_ERR_BAD_RANGE, + "invalid attribute"); + + /* Misaligned pointer address */ + ptr = (void *)&value; + ptr += 1; + sse_read_write_test(event_id, SBI_SSE_ATTR_STATUS, 1, ptr, SBI_ERR_INVALID_ADDRESS, + "attribute with invalid address"); + + report_prefix_pop(); +} + +static void sse_test_register_error(uint32_t event_id) +{ + struct sbiret ret; + + report_prefix_push("register"); + + ret = sbi_sse_unregister(event_id); + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "unregister non-registered event"); + + ret = sbi_sse_register_raw(event_id, 0x1, 0); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "register misaligned entry"); + + ret = sbi_sse_register(event_id, NULL); + sbiret_report_error(&ret, SBI_SUCCESS, "register"); + if (ret.error) + goto done; + + ret = sbi_sse_register(event_id, NULL); + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "register used event failure"); + + ret = sbi_sse_unregister(event_id); + sbiret_report_error(&ret, SBI_SUCCESS, "unregister"); + +done: + report_prefix_pop(); +} + +struct sse_simple_test_arg { + bool done; + unsigned long expected_a6; + uint32_t event_id; +}; + +#if __riscv_xlen > 32 + +struct alias_test_params { + unsigned long event_id; + unsigned long attr_id; + unsigned long attr_count; + const char *str; +}; + +static void test_alias(uint32_t event_id) +{ + struct alias_test_params *write, *read; + unsigned long write_value, read_value; + struct sbiret ret; + bool err = false; + int r, w; + struct alias_test_params params[] = { + {event_id, SBI_SSE_ATTR_INTERRUPTED_A6, 1, "non aliased"}, + {BIT(32) + event_id, SBI_SSE_ATTR_INTERRUPTED_A6, 1, "aliased event_id"}, + {event_id, BIT(32) + SBI_SSE_ATTR_INTERRUPTED_A6, 1, "aliased attr_id"}, + {event_id, SBI_SSE_ATTR_INTERRUPTED_A6, BIT(32) + 1, "aliased attr_count"}, + }; + + report_prefix_push("alias"); + for (w = 0; w < ARRAY_SIZE(params); w++) { + write = ¶ms[w]; + + write_value = 0xDEADBEEF + w; + ret = sbi_sse_write_attrs(write->event_id, write->attr_id, write->attr_count, &write_value); + if (ret.error) + sbiret_report_error(&ret, SBI_SUCCESS, "Write %s, event 0x%lx attr 0x%lx, attr count 0x%lx", + write->str, write->event_id, write->attr_id, write->attr_count); + + for (r = 0; r < ARRAY_SIZE(params); r++) { + read = ¶ms[r]; + read_value = 0; + ret = sbi_sse_read_attrs(read->event_id, read->attr_id, read->attr_count, &read_value); + if (ret.error) + sbiret_report_error(&ret, SBI_SUCCESS, + "Read %s, event 0x%lx attr 0x%lx, attr count 0x%lx", + read->str, read->event_id, read->attr_id, read->attr_count); + + /* Do not spam output with a lot of reports */ + if (write_value != read_value) { + err = true; + report_fail("Write %s, event 0x%lx attr 0x%lx, attr count 0x%lx value %lx ==" + "Read %s, event 0x%lx attr 0x%lx, attr count 0x%lx value %lx", + write->str, write->event_id, write->attr_id, + write->attr_count, write_value, read->str, + read->event_id, read->attr_id, read->attr_count, + read_value); + } + } + } + + report(!err, "BIT(32) aliasing tests"); + report_prefix_pop(); +} +#endif + +static void sse_simple_handler(void *data, struct pt_regs *regs, unsigned int hartid) +{ + struct sse_simple_test_arg *arg = data; + int i; + struct sbiret ret; + const char *attr_name; + uint32_t event_id = READ_ONCE(arg->event_id), attr; + unsigned long value, prev_value, flags; + unsigned long interrupted_state[ARRAY_SIZE(interrupted_attrs)]; + unsigned long modified_state[ARRAY_SIZE(interrupted_attrs)] = {4, 3, 2, 1}; + unsigned long tmp_state[ARRAY_SIZE(interrupted_attrs)]; + + report((regs->status & SR_SPP) == SR_SPP, "Interrupted S-mode"); + report(hartid == current_thread_info()->hartid, "Hartid correctly passed"); + sse_check_state(event_id, SBI_SSE_STATE_RUNNING); + report(!sse_event_pending(event_id), "Event not pending"); + + /* Read full interrupted state */ + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, + ARRAY_SIZE(interrupted_attrs), interrupted_state); + sbiret_report_error(&ret, SBI_SUCCESS, "Save full interrupted state from handler"); + + /* Write full modified state and read it */ + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, + ARRAY_SIZE(modified_state), modified_state); + sbiret_report_error(&ret, SBI_SUCCESS, + "Write full interrupted state from handler"); + + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, + ARRAY_SIZE(tmp_state), tmp_state); + sbiret_report_error(&ret, SBI_SUCCESS, "Read full modified state from handler"); + + report(memcmp(tmp_state, modified_state, sizeof(modified_state)) == 0, + "Full interrupted state successfully written"); + +#if __riscv_xlen > 32 + test_alias(event_id); +#endif + + /* Restore full saved state */ + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_INTERRUPTED_SEPC, + ARRAY_SIZE(interrupted_attrs), interrupted_state); + sbiret_report_error(&ret, SBI_SUCCESS, "Full interrupted state restore from handler"); + + /* We test SBI_SSE_ATTR_INTERRUPTED_FLAGS below with specific flag values */ + for (i = 0; i < ARRAY_SIZE(interrupted_attrs); i++) { + attr = interrupted_attrs[i]; + if (attr == SBI_SSE_ATTR_INTERRUPTED_FLAGS) + continue; + + attr_name = attr_names[attr]; + + ret = sbi_sse_read_attrs(event_id, attr, 1, &prev_value); + sbiret_report_error(&ret, SBI_SUCCESS, "Get attr %s", attr_name); + + value = 0xDEADBEEF + i; + ret = sbi_sse_write_attrs(event_id, attr, 1, &value); + sbiret_report_error(&ret, SBI_SUCCESS, "Set attr %s", attr_name); + + ret = sbi_sse_read_attrs(event_id, attr, 1, &value); + sbiret_report_error(&ret, SBI_SUCCESS, "Get attr %s", attr_name); + report(value == 0xDEADBEEF + i, "Get attr %s, value: 0x%lx", attr_name, value); + + ret = sbi_sse_write_attrs(event_id, attr, 1, &prev_value); + sbiret_report_error(&ret, SBI_SUCCESS, "Restore attr %s value", attr_name); + } + + /* Test all flags allowed for SBI_SSE_ATTR_INTERRUPTED_FLAGS */ + attr = SBI_SSE_ATTR_INTERRUPTED_FLAGS; + ret = sbi_sse_read_attrs(event_id, attr, 1, &prev_value); + sbiret_report_error(&ret, SBI_SUCCESS, "Save interrupted flags"); + + for (i = 0; i < ARRAY_SIZE(interrupted_flags); i++) { + flags = interrupted_flags[i]; + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); + sbiret_report_error(&ret, SBI_SUCCESS, + "Set interrupted flags bit 0x%lx value", flags); + ret = sbi_sse_read_attrs(event_id, attr, 1, &value); + sbiret_report_error(&ret, SBI_SUCCESS, "Get interrupted flags after set"); + report(value == flags, "interrupted flags modified value: 0x%lx", value); + } + + /* Write invalid bit in flag register */ + flags = SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT << 1; + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", + flags); + + flags = BIT(SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT + 1); + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", + flags); + + ret = sbi_sse_write_attrs(event_id, attr, 1, &prev_value); + sbiret_report_error(&ret, SBI_SUCCESS, "Restore interrupted flags"); + + /* Try to change HARTID/Priority while running */ + if (sbi_sse_event_is_global(event_id)) { + value = current_thread_info()->hartid; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "Set hart id while running error"); + } + + value = 0; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_STATE, "Set priority while running error"); + + value = READ_ONCE(arg->expected_a6); + report(interrupted_state[2] == value, "Interrupted state a6, expected 0x%lx, got 0x%lx", + value, interrupted_state[2]); + + report(interrupted_state[3] == SBI_EXT_SSE, + "Interrupted state a7, expected 0x%x, got 0x%lx", SBI_EXT_SSE, + interrupted_state[3]); + + WRITE_ONCE(arg->done, true); +} + +static void sse_test_inject_simple(uint32_t event_id) +{ + unsigned long value, error; + struct sbiret ret; + enum sbi_sse_state state = SBI_SSE_STATE_UNUSED; + struct sse_simple_test_arg test_arg = {.event_id = event_id}; + struct sbi_sse_handler_arg args = { + .handler = sse_simple_handler, + .handler_data = (void *)&test_arg, + .stack = sse_alloc_stack(), + }; + + report_prefix_push("simple"); + + if (!sse_check_state(event_id, SBI_SSE_STATE_UNUSED)) + goto cleanup; + + ret = sbi_sse_register(event_id, &args); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "register")) + goto cleanup; + + state = SBI_SSE_STATE_REGISTERED; + + if (!sse_check_state(event_id, SBI_SSE_STATE_REGISTERED)) + goto cleanup; + + if (sbi_sse_event_is_global(event_id)) { + /* Be sure global events are targeting the current hart */ + error = sse_global_event_set_current_hart(event_id); + if (error) + goto cleanup; + } + + ret = sbi_sse_enable(event_id); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "enable")) + goto cleanup; + + state = SBI_SSE_STATE_ENABLED; + if (!sse_check_state(event_id, SBI_SSE_STATE_ENABLED)) + goto cleanup; + + ret = sbi_sse_hart_mask(); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "hart mask")) + goto cleanup; + + ret = sbi_sse_inject(event_id, current_thread_info()->hartid); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "injection masked")) { + sbi_sse_hart_unmask(); + goto cleanup; + } + + report(READ_ONCE(test_arg.done) == 0, "event masked not handled"); + + /* + * When unmasking the SSE events, we expect it to be injected + * immediately so a6 should be SBI_EXT_SBI_SSE_HART_UNMASK + */ + WRITE_ONCE(test_arg.expected_a6, SBI_EXT_SSE_HART_UNMASK); + ret = sbi_sse_hart_unmask(); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "hart unmask")) + goto cleanup; + + report(READ_ONCE(test_arg.done) == 1, "event unmasked handled"); + WRITE_ONCE(test_arg.done, 0); + WRITE_ONCE(test_arg.expected_a6, SBI_EXT_SSE_INJECT); + + /* Set as oneshot and verify it is disabled */ + ret = sbi_sse_disable(event_id); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Disable event")) { + /* Nothing we can really do here, event can not be disabled */ + goto cleanup; + } + state = SBI_SSE_STATE_REGISTERED; + + value = SBI_SSE_ATTR_CONFIG_ONESHOT; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_CONFIG, 1, &value); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Set event attribute as ONESHOT")) + goto cleanup; + + ret = sbi_sse_enable(event_id); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Enable event")) + goto cleanup; + state = SBI_SSE_STATE_ENABLED; + + ret = sbi_sse_inject(event_id, current_thread_info()->hartid); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "second injection")) + goto cleanup; + + report(READ_ONCE(test_arg.done) == 1, "event handled"); + WRITE_ONCE(test_arg.done, 0); + + if (!sse_check_state(event_id, SBI_SSE_STATE_REGISTERED)) + goto cleanup; + state = SBI_SSE_STATE_REGISTERED; + + /* Clear ONESHOT FLAG */ + value = 0; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_CONFIG, 1, &value); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Clear CONFIG.ONESHOT flag")) + goto cleanup; + + ret = sbi_sse_unregister(event_id); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "unregister")) + goto cleanup; + state = SBI_SSE_STATE_UNUSED; + + sse_check_state(event_id, SBI_SSE_STATE_UNUSED); + +cleanup: + switch (state) { + case SBI_SSE_STATE_ENABLED: + ret = sbi_sse_disable(event_id); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "disable event 0x%x", event_id); + break; + } + case SBI_SSE_STATE_REGISTERED: + sbi_sse_unregister(event_id); + if (ret.error) + sbiret_report_error(&ret, SBI_SUCCESS, "unregister event 0x%x", event_id); + default: + } + + sse_free_stack(args.stack); + report_prefix_pop(); +} + +struct sse_foreign_cpu_test_arg { + bool done; + unsigned int expected_cpu; + uint32_t event_id; +}; + +static void sse_foreign_cpu_handler(void *data, struct pt_regs *regs, unsigned int hartid) +{ + struct sse_foreign_cpu_test_arg *arg = data; + unsigned int expected_cpu; + + /* For arg content to be visible */ + smp_rmb(); + expected_cpu = READ_ONCE(arg->expected_cpu); + report(expected_cpu == current_thread_info()->cpu, + "Received event on CPU (%d), expected CPU (%d)", current_thread_info()->cpu, + expected_cpu); + + WRITE_ONCE(arg->done, true); + /* For arg update to be visible for other CPUs */ + smp_wmb(); +} + +struct sse_local_per_cpu { + struct sbi_sse_handler_arg args; + struct sbiret ret; + struct sse_foreign_cpu_test_arg handler_arg; + enum sbi_sse_state state; +}; + +static void sse_register_enable_local(void *data) +{ + struct sbiret ret; + struct sse_local_per_cpu *cpu_args = data; + struct sse_local_per_cpu *cpu_arg = &cpu_args[current_thread_info()->cpu]; + uint32_t event_id = cpu_arg->handler_arg.event_id; + + ret = sbi_sse_register(event_id, &cpu_arg->args); + WRITE_ONCE(cpu_arg->ret, ret); + if (ret.error) + return; + cpu_arg->state = SBI_SSE_STATE_REGISTERED; + + ret = sbi_sse_enable(event_id); + WRITE_ONCE(cpu_arg->ret, ret); + if (ret.error) + return; + cpu_arg->state = SBI_SSE_STATE_ENABLED; +} + +static void sbi_sse_disable_unregister_local(void *data) +{ + struct sbiret ret; + struct sse_local_per_cpu *cpu_args = data; + struct sse_local_per_cpu *cpu_arg = &cpu_args[current_thread_info()->cpu]; + uint32_t event_id = cpu_arg->handler_arg.event_id; + + switch (cpu_arg->state) { + case SBI_SSE_STATE_ENABLED: + ret = sbi_sse_disable(event_id); + WRITE_ONCE(cpu_arg->ret, ret); + if (ret.error) + return; + case SBI_SSE_STATE_REGISTERED: + ret = sbi_sse_unregister(event_id); + WRITE_ONCE(cpu_arg->ret, ret); + default: + } +} + +static uint64_t sse_event_get_complete_timeout(void) +{ + char *event_complete_timeout_str; + uint64_t timeout; + + event_complete_timeout_str = getenv("SSE_EVENT_COMPLETE_TIMEOUT"); + if (!event_complete_timeout_str) + timeout = 3000; + else + timeout = strtoul(event_complete_timeout_str, NULL, 0); + + return timer_get_cycles() + usec_to_cycles(timeout); +} + +static void sse_test_inject_local(uint32_t event_id) +{ + int cpu; + uint64_t timeout; + struct sbiret ret; + struct sse_local_per_cpu *cpu_args, *cpu_arg; + struct sse_foreign_cpu_test_arg *handler_arg; + + cpu_args = calloc(NR_CPUS, sizeof(struct sbi_sse_handler_arg)); + + report_prefix_push("local_dispatch"); + for_each_online_cpu(cpu) { + cpu_arg = &cpu_args[cpu]; + cpu_arg->handler_arg.event_id = event_id; + cpu_arg->args.stack = sse_alloc_stack(); + cpu_arg->args.handler = sse_foreign_cpu_handler; + cpu_arg->args.handler_data = (void *)&cpu_arg->handler_arg; + cpu_arg->state = SBI_SSE_STATE_UNUSED; + } + + on_cpus(sse_register_enable_local, cpu_args); + for_each_online_cpu(cpu) { + cpu_arg = &cpu_args[cpu]; + ret = cpu_arg->ret; + if (ret.error) { + report_fail("CPU failed to register/enable event: %ld", ret.error); + goto cleanup; + } + + handler_arg = &cpu_arg->handler_arg; + WRITE_ONCE(handler_arg->expected_cpu, cpu); + /* For handler_arg content to be visible for other CPUs */ + smp_wmb(); + ret = sbi_sse_inject(event_id, cpus[cpu].hartid); + if (ret.error) { + report_fail("CPU failed to inject event: %ld", ret.error); + goto cleanup; + } + } + + for_each_online_cpu(cpu) { + handler_arg = &cpu_args[cpu].handler_arg; + smp_rmb(); + + timeout = sse_event_get_complete_timeout(); + while (!READ_ONCE(handler_arg->done) && timer_get_cycles() < timeout) { + /* For handler_arg update to be visible */ + smp_rmb(); + cpu_relax(); + } + report(READ_ONCE(handler_arg->done), "Event handled"); + WRITE_ONCE(handler_arg->done, false); + } + +cleanup: + on_cpus(sbi_sse_disable_unregister_local, cpu_args); + for_each_online_cpu(cpu) { + cpu_arg = &cpu_args[cpu]; + ret = READ_ONCE(cpu_arg->ret); + if (ret.error) + report_fail("CPU failed to disable/unregister event: %ld", ret.error); + } + + for_each_online_cpu(cpu) { + cpu_arg = &cpu_args[cpu]; + sse_free_stack(cpu_arg->args.stack); + } + + report_prefix_pop(); +} + +static void sse_test_inject_global_cpu(uint32_t event_id, unsigned int cpu, + struct sse_foreign_cpu_test_arg *test_arg) +{ + unsigned long value; + struct sbiret ret; + uint64_t timeout; + enum sbi_sse_state state; + + WRITE_ONCE(test_arg->expected_cpu, cpu); + /* For test_arg content to be visible for other CPUs */ + smp_wmb(); + value = cpu; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PREFERRED_HART, 1, &value); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Set preferred hart")) + return; + + ret = sbi_sse_enable(event_id); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Enable event")) + return; + + ret = sbi_sse_inject(event_id, cpu); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Inject event")) + goto disable; + + smp_rmb(); + timeout = sse_event_get_complete_timeout(); + while (!READ_ONCE(test_arg->done) && timer_get_cycles() < timeout) { + /* For shared test_arg structure */ + smp_rmb(); + cpu_relax(); + } + + report(READ_ONCE(test_arg->done), "event handler called"); + WRITE_ONCE(test_arg->done, false); + + timeout = sse_event_get_complete_timeout(); + /* Wait for event to be back in ENABLED state */ + do { + ret = sse_event_get_state(event_id, &state); + if (ret.error) + goto disable; + cpu_relax(); + } while (state != SBI_SSE_STATE_ENABLED && timer_get_cycles() < timeout); + + report(state == SBI_SSE_STATE_ENABLED, "Event in enabled state"); + +disable: + ret = sbi_sse_disable(event_id); + sbiret_report_error(&ret, SBI_SUCCESS, "Disable event"); +} + +static void sse_test_inject_global(uint32_t event_id) +{ + struct sbiret ret; + unsigned int cpu; + struct sse_foreign_cpu_test_arg test_arg = {.event_id = event_id}; + struct sbi_sse_handler_arg args = { + .handler = sse_foreign_cpu_handler, + .handler_data = (void *)&test_arg, + .stack = sse_alloc_stack(), + }; + + report_prefix_push("global_dispatch"); + + ret = sbi_sse_register(event_id, &args); + if (!sbiret_report_error(&ret, SBI_SUCCESS, "Register event")) + goto err; + + for_each_online_cpu(cpu) + sse_test_inject_global_cpu(event_id, cpu, &test_arg); + + ret = sbi_sse_unregister(event_id); + sbiret_report_error(&ret, SBI_SUCCESS, "Unregister event"); + +err: + sse_free_stack(args.stack); + report_prefix_pop(); +} + +struct priority_test_arg { + uint32_t event_id; + bool called; + u32 prio; + enum sbi_sse_state state; /* Used for error handling */ + struct priority_test_arg *next_event_arg; + void (*check_func)(struct priority_test_arg *arg); +}; + +static void sse_hi_priority_test_handler(void *arg, struct pt_regs *regs, + unsigned int hartid) +{ + struct priority_test_arg *targ = arg; + struct priority_test_arg *next = targ->next_event_arg; + + targ->called = true; + if (next) { + sbi_sse_inject(next->event_id, current_thread_info()->hartid); + + report(!sse_event_pending(next->event_id), "Higher priority event is not pending"); + report(next->called, "Higher priority event was handled"); + } +} + +static void sse_low_priority_test_handler(void *arg, struct pt_regs *regs, + unsigned int hartid) +{ + struct priority_test_arg *targ = arg; + struct priority_test_arg *next = targ->next_event_arg; + + targ->called = true; + + if (next) { + sbi_sse_inject(next->event_id, current_thread_info()->hartid); + + report(sse_event_pending(next->event_id), "Lower priority event is pending"); + report(!next->called, "Lower priority event %s was not handled before %s", + sse_event_name(next->event_id), sse_event_name(targ->event_id)); + } +} + +static void sse_test_injection_priority_arg(struct priority_test_arg *in_args, + unsigned int in_args_size, + sbi_sse_handler_fn handler, + const char *test_name) +{ + unsigned int i; + unsigned long value, uret; + struct sbiret ret; + uint32_t event_id; + struct priority_test_arg *arg; + unsigned int args_size = 0; + struct sbi_sse_handler_arg event_args[in_args_size]; + struct priority_test_arg *args[in_args_size]; + void *stack; + struct sbi_sse_handler_arg *event_arg; + + report_prefix_push(test_name); + + for (i = 0; i < in_args_size; i++) { + arg = &in_args[i]; + arg->state = SBI_SSE_STATE_UNUSED; + event_id = arg->event_id; + if (!sse_event_can_inject(event_id)) + continue; + + args[args_size] = arg; + args_size++; + event_args->stack = 0; + } + + if (!args_size) { + report_skip("No injectable events"); + goto skip; + } + + for (i = 0; i < args_size; i++) { + arg = args[i]; + event_id = arg->event_id; + stack = sse_alloc_stack(); + + event_arg = &event_args[i]; + event_arg->handler = handler; + event_arg->handler_data = (void *)arg; + event_arg->stack = stack; + + if (i < (args_size - 1)) + arg->next_event_arg = args[i + 1]; + else + arg->next_event_arg = NULL; + + /* Be sure global events are targeting the current hart */ + if (sbi_sse_event_is_global(event_id)) { + uret = sse_global_event_set_current_hart(event_id); + if (uret) + goto err; + } + + ret = sbi_sse_register(event_id, event_arg); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "register event %s", + sse_event_name(event_id)); + goto err; + } + arg->state = SBI_SSE_STATE_REGISTERED; + + value = arg->prio; + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "set event %s priority", + sse_event_name(event_id)); + goto err; + } + ret = sbi_sse_enable(event_id); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "enable event %s", + sse_event_name(event_id)); + goto err; + } + arg->state = SBI_SSE_STATE_ENABLED; + } + + /* Inject first event */ + ret = sbi_sse_inject(args[0]->event_id, current_thread_info()->hartid); + sbiret_report_error(&ret, SBI_SUCCESS, "injection"); + + /* Check that all handlers have been called */ + for (i = 0; i < args_size; i++) + report(arg->called, "Event %s handler called", sse_event_name(args[i]->event_id)); + +err: + for (i = 0; i < args_size; i++) { + arg = args[i]; + event_id = arg->event_id; + + switch (arg->state) { + case SBI_SSE_STATE_ENABLED: + ret = sbi_sse_disable(event_id); + if (ret.error) { + sbiret_report_error(&ret, SBI_SUCCESS, "disable event 0x%x", + event_id); + break; + } + case SBI_SSE_STATE_REGISTERED: + sbi_sse_unregister(event_id); + if (ret.error) + sbiret_report_error(&ret, SBI_SUCCESS, "unregister event 0x%x", + event_id); + default: + } + + event_arg = &event_args[i]; + if (event_arg->stack) + sse_free_stack(event_arg->stack); + } + +skip: + report_prefix_pop(); +} + +static struct priority_test_arg hi_prio_args[] = { + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE}, + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE}, + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW}, + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP}, + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS}, +}; + +static struct priority_test_arg low_prio_args[] = { + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_DOUBLE_TRAP}, + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW}, + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS}, + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE}, + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE}, +}; + +static struct priority_test_arg prio_args[] = { + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE, .prio = 5}, + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE, .prio = 10}, + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, .prio = 12}, + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, .prio = 15}, + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, .prio = 20}, + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, .prio = 22}, + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, .prio = 25}, +}; + +static struct priority_test_arg same_prio_args[] = { + {.event_id = SBI_SSE_EVENT_LOCAL_PMU_OVERFLOW, .prio = 0}, + {.event_id = SBI_SSE_EVENT_GLOBAL_LOW_PRIO_RAS, .prio = 0}, + {.event_id = SBI_SSE_EVENT_LOCAL_HIGH_PRIO_RAS, .prio = 10}, + {.event_id = SBI_SSE_EVENT_LOCAL_SOFTWARE, .prio = 10}, + {.event_id = SBI_SSE_EVENT_GLOBAL_SOFTWARE, .prio = 10}, + {.event_id = SBI_SSE_EVENT_GLOBAL_HIGH_PRIO_RAS, .prio = 20}, + {.event_id = SBI_SSE_EVENT_LOCAL_LOW_PRIO_RAS, .prio = 20}, +}; + +static void sse_test_injection_priority(void) +{ + report_prefix_push("prio"); + + sse_test_injection_priority_arg(hi_prio_args, ARRAY_SIZE(hi_prio_args), + sse_hi_priority_test_handler, "high"); + + sse_test_injection_priority_arg(low_prio_args, ARRAY_SIZE(low_prio_args), + sse_low_priority_test_handler, "low"); + + sse_test_injection_priority_arg(prio_args, ARRAY_SIZE(prio_args), + sse_low_priority_test_handler, "changed"); + + sse_test_injection_priority_arg(same_prio_args, ARRAY_SIZE(same_prio_args), + sse_low_priority_test_handler, "same_prio_args"); + + report_prefix_pop(); +} + +static void test_invalid_event_id(unsigned long event_id) +{ + struct sbiret ret; + unsigned long value = 0; + + ret = sbi_sse_register_raw(event_id, (unsigned long) sbi_sse_entry, 0); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "register event_id 0x%lx", event_id); + + ret = sbi_sse_unregister(event_id); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "unregister event_id 0x%lx", event_id); + + ret = sbi_sse_enable(event_id); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "enable event_id 0x%lx", event_id); + + ret = sbi_sse_disable(event_id); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "disable event_id 0x%lx", event_id); + + ret = sbi_sse_inject(event_id, 0); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "inject event_id 0x%lx", event_id); + + ret = sbi_sse_write_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "write attr event_id 0x%lx", event_id); + + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_PRIORITY, 1, &value); + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, + "read attr event_id 0x%lx", event_id); +} + +static void sse_test_invalid_event_id(void) +{ + + report_prefix_push("event_id"); + + test_invalid_event_id(SBI_SSE_EVENT_LOCAL_RESERVED_0_START); + + report_prefix_pop(); +} + +static void sse_check_event_availability(uint32_t event_id, bool *can_inject, bool *supported) +{ + unsigned long status; + struct sbiret ret; + + *can_inject = false; + *supported = false; + + ret = sbi_sse_read_attrs(event_id, SBI_SSE_ATTR_STATUS, 1, &status); + if (ret.error != SBI_SUCCESS && ret.error != SBI_ERR_NOT_SUPPORTED) { + report_fail("Get event status != SBI_SUCCESS && != SBI_ERR_NOT_SUPPORTED: %ld", + ret.error); + return; + } + if (ret.error == SBI_ERR_NOT_SUPPORTED) + return; + + *supported = true; + *can_inject = (status >> SBI_SSE_ATTR_STATUS_INJECT_OFFSET) & 1; +} + +static void sse_secondary_boot_and_unmask(void *data) +{ + sbi_sse_hart_unmask(); +} + +static void sse_check_mask(void) +{ + struct sbiret ret; + + /* Upon boot, event are masked, check that */ + ret = sbi_sse_hart_mask(); + sbiret_report_error(&ret, SBI_ERR_ALREADY_STOPPED, "hart mask at boot time"); + + ret = sbi_sse_hart_unmask(); + sbiret_report_error(&ret, SBI_SUCCESS, "hart unmask"); + ret = sbi_sse_hart_unmask(); + sbiret_report_error(&ret, SBI_ERR_ALREADY_STARTED, "hart unmask twice error"); + + ret = sbi_sse_hart_mask(); + sbiret_report_error(&ret, SBI_SUCCESS, "hart mask"); + ret = sbi_sse_hart_mask(); + sbiret_report_error(&ret, SBI_ERR_ALREADY_STOPPED, "hart mask twice"); +} + +static void run_inject_test(struct sse_event_info *info) +{ + unsigned long event_id = info->event_id; + + if (!info->can_inject) { + report_skip("Event does not support injection, skipping injection tests"); + return; + } + + sse_test_inject_simple(event_id); + + if (sbi_sse_event_is_global(event_id)) + sse_test_inject_global(event_id); + else + sse_test_inject_local(event_id); +} + +void check_sse(void) +{ + struct sse_event_info *info; + unsigned long i, event_id; + bool sbi_skip_inject = false; + bool supported; + + report_prefix_push("sse"); + + if (!sbi_probe(SBI_EXT_SSE)) { + report_skip("extension not available"); + report_prefix_pop(); + return; + } + + sse_check_mask(); + + /* + * Dummy wakeup of all processors since some of them will be targeted + * by global events without going through the wakeup call as well as + * unmasking SSE events on all harts + */ + on_cpus(sse_secondary_boot_and_unmask, NULL); + + /* Check for OpenSBI to support injection */ + if (sbi_get_imp_id() == SBI_IMPL_OPENSBI) { + if (sbi_get_imp_version() < sbi_impl_opensbi_mk_version(1, 6)) { + /* + * OpenSBI < v1.6 crashes kvm-unit-tests upon injection since injection + * arguments (a6/a7) were reversed. Skip injection tests. + */ + report_skip("OpenSBI < v1.6 detected, skipping injection tests"); + sbi_skip_inject = true; + } + } + + sse_test_invalid_event_id(); + + for (i = 0; i < ARRAY_SIZE(sse_event_infos); i++) { + info = &sse_event_infos[i]; + event_id = info->event_id; + report_prefix_push(info->name); + sse_check_event_availability(event_id, &info->can_inject, &supported); + if (!supported) { + report_skip("Event is not supported, skipping tests"); + report_prefix_pop(); + continue; + } + + sse_test_attrs(event_id); + sse_test_register_error(event_id); + + if (!sbi_skip_inject) + run_inject_test(info); + + report_prefix_pop(); + } + + if (!sbi_skip_inject) + sse_test_injection_priority(); + + report_prefix_pop(); +} diff --git a/riscv/sbi.c b/riscv/sbi.c index 0404bb81..478cb35d 100644 --- a/riscv/sbi.c +++ b/riscv/sbi.c @@ -32,6 +32,7 @@ #define HIGH_ADDR_BOUNDARY ((phys_addr_t)1 << 32) +void check_sse(void); void check_fwft(void); static long __labs(long a) @@ -1567,6 +1568,7 @@ int main(int argc, char **argv) check_hsm(); check_dbcn(); check_susp(); + check_sse(); check_fwft(); return report_summary(); -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 10:06:06 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 18:06:06 +0100 Subject: [PATCH v4 00/18] riscv: add SBI FWFT misaligned exception delegation support Message-ID: <20250317170625.1142870-1-cleger@rivosinc.com> The SBI Firmware Feature extension allows the S-mode to request some specific features (either hardware or software) to be enabled. This series uses this extension to request misaligned access exception delegation to S-mode in order to let the kernel handle it. It also adds support for the KVM FWFT SBI extension based on the misaligned access handling infrastructure. FWFT SBI extension is part of the SBI V3.0 specifications [1]. It can be tested using the qemu provided at [2] which contains the series from [3]. kvm-unit-tests [4] can be used inside kvm to tests the correct delegation of misaligned exceptions. Upstream OpenSBI can be used. Note: Since SBI V3.0 is not yet ratified, FWFT extension API is split between interface only and implementation, allowing to pick only the interface which do not have hard dependencies on SBI. The tests can be run using the included kselftest: $ qemu-system-riscv64 \ -cpu rv64,trap-misaligned-access=true,v=true \ -M virt \ -m 1024M \ -bios fw_dynamic.bin \ -kernel Image ... # ./misaligned TAP version 13 1..23 # Starting 23 tests from 1 test cases. # RUN global.gp_load_lh ... # OK global.gp_load_lh ok 1 global.gp_load_lh # RUN global.gp_load_lhu ... # OK global.gp_load_lhu ok 2 global.gp_load_lhu # RUN global.gp_load_lw ... # OK global.gp_load_lw ok 3 global.gp_load_lw # RUN global.gp_load_lwu ... # OK global.gp_load_lwu ok 4 global.gp_load_lwu # RUN global.gp_load_ld ... # OK global.gp_load_ld ok 5 global.gp_load_ld # RUN global.gp_load_c_lw ... # OK global.gp_load_c_lw ok 6 global.gp_load_c_lw # RUN global.gp_load_c_ld ... # OK global.gp_load_c_ld ok 7 global.gp_load_c_ld # RUN global.gp_load_c_ldsp ... # OK global.gp_load_c_ldsp ok 8 global.gp_load_c_ldsp # RUN global.gp_load_sh ... # OK global.gp_load_sh ok 9 global.gp_load_sh # RUN global.gp_load_sw ... # OK global.gp_load_sw ok 10 global.gp_load_sw # RUN global.gp_load_sd ... # OK global.gp_load_sd ok 11 global.gp_load_sd # RUN global.gp_load_c_sw ... # OK global.gp_load_c_sw ok 12 global.gp_load_c_sw # RUN global.gp_load_c_sd ... # OK global.gp_load_c_sd ok 13 global.gp_load_c_sd # RUN global.gp_load_c_sdsp ... # OK global.gp_load_c_sdsp ok 14 global.gp_load_c_sdsp # RUN global.fpu_load_flw ... # OK global.fpu_load_flw ok 15 global.fpu_load_flw # RUN global.fpu_load_fld ... # OK global.fpu_load_fld ok 16 global.fpu_load_fld # RUN global.fpu_load_c_fld ... # OK global.fpu_load_c_fld ok 17 global.fpu_load_c_fld # RUN global.fpu_load_c_fldsp ... # OK global.fpu_load_c_fldsp ok 18 global.fpu_load_c_fldsp # RUN global.fpu_store_fsw ... # OK global.fpu_store_fsw ok 19 global.fpu_store_fsw # RUN global.fpu_store_fsd ... # OK global.fpu_store_fsd ok 20 global.fpu_store_fsd # RUN global.fpu_store_c_fsd ... # OK global.fpu_store_c_fsd ok 21 global.fpu_store_c_fsd # RUN global.fpu_store_c_fsdsp ... # OK global.fpu_store_c_fsdsp ok 22 global.fpu_store_c_fsdsp # RUN global.gen_sigbus ... [12797.988647] misaligned[618]: unhandled signal 7 code 0x1 at 0x0000000000014dc0 in misaligned[4dc0,10000+76000] [12797.988990] CPU: 0 UID: 0 PID: 618 Comm: misaligned Not tainted 6.13.0-rc6-00008-g4ec4468967c9-dirty #51 [12797.989169] Hardware name: riscv-virtio,qemu (DT) [12797.989264] epc : 0000000000014dc0 ra : 0000000000014d00 sp : 00007fffe165d100 [12797.989407] gp : 000000000008f6e8 tp : 0000000000095760 t0 : 0000000000000008 [12797.989544] t1 : 00000000000965d8 t2 : 000000000008e830 s0 : 00007fffe165d160 [12797.989692] s1 : 000000000000001a a0 : 0000000000000000 a1 : 0000000000000002 [12797.989831] a2 : 0000000000000000 a3 : 0000000000000000 a4 : ffffffffdeadbeef [12797.989964] a5 : 000000000008ef61 a6 : 626769735f6e0000 a7 : fffffffffffff000 [12797.990094] s2 : 0000000000000001 s3 : 00007fffe165d838 s4 : 00007fffe165d848 [12797.990238] s5 : 000000000000001a s6 : 0000000000010442 s7 : 0000000000010200 [12797.990391] s8 : 000000000000003a s9 : 0000000000094508 s10: 0000000000000000 [12797.990526] s11: 0000555567460668 t3 : 00007fffe165d070 t4 : 00000000000965d0 [12797.990656] t5 : fefefefefefefeff t6 : 0000000000000073 [12797.990756] status: 0000000200004020 badaddr: 000000000008ef61 cause: 0000000000000006 [12797.990911] Code: 8793 8791 3423 fcf4 3783 fc84 c737 dead 0713 eef7 (c398) 0001 # OK global.gen_sigbus ok 23 global.gen_sigbus # PASSED: 23 / 23 tests passed. # Totals: pass:23 fail:0 xfail:0 xpass:0 skip:0 error:0 With kvm-tools: # lkvm run -k sbi.flat -m 128 Info: # lkvm run -k sbi.flat -m 128 -c 1 --name guest-97 Info: Removed ghost socket file "/root/.lkvm//guest-97.sock". ########################################################################## # kvm-unit-tests ########################################################################## ... [test messages elided] PASS: sbi: fwft: FWFT extension probing no error PASS: sbi: fwft: get/set reserved feature 0x6 error == SBI_ERR_DENIED PASS: sbi: fwft: get/set reserved feature 0x3fffffff error == SBI_ERR_DENIED PASS: sbi: fwft: get/set reserved feature 0x80000000 error == SBI_ERR_DENIED PASS: sbi: fwft: get/set reserved feature 0xbfffffff error == SBI_ERR_DENIED PASS: sbi: fwft: misaligned_deleg: Get misaligned deleg feature no error PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature invalid value error PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature invalid value error PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature value no error PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature value 0 PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature value no error PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature value 1 PASS: sbi: fwft: misaligned_deleg: Verify misaligned load exception trap in supervisor SUMMARY: 50 tests, 2 unexpected failures, 12 skipped This series is available at [6]. Link: https://github.com/riscv-non-isa/riscv-sbi-doc/releases/download/vv3.0-rc2/riscv-sbi.pdf [1] Link: https://github.com/rivosinc/qemu/tree/dev/cleger/misaligned [2] Link: https://lore.kernel.org/all/20241211211933.198792-3-fkonrad at amd.com/T/ [3] Link: https://github.com/clementleger/kvm-unit-tests/tree/dev/cleger/fwft_v1 [4] Link: https://github.com/clementleger/unaligned_test [5] Link: https://github.com/rivosinc/linux/tree/dev/cleger/fwft_v1 [6] --- V4: - Check SBI version 3.0 instead of 2.0 for FWFT presence - Use long for kvm_sbi_fwft operation return value - Init KVM sbi extension even if default_disabled - Remove revert_on_fail parameter for sbi_fwft_feature_set(). - Fix comments for sbi_fwft_set/get() - Only handle local features (there are no globals yet in the spec) - Add new SBI errors to sbi_err_map_linux_errno() V3: - Added comment about kvm sbi fwft supported/set/get callback requirements - Move struct kvm_sbi_fwft_feature in kvm_sbi_fwft.c - Add a FWFT interface V2: - Added Kselftest for misaligned testing - Added get_user() usage instead of __get_user() - Reenable interrupt when possible in misaligned access handling - Document that riscv supports unaligned-traps - Fix KVM extension state when an init function is present - Rework SBI misaligned accesses trap delegation code - Added support for CPU hotplugging - Added KVM SBI reset callback - Added reset for KVM SBI FWFT lock - Return SBI_ERR_DENIED_LOCKED when LOCK flag is set Cl?ment L?ger (18): riscv: add Firmware Feature (FWFT) SBI extensions definitions riscv: sbi: add new SBI error mappings riscv: sbi: add FWFT extension interface riscv: sbi: add SBI FWFT extension calls riscv: misaligned: request misaligned exception from SBI riscv: misaligned: use on_each_cpu() for scalar misaligned access probing riscv: misaligned: use correct CONFIG_ ifdef for misaligned_access_speed riscv: misaligned: move emulated access uniformity check in a function riscv: misaligned: add a function to check misalign trap delegability riscv: misaligned: factorize trap handling riscv: misaligned: enable IRQs while handling misaligned accesses riscv: misaligned: use get_user() instead of __get_user() Documentation/sysctl: add riscv to unaligned-trap supported archs selftests: riscv: add misaligned access testing RISC-V: KVM: add SBI extension init()/deinit() functions RISC-V: KVM: add SBI extension reset callback RISC-V: KVM: add support for FWFT SBI extension RISC-V: KVM: add support for SBI_FWFT_MISALIGNED_DELEG Documentation/admin-guide/sysctl/kernel.rst | 4 +- arch/riscv/include/asm/cpufeature.h | 8 +- arch/riscv/include/asm/kvm_host.h | 5 +- arch/riscv/include/asm/kvm_vcpu_sbi.h | 12 + arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h | 29 ++ arch/riscv/include/asm/sbi.h | 62 +++++ arch/riscv/include/uapi/asm/kvm.h | 1 + arch/riscv/kernel/sbi.c | 95 +++++++ arch/riscv/kernel/traps.c | 57 ++-- arch/riscv/kernel/traps_misaligned.c | 118 +++++++- arch/riscv/kernel/unaligned_access_speed.c | 11 +- arch/riscv/kvm/Makefile | 1 + arch/riscv/kvm/vcpu.c | 7 +- arch/riscv/kvm/vcpu_sbi.c | 54 ++++ arch/riscv/kvm/vcpu_sbi_fwft.c | 252 +++++++++++++++++ arch/riscv/kvm/vcpu_sbi_sta.c | 3 +- .../selftests/riscv/misaligned/.gitignore | 1 + .../selftests/riscv/misaligned/Makefile | 12 + .../selftests/riscv/misaligned/common.S | 33 +++ .../testing/selftests/riscv/misaligned/fpu.S | 180 +++++++++++++ tools/testing/selftests/riscv/misaligned/gp.S | 103 +++++++ .../selftests/riscv/misaligned/misaligned.c | 254 ++++++++++++++++++ 22 files changed, 1255 insertions(+), 47 deletions(-) create mode 100644 arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h create mode 100644 arch/riscv/kvm/vcpu_sbi_fwft.c create mode 100644 tools/testing/selftests/riscv/misaligned/.gitignore create mode 100644 tools/testing/selftests/riscv/misaligned/Makefile create mode 100644 tools/testing/selftests/riscv/misaligned/common.S create mode 100644 tools/testing/selftests/riscv/misaligned/fpu.S create mode 100644 tools/testing/selftests/riscv/misaligned/gp.S create mode 100644 tools/testing/selftests/riscv/misaligned/misaligned.c -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 10:06:07 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 18:06:07 +0100 Subject: [PATCH v4 01/18] riscv: add Firmware Feature (FWFT) SBI extensions definitions In-Reply-To: <20250317170625.1142870-1-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> Message-ID: <20250317170625.1142870-2-cleger@rivosinc.com> The Firmware Features extension (FWFT) was added as part of the SBI 3.0 specification. Add SBI definitions to use this extension. Signed-off-by: Cl?ment L?ger Reviewed-by: Samuel Holland Tested-by: Samuel Holland Reviewed-by: Deepak Gupta Reviewed-by: Andrew Jones --- arch/riscv/include/asm/sbi.h | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h index 3d250824178b..bb077d0c912f 100644 --- a/arch/riscv/include/asm/sbi.h +++ b/arch/riscv/include/asm/sbi.h @@ -35,6 +35,7 @@ enum sbi_ext_id { SBI_EXT_DBCN = 0x4442434E, SBI_EXT_STA = 0x535441, SBI_EXT_NACL = 0x4E41434C, + SBI_EXT_FWFT = 0x46574654, /* Experimentals extensions must lie within this range */ SBI_EXT_EXPERIMENTAL_START = 0x08000000, @@ -402,6 +403,33 @@ enum sbi_ext_nacl_feature { #define SBI_NACL_SHMEM_SRET_X(__i) ((__riscv_xlen / 8) * (__i)) #define SBI_NACL_SHMEM_SRET_X_LAST 31 +/* SBI function IDs for FW feature extension */ +#define SBI_EXT_FWFT_SET 0x0 +#define SBI_EXT_FWFT_GET 0x1 + +enum sbi_fwft_feature_t { + SBI_FWFT_MISALIGNED_EXC_DELEG = 0x0, + SBI_FWFT_LANDING_PAD = 0x1, + SBI_FWFT_SHADOW_STACK = 0x2, + SBI_FWFT_DOUBLE_TRAP = 0x3, + SBI_FWFT_PTE_AD_HW_UPDATING = 0x4, + SBI_FWFT_POINTER_MASKING_PMLEN = 0x5, + SBI_FWFT_LOCAL_RESERVED_START = 0x6, + SBI_FWFT_LOCAL_RESERVED_END = 0x3fffffff, + SBI_FWFT_LOCAL_PLATFORM_START = 0x40000000, + SBI_FWFT_LOCAL_PLATFORM_END = 0x7fffffff, + + SBI_FWFT_GLOBAL_RESERVED_START = 0x80000000, + SBI_FWFT_GLOBAL_RESERVED_END = 0xbfffffff, + SBI_FWFT_GLOBAL_PLATFORM_START = 0xc0000000, + SBI_FWFT_GLOBAL_PLATFORM_END = 0xffffffff, +}; + +#define SBI_FWFT_PLATFORM_FEATURE_BIT BIT(30) +#define SBI_FWFT_GLOBAL_FEATURE_BIT BIT(31) + +#define SBI_FWFT_SET_FLAG_LOCK BIT(0) + /* SBI spec version fields */ #define SBI_SPEC_VERSION_DEFAULT 0x1 #define SBI_SPEC_VERSION_MAJOR_SHIFT 24 @@ -419,6 +447,11 @@ enum sbi_ext_nacl_feature { #define SBI_ERR_ALREADY_STARTED -7 #define SBI_ERR_ALREADY_STOPPED -8 #define SBI_ERR_NO_SHMEM -9 +#define SBI_ERR_INVALID_STATE -10 +#define SBI_ERR_BAD_RANGE -11 +#define SBI_ERR_TIMEOUT -12 +#define SBI_ERR_IO -13 +#define SBI_ERR_DENIED_LOCKED -14 extern unsigned long sbi_spec_version; struct sbiret { -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 10:06:08 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 18:06:08 +0100 Subject: [PATCH v4 02/18] riscv: sbi: add new SBI error mappings In-Reply-To: <20250317170625.1142870-1-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> Message-ID: <20250317170625.1142870-3-cleger@rivosinc.com> A few new errors have been added with SBI V3.0, maps them as close as possible to errno values. Signed-off-by: Cl?ment L?ger --- arch/riscv/include/asm/sbi.h | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h index bb077d0c912f..d11d22717b49 100644 --- a/arch/riscv/include/asm/sbi.h +++ b/arch/riscv/include/asm/sbi.h @@ -536,11 +536,20 @@ static inline int sbi_err_map_linux_errno(int err) case SBI_SUCCESS: return 0; case SBI_ERR_DENIED: + case SBI_ERR_DENIED_LOCKED: return -EPERM; case SBI_ERR_INVALID_PARAM: + case SBI_ERR_INVALID_STATE: + case SBI_ERR_BAD_RANGE: return -EINVAL; case SBI_ERR_INVALID_ADDRESS: return -EFAULT; + case SBI_ERR_NO_SHMEM: + return -ENOMEM; + case SBI_ERR_TIMEOUT: + return -ETIME; + case SBI_ERR_IO: + return -EIO; case SBI_ERR_NOT_SUPPORTED: case SBI_ERR_FAILURE: default: -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 10:06:09 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 18:06:09 +0100 Subject: [PATCH v4 03/18] riscv: sbi: add FWFT extension interface In-Reply-To: <20250317170625.1142870-1-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> Message-ID: <20250317170625.1142870-4-cleger@rivosinc.com> This SBI extensions enables supervisor mode to control feature that are under M-mode control (For instance, Svadu menvcfg ADUE bit, Ssdbltrp DTE, etc). Add an interface to set local features for a specific cpu mask as well as for the online cpu mask. Signed-off-by: Cl?ment L?ger --- arch/riscv/include/asm/sbi.h | 20 +++++++++++ arch/riscv/kernel/sbi.c | 69 ++++++++++++++++++++++++++++++++++++ 2 files changed, 89 insertions(+) diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h index d11d22717b49..1cecfa82c2e5 100644 --- a/arch/riscv/include/asm/sbi.h +++ b/arch/riscv/include/asm/sbi.h @@ -503,6 +503,26 @@ int sbi_remote_hfence_vvma_asid(const struct cpumask *cpu_mask, unsigned long asid); long sbi_probe_extension(int ext); +int sbi_fwft_local_set_cpumask(const cpumask_t *mask, u32 feature, + unsigned long value, unsigned long flags); +/** + * sbi_fwft_local_set() - Set a feature on all online cpus + * @feature: The feature to be set + * @value: The feature value to be set + * @flags: FWFT feature set flags + * + * Return: 0 on success, appropriate linux error code otherwise. + */ + static inline int sbi_fwft_local_set(u32 feature, unsigned long value, + unsigned long flags) + { + return sbi_fwft_local_set_cpumask(cpu_online_mask, feature, value, + flags); + } + +int sbi_fwft_get(u32 feature, unsigned long *value); +int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags); + /* Check if current SBI specification version is 0.1 or not */ static inline int sbi_spec_is_0_1(void) { diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c index 1989b8cade1b..d41a5642be24 100644 --- a/arch/riscv/kernel/sbi.c +++ b/arch/riscv/kernel/sbi.c @@ -299,6 +299,75 @@ static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask, return 0; } +/** + * sbi_fwft_get() - Get a feature for the local hart + * @feature: The feature ID to be set + * @value: Will contain the feature value on success + * + * Return: 0 on success, appropriate linux error code otherwise. + */ +int sbi_fwft_get(u32 feature, unsigned long *value) +{ + return -EOPNOTSUPP; +} + +/** + * sbi_fwft_set() - Set a feature on the local hart + * @feature: The feature ID to be set + * @value: The feature value to be set + * @flags: FWFT feature set flags + * + * Return: 0 on success, appropriate linux error code otherwise. + */ +int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags) +{ + return -EOPNOTSUPP; +} + +struct fwft_set_req { + u32 feature; + unsigned long value; + unsigned long flags; + atomic_t error; +}; + +static void cpu_sbi_fwft_set(void *arg) +{ + struct fwft_set_req *req = arg; + int ret; + + ret = sbi_fwft_set(req->feature, req->value, req->flags); + if (ret) + atomic_set(&req->error, ret); +} + +/** + * sbi_fwft_local_set() - Set a feature for the specified cpumask + * @mask: CPU mask of cpus that need the feature to be set + * @feature: The feature ID to be set + * @value: The feature value to be set + * @flags: FWFT feature set flags + * + * Return: 0 on success, appropriate linux error code otherwise. + */ +int sbi_fwft_local_set_cpumask(const cpumask_t *mask, u32 feature, + unsigned long value, unsigned long flags) +{ + struct fwft_set_req req = { + .feature = feature, + .value = value, + .flags = flags, + .error = ATOMIC_INIT(0), + }; + + if (feature & SBI_FWFT_GLOBAL_FEATURE_BIT) + return -EINVAL; + + on_each_cpu_mask(mask, cpu_sbi_fwft_set, &req, 1); + + return atomic_read(&req.error); +} + /** * sbi_set_timer() - Program the timer for next timer event. * @stime_value: The value after which next timer event should fire. -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 10:06:10 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 18:06:10 +0100 Subject: [PATCH v4 04/18] riscv: sbi: add SBI FWFT extension calls In-Reply-To: <20250317170625.1142870-1-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> Message-ID: <20250317170625.1142870-5-cleger@rivosinc.com> Add FWFT extension calls. This will be ratified in SBI V3.0 hence, it is provided as a separate commit that can be left out if needed. Signed-off-by: Cl?ment L?ger --- arch/riscv/kernel/sbi.c | 30 ++++++++++++++++++++++++++++-- 1 file changed, 28 insertions(+), 2 deletions(-) diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c index d41a5642be24..54d9ceb7b723 100644 --- a/arch/riscv/kernel/sbi.c +++ b/arch/riscv/kernel/sbi.c @@ -299,6 +299,8 @@ static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask, return 0; } +static bool sbi_fwft_supported; + /** * sbi_fwft_get() - Get a feature for the local hart * @feature: The feature ID to be set @@ -308,7 +310,15 @@ static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask, */ int sbi_fwft_get(u32 feature, unsigned long *value) { - return -EOPNOTSUPP; + struct sbiret ret; + + if (!sbi_fwft_supported) + return -EOPNOTSUPP; + + ret = sbi_ecall(SBI_EXT_FWFT, SBI_EXT_FWFT_GET, + feature, 0, 0, 0, 0, 0); + + return sbi_err_map_linux_errno(ret.error); } /** @@ -321,7 +331,15 @@ int sbi_fwft_get(u32 feature, unsigned long *value) */ int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags) { - return -EOPNOTSUPP; + struct sbiret ret; + + if (!sbi_fwft_supported) + return -EOPNOTSUPP; + + ret = sbi_ecall(SBI_EXT_FWFT, SBI_EXT_FWFT_SET, + feature, value, flags, 0, 0, 0); + + return sbi_err_map_linux_errno(ret.error); } struct fwft_set_req { @@ -360,6 +378,9 @@ int sbi_fwft_local_set_cpumask(const cpumask_t *mask, u32 feature, .error = ATOMIC_INIT(0), }; + if (!sbi_fwft_supported) + return -EOPNOTSUPP; + if (feature & SBI_FWFT_GLOBAL_FEATURE_BIT) return -EINVAL; @@ -691,6 +712,11 @@ void __init sbi_init(void) pr_info("SBI DBCN extension detected\n"); sbi_debug_console_available = true; } + if ((sbi_spec_version >= sbi_mk_version(3, 0)) && + (sbi_probe_extension(SBI_EXT_FWFT) > 0)) { + pr_info("SBI FWFT extension detected\n"); + sbi_fwft_supported = true; + } } else { __sbi_set_timer = __sbi_set_timer_v01; __sbi_send_ipi = __sbi_send_ipi_v01; -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 10:06:11 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 18:06:11 +0100 Subject: [PATCH v4 05/18] riscv: misaligned: request misaligned exception from SBI In-Reply-To: <20250317170625.1142870-1-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> Message-ID: <20250317170625.1142870-6-cleger@rivosinc.com> Now that the kernel can handle misaligned accesses in S-mode, request misaligned access exception delegation from SBI. This uses the FWFT SBI extension defined in SBI version 3.0. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- arch/riscv/include/asm/cpufeature.h | 3 +- arch/riscv/kernel/traps_misaligned.c | 77 +++++++++++++++++++++- arch/riscv/kernel/unaligned_access_speed.c | 11 +++- 3 files changed, 86 insertions(+), 5 deletions(-) diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h index 569140d6e639..ad7d26788e6a 100644 --- a/arch/riscv/include/asm/cpufeature.h +++ b/arch/riscv/include/asm/cpufeature.h @@ -64,8 +64,9 @@ void __init riscv_user_isa_enable(void); _RISCV_ISA_EXT_DATA(_name, _id, _sub_exts, ARRAY_SIZE(_sub_exts), _validate) bool check_unaligned_access_emulated_all_cpus(void); +void unaligned_access_init(void); +int cpu_online_unaligned_access_init(unsigned int cpu); #if defined(CONFIG_RISCV_SCALAR_MISALIGNED) -void check_unaligned_access_emulated(struct work_struct *work __always_unused); void unaligned_emulation_finish(void); bool unaligned_ctl_available(void); DECLARE_PER_CPU(long, misaligned_access_speed); diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c index 7cc108aed74e..fa7f100b95bd 100644 --- a/arch/riscv/kernel/traps_misaligned.c +++ b/arch/riscv/kernel/traps_misaligned.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #define INSN_MATCH_LB 0x3 @@ -635,7 +636,7 @@ bool check_vector_unaligned_access_emulated_all_cpus(void) static bool unaligned_ctl __read_mostly; -void check_unaligned_access_emulated(struct work_struct *work __always_unused) +static void check_unaligned_access_emulated(struct work_struct *work __always_unused) { int cpu = smp_processor_id(); long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu); @@ -646,6 +647,13 @@ void check_unaligned_access_emulated(struct work_struct *work __always_unused) __asm__ __volatile__ ( " "REG_L" %[tmp], 1(%[ptr])\n" : [tmp] "=r" (tmp_val) : [ptr] "r" (&tmp_var) : "memory"); +} + +static int cpu_online_check_unaligned_access_emulated(unsigned int cpu) +{ + long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu); + + check_unaligned_access_emulated(NULL); /* * If unaligned_ctl is already set, this means that we detected that all @@ -654,9 +662,10 @@ void check_unaligned_access_emulated(struct work_struct *work __always_unused) */ if (unlikely(unaligned_ctl && (*mas_ptr != RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED))) { pr_crit("CPU misaligned accesses non homogeneous (expected all emulated)\n"); - while (true) - cpu_relax(); + return -EINVAL; } + + return 0; } bool check_unaligned_access_emulated_all_cpus(void) @@ -688,4 +697,66 @@ bool check_unaligned_access_emulated_all_cpus(void) { return false; } +static int cpu_online_check_unaligned_access_emulated(unsigned int cpu) +{ + return 0; +} #endif + +#ifdef CONFIG_RISCV_SBI + +static bool misaligned_traps_delegated; + +static int cpu_online_sbi_unaligned_setup(unsigned int cpu) +{ + if (sbi_fwft_set(SBI_FWFT_MISALIGNED_EXC_DELEG, 1, 0) && + misaligned_traps_delegated) { + pr_crit("Misaligned trap delegation non homogeneous (expected delegated)"); + return -EINVAL; + } + + return 0; +} + +static void unaligned_sbi_request_delegation(void) +{ + int ret; + + ret = sbi_fwft_local_set(SBI_FWFT_MISALIGNED_EXC_DELEG, 1, 0); + if (ret) + return; + + misaligned_traps_delegated = true; + pr_info("SBI misaligned access exception delegation ok\n"); + /* + * Note that we don't have to take any specific action here, if + * the delegation is successful, then + * check_unaligned_access_emulated() will verify that indeed the + * platform traps on misaligned accesses. + */ +} + +void unaligned_access_init(void) +{ + if (sbi_probe_extension(SBI_EXT_FWFT) > 0) + unaligned_sbi_request_delegation(); +} +#else +void unaligned_access_init(void) {} + +static int cpu_online_sbi_unaligned_setup(unsigned int cpu __always_unused) +{ + return 0; +} +#endif + +int cpu_online_unaligned_access_init(unsigned int cpu) +{ + int ret; + + ret = cpu_online_sbi_unaligned_setup(cpu); + if (ret) + return ret; + + return cpu_online_check_unaligned_access_emulated(cpu); +} diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel/unaligned_access_speed.c index 91f189cf1611..2f3aba073297 100644 --- a/arch/riscv/kernel/unaligned_access_speed.c +++ b/arch/riscv/kernel/unaligned_access_speed.c @@ -188,13 +188,20 @@ arch_initcall_sync(lock_and_set_unaligned_access_static_branch); static int riscv_online_cpu(unsigned int cpu) { + int ret; static struct page *buf; /* We are already set since the last check */ if (per_cpu(misaligned_access_speed, cpu) != RISCV_HWPROBE_MISALIGNED_SCALAR_UNKNOWN) goto exit; - check_unaligned_access_emulated(NULL); + ret = cpu_online_unaligned_access_init(cpu); + if (ret) + return ret; + + if (per_cpu(misaligned_access_speed, cpu) == RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED) + goto exit; + buf = alloc_pages(GFP_KERNEL, MISALIGNED_BUFFER_ORDER); if (!buf) { pr_warn("Allocation failure, not measuring misaligned performance\n"); @@ -403,6 +410,8 @@ static int check_unaligned_access_all_cpus(void) { bool all_cpus_emulated, all_cpus_vec_unsupported; + unaligned_access_init(); + all_cpus_emulated = check_unaligned_access_emulated_all_cpus(); all_cpus_vec_unsupported = check_vector_unaligned_access_emulated_all_cpus(); -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 10:06:12 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 18:06:12 +0100 Subject: [PATCH v4 06/18] riscv: misaligned: use on_each_cpu() for scalar misaligned access probing In-Reply-To: <20250317170625.1142870-1-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> Message-ID: <20250317170625.1142870-7-cleger@rivosinc.com> schedule_on_each_cpu() was used without any good reason while documented as very slow. This call was in the boot path, so better use on_each_cpu() for scalar misaligned checking. Vector misaligned check still needs to use schedule_on_each_cpu() since it requires irqs to be enabled but that's less of a problem since this code is ran in a kthread. Add a comment to explicit that. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- arch/riscv/kernel/traps_misaligned.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c index fa7f100b95bd..4584f2e1d39d 100644 --- a/arch/riscv/kernel/traps_misaligned.c +++ b/arch/riscv/kernel/traps_misaligned.c @@ -616,6 +616,10 @@ bool check_vector_unaligned_access_emulated_all_cpus(void) return false; } + /* + * While being documented as very slow, schedule_on_each_cpu() is used since + * kernel_vector_begin() expects irqs to be enabled or it will panic() + */ schedule_on_each_cpu(check_vector_unaligned_access_emulated); for_each_online_cpu(cpu) @@ -636,7 +640,7 @@ bool check_vector_unaligned_access_emulated_all_cpus(void) static bool unaligned_ctl __read_mostly; -static void check_unaligned_access_emulated(struct work_struct *work __always_unused) +static void check_unaligned_access_emulated(void *arg __always_unused) { int cpu = smp_processor_id(); long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu); @@ -677,7 +681,7 @@ bool check_unaligned_access_emulated_all_cpus(void) * accesses emulated since tasks requesting such control can run on any * CPU. */ - schedule_on_each_cpu(check_unaligned_access_emulated); + on_each_cpu(check_unaligned_access_emulated, NULL, 1); for_each_online_cpu(cpu) if (per_cpu(misaligned_access_speed, cpu) -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 10:06:13 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 18:06:13 +0100 Subject: [PATCH v4 07/18] riscv: misaligned: use correct CONFIG_ ifdef for misaligned_access_speed In-Reply-To: <20250317170625.1142870-1-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> Message-ID: <20250317170625.1142870-8-cleger@rivosinc.com> misaligned_access_speed is defined under CONFIG_RISCV_SCALAR_MISALIGNED but was used under CONFIG_RISCV_PROBE_UNALIGNED_ACCESS. Fix that by using the correct config option. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- arch/riscv/kernel/traps_misaligned.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c index 4584f2e1d39d..8175b3449b73 100644 --- a/arch/riscv/kernel/traps_misaligned.c +++ b/arch/riscv/kernel/traps_misaligned.c @@ -362,7 +362,7 @@ static int handle_scalar_misaligned_load(struct pt_regs *regs) perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS, 1, regs, addr); -#ifdef CONFIG_RISCV_PROBE_UNALIGNED_ACCESS +#ifdef CONFIG_RISCV_SCALAR_MISALIGNED *this_cpu_ptr(&misaligned_access_speed) = RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED; #endif -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 10:06:14 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 18:06:14 +0100 Subject: [PATCH v4 08/18] riscv: misaligned: move emulated access uniformity check in a function In-Reply-To: <20250317170625.1142870-1-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> Message-ID: <20250317170625.1142870-9-cleger@rivosinc.com> Split the code that check for the uniformity of misaligned accesses performance on all cpus from check_unaligned_access_emulated_all_cpus() to its own function which will be used for delegation check. No functional changes intended. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- arch/riscv/kernel/traps_misaligned.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c index 8175b3449b73..3c77fc78fe4f 100644 --- a/arch/riscv/kernel/traps_misaligned.c +++ b/arch/riscv/kernel/traps_misaligned.c @@ -672,10 +672,20 @@ static int cpu_online_check_unaligned_access_emulated(unsigned int cpu) return 0; } -bool check_unaligned_access_emulated_all_cpus(void) +static bool all_cpus_unaligned_scalar_access_emulated(void) { int cpu; + for_each_online_cpu(cpu) + if (per_cpu(misaligned_access_speed, cpu) != + RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED) + return false; + + return true; +} + +bool check_unaligned_access_emulated_all_cpus(void) +{ /* * We can only support PR_UNALIGN controls if all CPUs have misaligned * accesses emulated since tasks requesting such control can run on any @@ -683,10 +693,8 @@ bool check_unaligned_access_emulated_all_cpus(void) */ on_each_cpu(check_unaligned_access_emulated, NULL, 1); - for_each_online_cpu(cpu) - if (per_cpu(misaligned_access_speed, cpu) - != RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED) - return false; + if (!all_cpus_unaligned_scalar_access_emulated()) + return false; unaligned_ctl = true; return true; -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 10:06:15 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 18:06:15 +0100 Subject: [PATCH v4 09/18] riscv: misaligned: add a function to check misalign trap delegability In-Reply-To: <20250317170625.1142870-1-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> Message-ID: <20250317170625.1142870-10-cleger@rivosinc.com> Checking for the delegability of the misaligned access trap is needed for the KVM FWFT extension implementation. Add a function to get the delegability of the misaligned trap exception. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- arch/riscv/include/asm/cpufeature.h | 5 +++++ arch/riscv/kernel/traps_misaligned.c | 17 +++++++++++++++-- 2 files changed, 20 insertions(+), 2 deletions(-) diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h index ad7d26788e6a..8b97cba99fc3 100644 --- a/arch/riscv/include/asm/cpufeature.h +++ b/arch/riscv/include/asm/cpufeature.h @@ -69,12 +69,17 @@ int cpu_online_unaligned_access_init(unsigned int cpu); #if defined(CONFIG_RISCV_SCALAR_MISALIGNED) void unaligned_emulation_finish(void); bool unaligned_ctl_available(void); +bool misaligned_traps_can_delegate(void); DECLARE_PER_CPU(long, misaligned_access_speed); #else static inline bool unaligned_ctl_available(void) { return false; } +static inline bool misaligned_traps_can_delegate(void) +{ + return false; +} #endif bool check_vector_unaligned_access_emulated_all_cpus(void); diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c index 3c77fc78fe4f..0fb663ac200f 100644 --- a/arch/riscv/kernel/traps_misaligned.c +++ b/arch/riscv/kernel/traps_misaligned.c @@ -715,10 +715,10 @@ static int cpu_online_check_unaligned_access_emulated(unsigned int cpu) } #endif -#ifdef CONFIG_RISCV_SBI - static bool misaligned_traps_delegated; +#ifdef CONFIG_RISCV_SBI + static int cpu_online_sbi_unaligned_setup(unsigned int cpu) { if (sbi_fwft_set(SBI_FWFT_MISALIGNED_EXC_DELEG, 1, 0) && @@ -760,6 +760,7 @@ static int cpu_online_sbi_unaligned_setup(unsigned int cpu __always_unused) { return 0; } + #endif int cpu_online_unaligned_access_init(unsigned int cpu) @@ -772,3 +773,15 @@ int cpu_online_unaligned_access_init(unsigned int cpu) return cpu_online_check_unaligned_access_emulated(cpu); } + +bool misaligned_traps_can_delegate(void) +{ + /* + * Either we successfully requested misaligned traps delegation for all + * CPUS or the SBI does not implemented FWFT extension but delegated the + * exception by default. + */ + return misaligned_traps_delegated || + all_cpus_unaligned_scalar_access_emulated(); +} +EXPORT_SYMBOL_GPL(misaligned_traps_can_delegate); -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 10:06:16 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 18:06:16 +0100 Subject: [PATCH v4 10/18] riscv: misaligned: factorize trap handling In-Reply-To: <20250317170625.1142870-1-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> Message-ID: <20250317170625.1142870-11-cleger@rivosinc.com> misaligned accesses traps are not nmi and should be treated as normal one using irqentry_enter()/exit(). Since both load/store and user/kernel should use almost the same path and that we are going to add some code around that, factorize it. Signed-off-by: Cl?ment L?ger --- arch/riscv/kernel/traps.c | 49 ++++++++++++++++----------------------- 1 file changed, 20 insertions(+), 29 deletions(-) diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 8ff8e8b36524..55d9f3450398 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -198,47 +198,38 @@ asmlinkage __visible __trap_section void do_trap_insn_illegal(struct pt_regs *re DO_ERROR_INFO(do_trap_load_fault, SIGSEGV, SEGV_ACCERR, "load access fault"); -asmlinkage __visible __trap_section void do_trap_load_misaligned(struct pt_regs *regs) +enum misaligned_access_type { + MISALIGNED_STORE, + MISALIGNED_LOAD, +}; + +static void do_trap_misaligned(struct pt_regs *regs, enum misaligned_access_type type) { - if (user_mode(regs)) { - irqentry_enter_from_user_mode(regs); + irqentry_state_t state = irqentry_enter(regs); + if (type == MISALIGNED_LOAD) { if (handle_misaligned_load(regs)) do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc, - "Oops - load address misaligned"); - - irqentry_exit_to_user_mode(regs); + "Oops - load address misaligned"); } else { - irqentry_state_t state = irqentry_nmi_enter(regs); - - if (handle_misaligned_load(regs)) + if (handle_misaligned_store(regs)) do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc, - "Oops - load address misaligned"); - - irqentry_nmi_exit(regs, state); + "Oops - store (or AMO) address misaligned"); } + + irqentry_exit(regs, state); } -asmlinkage __visible __trap_section void do_trap_store_misaligned(struct pt_regs *regs) +asmlinkage __visible __trap_section void do_trap_load_misaligned(struct pt_regs *regs) { - if (user_mode(regs)) { - irqentry_enter_from_user_mode(regs); - - if (handle_misaligned_store(regs)) - do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc, - "Oops - store (or AMO) address misaligned"); - - irqentry_exit_to_user_mode(regs); - } else { - irqentry_state_t state = irqentry_nmi_enter(regs); - - if (handle_misaligned_store(regs)) - do_trap_error(regs, SIGBUS, BUS_ADRALN, regs->epc, - "Oops - store (or AMO) address misaligned"); + do_trap_misaligned(regs, MISALIGNED_LOAD); +} - irqentry_nmi_exit(regs, state); - } +asmlinkage __visible __trap_section void do_trap_store_misaligned(struct pt_regs *regs) +{ + do_trap_misaligned(regs, MISALIGNED_STORE); } + DO_ERROR_INFO(do_trap_store_fault, SIGSEGV, SEGV_ACCERR, "store (or AMO) access fault"); DO_ERROR_INFO(do_trap_ecall_s, -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 10:06:17 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 18:06:17 +0100 Subject: [PATCH v4 11/18] riscv: misaligned: enable IRQs while handling misaligned accesses In-Reply-To: <20250317170625.1142870-1-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> Message-ID: <20250317170625.1142870-12-cleger@rivosinc.com> We can safely reenable IRQs if they were enabled in the previous context. This allows to access user memory that could potentially trigger a page fault. Signed-off-by: Cl?ment L?ger --- arch/riscv/kernel/traps.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 55d9f3450398..3eecc2addc41 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -206,6 +206,11 @@ enum misaligned_access_type { static void do_trap_misaligned(struct pt_regs *regs, enum misaligned_access_type type) { irqentry_state_t state = irqentry_enter(regs); + bool enable_irqs = !regs_irqs_disabled(regs); + + /* Enable interrupts if they were enabled in the interrupted context. */ + if (enable_irqs) + local_irq_enable(); if (type == MISALIGNED_LOAD) { if (handle_misaligned_load(regs)) @@ -217,6 +222,9 @@ static void do_trap_misaligned(struct pt_regs *regs, enum misaligned_access_type "Oops - store (or AMO) address misaligned"); } + if (enable_irqs) + local_irq_disable(); + irqentry_exit(regs, state); } -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 10:06:18 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 18:06:18 +0100 Subject: [PATCH v4 12/18] riscv: misaligned: use get_user() instead of __get_user() In-Reply-To: <20250317170625.1142870-1-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> Message-ID: <20250317170625.1142870-13-cleger@rivosinc.com> Now that we can safely handle user memory accesses while in the misaligned access handlers, use get_user() instead of __get_user() to have user memory access checks. Signed-off-by: Cl?ment L?ger --- arch/riscv/kernel/traps_misaligned.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c index 0fb663ac200f..90466a171f58 100644 --- a/arch/riscv/kernel/traps_misaligned.c +++ b/arch/riscv/kernel/traps_misaligned.c @@ -269,7 +269,7 @@ static unsigned long get_f32_rs(unsigned long insn, u8 fp_reg_offset, int __ret; \ \ if (user_mode(regs)) { \ - __ret = __get_user(insn, (type __user *) insn_addr); \ + __ret = get_user(insn, (type __user *) insn_addr); \ } else { \ insn = *(type *)insn_addr; \ __ret = 0; \ -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 10:06:19 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 18:06:19 +0100 Subject: [PATCH v4 13/18] Documentation/sysctl: add riscv to unaligned-trap supported archs In-Reply-To: <20250317170625.1142870-1-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> Message-ID: <20250317170625.1142870-14-cleger@rivosinc.com> riscv supports the "unaligned-trap" sysctl variable, add it to the list of supported architectures. Signed-off-by: Cl?ment L?ger --- Documentation/admin-guide/sysctl/kernel.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst index dd49a89a62d3..a38e91c4d92c 100644 --- a/Documentation/admin-guide/sysctl/kernel.rst +++ b/Documentation/admin-guide/sysctl/kernel.rst @@ -1595,8 +1595,8 @@ unaligned-trap On architectures where unaligned accesses cause traps, and where this feature is supported (``CONFIG_SYSCTL_ARCH_UNALIGN_ALLOW``; currently, -``arc``, ``parisc`` and ``loongarch``), controls whether unaligned traps -are caught and emulated (instead of failing). +``arc``, ``parisc``, ``loongarch`` and ``riscv``), controls whether unaligned +traps are caught and emulated (instead of failing). = ======================================================== 0 Do not emulate unaligned accesses. -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 10:06:20 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 18:06:20 +0100 Subject: [PATCH v4 14/18] selftests: riscv: add misaligned access testing In-Reply-To: <20250317170625.1142870-1-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> Message-ID: <20250317170625.1142870-15-cleger@rivosinc.com> Now that the kernel can emulate misaligned access and control its behavior, add a selftest for that. This selftest tests all the currently emulated instruction (except for the RV32 compressed ones which are left as a future exercise for a RV32 user). For the FPU instructions, all the FPU registers are tested. Signed-off-by: Cl?ment L?ger --- .../selftests/riscv/misaligned/.gitignore | 1 + .../selftests/riscv/misaligned/Makefile | 12 + .../selftests/riscv/misaligned/common.S | 33 +++ .../testing/selftests/riscv/misaligned/fpu.S | 180 +++++++++++++ tools/testing/selftests/riscv/misaligned/gp.S | 103 +++++++ .../selftests/riscv/misaligned/misaligned.c | 254 ++++++++++++++++++ 6 files changed, 583 insertions(+) create mode 100644 tools/testing/selftests/riscv/misaligned/.gitignore create mode 100644 tools/testing/selftests/riscv/misaligned/Makefile create mode 100644 tools/testing/selftests/riscv/misaligned/common.S create mode 100644 tools/testing/selftests/riscv/misaligned/fpu.S create mode 100644 tools/testing/selftests/riscv/misaligned/gp.S create mode 100644 tools/testing/selftests/riscv/misaligned/misaligned.c diff --git a/tools/testing/selftests/riscv/misaligned/.gitignore b/tools/testing/selftests/riscv/misaligned/.gitignore new file mode 100644 index 000000000000..5eff15a1f981 --- /dev/null +++ b/tools/testing/selftests/riscv/misaligned/.gitignore @@ -0,0 +1 @@ +misaligned diff --git a/tools/testing/selftests/riscv/misaligned/Makefile b/tools/testing/selftests/riscv/misaligned/Makefile new file mode 100644 index 000000000000..1aa40110c50d --- /dev/null +++ b/tools/testing/selftests/riscv/misaligned/Makefile @@ -0,0 +1,12 @@ +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) 2021 ARM Limited +# Originally tools/testing/arm64/abi/Makefile + +CFLAGS += -I$(top_srcdir)/tools/include + +TEST_GEN_PROGS := misaligned + +include ../../lib.mk + +$(OUTPUT)/misaligned: misaligned.c fpu.S gp.S + $(CC) -g3 -static -o$@ -march=rv64imafdc $(CFLAGS) $(LDFLAGS) $^ diff --git a/tools/testing/selftests/riscv/misaligned/common.S b/tools/testing/selftests/riscv/misaligned/common.S new file mode 100644 index 000000000000..8fa00035bd5d --- /dev/null +++ b/tools/testing/selftests/riscv/misaligned/common.S @@ -0,0 +1,33 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) 2025 Rivos Inc. + * + * Authors: + * Cl?ment L?ger + */ + +.macro lb_sb temp, offset, src, dst + lb \temp, \offset(\src) + sb \temp, \offset(\dst) +.endm + +.macro copy_long_to temp, src, dst + lb_sb \temp, 0, \src, \dst, + lb_sb \temp, 1, \src, \dst, + lb_sb \temp, 2, \src, \dst, + lb_sb \temp, 3, \src, \dst, + lb_sb \temp, 4, \src, \dst, + lb_sb \temp, 5, \src, \dst, + lb_sb \temp, 6, \src, \dst, + lb_sb \temp, 7, \src, \dst, +.endm + +.macro sp_stack_prologue offset + addi sp, sp, -8 + sub sp, sp, \offset +.endm + +.macro sp_stack_epilogue offset + add sp, sp, \offset + addi sp, sp, 8 +.endm diff --git a/tools/testing/selftests/riscv/misaligned/fpu.S b/tools/testing/selftests/riscv/misaligned/fpu.S new file mode 100644 index 000000000000..d008bff58310 --- /dev/null +++ b/tools/testing/selftests/riscv/misaligned/fpu.S @@ -0,0 +1,180 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) 2025 Rivos Inc. + * + * Authors: + * Cl?ment L?ger + */ + +#include "common.S" + +#define CASE_ALIGN 4 + +.macro fpu_load_inst fpreg, inst, precision, load_reg +.align CASE_ALIGN + \inst \fpreg, 0(\load_reg) + fmv.\precision fa0, \fpreg + j 2f +.endm + +#define flw(__fpreg) fpu_load_inst __fpreg, flw, s, s1 +#define fld(__fpreg) fpu_load_inst __fpreg, fld, d, s1 +#define c_flw(__fpreg) fpu_load_inst __fpreg, c.flw, s, s1 +#define c_fld(__fpreg) fpu_load_inst __fpreg, c.fld, d, s1 +#define c_fldsp(__fpreg) fpu_load_inst __fpreg, c.fldsp, d, sp + +.macro fpu_store_inst fpreg, inst, precision, store_reg +.align CASE_ALIGN + fmv.\precision \fpreg, fa0 + \inst \fpreg, 0(\store_reg) + j 2f +.endm + +#define fsw(__fpreg) fpu_store_inst __fpreg, fsw, s, s1 +#define fsd(__fpreg) fpu_store_inst __fpreg, fsd, d, s1 +#define c_fsw(__fpreg) fpu_store_inst __fpreg, c.fsw, s, s1 +#define c_fsd(__fpreg) fpu_store_inst __fpreg, c.fsd, d, s1 +#define c_fsdsp(__fpreg) fpu_store_inst __fpreg, c.fsdsp, d, sp + +.macro fp_test_prologue + move s1, a1 + /* + * Compute jump offset to store the correct FP register since we don't + * have indirect FP register access (or at least we don't use this + * extension so that works on all archs) + */ + sll t0, a0, CASE_ALIGN + la t2, 1f + add t0, t0, t2 + jr t0 +.align CASE_ALIGN +1: +.endm + +.macro fp_test_prologue_compressed + /* FP registers for compressed instructions starts from 8 to 16 */ + addi a0, a0, -8 + fp_test_prologue +.endm + +#define fp_test_body_compressed(__inst_func) \ + __inst_func(f8); \ + __inst_func(f9); \ + __inst_func(f10); \ + __inst_func(f11); \ + __inst_func(f12); \ + __inst_func(f13); \ + __inst_func(f14); \ + __inst_func(f15); \ +2: + +#define fp_test_body(__inst_func) \ + __inst_func(f0); \ + __inst_func(f1); \ + __inst_func(f2); \ + __inst_func(f3); \ + __inst_func(f4); \ + __inst_func(f5); \ + __inst_func(f6); \ + __inst_func(f7); \ + __inst_func(f8); \ + __inst_func(f9); \ + __inst_func(f10); \ + __inst_func(f11); \ + __inst_func(f12); \ + __inst_func(f13); \ + __inst_func(f14); \ + __inst_func(f15); \ + __inst_func(f16); \ + __inst_func(f17); \ + __inst_func(f18); \ + __inst_func(f19); \ + __inst_func(f20); \ + __inst_func(f21); \ + __inst_func(f22); \ + __inst_func(f23); \ + __inst_func(f24); \ + __inst_func(f25); \ + __inst_func(f26); \ + __inst_func(f27); \ + __inst_func(f28); \ + __inst_func(f29); \ + __inst_func(f30); \ + __inst_func(f31); \ +2: +.text + +#define __gen_test_inst(__inst, __suffix) \ +.global test_ ## __inst; \ +test_ ## __inst:; \ + fp_test_prologue ## __suffix; \ + fp_test_body ## __suffix(__inst); \ + ret + +#define gen_test_inst_compressed(__inst) \ + .option arch,+c; \ + __gen_test_inst(c_ ## __inst, _compressed) + +#define gen_test_inst(__inst) \ + .balign 16; \ + .option push; \ + .option arch,-c; \ + __gen_test_inst(__inst, ); \ + .option pop + +.macro fp_test_prologue_load_compressed_sp + copy_long_to t0, a1, sp +.endm + +.macro fp_test_epilogue_load_compressed_sp +.endm + +.macro fp_test_prologue_store_compressed_sp +.endm + +.macro fp_test_epilogue_store_compressed_sp + copy_long_to t0, sp, a1 +.endm + +#define gen_inst_compressed_sp(__inst, __type) \ + .global test_c_ ## __inst ## sp; \ + test_c_ ## __inst ## sp:; \ + sp_stack_prologue a2; \ + fp_test_prologue_## __type ## _compressed_sp; \ + fp_test_prologue_compressed; \ + fp_test_body_compressed(c_ ## __inst ## sp); \ + fp_test_epilogue_## __type ## _compressed_sp; \ + sp_stack_epilogue a2; \ + ret + +#define gen_test_load_compressed_sp(__inst) gen_inst_compressed_sp(__inst, load) +#define gen_test_store_compressed_sp(__inst) gen_inst_compressed_sp(__inst, store) + +/* + * float_fsw_reg - Set a FP register from a register containing the value + * a0 = FP register index to be set + * a1 = addr where to store register value + * a2 = address offset + * a3 = value to be store + */ +gen_test_inst(fsw) + +/* + * float_flw_reg - Get a FP register value and return it + * a0 = FP register index to be retrieved + * a1 = addr to load register from + * a2 = address offset + */ +gen_test_inst(flw) + +gen_test_inst(fsd) +#ifdef __riscv_compressed +gen_test_inst_compressed(fsd) +gen_test_store_compressed_sp(fsd) +#endif + +gen_test_inst(fld) +#ifdef __riscv_compressed +gen_test_inst_compressed(fld) +gen_test_load_compressed_sp(fld) +#endif diff --git a/tools/testing/selftests/riscv/misaligned/gp.S b/tools/testing/selftests/riscv/misaligned/gp.S new file mode 100644 index 000000000000..f53f4c6d81dd --- /dev/null +++ b/tools/testing/selftests/riscv/misaligned/gp.S @@ -0,0 +1,103 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) 2025 Rivos Inc. + * + * Authors: + * Cl?ment L?ger + */ + +#include "common.S" + +.text + +.macro __gen_test_inst inst, src_reg + \inst a2, 0(\src_reg) + move a0, a2 +.endm + +.macro gen_func_header func_name, rvc + .option arch,\rvc + .global test_\func_name + test_\func_name: +.endm + +.macro gen_test_inst inst + .option push + gen_func_header \inst, -c + __gen_test_inst \inst, a0 + .option pop + ret +.endm + +.macro __gen_test_inst_c name, src_reg + .option push + gen_func_header c_\name, +c + __gen_test_inst c.\name, \src_reg + .option pop + ret +.endm + +.macro gen_test_inst_c name + __gen_test_inst_c \name, a0 +.endm + + +.macro gen_test_inst_load_c_sp name + .option push + gen_func_header c_\name\()sp, +c + sp_stack_prologue a1 + copy_long_to t0, a0, sp + c.ldsp a0, 0(sp) + sp_stack_epilogue a1 + .option pop + ret +.endm + +.macro lb_sp_sb_a0 reg, offset + lb_sb \reg, \offset, sp, a0 +.endm + +.macro gen_test_inst_store_c_sp inst_name + .option push + gen_func_header c_\inst_name\()sp, +c + /* Misalign stack pointer */ + sp_stack_prologue a1 + /* Misalign access */ + c.sdsp a2, 0(sp) + copy_long_to t0, sp, a0 + sp_stack_epilogue a1 + .option pop + ret +.endm + + + /* + * a0 = addr to load from + * a1 = address offset + * a2 = value to be loaded + */ +gen_test_inst lh +gen_test_inst lhu +gen_test_inst lw +gen_test_inst lwu +gen_test_inst ld +#ifdef __riscv_compressed +gen_test_inst_c lw +gen_test_inst_c ld +gen_test_inst_load_c_sp ld +#endif + +/* + * a0 = addr where to store value + * a1 = address offset + * a2 = value to be stored + */ +gen_test_inst sh +gen_test_inst sw +gen_test_inst sd +#ifdef __riscv_compressed +gen_test_inst_c sw +gen_test_inst_c sd +gen_test_inst_store_c_sp sd +#endif + diff --git a/tools/testing/selftests/riscv/misaligned/misaligned.c b/tools/testing/selftests/riscv/misaligned/misaligned.c new file mode 100644 index 000000000000..c66aa87ec03e --- /dev/null +++ b/tools/testing/selftests/riscv/misaligned/misaligned.c @@ -0,0 +1,254 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2025 Rivos Inc. + * + * Authors: + * Cl?ment L?ger + */ +#include +#include +#include +#include +#include "../../kselftest_harness.h" + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#define stringify(s) __stringify(s) +#define __stringify(s) #s + +#define VAL16 0x1234 +#define VAL32 0xDEADBEEF +#define VAL64 0x45674321D00DF789 + +#define VAL_float 78951.234375 +#define VAL_double 567890.512396965789589290 + +static bool float_equal(float a, float b) +{ + float scaled_epsilon; + float difference = fabsf(a - b); + + // Scale to the largest value. + a = fabsf(a); + b = fabsf(b); + if (a > b) + scaled_epsilon = FLT_EPSILON * a; + else + scaled_epsilon = FLT_EPSILON * b; + + return difference <= scaled_epsilon; +} + +static bool double_equal(double a, double b) +{ + double scaled_epsilon; + double difference = fabsf(a - b); + + // Scale to the largest value. + a = fabs(a); + b = fabs(b); + if (a > b) + scaled_epsilon = DBL_EPSILON * a; + else + scaled_epsilon = DBL_EPSILON * b; + + return difference <= scaled_epsilon; +} + +#define fpu_load_proto(__inst, __type) \ +extern __type test_ ## __inst(unsigned long fp_reg, void *addr, unsigned long offset, __type value) + +fpu_load_proto(flw, float); +fpu_load_proto(fld, double); +fpu_load_proto(c_flw, float); +fpu_load_proto(c_fld, double); +fpu_load_proto(c_fldsp, double); + +#define fpu_store_proto(__inst, __type) \ +extern void test_ ## __inst(unsigned long fp_reg, void *addr, unsigned long offset, __type value) + +fpu_store_proto(fsw, float); +fpu_store_proto(fsd, double); +fpu_store_proto(c_fsw, float); +fpu_store_proto(c_fsd, double); +fpu_store_proto(c_fsdsp, double); + +#define gp_load_proto(__inst, __type) \ +extern __type test_ ## __inst(void *addr, unsigned long offset, __type value) + +gp_load_proto(lh, uint16_t); +gp_load_proto(lhu, uint16_t); +gp_load_proto(lw, uint32_t); +gp_load_proto(lwu, uint32_t); +gp_load_proto(ld, uint64_t); +gp_load_proto(c_lw, uint32_t); +gp_load_proto(c_ld, uint64_t); +gp_load_proto(c_ldsp, uint64_t); + +#define gp_store_proto(__inst, __type) \ +extern void test_ ## __inst(void *addr, unsigned long offset, __type value) + +gp_store_proto(sh, uint16_t); +gp_store_proto(sw, uint32_t); +gp_store_proto(sd, uint64_t); +gp_store_proto(c_sw, uint32_t); +gp_store_proto(c_sd, uint64_t); +gp_store_proto(c_sdsp, uint64_t); + +#define TEST_GP_LOAD(__inst, __type_size) \ +TEST(gp_load_ ## __inst) \ +{ \ + int offset, ret; \ + uint8_t buf[16] __attribute__((aligned(16))); \ + \ + ret = prctl(PR_SET_UNALIGN, PR_UNALIGN_NOPRINT); \ + ASSERT_EQ(ret, 0); \ + \ + for (offset = 1; offset < __type_size / 8; offset++) { \ + uint ## __type_size ## _t val = VAL ## __type_size; \ + uint ## __type_size ## _t *ptr = (uint ## __type_size ## _t *) (buf + offset); \ + memcpy(ptr, &val, sizeof(val)); \ + val = test_ ## __inst(ptr, offset, val); \ + EXPECT_EQ(VAL ## __type_size, val); \ + } \ +} + +TEST_GP_LOAD(lh, 16); +TEST_GP_LOAD(lhu, 16); +TEST_GP_LOAD(lw, 32); +TEST_GP_LOAD(lwu, 32); +TEST_GP_LOAD(ld, 64); +#ifdef __riscv_compressed +TEST_GP_LOAD(c_lw, 32); +TEST_GP_LOAD(c_ld, 64); +TEST_GP_LOAD(c_ldsp, 64); +#endif + +#define TEST_GP_STORE(__inst, __type_size) \ +TEST(gp_load_ ## __inst) \ +{ \ + int offset, ret; \ + uint8_t buf[16] __attribute__((aligned(16))); \ + \ + ret = prctl(PR_SET_UNALIGN, PR_UNALIGN_NOPRINT); \ + ASSERT_EQ(ret, 0); \ + \ + for (offset = 1; offset < __type_size / 8; offset++) { \ + uint ## __type_size ## _t val = VAL ## __type_size; \ + uint ## __type_size ## _t *ptr = (uint ## __type_size ## _t *) (buf + offset); \ + memset(ptr, 0, sizeof(val)); \ + test_ ## __inst(ptr, offset, val); \ + memcpy(&val, ptr, sizeof(val)); \ + EXPECT_EQ(VAL ## __type_size, val); \ + } \ +} +TEST_GP_STORE(sh, 16); +TEST_GP_STORE(sw, 32); +TEST_GP_STORE(sd, 64); +#ifdef __riscv_compressed +TEST_GP_STORE(c_sw, 32); +TEST_GP_STORE(c_sd, 64); +TEST_GP_STORE(c_sdsp, 64); +#endif + +#define __TEST_FPU_LOAD(__type, __inst, __reg_start, __reg_end) \ +TEST(fpu_load_ ## __inst) \ +{ \ + int i, ret, offset, fp_reg; \ + uint8_t buf[16] __attribute__((aligned(16))); \ + \ + ret = prctl(PR_SET_UNALIGN, PR_UNALIGN_NOPRINT); \ + ASSERT_EQ(ret, 0); \ + \ + for (fp_reg = __reg_start; fp_reg < __reg_end; fp_reg++) { \ + for (offset = 1; offset < 4; offset++) { \ + void *load_addr = (buf + offset); \ + __type val = VAL_ ## __type ; \ + \ + memcpy(load_addr, &val, sizeof(val)); \ + val = test_ ## __inst(fp_reg, load_addr, offset, val); \ + EXPECT_TRUE(__type ##_equal(val, VAL_## __type)); \ + } \ + } \ +} +#define TEST_FPU_LOAD(__type, __inst) \ + __TEST_FPU_LOAD(__type, __inst, 0, 32) +#define TEST_FPU_LOAD_COMPRESSED(__type, __inst) \ + __TEST_FPU_LOAD(__type, __inst, 8, 16) + +TEST_FPU_LOAD(float, flw) +TEST_FPU_LOAD(double, fld) +#ifdef __riscv_compressed +TEST_FPU_LOAD_COMPRESSED(double, c_fld) +TEST_FPU_LOAD_COMPRESSED(double, c_fldsp) +#endif + +#define __TEST_FPU_STORE(__type, __inst, __reg_start, __reg_end) \ +TEST(fpu_store_ ## __inst) \ +{ \ + int i, ret, offset, fp_reg; \ + uint8_t buf[16] __attribute__((aligned(16))); \ + \ + ret = prctl(PR_SET_UNALIGN, PR_UNALIGN_NOPRINT); \ + ASSERT_EQ(ret, 0); \ + \ + for (fp_reg = __reg_start; fp_reg < __reg_end; fp_reg++) { \ + for (offset = 1; offset < 4; offset++) { \ + \ + void *store_addr = (buf + offset); \ + __type val = VAL_ ## __type ; \ + \ + test_ ## __inst(fp_reg, store_addr, offset, val); \ + memcpy(&val, store_addr, sizeof(val)); \ + EXPECT_TRUE(__type ## _equal(val, VAL_## __type)); \ + } \ + } \ +} +#define TEST_FPU_STORE(__type, __inst) \ + __TEST_FPU_STORE(__type, __inst, 0, 32) +#define TEST_FPU_STORE_COMPRESSED(__type, __inst) \ + __TEST_FPU_STORE(__type, __inst, 8, 16) + +TEST_FPU_STORE(float, fsw) +TEST_FPU_STORE(double, fsd) +#ifdef __riscv_compressed +TEST_FPU_STORE_COMPRESSED(double, c_fsd) +TEST_FPU_STORE_COMPRESSED(double, c_fsdsp) +#endif + +TEST_SIGNAL(gen_sigbus, SIGBUS) +{ + uint32_t *ptr; + uint8_t buf[16] __attribute__((aligned(16))); + int ret; + + ret = prctl(PR_SET_UNALIGN, PR_UNALIGN_SIGBUS); + ASSERT_EQ(ret, 0); + + ptr = (uint32_t *)(buf + 1); + *ptr = 0xDEADBEEFULL; +} + +int main(int argc, char **argv) +{ + int ret, val; + + ret = prctl(PR_GET_UNALIGN, &val); + if (ret == -1 && errno == EINVAL) + ksft_exit_skip("SKIP GET_UNALIGN_CTL not supported\n"); + + exit(test_harness_run(argc, argv)); +} -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 10:06:21 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 18:06:21 +0100 Subject: [PATCH v4 15/18] RISC-V: KVM: add SBI extension init()/deinit() functions In-Reply-To: <20250317170625.1142870-1-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> Message-ID: <20250317170625.1142870-16-cleger@rivosinc.com> The FWFT SBI extension will need to dynamically allocate memory and do init time specific initialization. Add an init/deinit callbacks that allows to do so. Signed-off-by: Cl?ment L?ger --- arch/riscv/include/asm/kvm_vcpu_sbi.h | 9 +++++++++ arch/riscv/kvm/vcpu.c | 2 ++ arch/riscv/kvm/vcpu_sbi.c | 26 ++++++++++++++++++++++++++ 3 files changed, 37 insertions(+) diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h index 4ed6203cdd30..bcb90757b149 100644 --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h @@ -49,6 +49,14 @@ struct kvm_vcpu_sbi_extension { /* Extension specific probe function */ unsigned long (*probe)(struct kvm_vcpu *vcpu); + + /* + * Init/deinit function called once during VCPU init/destroy. These + * might be use if the SBI extensions need to allocate or do specific + * init time only configuration. + */ + int (*init)(struct kvm_vcpu *vcpu); + void (*deinit)(struct kvm_vcpu *vcpu); }; void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run); @@ -69,6 +77,7 @@ const struct kvm_vcpu_sbi_extension *kvm_vcpu_sbi_find_ext( bool riscv_vcpu_supports_sbi_ext(struct kvm_vcpu *vcpu, int idx); int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run); void kvm_riscv_vcpu_sbi_init(struct kvm_vcpu *vcpu); +void kvm_riscv_vcpu_sbi_deinit(struct kvm_vcpu *vcpu); int kvm_riscv_vcpu_get_reg_sbi_sta(struct kvm_vcpu *vcpu, unsigned long reg_num, unsigned long *reg_val); diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c index 60d684c76c58..877bcc85c067 100644 --- a/arch/riscv/kvm/vcpu.c +++ b/arch/riscv/kvm/vcpu.c @@ -185,6 +185,8 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu) void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu) { + kvm_riscv_vcpu_sbi_deinit(vcpu); + /* Cleanup VCPU AIA context */ kvm_riscv_vcpu_aia_deinit(vcpu); diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c index d1c83a77735e..3139f171c20f 100644 --- a/arch/riscv/kvm/vcpu_sbi.c +++ b/arch/riscv/kvm/vcpu_sbi.c @@ -508,5 +508,31 @@ void kvm_riscv_vcpu_sbi_init(struct kvm_vcpu *vcpu) scontext->ext_status[idx] = ext->default_disabled ? KVM_RISCV_SBI_EXT_STATUS_DISABLED : KVM_RISCV_SBI_EXT_STATUS_ENABLED; + + if (ext->init && ext->init(vcpu) != 0) + scontext->ext_status[idx] = KVM_RISCV_SBI_EXT_STATUS_UNAVAILABLE; + } +} + +void kvm_riscv_vcpu_sbi_deinit(struct kvm_vcpu *vcpu) +{ + struct kvm_vcpu_sbi_context *scontext = &vcpu->arch.sbi_context; + const struct kvm_riscv_sbi_extension_entry *entry; + const struct kvm_vcpu_sbi_extension *ext; + int idx, i; + + for (i = 0; i < ARRAY_SIZE(sbi_ext); i++) { + entry = &sbi_ext[i]; + ext = entry->ext_ptr; + idx = entry->ext_idx; + + if (idx < 0 || idx >= ARRAY_SIZE(scontext->ext_status)) + continue; + + if (scontext->ext_status[idx] == KVM_RISCV_SBI_EXT_STATUS_UNAVAILABLE || + !ext->deinit) + continue; + + ext->deinit(vcpu); } } -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 10:06:23 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 18:06:23 +0100 Subject: [PATCH v4 17/18] RISC-V: KVM: add support for FWFT SBI extension In-Reply-To: <20250317170625.1142870-1-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> Message-ID: <20250317170625.1142870-18-cleger@rivosinc.com> Add basic infrastructure to support the FWFT extension in KVM. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- arch/riscv/include/asm/kvm_host.h | 4 + arch/riscv/include/asm/kvm_vcpu_sbi.h | 1 + arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h | 29 +++ arch/riscv/include/uapi/asm/kvm.h | 1 + arch/riscv/kvm/Makefile | 1 + arch/riscv/kvm/vcpu_sbi.c | 4 + arch/riscv/kvm/vcpu_sbi_fwft.c | 216 +++++++++++++++++++++ 7 files changed, 256 insertions(+) create mode 100644 arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h create mode 100644 arch/riscv/kvm/vcpu_sbi_fwft.c diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h index bb93d2995ea2..c0db61ba691a 100644 --- a/arch/riscv/include/asm/kvm_host.h +++ b/arch/riscv/include/asm/kvm_host.h @@ -19,6 +19,7 @@ #include #include #include +#include #include #include @@ -281,6 +282,9 @@ struct kvm_vcpu_arch { /* Performance monitoring context */ struct kvm_pmu pmu_context; + /* Firmware feature SBI extension context */ + struct kvm_sbi_fwft fwft_context; + /* 'static' configurations which are set only once */ struct kvm_vcpu_config cfg; diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h index cb68b3a57c8f..ffd03fed0c06 100644 --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h @@ -98,6 +98,7 @@ extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_hsm; extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_dbcn; extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_susp; extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_sta; +extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_fwft; extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_experimental; extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_vendor; diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h b/arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h new file mode 100644 index 000000000000..9ba841355758 --- /dev/null +++ b/arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h @@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) 2025 Rivos Inc. + * + * Authors: + * Cl?ment L?ger + */ + +#ifndef __KVM_VCPU_RISCV_FWFT_H +#define __KVM_VCPU_RISCV_FWFT_H + +#include + +struct kvm_sbi_fwft_feature; + +struct kvm_sbi_fwft_config { + const struct kvm_sbi_fwft_feature *feature; + bool supported; + unsigned long flags; +}; + +/* FWFT data structure per vcpu */ +struct kvm_sbi_fwft { + struct kvm_sbi_fwft_config *configs; +}; + +#define vcpu_to_fwft(vcpu) (&(vcpu)->arch.fwft_context) + +#endif /* !__KVM_VCPU_RISCV_FWFT_H */ diff --git a/arch/riscv/include/uapi/asm/kvm.h b/arch/riscv/include/uapi/asm/kvm.h index f06bc5efcd79..fa6eee1caf41 100644 --- a/arch/riscv/include/uapi/asm/kvm.h +++ b/arch/riscv/include/uapi/asm/kvm.h @@ -202,6 +202,7 @@ enum KVM_RISCV_SBI_EXT_ID { KVM_RISCV_SBI_EXT_DBCN, KVM_RISCV_SBI_EXT_STA, KVM_RISCV_SBI_EXT_SUSP, + KVM_RISCV_SBI_EXT_FWFT, KVM_RISCV_SBI_EXT_MAX, }; diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile index 4e0bba91d284..06e2d52a9b88 100644 --- a/arch/riscv/kvm/Makefile +++ b/arch/riscv/kvm/Makefile @@ -26,6 +26,7 @@ kvm-y += vcpu_onereg.o kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o kvm-y += vcpu_sbi.o kvm-y += vcpu_sbi_base.o +kvm-y += vcpu_sbi_fwft.o kvm-y += vcpu_sbi_hsm.o kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_sbi_pmu.o kvm-y += vcpu_sbi_replace.o diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c index 50be079b5528..0748810c0252 100644 --- a/arch/riscv/kvm/vcpu_sbi.c +++ b/arch/riscv/kvm/vcpu_sbi.c @@ -78,6 +78,10 @@ static const struct kvm_riscv_sbi_extension_entry sbi_ext[] = { .ext_idx = KVM_RISCV_SBI_EXT_STA, .ext_ptr = &vcpu_sbi_ext_sta, }, + { + .ext_idx = KVM_RISCV_SBI_EXT_FWFT, + .ext_ptr = &vcpu_sbi_ext_fwft, + }, { .ext_idx = KVM_RISCV_SBI_EXT_EXPERIMENTAL, .ext_ptr = &vcpu_sbi_ext_experimental, diff --git a/arch/riscv/kvm/vcpu_sbi_fwft.c b/arch/riscv/kvm/vcpu_sbi_fwft.c new file mode 100644 index 000000000000..8a7cfe1fe7a7 --- /dev/null +++ b/arch/riscv/kvm/vcpu_sbi_fwft.c @@ -0,0 +1,216 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2025 Rivos Inc. + * + * Authors: + * Cl?ment L?ger + */ + +#include +#include +#include +#include +#include +#include +#include + +struct kvm_sbi_fwft_feature { + /** + * @id: Feature ID + */ + enum sbi_fwft_feature_t id; + + /** + * @supported: Check if the feature is supported on the vcpu + * + * This callback is optional, if not provided the feature is assumed to + * be supported + */ + bool (*supported)(struct kvm_vcpu *vcpu); + + /** + * @set: Set the feature value + * + * Return SBI_SUCCESS on success or an SBI error (SBI_ERR_*) + * + * This callback is mandatory + */ + long (*set)(struct kvm_vcpu *vcpu, struct kvm_sbi_fwft_config *conf, unsigned long value); + + /** + * @get: Get the feature current value + * + * Return SBI_SUCCESS on success or an SBI error (SBI_ERR_*) + * + * This callback is mandatory + */ + long (*get)(struct kvm_vcpu *vcpu, struct kvm_sbi_fwft_config *conf, unsigned long *value); +}; + +static const enum sbi_fwft_feature_t kvm_fwft_defined_features[] = { + SBI_FWFT_MISALIGNED_EXC_DELEG, + SBI_FWFT_LANDING_PAD, + SBI_FWFT_SHADOW_STACK, + SBI_FWFT_DOUBLE_TRAP, + SBI_FWFT_PTE_AD_HW_UPDATING, + SBI_FWFT_POINTER_MASKING_PMLEN, +}; + +static bool kvm_fwft_is_defined_feature(enum sbi_fwft_feature_t feature) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(kvm_fwft_defined_features); i++) { + if (kvm_fwft_defined_features[i] == feature) + return true; + } + + return false; +} + +static const struct kvm_sbi_fwft_feature features[] = { +}; + +static struct kvm_sbi_fwft_config * +kvm_sbi_fwft_get_config(struct kvm_vcpu *vcpu, enum sbi_fwft_feature_t feature) +{ + int i; + struct kvm_sbi_fwft *fwft = vcpu_to_fwft(vcpu); + + for (i = 0; i < ARRAY_SIZE(features); i++) { + if (fwft->configs[i].feature->id == feature) + return &fwft->configs[i]; + } + + return NULL; +} + +static int kvm_fwft_get_feature(struct kvm_vcpu *vcpu, u32 feature, + struct kvm_sbi_fwft_config **conf) +{ + struct kvm_sbi_fwft_config *tconf; + + tconf = kvm_sbi_fwft_get_config(vcpu, feature); + if (!tconf) { + if (kvm_fwft_is_defined_feature(feature)) + return SBI_ERR_NOT_SUPPORTED; + + return SBI_ERR_DENIED; + } + + if (!tconf->supported) + return SBI_ERR_NOT_SUPPORTED; + + *conf = tconf; + + return SBI_SUCCESS; +} + +static int kvm_sbi_fwft_set(struct kvm_vcpu *vcpu, u32 feature, + unsigned long value, unsigned long flags) +{ + int ret; + struct kvm_sbi_fwft_config *conf; + + ret = kvm_fwft_get_feature(vcpu, feature, &conf); + if (ret) + return ret; + + if ((flags & ~SBI_FWFT_SET_FLAG_LOCK) != 0) + return SBI_ERR_INVALID_PARAM; + + if (conf->flags & SBI_FWFT_SET_FLAG_LOCK) + return SBI_ERR_DENIED_LOCKED; + + conf->flags = flags; + + return conf->feature->set(vcpu, conf, value); +} + +static int kvm_sbi_fwft_get(struct kvm_vcpu *vcpu, unsigned long feature, + unsigned long *value) +{ + int ret; + struct kvm_sbi_fwft_config *conf; + + ret = kvm_fwft_get_feature(vcpu, feature, &conf); + if (ret) + return ret; + + return conf->feature->get(vcpu, conf, value); +} + +static int kvm_sbi_ext_fwft_handler(struct kvm_vcpu *vcpu, struct kvm_run *run, + struct kvm_vcpu_sbi_return *retdata) +{ + int ret; + struct kvm_cpu_context *cp = &vcpu->arch.guest_context; + unsigned long funcid = cp->a6; + + switch (funcid) { + case SBI_EXT_FWFT_SET: + ret = kvm_sbi_fwft_set(vcpu, cp->a0, cp->a1, cp->a2); + break; + case SBI_EXT_FWFT_GET: + ret = kvm_sbi_fwft_get(vcpu, cp->a0, &retdata->out_val); + break; + default: + ret = SBI_ERR_NOT_SUPPORTED; + break; + } + + retdata->err_val = ret; + + return 0; +} + +static int kvm_sbi_ext_fwft_init(struct kvm_vcpu *vcpu) +{ + struct kvm_sbi_fwft *fwft = vcpu_to_fwft(vcpu); + const struct kvm_sbi_fwft_feature *feature; + struct kvm_sbi_fwft_config *conf; + int i; + + fwft->configs = kcalloc(ARRAY_SIZE(features), sizeof(struct kvm_sbi_fwft_config), + GFP_KERNEL); + if (!fwft->configs) + return -ENOMEM; + + for (i = 0; i < ARRAY_SIZE(features); i++) { + feature = &features[i]; + conf = &fwft->configs[i]; + if (feature->supported) + conf->supported = feature->supported(vcpu); + else + conf->supported = true; + + conf->feature = feature; + } + + return 0; +} + +static void kvm_sbi_ext_fwft_deinit(struct kvm_vcpu *vcpu) +{ + struct kvm_sbi_fwft *fwft = vcpu_to_fwft(vcpu); + + kfree(fwft->configs); +} + +static void kvm_sbi_ext_fwft_reset(struct kvm_vcpu *vcpu) +{ + int i; + struct kvm_sbi_fwft *fwft = vcpu_to_fwft(vcpu); + + for (i = 0; i < ARRAY_SIZE(features); i++) + fwft->configs[i].flags = 0; +} + +const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_fwft = { + .extid_start = SBI_EXT_FWFT, + .extid_end = SBI_EXT_FWFT, + .handler = kvm_sbi_ext_fwft_handler, + .init = kvm_sbi_ext_fwft_init, + .deinit = kvm_sbi_ext_fwft_deinit, + .reset = kvm_sbi_ext_fwft_reset, +}; -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 10:06:24 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 18:06:24 +0100 Subject: [PATCH v4 18/18] RISC-V: KVM: add support for SBI_FWFT_MISALIGNED_DELEG In-Reply-To: <20250317170625.1142870-1-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> Message-ID: <20250317170625.1142870-19-cleger@rivosinc.com> SBI_FWFT_MISALIGNED_DELEG needs hedeleg to be modified to delegate misaligned load/store exceptions. Save and restore it during CPU load/put. Signed-off-by: Cl?ment L?ger Reviewed-by: Deepak Gupta Reviewed-by: Andrew Jones --- arch/riscv/kvm/vcpu.c | 3 +++ arch/riscv/kvm/vcpu_sbi_fwft.c | 36 ++++++++++++++++++++++++++++++++++ 2 files changed, 39 insertions(+) diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c index 542747e2c7f5..d98e379945c3 100644 --- a/arch/riscv/kvm/vcpu.c +++ b/arch/riscv/kvm/vcpu.c @@ -646,6 +646,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) { void *nsh; struct kvm_vcpu_csr *csr = &vcpu->arch.guest_csr; + struct kvm_vcpu_config *cfg = &vcpu->arch.cfg; vcpu->cpu = -1; @@ -671,6 +672,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) csr->vstval = nacl_csr_read(nsh, CSR_VSTVAL); csr->hvip = nacl_csr_read(nsh, CSR_HVIP); csr->vsatp = nacl_csr_read(nsh, CSR_VSATP); + cfg->hedeleg = nacl_csr_read(nsh, CSR_HEDELEG); } else { csr->vsstatus = csr_read(CSR_VSSTATUS); csr->vsie = csr_read(CSR_VSIE); @@ -681,6 +683,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) csr->vstval = csr_read(CSR_VSTVAL); csr->hvip = csr_read(CSR_HVIP); csr->vsatp = csr_read(CSR_VSATP); + cfg->hedeleg = csr_read(CSR_HEDELEG); } } diff --git a/arch/riscv/kvm/vcpu_sbi_fwft.c b/arch/riscv/kvm/vcpu_sbi_fwft.c index 8a7cfe1fe7a7..b0556d66e775 100644 --- a/arch/riscv/kvm/vcpu_sbi_fwft.c +++ b/arch/riscv/kvm/vcpu_sbi_fwft.c @@ -14,6 +14,8 @@ #include #include +#define MIS_DELEG (BIT_ULL(EXC_LOAD_MISALIGNED) | BIT_ULL(EXC_STORE_MISALIGNED)) + struct kvm_sbi_fwft_feature { /** * @id: Feature ID @@ -68,7 +70,41 @@ static bool kvm_fwft_is_defined_feature(enum sbi_fwft_feature_t feature) return false; } +static bool kvm_sbi_fwft_misaligned_delegation_supported(struct kvm_vcpu *vcpu) +{ + return misaligned_traps_can_delegate(); +} + +static long kvm_sbi_fwft_set_misaligned_delegation(struct kvm_vcpu *vcpu, + struct kvm_sbi_fwft_config *conf, + unsigned long value) +{ + if (value == 1) + csr_set(CSR_HEDELEG, MIS_DELEG); + else if (value == 0) + csr_clear(CSR_HEDELEG, MIS_DELEG); + else + return SBI_ERR_INVALID_PARAM; + + return SBI_SUCCESS; +} + +static long kvm_sbi_fwft_get_misaligned_delegation(struct kvm_vcpu *vcpu, + struct kvm_sbi_fwft_config *conf, + unsigned long *value) +{ + *value = (csr_read(CSR_HEDELEG) & MIS_DELEG) != 0; + + return SBI_SUCCESS; +} + static const struct kvm_sbi_fwft_feature features[] = { + { + .id = SBI_FWFT_MISALIGNED_EXC_DELEG, + .supported = kvm_sbi_fwft_misaligned_delegation_supported, + .set = kvm_sbi_fwft_set_misaligned_delegation, + .get = kvm_sbi_fwft_get_misaligned_delegation, + }, }; static struct kvm_sbi_fwft_config * -- 2.47.2 From cleger at rivosinc.com Mon Mar 17 10:06:22 2025 From: cleger at rivosinc.com (=?UTF-8?q?Cl=C3=A9ment=20L=C3=A9ger?=) Date: Mon, 17 Mar 2025 18:06:22 +0100 Subject: [PATCH v4 16/18] RISC-V: KVM: add SBI extension reset callback In-Reply-To: <20250317170625.1142870-1-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> Message-ID: <20250317170625.1142870-17-cleger@rivosinc.com> Currently, only the STA extension needed a reset function but that's going to be the case for FWFT as well. Add a reset callback that can be implemented by SBI extensions. Signed-off-by: Cl?ment L?ger Reviewed-by: Andrew Jones --- arch/riscv/include/asm/kvm_host.h | 1 - arch/riscv/include/asm/kvm_vcpu_sbi.h | 2 ++ arch/riscv/kvm/vcpu.c | 2 +- arch/riscv/kvm/vcpu_sbi.c | 24 ++++++++++++++++++++++++ arch/riscv/kvm/vcpu_sbi_sta.c | 3 ++- 5 files changed, 29 insertions(+), 3 deletions(-) diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h index cc33e35cd628..bb93d2995ea2 100644 --- a/arch/riscv/include/asm/kvm_host.h +++ b/arch/riscv/include/asm/kvm_host.h @@ -409,7 +409,6 @@ void __kvm_riscv_vcpu_power_on(struct kvm_vcpu *vcpu); void kvm_riscv_vcpu_power_on(struct kvm_vcpu *vcpu); bool kvm_riscv_vcpu_stopped(struct kvm_vcpu *vcpu); -void kvm_riscv_vcpu_sbi_sta_reset(struct kvm_vcpu *vcpu); void kvm_riscv_vcpu_record_steal_time(struct kvm_vcpu *vcpu); #endif /* __RISCV_KVM_HOST_H__ */ diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h index bcb90757b149..cb68b3a57c8f 100644 --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h @@ -57,6 +57,7 @@ struct kvm_vcpu_sbi_extension { */ int (*init)(struct kvm_vcpu *vcpu); void (*deinit)(struct kvm_vcpu *vcpu); + void (*reset)(struct kvm_vcpu *vcpu); }; void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run); @@ -78,6 +79,7 @@ bool riscv_vcpu_supports_sbi_ext(struct kvm_vcpu *vcpu, int idx); int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run); void kvm_riscv_vcpu_sbi_init(struct kvm_vcpu *vcpu); void kvm_riscv_vcpu_sbi_deinit(struct kvm_vcpu *vcpu); +void kvm_riscv_vcpu_sbi_reset(struct kvm_vcpu *vcpu); int kvm_riscv_vcpu_get_reg_sbi_sta(struct kvm_vcpu *vcpu, unsigned long reg_num, unsigned long *reg_val); diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c index 877bcc85c067..542747e2c7f5 100644 --- a/arch/riscv/kvm/vcpu.c +++ b/arch/riscv/kvm/vcpu.c @@ -94,7 +94,7 @@ static void kvm_riscv_reset_vcpu(struct kvm_vcpu *vcpu) vcpu->arch.hfence_tail = 0; memset(vcpu->arch.hfence_queue, 0, sizeof(vcpu->arch.hfence_queue)); - kvm_riscv_vcpu_sbi_sta_reset(vcpu); + kvm_riscv_vcpu_sbi_reset(vcpu); /* Reset the guest CSRs for hotplug usecase */ if (loaded) diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c index 3139f171c20f..50be079b5528 100644 --- a/arch/riscv/kvm/vcpu_sbi.c +++ b/arch/riscv/kvm/vcpu_sbi.c @@ -536,3 +536,27 @@ void kvm_riscv_vcpu_sbi_deinit(struct kvm_vcpu *vcpu) ext->deinit(vcpu); } } + +void kvm_riscv_vcpu_sbi_reset(struct kvm_vcpu *vcpu) +{ + struct kvm_vcpu_sbi_context *scontext = &vcpu->arch.sbi_context; + const struct kvm_riscv_sbi_extension_entry *entry; + const struct kvm_vcpu_sbi_extension *ext; + int idx, i; + + for (i = 0; i < ARRAY_SIZE(sbi_ext); i++) { + entry = &sbi_ext[i]; + ext = entry->ext_ptr; + idx = entry->ext_idx; + + if (idx < 0 || idx >= ARRAY_SIZE(scontext->ext_status)) + continue; + + if (scontext->ext_status[idx] != KVM_RISCV_SBI_EXT_STATUS_ENABLED || + !ext->reset) + continue; + + ext->reset(vcpu); + } +} + diff --git a/arch/riscv/kvm/vcpu_sbi_sta.c b/arch/riscv/kvm/vcpu_sbi_sta.c index 5f35427114c1..cc6cb7c8f0e4 100644 --- a/arch/riscv/kvm/vcpu_sbi_sta.c +++ b/arch/riscv/kvm/vcpu_sbi_sta.c @@ -16,7 +16,7 @@ #include #include -void kvm_riscv_vcpu_sbi_sta_reset(struct kvm_vcpu *vcpu) +static void kvm_riscv_vcpu_sbi_sta_reset(struct kvm_vcpu *vcpu) { vcpu->arch.sta.shmem = INVALID_GPA; vcpu->arch.sta.last_steal = 0; @@ -156,6 +156,7 @@ const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_sta = { .extid_end = SBI_EXT_STA, .handler = kvm_sbi_ext_sta_handler, .probe = kvm_sbi_ext_sta_probe, + .reset = kvm_riscv_vcpu_sbi_sta_reset, }; int kvm_riscv_vcpu_get_reg_sbi_sta(struct kvm_vcpu *vcpu, -- 2.47.2 From apatel at ventanamicro.com Tue Mar 18 23:34:55 2025 From: apatel at ventanamicro.com (Anup Patel) Date: Wed, 19 Mar 2025 12:04:55 +0530 Subject: [PATCH] RISC-V: KVM: Teardown riscv specific bits after kvm_exit In-Reply-To: <20250317-kvm_exit_fix-v1-1-aa5240c5dbd2@rivosinc.com> References: <20250317-kvm_exit_fix-v1-1-aa5240c5dbd2@rivosinc.com> Message-ID: On Mon, Mar 17, 2025 at 1:11?PM Atish Patra wrote: > > During a module removal, kvm_exit invokes arch specific disable > call which disables AIA. However, we invoke aia_exit before kvm_exit > resulting in the following warning. KVM kernel module can't be inserted > afterwards due to inconsistent state of IRQ. > > [25469.031389] percpu IRQ 31 still enabled on CPU0! > [25469.031732] WARNING: CPU: 3 PID: 943 at kernel/irq/manage.c:2476 __free_percpu_irq+0xa2/0x150 > [25469.031804] Modules linked in: kvm(-) > [25469.031848] CPU: 3 UID: 0 PID: 943 Comm: rmmod Not tainted 6.14.0-rc5-06947-g91c763118f47-dirty #2 > [25469.031905] Hardware name: riscv-virtio,qemu (DT) > [25469.031928] epc : __free_percpu_irq+0xa2/0x150 > [25469.031976] ra : __free_percpu_irq+0xa2/0x150 > [25469.032197] epc : ffffffff8007db1e ra : ffffffff8007db1e sp : ff2000000088bd50 > [25469.032241] gp : ffffffff8131cef8 tp : ff60000080b96400 t0 : ff2000000088baf8 > [25469.032285] t1 : fffffffffffffffc t2 : 5249207570637265 s0 : ff2000000088bd90 > [25469.032329] s1 : ff60000098b21080 a0 : 037d527a15eb4f00 a1 : 037d527a15eb4f00 > [25469.032372] a2 : 0000000000000023 a3 : 0000000000000001 a4 : ffffffff8122dbf8 > [25469.032410] a5 : 0000000000000fff a6 : 0000000000000000 a7 : ffffffff8122dc10 > [25469.032448] s2 : ff60000080c22eb0 s3 : 0000000200000022 s4 : 000000000000001f > [25469.032488] s5 : ff60000080c22e00 s6 : ffffffff80c351c0 s7 : 0000000000000000 > [25469.032582] s8 : 0000000000000003 s9 : 000055556b7fb490 s10: 00007ffff0e12fa0 > [25469.032621] s11: 00007ffff0e13e9a t3 : ffffffff81354ac7 t4 : ffffffff81354ac7 > [25469.032664] t5 : ffffffff81354ac8 t6 : ffffffff81354ac7 > [25469.032698] status: 0000000200000100 badaddr: ffffffff8007db1e cause: 0000000000000003 > [25469.032738] [] __free_percpu_irq+0xa2/0x150 > [25469.032797] [] free_percpu_irq+0x30/0x5e > [25469.032856] [] kvm_riscv_aia_exit+0x40/0x42 [kvm] > [25469.033947] [] cleanup_module+0x10/0x32 [kvm] > [25469.035300] [] __riscv_sys_delete_module+0x18e/0x1fc > [25469.035374] [] syscall_handler+0x3a/0x46 > [25469.035456] [] do_trap_ecall_u+0x72/0x134 > [25469.035536] [] handle_exception+0x148/0x156 > > Invoke aia_exit and other arch specific cleanup functions after kvm_exit > so that disable gets a chance to be called first before exit. > > Fixes: 54e43320c2ba ("RISC-V: KVM: Initial skeletal support for AIA") The kvm_riscv_treadown() was introduced by some other commit so we should include a second Fixes tag which is: Fixes: eded6754f398 ("riscv: KVM: add basic support for host vs guest profiling") > > Signed-off-by: Atish Patra Otherwise, this looks good to me. Reviewed-by: Anup Patel Regards, Anup > --- > arch/riscv/kvm/main.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c > index 1fa8be5ee509..4b24705dc63a 100644 > --- a/arch/riscv/kvm/main.c > +++ b/arch/riscv/kvm/main.c > @@ -172,8 +172,8 @@ module_init(riscv_kvm_init); > > static void __exit riscv_kvm_exit(void) > { > - kvm_riscv_teardown(); > - > kvm_exit(); > + > + kvm_riscv_teardown(); > } > module_exit(riscv_kvm_exit); > > --- > base-commit: 4701f33a10702d5fc577c32434eb62adde0a1ae1 > change-id: 20250316-kvm_exit_fix-77cd0632d740 > -- > Regards, > Atish patra > From apatel at ventanamicro.com Wed Mar 19 08:37:01 2025 From: apatel at ventanamicro.com (Anup Patel) Date: Wed, 19 Mar 2025 21:07:01 +0530 Subject: [PATCH] RISC-V: KVM: Teardown riscv specific bits after kvm_exit In-Reply-To: <20250317-kvm_exit_fix-v1-1-aa5240c5dbd2@rivosinc.com> References: <20250317-kvm_exit_fix-v1-1-aa5240c5dbd2@rivosinc.com> Message-ID: On Mon, Mar 17, 2025 at 1:11?PM Atish Patra wrote: > > During a module removal, kvm_exit invokes arch specific disable > call which disables AIA. However, we invoke aia_exit before kvm_exit > resulting in the following warning. KVM kernel module can't be inserted > afterwards due to inconsistent state of IRQ. > > [25469.031389] percpu IRQ 31 still enabled on CPU0! > [25469.031732] WARNING: CPU: 3 PID: 943 at kernel/irq/manage.c:2476 __free_percpu_irq+0xa2/0x150 > [25469.031804] Modules linked in: kvm(-) > [25469.031848] CPU: 3 UID: 0 PID: 943 Comm: rmmod Not tainted 6.14.0-rc5-06947-g91c763118f47-dirty #2 > [25469.031905] Hardware name: riscv-virtio,qemu (DT) > [25469.031928] epc : __free_percpu_irq+0xa2/0x150 > [25469.031976] ra : __free_percpu_irq+0xa2/0x150 > [25469.032197] epc : ffffffff8007db1e ra : ffffffff8007db1e sp : ff2000000088bd50 > [25469.032241] gp : ffffffff8131cef8 tp : ff60000080b96400 t0 : ff2000000088baf8 > [25469.032285] t1 : fffffffffffffffc t2 : 5249207570637265 s0 : ff2000000088bd90 > [25469.032329] s1 : ff60000098b21080 a0 : 037d527a15eb4f00 a1 : 037d527a15eb4f00 > [25469.032372] a2 : 0000000000000023 a3 : 0000000000000001 a4 : ffffffff8122dbf8 > [25469.032410] a5 : 0000000000000fff a6 : 0000000000000000 a7 : ffffffff8122dc10 > [25469.032448] s2 : ff60000080c22eb0 s3 : 0000000200000022 s4 : 000000000000001f > [25469.032488] s5 : ff60000080c22e00 s6 : ffffffff80c351c0 s7 : 0000000000000000 > [25469.032582] s8 : 0000000000000003 s9 : 000055556b7fb490 s10: 00007ffff0e12fa0 > [25469.032621] s11: 00007ffff0e13e9a t3 : ffffffff81354ac7 t4 : ffffffff81354ac7 > [25469.032664] t5 : ffffffff81354ac8 t6 : ffffffff81354ac7 > [25469.032698] status: 0000000200000100 badaddr: ffffffff8007db1e cause: 0000000000000003 > [25469.032738] [] __free_percpu_irq+0xa2/0x150 > [25469.032797] [] free_percpu_irq+0x30/0x5e > [25469.032856] [] kvm_riscv_aia_exit+0x40/0x42 [kvm] > [25469.033947] [] cleanup_module+0x10/0x32 [kvm] > [25469.035300] [] __riscv_sys_delete_module+0x18e/0x1fc > [25469.035374] [] syscall_handler+0x3a/0x46 > [25469.035456] [] do_trap_ecall_u+0x72/0x134 > [25469.035536] [] handle_exception+0x148/0x156 > > Invoke aia_exit and other arch specific cleanup functions after kvm_exit > so that disable gets a chance to be called first before exit. > > Fixes: 54e43320c2ba ("RISC-V: KVM: Initial skeletal support for AIA") > > Signed-off-by: Atish Patra Queued this patch for Linux-6.15 since the merge window is near. Thanks, Anup > --- > arch/riscv/kvm/main.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c > index 1fa8be5ee509..4b24705dc63a 100644 > --- a/arch/riscv/kvm/main.c > +++ b/arch/riscv/kvm/main.c > @@ -172,8 +172,8 @@ module_init(riscv_kvm_init); > > static void __exit riscv_kvm_exit(void) > { > - kvm_riscv_teardown(); > - > kvm_exit(); > + > + kvm_riscv_teardown(); > } > module_exit(riscv_kvm_exit); > > --- > base-commit: 4701f33a10702d5fc577c32434eb62adde0a1ae1 > change-id: 20250316-kvm_exit_fix-77cd0632d740 > -- > Regards, > Atish patra > From andrew.jones at linux.dev Wed Mar 19 10:31:56 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Wed, 19 Mar 2025 18:31:56 +0100 Subject: [kvm-unit-tests PATCH v11 4/8] lib: riscv: Add functions for version checking In-Reply-To: <20250317164655.1120015-5-cleger@rivosinc.com> References: <20250317164655.1120015-1-cleger@rivosinc.com> <20250317164655.1120015-5-cleger@rivosinc.com> Message-ID: <20250319-7ce04ed29661af987303b215@orel> On Mon, Mar 17, 2025 at 05:46:49PM +0100, Cl?ment L?ger wrote: > Version checking was done using some custom hardcoded values, backport a > few SBI function and defines from Linux to do that cleanly. > > Signed-off-by: Cl?ment L?ger > --- > lib/riscv/asm/sbi.h | 15 +++++++++++++++ > lib/riscv/sbi.c | 9 +++++++-- > 2 files changed, 22 insertions(+), 2 deletions(-) > > diff --git a/lib/riscv/asm/sbi.h b/lib/riscv/asm/sbi.h > index 2f4d91ef..ee9d6e50 100644 > --- a/lib/riscv/asm/sbi.h > +++ b/lib/riscv/asm/sbi.h > @@ -18,6 +18,13 @@ > #define SBI_ERR_IO -13 > #define SBI_ERR_DENIED_LOCKED -14 > > +/* SBI spec version fields */ > +#define SBI_SPEC_VERSION_MAJOR_SHIFT 24 > +#define SBI_SPEC_VERSION_MAJOR_MASK 0x7f > +#define SBI_SPEC_VERSION_MINOR_MASK 0xffffff > +#define SBI_SPEC_VERSION_MASK ((SBI_SPEC_VERSION_MAJOR_MASK << SBI_SPEC_VERSION_MAJOR_SHIFT) | \ > + SBI_SPEC_VERSION_MINOR_MASK) ^ needs one more space > + > #ifndef __ASSEMBLER__ > #include > > @@ -110,6 +117,13 @@ struct sbiret { > long value; > }; > > +/* Make SBI version */ Unnecessary comment, it's the same as the function name. > +static inline unsigned long sbi_mk_version(unsigned long major, unsigned long minor) > +{ > + return ((major & SBI_SPEC_VERSION_MAJOR_MASK) << SBI_SPEC_VERSION_MAJOR_SHIFT) > + | (minor & SBI_SPEC_VERSION_MINOR_MASK); > +} > + > struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0, > unsigned long arg1, unsigned long arg2, > unsigned long arg3, unsigned long arg4, > @@ -124,6 +138,7 @@ struct sbiret sbi_send_ipi_cpu(int cpu); > struct sbiret sbi_send_ipi_cpumask(const cpumask_t *mask); > struct sbiret sbi_send_ipi_broadcast(void); > struct sbiret sbi_set_timer(unsigned long stime_value); > +struct sbiret sbi_get_spec_version(void); > long sbi_probe(int ext); > > #endif /* !__ASSEMBLER__ */ > diff --git a/lib/riscv/sbi.c b/lib/riscv/sbi.c > index 02dd338c..9d4eb541 100644 > --- a/lib/riscv/sbi.c > +++ b/lib/riscv/sbi.c > @@ -107,12 +107,17 @@ struct sbiret sbi_set_timer(unsigned long stime_value) > return sbi_ecall(SBI_EXT_TIME, SBI_EXT_TIME_SET_TIMER, stime_value, 0, 0, 0, 0, 0); > } > > +struct sbiret sbi_get_spec_version(void) > +{ > + return sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_SPEC_VERSION, 0, 0, 0, 0, 0, 0); > +} > + > long sbi_probe(int ext) > { > struct sbiret ret; > > - ret = sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_SPEC_VERSION, 0, 0, 0, 0, 0, 0); > - assert(!ret.error && (ret.value & 0x7ffffffful) >= 2); > + ret = sbi_get_spec_version(); > + assert(!ret.error && (ret.value & SBI_SPEC_VERSION_MASK) >= sbi_mk_version(0, 2)); > > ret = sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_PROBE_EXT, ext, 0, 0, 0, 0, 0); > assert(!ret.error); > -- > 2.47.2 > I fixed those two things up while applying. Thanks, drew From andrew.jones at linux.dev Wed Mar 19 10:36:06 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Wed, 19 Mar 2025 18:36:06 +0100 Subject: [kvm-unit-tests PATCH v11 5/8] lib: riscv: Add functions to get implementer ID and version In-Reply-To: <20250317164655.1120015-6-cleger@rivosinc.com> References: <20250317164655.1120015-1-cleger@rivosinc.com> <20250317164655.1120015-6-cleger@rivosinc.com> Message-ID: <20250319-093901c8531b82e99d02ea8e@orel> On Mon, Mar 17, 2025 at 05:46:50PM +0100, Cl?ment L?ger wrote: > These functions will be used by SSE tests to check for a specific OpenSBI > version. > > Signed-off-by: Cl?ment L?ger > --- > lib/riscv/asm/sbi.h | 20 ++++++++++++++++++++ > lib/riscv/sbi.c | 20 ++++++++++++++++++++ > 2 files changed, 40 insertions(+) > > diff --git a/lib/riscv/asm/sbi.h b/lib/riscv/asm/sbi.h > index ee9d6e50..90111628 100644 > --- a/lib/riscv/asm/sbi.h > +++ b/lib/riscv/asm/sbi.h > @@ -18,6 +18,19 @@ > #define SBI_ERR_IO -13 > #define SBI_ERR_DENIED_LOCKED -14 > > +#define SBI_IMPL_BBL 0 > +#define SBI_IMPL_OPENSBI 1 > +#define SBI_IMPL_XVISOR 2 > +#define SBI_IMPL_KVM 3 > +#define SBI_IMPL_RUSTSBI 4 > +#define SBI_IMPL_DIOSIX 5 > +#define SBI_IMPL_COFFER 6 > +#define SBI_IMPL_XEN 7 > +#define SBI_IMPL_POLARFIRE_HSS 8 > +#define SBI_IMPL_COREBOOT 9 > +#define SBI_IMPL_OREBOOT 10 > +#define SBI_IMPL_BHYVE 11 > + > /* SBI spec version fields */ > #define SBI_SPEC_VERSION_MAJOR_SHIFT 24 > #define SBI_SPEC_VERSION_MAJOR_MASK 0x7f > @@ -124,6 +137,11 @@ static inline unsigned long sbi_mk_version(unsigned long major, unsigned long mi > | (minor & SBI_SPEC_VERSION_MINOR_MASK); > } > > +static inline unsigned long sbi_impl_opensbi_mk_version(unsigned long major, unsigned long minor) > +{ > + return (((major & 0xffff) << 16) | (minor & 0xffff)); > +} > + > struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0, > unsigned long arg1, unsigned long arg2, > unsigned long arg3, unsigned long arg4, > @@ -139,6 +157,8 @@ struct sbiret sbi_send_ipi_cpumask(const cpumask_t *mask); > struct sbiret sbi_send_ipi_broadcast(void); > struct sbiret sbi_set_timer(unsigned long stime_value); > struct sbiret sbi_get_spec_version(void); > +unsigned long sbi_get_imp_version(void); > +unsigned long sbi_get_imp_id(void); > long sbi_probe(int ext); > > #endif /* !__ASSEMBLER__ */ > diff --git a/lib/riscv/sbi.c b/lib/riscv/sbi.c > index 9d4eb541..ab032e3e 100644 > --- a/lib/riscv/sbi.c > +++ b/lib/riscv/sbi.c > @@ -107,6 +107,26 @@ struct sbiret sbi_set_timer(unsigned long stime_value) > return sbi_ecall(SBI_EXT_TIME, SBI_EXT_TIME_SET_TIMER, stime_value, 0, 0, 0, 0, 0); > } > > +unsigned long sbi_get_imp_version(void) > +{ > + struct sbiret ret; > + > + ret = sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_IMP_VERSION, 0, 0, 0, 0, 0, 0); > + assert(!ret.error); > + > + return ret.value; > +} > + > +unsigned long sbi_get_imp_id(void) > +{ > + struct sbiret ret; > + > + ret = sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_IMP_ID, 0, 0, 0, 0, 0, 0); > + assert(!ret.error); > + > + return ret.value; > +} > + > struct sbiret sbi_get_spec_version(void) > { > return sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_SPEC_VERSION, 0, 0, 0, 0, 0, 0); > -- > 2.47.2 > LGTM Thanks, drew From andrew.jones at linux.dev Wed Mar 19 11:01:24 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Wed, 19 Mar 2025 19:01:24 +0100 Subject: [kvm-unit-tests PATCH v11 0/8] riscv: add SBI SSE extension tests In-Reply-To: <20250317164655.1120015-1-cleger@rivosinc.com> References: <20250317164655.1120015-1-cleger@rivosinc.com> Message-ID: <20250319-ff9d4b4904195050638f77f1@orel> Hi Cl?ment, I'd like to merge this, but we still have 50 failures with the latest opensbi and over 30 with the opensbi QEMU provides. Testing with [1] and bumping the timeout to 6000 allowed me to avoid failures, however we can't count on that for CI. [1] https://lists.infradead.org/pipermail/opensbi/2025-March/008190.html I'm thinking about just doing the following. What do you think? Thanks, drew diff --git a/riscv/sbi-sse.c b/riscv/sbi-sse.c index a31c84c32303..fb4ee7dd44b2 100644 --- a/riscv/sbi-sse.c +++ b/riscv/sbi-sse.c @@ -1217,7 +1217,6 @@ void check_sse(void) { struct sse_event_info *info; unsigned long i, event_id; - bool sbi_skip_inject = false; bool supported; report_prefix_push("sse"); @@ -1228,6 +1227,13 @@ void check_sse(void) return; } + if (sbi_get_imp_id() == SBI_IMPL_OPENSBI && + sbi_get_imp_version() < sbi_impl_opensbi_mk_version(1, 7)) { + report_skip("OpenSBI < v1.7 detected, skipping tests"); + report_prefix_pop(); + return; + } + sse_check_mask(); /* @@ -1237,18 +1243,6 @@ void check_sse(void) */ on_cpus(sse_secondary_boot_and_unmask, NULL); - /* Check for OpenSBI to support injection */ - if (sbi_get_imp_id() == SBI_IMPL_OPENSBI) { - if (sbi_get_imp_version() < sbi_impl_opensbi_mk_version(1, 6)) { - /* - * OpenSBI < v1.6 crashes kvm-unit-tests upon injection since injection - * arguments (a6/a7) were reversed. Skip injection tests. - */ - report_skip("OpenSBI < v1.6 detected, skipping injection tests"); - sbi_skip_inject = true; - } - } - sse_test_invalid_event_id(); for (i = 0; i < ARRAY_SIZE(sse_event_infos); i++) { @@ -1265,14 +1259,12 @@ void check_sse(void) sse_test_attrs(event_id); sse_test_register_error(event_id); - if (!sbi_skip_inject) - run_inject_test(info); + run_inject_test(info); report_prefix_pop(); } - if (!sbi_skip_inject) - sse_test_injection_priority(); + sse_test_injection_priority(); report_prefix_pop(); } On Mon, Mar 17, 2025 at 05:46:45PM +0100, Cl?ment L?ger wrote: > This series adds tests for SBI SSE extension as well as needed > infrastructure for SSE support. It also adds test specific asm-offsets > generation to use custom OFFSET and DEFINE from the test directory. > > These tests can be run using an OpenSBI version that implements latest > specifications modification [1] > > Link: https://github.com/rivosinc/opensbi/tree/dev/cleger/sse [1] > > --- > > V11: > - Use mask inside sbi_impl_opensbi_mk_version() > - Mask the SBI version with a new mask > - Use assert inside sbi_get_impl_id/version() > - Remove sbi_check_impl() > - Increase completion timeout as events failed completing under 1000 > micros when system is loaded. > > V10: > - Use && instead of || for timeout handling > - Add SBI patches which introduce function to get implementer ID and > version as well as implementer ID defines. > - Skip injection tests in OpenSBI < v1.6 > > V9: > - Use __ASSEMBLER__ instead of __ASSEMBLY__ > - Remove extra spaces > - Use assert to check global event in > sse_global_event_set_current_hart() > - Tabulate SSE events names table > - Use sbi_sse_register() instead of sbi_sse_register_raw() in error > testing > - Move a report_pass() out of error path > - Rework all injection tests with better error handling > - Use an env var for sse event completion timeout > - Add timeout for some potentially infinite while() loops > > V8: > - Short circuit current event tests if failure happens > - Remove SSE from all report strings > - Indent .prio field > - Add cpu_relax()/smp_rmb() where needed > - Add timeout for global event ENABLED state check > - Added BIT(32) aliases tests for attribute/event_id. > > V7: > - Test ids/attributes/attributes count > 32 bits > - Rename all SSE function to sbi_sse_* > - Use event_id instead of event/evt > - Factorize read/write test > - Use virt_to_phys() for attributes read/write. > - Extensively use sbiret_report_error() > - Change check function return values to bool. > - Added assert for stack size to be below or equal to PAGE_SIZE > - Use en env variable for the maximum hart ID > - Check that individual read from attributes matches the multiple > attributes read. > - Added multiple attributes write at once > - Used READ_ONCE/WRITE_ONCE > - Inject all local event at once rather than looping fopr each core. > - Split test_arg for local_dispatch test so that all CPUs can run at > once. > - Move SSE entry and generic code to lib/riscv for other tests > - Fix unmask/mask state checking > > V6: > - Add missing $(generated-file) dependencies for "-deps" objects > - Split SSE entry from sbi-asm.S to sse-asm.S and all SSE core functions > since it will be useful for other tests as well (dbltrp). > > V5: > - Update event ranges based on latest spec > - Rename asm-offset-test.c to sbi-asm-offset.c > > V4: > - Fix typo sbi_ext_ss_fid -> sbi_ext_sse_fid > - Add proper asm-offset generation for tests > - Move SSE specific file from lib/riscv to riscv/ > > V3: > - Add -deps variable for test specific dependencies > - Fix formatting errors/typo in sbi.h > - Add missing double trap event > - Alphabetize sbi-sse.c includes > - Fix a6 content after unmasking event > - Add SSE HART_MASK/UNMASK test > - Use mv instead of move > - move sbi_check_sse() definition in sbi.c > - Remove sbi_sse test from unitests.cfg > > V2: > - Rebased on origin/master and integrate it into sbi.c tests > > Cl?ment L?ger (8): > kbuild: Allow multiple asm-offsets file to be generated > riscv: Set .aux.o files as .PRECIOUS > riscv: Use asm-offsets to generate SBI_EXT_HSM values > lib: riscv: Add functions for version checking > lib: riscv: Add functions to get implementer ID and version > riscv: lib: Add SBI SSE extension definitions > lib: riscv: Add SBI SSE support > riscv: sbi: Add SSE extension tests > > scripts/asm-offsets.mak | 22 +- > riscv/Makefile | 5 +- > lib/riscv/asm/csr.h | 1 + > lib/riscv/asm/sbi.h | 177 +++++- > lib/riscv/sbi-sse-asm.S | 102 ++++ > lib/riscv/asm-offsets.c | 9 + > lib/riscv/sbi.c | 105 +++- > riscv/sbi-tests.h | 1 + > riscv/sbi-asm.S | 6 +- > riscv/sbi-asm-offsets.c | 11 + > riscv/sbi-sse.c | 1278 +++++++++++++++++++++++++++++++++++++++ > riscv/sbi.c | 2 + > riscv/.gitignore | 1 + > 13 files changed, 1707 insertions(+), 13 deletions(-) > create mode 100644 lib/riscv/sbi-sse-asm.S > create mode 100644 riscv/sbi-asm-offsets.c > create mode 100644 riscv/sbi-sse.c > create mode 100644 riscv/.gitignore > > -- > 2.47.2 > From anup at brainfault.org Wed Mar 19 23:16:45 2025 From: anup at brainfault.org (Anup Patel) Date: Thu, 20 Mar 2025 11:46:45 +0530 Subject: [PATCH v2] RISC-V: KVM: Optimize comments in kvm_riscv_vcpu_isa_disable_allowed In-Reply-To: <20250221025929.31678-1-duchao@eswincomputing.com> References: <20250221025929.31678-1-duchao@eswincomputing.com> Message-ID: On Fri, Feb 21, 2025 at 8:29?AM Chao Du wrote: > > The comments for EXT_SVADE are a bit confusing. Optimize it to make it > more clear. > > Signed-off-by: Chao Du Queued this patch for Linux-6.15. I have taken care of Drew's comment. Regards, Anup > --- > arch/riscv/kvm/vcpu_onereg.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/riscv/kvm/vcpu_onereg.c b/arch/riscv/kvm/vcpu_onereg.c > index f6d27b59c641..43ee8e33ba23 100644 > --- a/arch/riscv/kvm/vcpu_onereg.c > +++ b/arch/riscv/kvm/vcpu_onereg.c > @@ -203,7 +203,7 @@ static bool kvm_riscv_vcpu_isa_disable_allowed(unsigned long ext) > case KVM_RISCV_ISA_EXT_SVADE: > /* > * The henvcfg.ADUE is read-only zero if menvcfg.ADUE is zero. > - * Svade is not allowed to disable when the platform use Svade. > + * Svade can't be disabled unless we support Svadu. > */ > return arch_has_hw_pte_young(); > default: > -- > 2.34.1 > From cleger at rivosinc.com Thu Mar 20 01:08:45 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Thu, 20 Mar 2025 09:08:45 +0100 Subject: [kvm-unit-tests PATCH v11 4/8] lib: riscv: Add functions for version checking In-Reply-To: <20250319-7ce04ed29661af987303b215@orel> References: <20250317164655.1120015-1-cleger@rivosinc.com> <20250317164655.1120015-5-cleger@rivosinc.com> <20250319-7ce04ed29661af987303b215@orel> Message-ID: <56d35d26-05f7-4c5a-a987-a5cafb39a8ef@rivosinc.com> On 19/03/2025 18:31, Andrew Jones wrote: > On Mon, Mar 17, 2025 at 05:46:49PM +0100, Cl?ment L?ger wrote: >> Version checking was done using some custom hardcoded values, backport a >> few SBI function and defines from Linux to do that cleanly. >> >> Signed-off-by: Cl?ment L?ger >> --- >> lib/riscv/asm/sbi.h | 15 +++++++++++++++ >> lib/riscv/sbi.c | 9 +++++++-- >> 2 files changed, 22 insertions(+), 2 deletions(-) >> >> diff --git a/lib/riscv/asm/sbi.h b/lib/riscv/asm/sbi.h >> index 2f4d91ef..ee9d6e50 100644 >> --- a/lib/riscv/asm/sbi.h >> +++ b/lib/riscv/asm/sbi.h >> @@ -18,6 +18,13 @@ >> #define SBI_ERR_IO -13 >> #define SBI_ERR_DENIED_LOCKED -14 >> >> +/* SBI spec version fields */ >> +#define SBI_SPEC_VERSION_MAJOR_SHIFT 24 >> +#define SBI_SPEC_VERSION_MAJOR_MASK 0x7f >> +#define SBI_SPEC_VERSION_MINOR_MASK 0xffffff >> +#define SBI_SPEC_VERSION_MASK ((SBI_SPEC_VERSION_MAJOR_MASK << SBI_SPEC_VERSION_MAJOR_SHIFT) | \ >> + SBI_SPEC_VERSION_MINOR_MASK) > ^ needs one more space >> + >> #ifndef __ASSEMBLER__ >> #include >> >> @@ -110,6 +117,13 @@ struct sbiret { >> long value; >> }; >> >> +/* Make SBI version */ > > Unnecessary comment, it's the same as the function name. Yeah that's a copy/paste from Linux, didn't cleanup. > >> +static inline unsigned long sbi_mk_version(unsigned long major, unsigned long minor) >> +{ >> + return ((major & SBI_SPEC_VERSION_MAJOR_MASK) << SBI_SPEC_VERSION_MAJOR_SHIFT) >> + | (minor & SBI_SPEC_VERSION_MINOR_MASK); >> +} >> + >> struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0, >> unsigned long arg1, unsigned long arg2, >> unsigned long arg3, unsigned long arg4, >> @@ -124,6 +138,7 @@ struct sbiret sbi_send_ipi_cpu(int cpu); >> struct sbiret sbi_send_ipi_cpumask(const cpumask_t *mask); >> struct sbiret sbi_send_ipi_broadcast(void); >> struct sbiret sbi_set_timer(unsigned long stime_value); >> +struct sbiret sbi_get_spec_version(void); >> long sbi_probe(int ext); >> >> #endif /* !__ASSEMBLER__ */ >> diff --git a/lib/riscv/sbi.c b/lib/riscv/sbi.c >> index 02dd338c..9d4eb541 100644 >> --- a/lib/riscv/sbi.c >> +++ b/lib/riscv/sbi.c >> @@ -107,12 +107,17 @@ struct sbiret sbi_set_timer(unsigned long stime_value) >> return sbi_ecall(SBI_EXT_TIME, SBI_EXT_TIME_SET_TIMER, stime_value, 0, 0, 0, 0, 0); >> } >> >> +struct sbiret sbi_get_spec_version(void) >> +{ >> + return sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_SPEC_VERSION, 0, 0, 0, 0, 0, 0); >> +} >> + >> long sbi_probe(int ext) >> { >> struct sbiret ret; >> >> - ret = sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_SPEC_VERSION, 0, 0, 0, 0, 0, 0); >> - assert(!ret.error && (ret.value & 0x7ffffffful) >= 2); >> + ret = sbi_get_spec_version(); >> + assert(!ret.error && (ret.value & SBI_SPEC_VERSION_MASK) >= sbi_mk_version(0, 2)); >> >> ret = sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_PROBE_EXT, ext, 0, 0, 0, 0, 0); >> assert(!ret.error); >> -- >> 2.47.2 >> > > I fixed those two things up while applying. Thanks Andrew, Cl?ment > > Thanks, > drew From cleger at rivosinc.com Thu Mar 20 01:13:25 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Thu, 20 Mar 2025 09:13:25 +0100 Subject: [kvm-unit-tests PATCH v11 0/8] riscv: add SBI SSE extension tests In-Reply-To: <20250319-ff9d4b4904195050638f77f1@orel> References: <20250317164655.1120015-1-cleger@rivosinc.com> <20250319-ff9d4b4904195050638f77f1@orel> Message-ID: On 19/03/2025 19:01, Andrew Jones wrote: > Hi Cl?ment, > > I'd like to merge this, but we still have 50 failures with the latest > opensbi and over 30 with the opensbi QEMU provides. Testing with [1] > and bumping the timeout to 6000 allowed me to avoid failures, however > we can't count on that for CI. > > [1] https://lists.infradead.org/pipermail/opensbi/2025-March/008190.html > > I'm thinking about just doing the following. What do you think? Hi Andrew, Since the spec isn't even ratified, does it even make sense to test a version that does not implement the ratified SBI V3.0 spec ? IOW, should we simply test for SBI version to be >= 3.0 ? This will be correct for all SBI implementation. However, If you want that series to be integrated ASAP, I'm ok with your patch , it seems unlikely that the SSE specification will change in the upcoming weeks. Reviewed-by: Cl?ment L?ger Thanks, Cl?ment > > Thanks, > drew > > diff --git a/riscv/sbi-sse.c b/riscv/sbi-sse.c > index a31c84c32303..fb4ee7dd44b2 100644 > --- a/riscv/sbi-sse.c > +++ b/riscv/sbi-sse.c > @@ -1217,7 +1217,6 @@ void check_sse(void) > { > struct sse_event_info *info; > unsigned long i, event_id; > - bool sbi_skip_inject = false; > bool supported; > > report_prefix_push("sse"); > @@ -1228,6 +1227,13 @@ void check_sse(void) > return; > } > > + if (sbi_get_imp_id() == SBI_IMPL_OPENSBI && > + sbi_get_imp_version() < sbi_impl_opensbi_mk_version(1, 7)) { > + report_skip("OpenSBI < v1.7 detected, skipping tests"); > + report_prefix_pop(); > + return; > + } > + > sse_check_mask(); > > /* > @@ -1237,18 +1243,6 @@ void check_sse(void) > */ > on_cpus(sse_secondary_boot_and_unmask, NULL); > > - /* Check for OpenSBI to support injection */ > - if (sbi_get_imp_id() == SBI_IMPL_OPENSBI) { > - if (sbi_get_imp_version() < sbi_impl_opensbi_mk_version(1, 6)) { > - /* > - * OpenSBI < v1.6 crashes kvm-unit-tests upon injection since injection > - * arguments (a6/a7) were reversed. Skip injection tests. > - */ > - report_skip("OpenSBI < v1.6 detected, skipping injection tests"); > - sbi_skip_inject = true; > - } > - } > - > sse_test_invalid_event_id(); > > for (i = 0; i < ARRAY_SIZE(sse_event_infos); i++) { > @@ -1265,14 +1259,12 @@ void check_sse(void) > sse_test_attrs(event_id); > sse_test_register_error(event_id); > > - if (!sbi_skip_inject) > - run_inject_test(info); > + run_inject_test(info); > > report_prefix_pop(); > } > > - if (!sbi_skip_inject) > - sse_test_injection_priority(); > + sse_test_injection_priority(); > > report_prefix_pop(); > } > > On Mon, Mar 17, 2025 at 05:46:45PM +0100, Cl?ment L?ger wrote: >> This series adds tests for SBI SSE extension as well as needed >> infrastructure for SSE support. It also adds test specific asm-offsets >> generation to use custom OFFSET and DEFINE from the test directory. >> >> These tests can be run using an OpenSBI version that implements latest >> specifications modification [1] >> >> Link: https://github.com/rivosinc/opensbi/tree/dev/cleger/sse [1] >> >> --- >> >> V11: >> - Use mask inside sbi_impl_opensbi_mk_version() >> - Mask the SBI version with a new mask >> - Use assert inside sbi_get_impl_id/version() >> - Remove sbi_check_impl() >> - Increase completion timeout as events failed completing under 1000 >> micros when system is loaded. >> >> V10: >> - Use && instead of || for timeout handling >> - Add SBI patches which introduce function to get implementer ID and >> version as well as implementer ID defines. >> - Skip injection tests in OpenSBI < v1.6 >> >> V9: >> - Use __ASSEMBLER__ instead of __ASSEMBLY__ >> - Remove extra spaces >> - Use assert to check global event in >> sse_global_event_set_current_hart() >> - Tabulate SSE events names table >> - Use sbi_sse_register() instead of sbi_sse_register_raw() in error >> testing >> - Move a report_pass() out of error path >> - Rework all injection tests with better error handling >> - Use an env var for sse event completion timeout >> - Add timeout for some potentially infinite while() loops >> >> V8: >> - Short circuit current event tests if failure happens >> - Remove SSE from all report strings >> - Indent .prio field >> - Add cpu_relax()/smp_rmb() where needed >> - Add timeout for global event ENABLED state check >> - Added BIT(32) aliases tests for attribute/event_id. >> >> V7: >> - Test ids/attributes/attributes count > 32 bits >> - Rename all SSE function to sbi_sse_* >> - Use event_id instead of event/evt >> - Factorize read/write test >> - Use virt_to_phys() for attributes read/write. >> - Extensively use sbiret_report_error() >> - Change check function return values to bool. >> - Added assert for stack size to be below or equal to PAGE_SIZE >> - Use en env variable for the maximum hart ID >> - Check that individual read from attributes matches the multiple >> attributes read. >> - Added multiple attributes write at once >> - Used READ_ONCE/WRITE_ONCE >> - Inject all local event at once rather than looping fopr each core. >> - Split test_arg for local_dispatch test so that all CPUs can run at >> once. >> - Move SSE entry and generic code to lib/riscv for other tests >> - Fix unmask/mask state checking >> >> V6: >> - Add missing $(generated-file) dependencies for "-deps" objects >> - Split SSE entry from sbi-asm.S to sse-asm.S and all SSE core functions >> since it will be useful for other tests as well (dbltrp). >> >> V5: >> - Update event ranges based on latest spec >> - Rename asm-offset-test.c to sbi-asm-offset.c >> >> V4: >> - Fix typo sbi_ext_ss_fid -> sbi_ext_sse_fid >> - Add proper asm-offset generation for tests >> - Move SSE specific file from lib/riscv to riscv/ >> >> V3: >> - Add -deps variable for test specific dependencies >> - Fix formatting errors/typo in sbi.h >> - Add missing double trap event >> - Alphabetize sbi-sse.c includes >> - Fix a6 content after unmasking event >> - Add SSE HART_MASK/UNMASK test >> - Use mv instead of move >> - move sbi_check_sse() definition in sbi.c >> - Remove sbi_sse test from unitests.cfg >> >> V2: >> - Rebased on origin/master and integrate it into sbi.c tests >> >> Cl?ment L?ger (8): >> kbuild: Allow multiple asm-offsets file to be generated >> riscv: Set .aux.o files as .PRECIOUS >> riscv: Use asm-offsets to generate SBI_EXT_HSM values >> lib: riscv: Add functions for version checking >> lib: riscv: Add functions to get implementer ID and version >> riscv: lib: Add SBI SSE extension definitions >> lib: riscv: Add SBI SSE support >> riscv: sbi: Add SSE extension tests >> >> scripts/asm-offsets.mak | 22 +- >> riscv/Makefile | 5 +- >> lib/riscv/asm/csr.h | 1 + >> lib/riscv/asm/sbi.h | 177 +++++- >> lib/riscv/sbi-sse-asm.S | 102 ++++ >> lib/riscv/asm-offsets.c | 9 + >> lib/riscv/sbi.c | 105 +++- >> riscv/sbi-tests.h | 1 + >> riscv/sbi-asm.S | 6 +- >> riscv/sbi-asm-offsets.c | 11 + >> riscv/sbi-sse.c | 1278 +++++++++++++++++++++++++++++++++++++++ >> riscv/sbi.c | 2 + >> riscv/.gitignore | 1 + >> 13 files changed, 1707 insertions(+), 13 deletions(-) >> create mode 100644 lib/riscv/sbi-sse-asm.S >> create mode 100644 riscv/sbi-asm-offsets.c >> create mode 100644 riscv/sbi-sse.c >> create mode 100644 riscv/.gitignore >> >> -- >> 2.47.2 >> From andrew.jones at linux.dev Thu Mar 20 06:36:26 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Thu, 20 Mar 2025 14:36:26 +0100 Subject: [kvm-unit-tests PATCH v11 3/8] riscv: Use asm-offsets to generate SBI_EXT_HSM values In-Reply-To: <20250317164655.1120015-4-cleger@rivosinc.com> References: <20250317164655.1120015-1-cleger@rivosinc.com> <20250317164655.1120015-4-cleger@rivosinc.com> Message-ID: <20250320-3db2b32593ec175998ef03b8@orel> On Mon, Mar 17, 2025 at 05:46:48PM +0100, Cl?ment L?ger wrote: > Replace hardcoded values with generated ones using sbi-asm-offset. This > allows to directly use ASM_SBI_EXT_HSM and ASM_SBI_EXT_HSM_STOP in > assembly. > > Signed-off-by: Cl?ment L?ger > Reviewed-by: Andrew Jones > --- > riscv/Makefile | 2 +- > riscv/sbi-asm.S | 6 ++++-- > riscv/sbi-asm-offsets.c | 11 +++++++++++ > riscv/.gitignore | 1 + > 4 files changed, 17 insertions(+), 3 deletions(-) > create mode 100644 riscv/sbi-asm-offsets.c > create mode 100644 riscv/.gitignore > > diff --git a/riscv/Makefile b/riscv/Makefile > index ae9cf02a..02d2ac39 100644 > --- a/riscv/Makefile > +++ b/riscv/Makefile > @@ -87,7 +87,7 @@ CFLAGS += -ffreestanding > CFLAGS += -O2 > CFLAGS += -I $(SRCDIR)/lib -I $(SRCDIR)/lib/libfdt -I lib -I $(SRCDIR)/riscv > > -asm-offsets = lib/riscv/asm-offsets.h > +asm-offsets = lib/riscv/asm-offsets.h riscv/sbi-asm-offsets.h > include $(SRCDIR)/scripts/asm-offsets.mak > > .PRECIOUS: %.aux.o > diff --git a/riscv/sbi-asm.S b/riscv/sbi-asm.S > index f4185496..51f46efd 100644 > --- a/riscv/sbi-asm.S > +++ b/riscv/sbi-asm.S > @@ -6,6 +6,8 @@ > */ > #include > #include > +#include > +#include Doing this causes sbi-asm.o to depend on generated files, so I had to add the change below to the Makefile in order for 'make -j' to work. Thanks, drew diff --git a/riscv/Makefile b/riscv/Makefile index 02d2ac392f11..e4dba6772377 100644 --- a/riscv/Makefile +++ b/riscv/Makefile @@ -19,6 +19,8 @@ all: $(tests) $(TEST_DIR)/sbi-deps = $(TEST_DIR)/sbi-asm.o $(TEST_DIR)/sbi-fwft.o +all_deps += $($(TEST_DIR)/sbi-deps) + # When built for EFI sieve needs extra memory, run with e.g. '-m 256' on QEMU $(TEST_DIR)/sieve.$(exe): AUXFLAGS = 0x1 @@ -134,7 +136,7 @@ else endif generated-files = $(asm-offsets) -$(tests:.$(exe)=.o) $(cstart.o) $(cflatobjs): $(generated-files) +$(tests:.$(exe)=.o) $(cstart.o) $(cflatobjs) $(all_deps): $(generated-files) arch_clean: asm_offsets_clean $(RM) $(TEST_DIR)/*.{o,flat,elf,so,efi,debug} \ > > #include "sbi-tests.h" > > @@ -57,8 +59,8 @@ sbi_hsm_check: > 7: lb t0, 0(t1) > pause > beqz t0, 7b > - li a7, 0x48534d /* SBI_EXT_HSM */ > - li a6, 1 /* SBI_EXT_HSM_HART_STOP */ > + li a7, ASM_SBI_EXT_HSM > + li a6, ASM_SBI_EXT_HSM_HART_STOP > ecall > 8: pause > j 8b > diff --git a/riscv/sbi-asm-offsets.c b/riscv/sbi-asm-offsets.c > new file mode 100644 > index 00000000..bd37b6a2 > --- /dev/null > +++ b/riscv/sbi-asm-offsets.c > @@ -0,0 +1,11 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +#include > +#include > + > +int main(void) > +{ > + DEFINE(ASM_SBI_EXT_HSM, SBI_EXT_HSM); > + DEFINE(ASM_SBI_EXT_HSM_HART_STOP, SBI_EXT_HSM_HART_STOP); > + > + return 0; > +} > diff --git a/riscv/.gitignore b/riscv/.gitignore > new file mode 100644 > index 00000000..0a8c5a36 > --- /dev/null > +++ b/riscv/.gitignore > @@ -0,0 +1 @@ > +/*-asm-offsets.[hs] > -- > 2.47.2 > From andrew.jones at linux.dev Thu Mar 20 06:40:16 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Thu, 20 Mar 2025 14:40:16 +0100 Subject: [kvm-unit-tests PATCH v11 8/8] riscv: sbi: Add SSE extension tests In-Reply-To: <20250317164655.1120015-9-cleger@rivosinc.com> References: <20250317164655.1120015-1-cleger@rivosinc.com> <20250317164655.1120015-9-cleger@rivosinc.com> Message-ID: <20250320-ab264d08531d4ff10b874485@orel> On Mon, Mar 17, 2025 at 05:46:53PM +0100, Cl?ment L?ger wrote: ... > + /* Test all flags allowed for SBI_SSE_ATTR_INTERRUPTED_FLAGS */ > + attr = SBI_SSE_ATTR_INTERRUPTED_FLAGS; > + ret = sbi_sse_read_attrs(event_id, attr, 1, &prev_value); > + sbiret_report_error(&ret, SBI_SUCCESS, "Save interrupted flags"); > + > + for (i = 0; i < ARRAY_SIZE(interrupted_flags); i++) { > + flags = interrupted_flags[i]; > + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); > + sbiret_report_error(&ret, SBI_SUCCESS, > + "Set interrupted flags bit 0x%lx value", flags); > + ret = sbi_sse_read_attrs(event_id, attr, 1, &value); > + sbiret_report_error(&ret, SBI_SUCCESS, "Get interrupted flags after set"); > + report(value == flags, "interrupted flags modified value: 0x%lx", value); > + } > + > + /* Write invalid bit in flag register */ > + flags = SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT << 1; > + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); > + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", > + flags); > + > + flags = BIT(SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT + 1); This broke compiling for rv32, but just changing it to BIT_ULL wouldn't be right either since SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT is a BIT(). I think the test just above this one is what this test was aiming to do, making it redundant. Instead of removing it, I changed it as below to get more coverage, at least on rv64. Thanks, drew diff --git a/riscv/sbi-sse.c b/riscv/sbi-sse.c index fb4ee7dd44b2..f9e389728616 100644 --- a/riscv/sbi-sse.c +++ b/riscv/sbi-sse.c @@ -495,10 +495,12 @@ static void sse_simple_handler(void *data, struct pt_regs *regs, unsigned int ha sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", flags); - flags = BIT(SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT + 1); +#if __riscv_xlen > 32 + flags = BIT(32); ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", flags); +#endif ret = sbi_sse_write_attrs(event_id, attr, 1, &prev_value); sbiret_report_error(&ret, SBI_SUCCESS, "Restore interrupted flags"); From alexandru.elisei at arm.com Thu Mar 20 06:42:01 2025 From: alexandru.elisei at arm.com (Alexandru Elisei) Date: Thu, 20 Mar 2025 13:42:01 +0000 Subject: [kvm-unit-tests PATCH v2 4/5] configure: Add --qemu-cpu option In-Reply-To: <20250314154904.3946484-6-jean-philippe@linaro.org> References: <20250314154904.3946484-2-jean-philippe@linaro.org> <20250314154904.3946484-6-jean-philippe@linaro.org> Message-ID: Hi Jean-Philippe, Thank you for picking up this work, much appreciated. I like this approach a lot: the value for -cpu, if the user hasn't specified one with --qemu-cpu, depends on the qemu accelerator, which is either in the test specification, or an environment variable, making it unknown at configure time. So it makes more sense to have arm/run choose the correct -cpu value at runtime, than having configure put a default value for QEMU_CPU in config.mak. On Fri, Mar 14, 2025 at 03:49:04PM +0000, Jean-Philippe Brucker wrote: > Add the --qemu-cpu option to let users set the CPU type to run on. > At the moment --processor allows to set both GCC -mcpu flag and QEMU > -cpu. On Arm we'd like to pass `-cpu max` to QEMU in order to enable all > the TCG features by default, and it could also be nice to let users > modify the CPU capabilities by setting extra -cpu options. > Since GCC -mcpu doesn't accept "max" or "host", separate the compiler > and QEMU arguments. > > `--processor` is now exclusively for compiler options, as indicated by > its documentation ("processor to compile for"). So use $QEMU_CPU on > RISC-V as well. > > Suggested-by: Andrew Jones > Signed-off-by: Jean-Philippe Brucker > --- > scripts/mkstandalone.sh | 2 +- > arm/run | 17 +++++++++++------ > riscv/run | 8 ++++---- > configure | 7 +++++++ > 4 files changed, 23 insertions(+), 11 deletions(-) > > diff --git a/scripts/mkstandalone.sh b/scripts/mkstandalone.sh > index 2318a85f..6b5f725d 100755 > --- a/scripts/mkstandalone.sh > +++ b/scripts/mkstandalone.sh > @@ -42,7 +42,7 @@ generate_test () > > config_export ARCH > config_export ARCH_NAME > - config_export PROCESSOR > + config_export QEMU_CPU > > echo "echo BUILD_HEAD=$(cat build-head)" > > diff --git a/arm/run b/arm/run > index efdd44ce..561bafab 100755 > --- a/arm/run > +++ b/arm/run > @@ -8,7 +8,7 @@ if [ -z "$KUT_STANDALONE" ]; then > source config.mak > source scripts/arch-run.bash > fi > -processor="$PROCESSOR" > +qemu_cpu="$QEMU_CPU" > > if [ "$QEMU" ] && [ -z "$ACCEL" ] && > [ "$HOST" = "aarch64" ] && [ "$ARCH" = "arm" ] && > @@ -37,12 +37,17 @@ if [ "$ACCEL" = "kvm" ]; then > fi > fi > > -if [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ]; then > - if [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ]; then > - processor="host" > +if [ -z "$qemu_cpu" ]; then > + if ( [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ] ) && > + ( [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ] ); then > + qemu_cpu="host" > if [ "$ARCH" = "arm" ] && [ "$HOST" = "aarch64" ]; then > - processor+=",aarch64=off" > + qemu_cpu+=",aarch64=off" > fi > + elif [ "$ARCH" = "arm64" ]; then > + qemu_cpu="cortex-a57" > + else > + qemu_cpu="cortex-a15" > fi > fi > > @@ -71,7 +76,7 @@ if $qemu $M -device '?' | grep -q pci-testdev; then > fi > > A="-accel $ACCEL$ACCEL_PROPS" > -command="$qemu -nodefaults $M $A -cpu $processor $chr_testdev $pci_testdev" > +command="$qemu -nodefaults $M $A -cpu $qemu_cpu $chr_testdev $pci_testdev" > command+=" -display none -serial stdio" > command="$(migration_cmd) $(timeout_cmd) $command" > > diff --git a/riscv/run b/riscv/run > index e2f5a922..02fcf0c0 100755 > --- a/riscv/run > +++ b/riscv/run > @@ -11,12 +11,12 @@ fi > > # Allow user overrides of some config.mak variables > mach=$MACHINE_OVERRIDE > -processor=$PROCESSOR_OVERRIDE > +qemu_cpu=$QEMU_CPU_OVERRIDE The name QEMU_CPU_OVERRIDE makes more sense, but I think this will break existing setups where people use PROCESSOR_OVERRIDE to pass a particular -cpu value to qemu. If I were to guess, the environment variable was added to pass a value to -cpu different than what it can be specified with ./configure --processor, which is exactly what --qemu-cpu does. But I'll let Drew comment on that. > firmware=$FIRMWARE_OVERRIDE > > -[ "$PROCESSOR" = "$ARCH" ] && PROCESSOR="max" > +[ -z "$QEMU_CPU" ] && QEMU_CPU="max" > : "${mach:=virt}" > -: "${processor:=$PROCESSOR}" > +: "${qemu_cpu:=$QEMU_CPU}" > : "${firmware:=$FIRMWARE}" > [ "$firmware" ] && firmware="-bios $firmware" > > @@ -32,7 +32,7 @@ fi > mach="-machine $mach" > > command="$qemu -nodefaults -nographic -serial mon:stdio" > -command+=" $mach $acc $firmware -cpu $processor " > +command+=" $mach $acc $firmware -cpu $qemu_cpu " > command="$(migration_cmd) $(timeout_cmd) $command" > > if [ "$UEFI_SHELL_RUN" = "y" ]; then > diff --git a/configure b/configure > index 5306bad3..d25bd23e 100755 > --- a/configure > +++ b/configure > @@ -52,6 +52,7 @@ page_size= > earlycon= > efi= > efi_direct= > +qemu_cpu= > > # Enable -Werror by default for git repositories only (i.e. developer builds) > if [ -e "$srcdir"/.git ]; then > @@ -69,6 +70,8 @@ usage() { > --arch=ARCH architecture to compile for ($arch). ARCH can be one of: > arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 > --processor=PROCESSOR processor to compile for ($processor) > + --qemu-cpu=CPU the CPU model to run on. The default depends on > + the configuration, usually it is "host" or "max". Nitpick here, would you mind changing this to "The default value depends on the host system and the test configuration [..]", to make it clear that it also depends on the machine the tests are being run on? Thanks, Alex > --target=TARGET target platform that the tests will be running on (qemu or > kvmtool, default is qemu) (arm/arm64 only) > --cross-prefix=PREFIX cross compiler prefix > @@ -142,6 +145,9 @@ while [[ $optno -le $argc ]]; do > --processor) > processor="$arg" > ;; > + --qemu-cpu) > + qemu_cpu="$arg" > + ;; > --target) > target="$arg" > ;; > @@ -464,6 +470,7 @@ ARCH=$arch > ARCH_NAME=$arch_name > ARCH_LIBDIR=$arch_libdir > PROCESSOR=$processor > +QEMU_CPU=$qemu_cpu > CC=$cc > CFLAGS=$cflags > LD=$cross_prefix$ld > -- > 2.48.1 > From cleger at rivosinc.com Thu Mar 20 06:44:30 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Thu, 20 Mar 2025 14:44:30 +0100 Subject: [kvm-unit-tests PATCH v11 8/8] riscv: sbi: Add SSE extension tests In-Reply-To: <20250320-ab264d08531d4ff10b874485@orel> References: <20250317164655.1120015-1-cleger@rivosinc.com> <20250317164655.1120015-9-cleger@rivosinc.com> <20250320-ab264d08531d4ff10b874485@orel> Message-ID: <0e6de092-4dde-4a50-9373-a8e5b37a8c1e@rivosinc.com> On 20/03/2025 14:40, Andrew Jones wrote: > On Mon, Mar 17, 2025 at 05:46:53PM +0100, Cl?ment L?ger wrote: > ... >> + /* Test all flags allowed for SBI_SSE_ATTR_INTERRUPTED_FLAGS */ >> + attr = SBI_SSE_ATTR_INTERRUPTED_FLAGS; >> + ret = sbi_sse_read_attrs(event_id, attr, 1, &prev_value); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Save interrupted flags"); >> + >> + for (i = 0; i < ARRAY_SIZE(interrupted_flags); i++) { >> + flags = interrupted_flags[i]; >> + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); >> + sbiret_report_error(&ret, SBI_SUCCESS, >> + "Set interrupted flags bit 0x%lx value", flags); >> + ret = sbi_sse_read_attrs(event_id, attr, 1, &value); >> + sbiret_report_error(&ret, SBI_SUCCESS, "Get interrupted flags after set"); >> + report(value == flags, "interrupted flags modified value: 0x%lx", value); >> + } >> + >> + /* Write invalid bit in flag register */ >> + flags = SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT << 1; >> + ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); >> + sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", >> + flags); >> + >> + flags = BIT(SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT + 1); > > This broke compiling for rv32, but just changing it to BIT_ULL wouldn't be > right either since SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT is a BIT(). > I think the test just above this one is what this test was aiming to do, > making it redundant. Instead of removing it, I changed it as below to get > more coverage, at least on rv64. Hum yeah indeed, that wasn't what was intended to be done. You fix is correct. Thanks, Cl?ment > > Thanks, > drew > > diff --git a/riscv/sbi-sse.c b/riscv/sbi-sse.c > index fb4ee7dd44b2..f9e389728616 100644 > --- a/riscv/sbi-sse.c > +++ b/riscv/sbi-sse.c > @@ -495,10 +495,12 @@ static void sse_simple_handler(void *data, struct pt_regs *regs, unsigned int ha > sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", > flags); > > - flags = BIT(SBI_SSE_ATTR_INTERRUPTED_FLAGS_SSTATUS_SDT + 1); > +#if __riscv_xlen > 32 > + flags = BIT(32); > ret = sbi_sse_write_attrs(event_id, attr, 1, &flags); > sbiret_report_error(&ret, SBI_ERR_INVALID_PARAM, "Set invalid flags bit 0x%lx value error", > flags); > +#endif > > ret = sbi_sse_write_attrs(event_id, attr, 1, &prev_value); > sbiret_report_error(&ret, SBI_SUCCESS, "Restore interrupted flags"); From andrew.jones at linux.dev Thu Mar 20 06:46:55 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Thu, 20 Mar 2025 14:46:55 +0100 Subject: [kvm-unit-tests PATCH v11 8/8] riscv: sbi: Add SSE extension tests In-Reply-To: <20250317164655.1120015-9-cleger@rivosinc.com> References: <20250317164655.1120015-1-cleger@rivosinc.com> <20250317164655.1120015-9-cleger@rivosinc.com> Message-ID: <20250320-48e2478cb96a804bfb1008e3@orel> ... > +cleanup: > + switch (state) { > + case SBI_SSE_STATE_ENABLED: > + ret = sbi_sse_disable(event_id); > + if (ret.error) { > + sbiret_report_error(&ret, SBI_SUCCESS, "disable event 0x%x", event_id); > + break; > + } > + case SBI_SSE_STATE_REGISTERED: > + sbi_sse_unregister(event_id); > + if (ret.error) > + sbiret_report_error(&ret, SBI_SUCCESS, "unregister event 0x%x", event_id); > + default: I had to add break here to make clang happy (and same for the other two defaults without statements). Thanks, drew From cleger at rivosinc.com Thu Mar 20 06:50:27 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Thu, 20 Mar 2025 14:50:27 +0100 Subject: [kvm-unit-tests PATCH v11 8/8] riscv: sbi: Add SSE extension tests In-Reply-To: <20250320-48e2478cb96a804bfb1008e3@orel> References: <20250317164655.1120015-1-cleger@rivosinc.com> <20250317164655.1120015-9-cleger@rivosinc.com> <20250320-48e2478cb96a804bfb1008e3@orel> Message-ID: <78fc8b1a-6dfb-44a4-bbec-fe25b1cc4af8@rivosinc.com> On 20/03/2025 14:46, Andrew Jones wrote: > ... >> +cleanup: >> + switch (state) { >> + case SBI_SSE_STATE_ENABLED: >> + ret = sbi_sse_disable(event_id); >> + if (ret.error) { >> + sbiret_report_error(&ret, SBI_SUCCESS, "disable event 0x%x", event_id); >> + break; >> + } >> + case SBI_SSE_STATE_REGISTERED: >> + sbi_sse_unregister(event_id); >> + if (ret.error) >> + sbiret_report_error(&ret, SBI_SUCCESS, "unregister event 0x%x", event_id); >> + default: > > I had to add break here to make clang happy (and same for the other two > defaults without statements). Ok. Not directly related but I could/should have added "/* passthrough */" comments on the two cases above though. Thanks, Cl?ment > > Thanks, > drew From anup at brainfault.org Thu Mar 20 07:15:09 2025 From: anup at brainfault.org (Anup Patel) Date: Thu, 20 Mar 2025 19:45:09 +0530 Subject: [GIT PULL] KVM/riscv changes for 6.15 Message-ID: Hi Paolo, We mainly have two fixes and few PMU selftests improvements for 6.15. There are quite a few in-flight patches dependent on SBI v3.0 specification which will freeze soon. Please pull. Regards, Anup The following changes since commit 7eb172143d5508b4da468ed59ee857c6e5e01da6: Linux 6.14-rc5 (2025-03-02 11:48:20 -0800) are available in the Git repository at: https://github.com/kvm-riscv/linux.git tags/kvm-riscv-6.15-1 for you to fetch changes up to b3f263a98d30fe2e33eefea297598c590ee3560e: RISC-V: KVM: Optimize comments in kvm_riscv_vcpu_isa_disable_allowed (2025-03-20 11:33:56 +0530) ---------------------------------------------------------------- KVM/riscv changes for 6.15 - Disable the kernel perf counter during configure - KVM selftests improvements for PMU - Fix warning at the time of KVM module removal ---------------------------------------------------------------- Atish Patra (5): RISC-V: KVM: Disable the kernel perf counter during configure KVM: riscv: selftests: Do not start the counter in the overflow handler KVM: riscv: selftests: Change command line option KVM: riscv: selftests: Allow number of interrupts to be configurable RISC-V: KVM: Teardown riscv specific bits after kvm_exit Chao Du (1): RISC-V: KVM: Optimize comments in kvm_riscv_vcpu_isa_disable_allowed arch/riscv/kvm/main.c | 4 +- arch/riscv/kvm/vcpu_onereg.c | 2 +- arch/riscv/kvm/vcpu_pmu.c | 1 + tools/testing/selftests/kvm/riscv/sbi_pmu_test.c | 81 ++++++++++++++++-------- 4 files changed, 60 insertions(+), 28 deletions(-) From andrew.jones at linux.dev Thu Mar 20 07:26:01 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Thu, 20 Mar 2025 15:26:01 +0100 Subject: [kvm-unit-tests PATCH] riscv: sbi: Add no target ipi test In-Reply-To: <20250317120814.35241-2-andrew.jones@linux.dev> References: <20250317120814.35241-2-andrew.jones@linux.dev> Message-ID: <20250320-9bc7b1199845e3679cac3792@orel> On Mon, Mar 17, 2025 at 01:08:15PM +0100, Andrew Jones wrote: > While an hmask of zero without an hbase of -1 (which would mean > broadcast) is pointless, since nothing is targeted, the spec > doesn't say it should return an error. Maybe it should? Anyway, > for now, just confirm that it "works". > > Signed-off-by: Andrew Jones > --- > riscv/sbi.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/riscv/sbi.c b/riscv/sbi.c > index 219f7187acf3..16d918d00848 100644 > --- a/riscv/sbi.c > +++ b/riscv/sbi.c > @@ -507,6 +507,10 @@ end_two: > max_hartid = cpus[cpu].hartid; > } > > + /* Test no targets */ > + ret = sbi_send_ipi(0, 0); > + sbiret_report_error(&ret, SBI_SUCCESS, "no targets"); > + > /* Try the next higher hartid than the max */ > ret = sbi_send_ipi(2, max_hartid); > report_kfail(true, ret.error == SBI_ERR_INVALID_PARAM, "hart_mask got expected error (%ld)", ret.error); > -- > 2.48.1 > Applied to riscv/sbi[1] with the test below added too just to make sure that a nonzero hbase with a zero hmask doesn't break things. Thanks, drew ret = sbi_send_ipi(0, 1); sbiret_report_error(&ret, SBI_SUCCESS, "no targets, hart_mask_base is 1"); [1] https://gitlab.com/jones-drew/kvm-unit-tests/-/commits/riscv/sbi From andrew.jones at linux.dev Thu Mar 20 07:26:44 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Thu, 20 Mar 2025 15:26:44 +0100 Subject: [kvm-unit-tests PATCH v11 0/8] riscv: add SBI SSE extension tests In-Reply-To: <20250317164655.1120015-1-cleger@rivosinc.com> References: <20250317164655.1120015-1-cleger@rivosinc.com> Message-ID: <20250320-55d10bb6d1d91dbe7b15501a@orel> On Mon, Mar 17, 2025 at 05:46:45PM +0100, Cl?ment L?ger wrote: > This series adds tests for SBI SSE extension as well as needed > infrastructure for SSE support. It also adds test specific asm-offsets > generation to use custom OFFSET and DEFINE from the test directory. > > These tests can be run using an OpenSBI version that implements latest > specifications modification [1] > > Link: https://github.com/rivosinc/opensbi/tree/dev/cleger/sse [1] Applied to riscv/sbi with the fixups pointed out. https://gitlab.com/jones-drew/kvm-unit-tests/-/commits/riscv/sbi Thanks, drew > > --- > > V11: > - Use mask inside sbi_impl_opensbi_mk_version() > - Mask the SBI version with a new mask > - Use assert inside sbi_get_impl_id/version() > - Remove sbi_check_impl() > - Increase completion timeout as events failed completing under 1000 > micros when system is loaded. > > V10: > - Use && instead of || for timeout handling > - Add SBI patches which introduce function to get implementer ID and > version as well as implementer ID defines. > - Skip injection tests in OpenSBI < v1.6 > > V9: > - Use __ASSEMBLER__ instead of __ASSEMBLY__ > - Remove extra spaces > - Use assert to check global event in > sse_global_event_set_current_hart() > - Tabulate SSE events names table > - Use sbi_sse_register() instead of sbi_sse_register_raw() in error > testing > - Move a report_pass() out of error path > - Rework all injection tests with better error handling > - Use an env var for sse event completion timeout > - Add timeout for some potentially infinite while() loops > > V8: > - Short circuit current event tests if failure happens > - Remove SSE from all report strings > - Indent .prio field > - Add cpu_relax()/smp_rmb() where needed > - Add timeout for global event ENABLED state check > - Added BIT(32) aliases tests for attribute/event_id. > > V7: > - Test ids/attributes/attributes count > 32 bits > - Rename all SSE function to sbi_sse_* > - Use event_id instead of event/evt > - Factorize read/write test > - Use virt_to_phys() for attributes read/write. > - Extensively use sbiret_report_error() > - Change check function return values to bool. > - Added assert for stack size to be below or equal to PAGE_SIZE > - Use en env variable for the maximum hart ID > - Check that individual read from attributes matches the multiple > attributes read. > - Added multiple attributes write at once > - Used READ_ONCE/WRITE_ONCE > - Inject all local event at once rather than looping fopr each core. > - Split test_arg for local_dispatch test so that all CPUs can run at > once. > - Move SSE entry and generic code to lib/riscv for other tests > - Fix unmask/mask state checking > > V6: > - Add missing $(generated-file) dependencies for "-deps" objects > - Split SSE entry from sbi-asm.S to sse-asm.S and all SSE core functions > since it will be useful for other tests as well (dbltrp). > > V5: > - Update event ranges based on latest spec > - Rename asm-offset-test.c to sbi-asm-offset.c > > V4: > - Fix typo sbi_ext_ss_fid -> sbi_ext_sse_fid > - Add proper asm-offset generation for tests > - Move SSE specific file from lib/riscv to riscv/ > > V3: > - Add -deps variable for test specific dependencies > - Fix formatting errors/typo in sbi.h > - Add missing double trap event > - Alphabetize sbi-sse.c includes > - Fix a6 content after unmasking event > - Add SSE HART_MASK/UNMASK test > - Use mv instead of move > - move sbi_check_sse() definition in sbi.c > - Remove sbi_sse test from unitests.cfg > > V2: > - Rebased on origin/master and integrate it into sbi.c tests > > Cl?ment L?ger (8): > kbuild: Allow multiple asm-offsets file to be generated > riscv: Set .aux.o files as .PRECIOUS > riscv: Use asm-offsets to generate SBI_EXT_HSM values > lib: riscv: Add functions for version checking > lib: riscv: Add functions to get implementer ID and version > riscv: lib: Add SBI SSE extension definitions > lib: riscv: Add SBI SSE support > riscv: sbi: Add SSE extension tests > > scripts/asm-offsets.mak | 22 +- > riscv/Makefile | 5 +- > lib/riscv/asm/csr.h | 1 + > lib/riscv/asm/sbi.h | 177 +++++- > lib/riscv/sbi-sse-asm.S | 102 ++++ > lib/riscv/asm-offsets.c | 9 + > lib/riscv/sbi.c | 105 +++- > riscv/sbi-tests.h | 1 + > riscv/sbi-asm.S | 6 +- > riscv/sbi-asm-offsets.c | 11 + > riscv/sbi-sse.c | 1278 +++++++++++++++++++++++++++++++++++++++ > riscv/sbi.c | 2 + > riscv/.gitignore | 1 + > 13 files changed, 1707 insertions(+), 13 deletions(-) > create mode 100644 lib/riscv/sbi-sse-asm.S > create mode 100644 riscv/sbi-asm-offsets.c > create mode 100644 riscv/sbi-sse.c > create mode 100644 riscv/.gitignore > > -- > 2.47.2 > From pbonzini at redhat.com Thu Mar 20 09:54:01 2025 From: pbonzini at redhat.com (Paolo Bonzini) Date: Thu, 20 Mar 2025 17:54:01 +0100 Subject: [GIT PULL] KVM/riscv changes for 6.15 In-Reply-To: References: Message-ID: On Thu, Mar 20, 2025 at 3:15?PM Anup Patel wrote: > > Hi Paolo, > > We mainly have two fixes and few PMU selftests improvements > for 6.15. There are quite a few in-flight patches dependent on > SBI v3.0 specification which will freeze soon. > > Please pull. Pulled, thanks. Paolo > Regards, > Anup From atishp at rivosinc.com Thu Mar 20 16:44:05 2025 From: atishp at rivosinc.com (Atish Kumar Patra) Date: Thu, 20 Mar 2025 16:44:05 -0700 Subject: [PATCH] RISC-V: KVM: Teardown riscv specific bits after kvm_exit In-Reply-To: References: <20250317-kvm_exit_fix-v1-1-aa5240c5dbd2@rivosinc.com> Message-ID: On Mon, Mar 17, 2025 at 9:08?AM Sean Christopherson wrote: > > On Mon, Mar 17, 2025, Atish Patra wrote: > > During a module removal, kvm_exit invokes arch specific disable > > call which disables AIA. However, we invoke aia_exit before kvm_exit > > resulting in the following warning. KVM kernel module can't be inserted > > afterwards due to inconsistent state of IRQ. > > > > [25469.031389] percpu IRQ 31 still enabled on CPU0! > > [25469.031732] WARNING: CPU: 3 PID: 943 at kernel/irq/manage.c:2476 __free_percpu_irq+0xa2/0x150 > > [25469.031804] Modules linked in: kvm(-) > > [25469.031848] CPU: 3 UID: 0 PID: 943 Comm: rmmod Not tainted 6.14.0-rc5-06947-g91c763118f47-dirty #2 > > [25469.031905] Hardware name: riscv-virtio,qemu (DT) > > [25469.031928] epc : __free_percpu_irq+0xa2/0x150 > > [25469.031976] ra : __free_percpu_irq+0xa2/0x150 > > [25469.032197] epc : ffffffff8007db1e ra : ffffffff8007db1e sp : ff2000000088bd50 > > [25469.032241] gp : ffffffff8131cef8 tp : ff60000080b96400 t0 : ff2000000088baf8 > > [25469.032285] t1 : fffffffffffffffc t2 : 5249207570637265 s0 : ff2000000088bd90 > > [25469.032329] s1 : ff60000098b21080 a0 : 037d527a15eb4f00 a1 : 037d527a15eb4f00 > > [25469.032372] a2 : 0000000000000023 a3 : 0000000000000001 a4 : ffffffff8122dbf8 > > [25469.032410] a5 : 0000000000000fff a6 : 0000000000000000 a7 : ffffffff8122dc10 > > [25469.032448] s2 : ff60000080c22eb0 s3 : 0000000200000022 s4 : 000000000000001f > > [25469.032488] s5 : ff60000080c22e00 s6 : ffffffff80c351c0 s7 : 0000000000000000 > > [25469.032582] s8 : 0000000000000003 s9 : 000055556b7fb490 s10: 00007ffff0e12fa0 > > [25469.032621] s11: 00007ffff0e13e9a t3 : ffffffff81354ac7 t4 : ffffffff81354ac7 > > [25469.032664] t5 : ffffffff81354ac8 t6 : ffffffff81354ac7 > > [25469.032698] status: 0000000200000100 badaddr: ffffffff8007db1e cause: 0000000000000003 > > [25469.032738] [] __free_percpu_irq+0xa2/0x150 > > [25469.032797] [] free_percpu_irq+0x30/0x5e > > [25469.032856] [] kvm_riscv_aia_exit+0x40/0x42 [kvm] > > [25469.033947] [] cleanup_module+0x10/0x32 [kvm] > > [25469.035300] [] __riscv_sys_delete_module+0x18e/0x1fc > > [25469.035374] [] syscall_handler+0x3a/0x46 > > [25469.035456] [] do_trap_ecall_u+0x72/0x134 > > [25469.035536] [] handle_exception+0x148/0x156 > > > > Invoke aia_exit and other arch specific cleanup functions after kvm_exit > > so that disable gets a chance to be called first before exit. > > > > Fixes: 54e43320c2ba ("RISC-V: KVM: Initial skeletal support for AIA") > > > > Signed-off-by: Atish Patra > > --- > > FWIW, > > Reviewed-by: Sean Christopherson > > > arch/riscv/kvm/main.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c > > index 1fa8be5ee509..4b24705dc63a 100644 > > --- a/arch/riscv/kvm/main.c > > +++ b/arch/riscv/kvm/main.c > > @@ -172,8 +172,8 @@ module_init(riscv_kvm_init); > > > > static void __exit riscv_kvm_exit(void) > > { > > - kvm_riscv_teardown(); > > - > > kvm_exit(); > > + > > + kvm_riscv_teardown(); > > I wonder if there's a way we can guard against kvm_init()/kvm_exit() being called > too early/late. x86 had similar bugs for a very long time, e.g. see commit > e32b120071ea ("KVM: VMX: Do _all_ initialization before exposing /dev/kvm to userspace"). > > E.g. maybe we do something like create+destroy a VM at the end of kvm_init() and > the beginning of kvm_exit()? Not sure if that would work for kvm_exit(), but it > should definitely be fine for kvm_init(). > Yes. That would be super useful. I am not sure about the exact mechanism to achieve that though. Do you just test code guarded within a new config that just creates/destroys a dummy VM ? May be kunit test for KVM fits here in some way ? > It wouldn't prevent bugs, but maybe it would help detect them during development? From andrew.jones at linux.dev Fri Mar 21 09:54:04 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Fri, 21 Mar 2025 17:54:04 +0100 Subject: [kvm-unit-tests PATCH 0/3] riscv: sbi: Ensure we can pass with any opensbi Message-ID: <20250321165403.57859-5-andrew.jones@linux.dev> At some point CI will update the version of QEMU it uses. If the version selected includes an opensbi that doesn't have all the fixes for current tests, then CI will start failing for riscv. Guard against that by using kfail for known opensbi failures. We only kfail for opensbi and for its versions less than 1.7, though, as we expect everything to be fixed then. Andrew Jones (3): lib/riscv: Also provide sbiret impl functions riscv: sbi: Add kfail versions of sbiret_report functions riscv: sbi: Use kfail for known opensbi failures lib/riscv/asm/sbi.h | 6 ++++-- lib/riscv/sbi.c | 18 ++++++++++++++---- riscv/sbi-fwft.c | 20 +++++++++++++------- riscv/sbi-sse.c | 4 ++-- riscv/sbi-tests.h | 20 +++++++++++++++----- riscv/sbi.c | 6 ++++-- 6 files changed, 52 insertions(+), 22 deletions(-) -- 2.48.1 From andrew.jones at linux.dev Fri Mar 21 09:54:07 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Fri, 21 Mar 2025 17:54:07 +0100 Subject: [kvm-unit-tests PATCH 3/3] riscv: sbi: Use kfail for known opensbi failures In-Reply-To: <20250321165403.57859-5-andrew.jones@linux.dev> References: <20250321165403.57859-5-andrew.jones@linux.dev> Message-ID: <20250321165403.57859-8-andrew.jones@linux.dev> Use kfail for the opensbi s/SBI_ERR_DENIED/SBI_ERR_DENIED_LOCKED/ change. We expect it to be fixed in 1.7, so only kfail for opensbi which has a version less than that. Also change the other uses of kfail to only kfail for opensbi versions less than 1.7. Signed-off-by: Andrew Jones --- riscv/sbi-fwft.c | 20 +++++++++++++------- riscv/sbi.c | 6 ++++-- 2 files changed, 17 insertions(+), 9 deletions(-) diff --git a/riscv/sbi-fwft.c b/riscv/sbi-fwft.c index 3d225997c0ec..c52fbd6e77a6 100644 --- a/riscv/sbi-fwft.c +++ b/riscv/sbi-fwft.c @@ -83,19 +83,21 @@ static void fwft_feature_lock_test_values(uint32_t feature, size_t nr_values, report_prefix_push("locked"); + bool kfail = __sbi_get_imp_id() == SBI_IMPL_OPENSBI && + __sbi_get_imp_version() < sbi_impl_opensbi_mk_version(1, 7); + for (int i = 0; i < nr_values; ++i) { ret = fwft_set(feature, test_values[i], 0); - sbiret_report_error(&ret, SBI_ERR_DENIED_LOCKED, - "Set to %lu without lock flag", test_values[i]); + sbiret_kfail_error(kfail, &ret, SBI_ERR_DENIED_LOCKED, + "Set to %lu without lock flag", test_values[i]); ret = fwft_set(feature, test_values[i], SBI_FWFT_SET_FLAG_LOCK); - sbiret_report_error(&ret, SBI_ERR_DENIED_LOCKED, - "Set to %lu with lock flag", test_values[i]); + sbiret_kfail_error(kfail, &ret, SBI_ERR_DENIED_LOCKED, + "Set to %lu with lock flag", test_values[i]); } ret = fwft_get(feature); - sbiret_report(&ret, SBI_SUCCESS, locked_value, - "Get value %lu", locked_value); + sbiret_report(&ret, SBI_SUCCESS, locked_value, "Get value %lu", locked_value); report_prefix_pop(); } @@ -103,6 +105,7 @@ static void fwft_feature_lock_test_values(uint32_t feature, size_t nr_values, static void fwft_feature_lock_test(uint32_t feature, unsigned long locked_value) { unsigned long values[] = {0, 1}; + fwft_feature_lock_test_values(feature, 2, values, locked_value); } @@ -317,7 +320,10 @@ static void fwft_check_pte_ad_hw_updating(void) report(ret.value == 0 || ret.value == 1, "first get value is 0/1"); enabled = ret.value; - report_kfail(true, !enabled, "resets to 0"); + + bool kfail = __sbi_get_imp_id() == SBI_IMPL_OPENSBI && + __sbi_get_imp_version() < sbi_impl_opensbi_mk_version(1, 7); + report_kfail(kfail, !enabled, "resets to 0"); install_exception_handler(EXC_LOAD_PAGE_FAULT, adue_read_handler); install_exception_handler(EXC_STORE_PAGE_FAULT, adue_write_handler); diff --git a/riscv/sbi.c b/riscv/sbi.c index 83bc55125d46..edb1a6bef1ac 100644 --- a/riscv/sbi.c +++ b/riscv/sbi.c @@ -515,10 +515,12 @@ end_two: sbiret_report_error(&ret, SBI_SUCCESS, "no targets, hart_mask_base is 1"); /* Try the next higher hartid than the max */ + bool kfail = __sbi_get_imp_id() == SBI_IMPL_OPENSBI && + __sbi_get_imp_version() < sbi_impl_opensbi_mk_version(1, 7); ret = sbi_send_ipi(2, max_hartid); - report_kfail(true, ret.error == SBI_ERR_INVALID_PARAM, "hart_mask got expected error (%ld)", ret.error); + sbiret_kfail_error(kfail, &ret, SBI_ERR_INVALID_PARAM, "hart_mask"); ret = sbi_send_ipi(1, max_hartid + 1); - report_kfail(true, ret.error == SBI_ERR_INVALID_PARAM, "hart_mask_base got expected error (%ld)", ret.error); + sbiret_kfail_error(kfail, &ret, SBI_ERR_INVALID_PARAM, "hart_mask_base"); report_prefix_pop(); -- 2.48.1 From andrew.jones at linux.dev Fri Mar 21 09:54:05 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Fri, 21 Mar 2025 17:54:05 +0100 Subject: [kvm-unit-tests PATCH 1/3] lib/riscv: Also provide sbiret impl functions In-Reply-To: <20250321165403.57859-5-andrew.jones@linux.dev> References: <20250321165403.57859-5-andrew.jones@linux.dev> Message-ID: <20250321165403.57859-6-andrew.jones@linux.dev> We almost always return sbiret from sbi wrapper functions so do that for sbi_get_imp_version() and sbi_get_imp_id(), but asserting no error and returning the value is also useful, so continue to provide those functions too, just with a slightly different name. Signed-off-by: Andrew Jones --- lib/riscv/asm/sbi.h | 6 ++++-- lib/riscv/sbi.c | 18 ++++++++++++++---- riscv/sbi-sse.c | 4 ++-- 3 files changed, 20 insertions(+), 8 deletions(-) diff --git a/lib/riscv/asm/sbi.h b/lib/riscv/asm/sbi.h index edaee462c3fa..a5738a5ce209 100644 --- a/lib/riscv/asm/sbi.h +++ b/lib/riscv/asm/sbi.h @@ -260,9 +260,11 @@ struct sbiret sbi_send_ipi_cpumask(const cpumask_t *mask); struct sbiret sbi_send_ipi_broadcast(void); struct sbiret sbi_set_timer(unsigned long stime_value); struct sbiret sbi_get_spec_version(void); -unsigned long sbi_get_imp_version(void); -unsigned long sbi_get_imp_id(void); +struct sbiret sbi_get_imp_version(void); +struct sbiret sbi_get_imp_id(void); long sbi_probe(int ext); +unsigned long __sbi_get_imp_version(void); +unsigned long __sbi_get_imp_id(void); typedef void (*sbi_sse_handler_fn)(void *data, struct pt_regs *regs, unsigned int hartid); diff --git a/lib/riscv/sbi.c b/lib/riscv/sbi.c index 53d25489f905..2959378f64bb 100644 --- a/lib/riscv/sbi.c +++ b/lib/riscv/sbi.c @@ -183,21 +183,31 @@ struct sbiret sbi_set_timer(unsigned long stime_value) return sbi_ecall(SBI_EXT_TIME, SBI_EXT_TIME_SET_TIMER, stime_value, 0, 0, 0, 0, 0); } -unsigned long sbi_get_imp_version(void) +struct sbiret sbi_get_imp_version(void) +{ + return sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_IMP_VERSION, 0, 0, 0, 0, 0, 0); +} + +struct sbiret sbi_get_imp_id(void) +{ + return sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_IMP_ID, 0, 0, 0, 0, 0, 0); +} + +unsigned long __sbi_get_imp_version(void) { struct sbiret ret; - ret = sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_IMP_VERSION, 0, 0, 0, 0, 0, 0); + ret = sbi_get_imp_version(); assert(!ret.error); return ret.value; } -unsigned long sbi_get_imp_id(void) +unsigned long __sbi_get_imp_id(void) { struct sbiret ret; - ret = sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_IMP_ID, 0, 0, 0, 0, 0, 0); + ret = sbi_get_imp_id(); assert(!ret.error); return ret.value; diff --git a/riscv/sbi-sse.c b/riscv/sbi-sse.c index 97e07725c359..bc6afaf5481e 100644 --- a/riscv/sbi-sse.c +++ b/riscv/sbi-sse.c @@ -1232,8 +1232,8 @@ void check_sse(void) return; } - if (sbi_get_imp_id() == SBI_IMPL_OPENSBI && - sbi_get_imp_version() < sbi_impl_opensbi_mk_version(1, 7)) { + if (__sbi_get_imp_id() == SBI_IMPL_OPENSBI && + __sbi_get_imp_version() < sbi_impl_opensbi_mk_version(1, 7)) { report_skip("OpenSBI < v1.7 detected, skipping tests"); report_prefix_pop(); return; -- 2.48.1 From andrew.jones at linux.dev Fri Mar 21 09:54:06 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Fri, 21 Mar 2025 17:54:06 +0100 Subject: [kvm-unit-tests PATCH 2/3] riscv: sbi: Add kfail versions of sbiret_report functions In-Reply-To: <20250321165403.57859-5-andrew.jones@linux.dev> References: <20250321165403.57859-5-andrew.jones@linux.dev> Message-ID: <20250321165403.57859-7-andrew.jones@linux.dev> report_kfail is useful for SBI testing to allowing CI to PASS even when SBI implementations have known failures. Since sbiret_report functions are frequently used by SBI tests, make kfail versions of them too. Signed-off-by: Andrew Jones --- riscv/sbi-tests.h | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/riscv/sbi-tests.h b/riscv/sbi-tests.h index ddfad7fef293..d5c4ae709632 100644 --- a/riscv/sbi-tests.h +++ b/riscv/sbi-tests.h @@ -39,7 +39,8 @@ #include #include -#define __sbiret_report(ret, expected_error, expected_value, has_value, expected_error_name, fmt, ...) ({ \ +#define __sbiret_report(kfail, ret, expected_error, expected_value, \ + has_value, expected_error_name, fmt, ...) ({ \ long ex_err = expected_error; \ long ex_val = expected_value; \ bool has_val = !!(has_value); \ @@ -48,9 +49,9 @@ bool pass; \ \ if (has_val) \ - pass = report(ch_err && ch_val, fmt, ##__VA_ARGS__); \ + pass = report_kfail(kfail, ch_err && ch_val, fmt, ##__VA_ARGS__); \ else \ - pass = report(ch_err, fmt ": %s", ##__VA_ARGS__, expected_error_name); \ + pass = report_kfail(kfail, ch_err, fmt ": %s", ##__VA_ARGS__, expected_error_name); \ \ if (!pass && has_val) \ report_info(fmt ": expected (error: %ld, value: %ld), received: (error: %ld, value %ld)", \ @@ -63,14 +64,23 @@ }) #define sbiret_report(ret, expected_error, expected_value, ...) \ - __sbiret_report(ret, expected_error, expected_value, true, #expected_error, __VA_ARGS__) + __sbiret_report(false, ret, expected_error, expected_value, true, #expected_error, __VA_ARGS__) #define sbiret_report_error(ret, expected_error, ...) \ - __sbiret_report(ret, expected_error, 0, false, #expected_error, __VA_ARGS__) + __sbiret_report(false, ret, expected_error, 0, false, #expected_error, __VA_ARGS__) #define sbiret_check(ret, expected_error, expected_value) \ sbiret_report(ret, expected_error, expected_value, "check sbi.error and sbi.value") +#define sbiret_kfail(kfail, ret, expected_error, expected_value, ...) \ + __sbiret_report(kfail, ret, expected_error, expected_value, true, #expected_error, __VA_ARGS__) + +#define sbiret_kfail_error(kfail, ret, expected_error, ...) \ + __sbiret_report(kfail, ret, expected_error, 0, false, #expected_error, __VA_ARGS__) + +#define sbiret_check_kfail(kfail, ret, expected_error, expected_value) \ + sbiret_kfail(kfail, ret, expected_error, expected_value, "check sbi.error and sbi.value") + static inline bool env_or_skip(const char *env) { if (!getenv(env)) { -- 2.48.1 From cleger at rivosinc.com Fri Mar 21 13:15:31 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Fri, 21 Mar 2025 21:15:31 +0100 Subject: [kvm-unit-tests PATCH 1/3] lib/riscv: Also provide sbiret impl functions In-Reply-To: <20250321165403.57859-6-andrew.jones@linux.dev> References: <20250321165403.57859-5-andrew.jones@linux.dev> <20250321165403.57859-6-andrew.jones@linux.dev> Message-ID: <6f30d2a2-0565-42cf-bc9a-e0c35e27627f@rivosinc.com> On 21/03/2025 17:54, Andrew Jones wrote: > We almost always return sbiret from sbi wrapper functions so > do that for sbi_get_imp_version() and sbi_get_imp_id(), but > asserting no error and returning the value is also useful, > so continue to provide those functions too, just with a slightly > different name. > > Signed-off-by: Andrew Jones > --- > lib/riscv/asm/sbi.h | 6 ++++-- > lib/riscv/sbi.c | 18 ++++++++++++++---- > riscv/sbi-sse.c | 4 ++-- > 3 files changed, 20 insertions(+), 8 deletions(-) > > diff --git a/lib/riscv/asm/sbi.h b/lib/riscv/asm/sbi.h > index edaee462c3fa..a5738a5ce209 100644 > --- a/lib/riscv/asm/sbi.h > +++ b/lib/riscv/asm/sbi.h > @@ -260,9 +260,11 @@ struct sbiret sbi_send_ipi_cpumask(const cpumask_t *mask); > struct sbiret sbi_send_ipi_broadcast(void); > struct sbiret sbi_set_timer(unsigned long stime_value); > struct sbiret sbi_get_spec_version(void); > -unsigned long sbi_get_imp_version(void); > -unsigned long sbi_get_imp_id(void); > +struct sbiret sbi_get_imp_version(void); > +struct sbiret sbi_get_imp_id(void); > long sbi_probe(int ext); > +unsigned long __sbi_get_imp_version(void); > +unsigned long __sbi_get_imp_id(void); > > typedef void (*sbi_sse_handler_fn)(void *data, struct pt_regs *regs, unsigned int hartid); > > diff --git a/lib/riscv/sbi.c b/lib/riscv/sbi.c > index 53d25489f905..2959378f64bb 100644 > --- a/lib/riscv/sbi.c > +++ b/lib/riscv/sbi.c > @@ -183,21 +183,31 @@ struct sbiret sbi_set_timer(unsigned long stime_value) > return sbi_ecall(SBI_EXT_TIME, SBI_EXT_TIME_SET_TIMER, stime_value, 0, 0, 0, 0, 0); > } > > -unsigned long sbi_get_imp_version(void) > +struct sbiret sbi_get_imp_version(void) > +{ > + return sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_IMP_VERSION, 0, 0, 0, 0, 0, 0); > +} > + > +struct sbiret sbi_get_imp_id(void) > +{ > + return sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_IMP_ID, 0, 0, 0, 0, 0, 0); > +} > + > +unsigned long __sbi_get_imp_version(void) > { > struct sbiret ret; > > - ret = sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_IMP_VERSION, 0, 0, 0, 0, 0, 0); > + ret = sbi_get_imp_version(); > assert(!ret.error); > > return ret.value; > } > > -unsigned long sbi_get_imp_id(void) > +unsigned long __sbi_get_imp_id(void) > { > struct sbiret ret; > > - ret = sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_GET_IMP_ID, 0, 0, 0, 0, 0, 0); > + ret = sbi_get_imp_id(); > assert(!ret.error); > > return ret.value; > diff --git a/riscv/sbi-sse.c b/riscv/sbi-sse.c > index 97e07725c359..bc6afaf5481e 100644 > --- a/riscv/sbi-sse.c > +++ b/riscv/sbi-sse.c > @@ -1232,8 +1232,8 @@ void check_sse(void) > return; > } > > - if (sbi_get_imp_id() == SBI_IMPL_OPENSBI && > - sbi_get_imp_version() < sbi_impl_opensbi_mk_version(1, 7)) { > + if (__sbi_get_imp_id() == SBI_IMPL_OPENSBI && > + __sbi_get_imp_version() < sbi_impl_opensbi_mk_version(1, 7)) { > report_skip("OpenSBI < v1.7 detected, skipping tests"); > report_prefix_pop(); > return; Hi Andrew, Looks good to me, Reviewed-by: Cl?ment L?ger Thanks, Cl?ment From cleger at rivosinc.com Fri Mar 21 13:17:16 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Fri, 21 Mar 2025 21:17:16 +0100 Subject: [kvm-unit-tests PATCH 2/3] riscv: sbi: Add kfail versions of sbiret_report functions In-Reply-To: <20250321165403.57859-7-andrew.jones@linux.dev> References: <20250321165403.57859-5-andrew.jones@linux.dev> <20250321165403.57859-7-andrew.jones@linux.dev> Message-ID: <1f900b99-e260-42cc-9e5d-ea4e7a4365ec@rivosinc.com> On 21/03/2025 17:54, Andrew Jones wrote: > report_kfail is useful for SBI testing to allowing CI to PASS even > when SBI implementations have known failures. Since sbiret_report > functions are frequently used by SBI tests, make kfail versions of > them too. > > Signed-off-by: Andrew Jones > --- > riscv/sbi-tests.h | 20 +++++++++++++++----- > 1 file changed, 15 insertions(+), 5 deletions(-) > > diff --git a/riscv/sbi-tests.h b/riscv/sbi-tests.h > index ddfad7fef293..d5c4ae709632 100644 > --- a/riscv/sbi-tests.h > +++ b/riscv/sbi-tests.h > @@ -39,7 +39,8 @@ > #include > #include > > -#define __sbiret_report(ret, expected_error, expected_value, has_value, expected_error_name, fmt, ...) ({ \ > +#define __sbiret_report(kfail, ret, expected_error, expected_value, \ > + has_value, expected_error_name, fmt, ...) ({ \ > long ex_err = expected_error; \ > long ex_val = expected_value; \ > bool has_val = !!(has_value); \ > @@ -48,9 +49,9 @@ > bool pass; \ > \ > if (has_val) \ > - pass = report(ch_err && ch_val, fmt, ##__VA_ARGS__); \ > + pass = report_kfail(kfail, ch_err && ch_val, fmt, ##__VA_ARGS__); \ > else \ > - pass = report(ch_err, fmt ": %s", ##__VA_ARGS__, expected_error_name); \ > + pass = report_kfail(kfail, ch_err, fmt ": %s", ##__VA_ARGS__, expected_error_name); \ > \ > if (!pass && has_val) \ > report_info(fmt ": expected (error: %ld, value: %ld), received: (error: %ld, value %ld)", \ > @@ -63,14 +64,23 @@ > }) > > #define sbiret_report(ret, expected_error, expected_value, ...) \ > - __sbiret_report(ret, expected_error, expected_value, true, #expected_error, __VA_ARGS__) > + __sbiret_report(false, ret, expected_error, expected_value, true, #expected_error, __VA_ARGS__) > > #define sbiret_report_error(ret, expected_error, ...) \ > - __sbiret_report(ret, expected_error, 0, false, #expected_error, __VA_ARGS__) > + __sbiret_report(false, ret, expected_error, 0, false, #expected_error, __VA_ARGS__) > > #define sbiret_check(ret, expected_error, expected_value) \ > sbiret_report(ret, expected_error, expected_value, "check sbi.error and sbi.value") > > +#define sbiret_kfail(kfail, ret, expected_error, expected_value, ...) \ > + __sbiret_report(kfail, ret, expected_error, expected_value, true, #expected_error, __VA_ARGS__) > + > +#define sbiret_kfail_error(kfail, ret, expected_error, ...) \ > + __sbiret_report(kfail, ret, expected_error, 0, false, #expected_error, __VA_ARGS__) > + > +#define sbiret_check_kfail(kfail, ret, expected_error, expected_value) \ > + sbiret_kfail(kfail, ret, expected_error, expected_value, "check sbi.error and sbi.value") > + > static inline bool env_or_skip(const char *env) > { > if (!getenv(env)) { Hi Andrew, I needed that as well in another test so: Reviewed-by: Cl?ment L?ger Thanks, Cl?ment From cleger at rivosinc.com Fri Mar 21 13:22:19 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Fri, 21 Mar 2025 21:22:19 +0100 Subject: [kvm-unit-tests PATCH 3/3] riscv: sbi: Use kfail for known opensbi failures In-Reply-To: <20250321165403.57859-8-andrew.jones@linux.dev> References: <20250321165403.57859-5-andrew.jones@linux.dev> <20250321165403.57859-8-andrew.jones@linux.dev> Message-ID: On 21/03/2025 17:54, Andrew Jones wrote: > Use kfail for the opensbi s/SBI_ERR_DENIED/SBI_ERR_DENIED_LOCKED/ > change. We expect it to be fixed in 1.7, so only kfail for opensbi > which has a version less than that. Also change the other uses of > kfail to only kfail for opensbi versions less than 1.7. > > Signed-off-by: Andrew Jones > --- > riscv/sbi-fwft.c | 20 +++++++++++++------- > riscv/sbi.c | 6 ++++-- > 2 files changed, 17 insertions(+), 9 deletions(-) > > diff --git a/riscv/sbi-fwft.c b/riscv/sbi-fwft.c > index 3d225997c0ec..c52fbd6e77a6 100644 > --- a/riscv/sbi-fwft.c > +++ b/riscv/sbi-fwft.c > @@ -83,19 +83,21 @@ static void fwft_feature_lock_test_values(uint32_t feature, size_t nr_values, > > report_prefix_push("locked"); > > + bool kfail = __sbi_get_imp_id() == SBI_IMPL_OPENSBI && > + __sbi_get_imp_version() < sbi_impl_opensbi_mk_version(1, 7); > + > for (int i = 0; i < nr_values; ++i) { > ret = fwft_set(feature, test_values[i], 0); > - sbiret_report_error(&ret, SBI_ERR_DENIED_LOCKED, > - "Set to %lu without lock flag", test_values[i]); > + sbiret_kfail_error(kfail, &ret, SBI_ERR_DENIED_LOCKED, > + "Set to %lu without lock flag", test_values[i]); > > ret = fwft_set(feature, test_values[i], SBI_FWFT_SET_FLAG_LOCK); > - sbiret_report_error(&ret, SBI_ERR_DENIED_LOCKED, > - "Set to %lu with lock flag", test_values[i]); > + sbiret_kfail_error(kfail, &ret, SBI_ERR_DENIED_LOCKED, > + "Set to %lu with lock flag", test_values[i]); > } > > ret = fwft_get(feature); > - sbiret_report(&ret, SBI_SUCCESS, locked_value, > - "Get value %lu", locked_value); > + sbiret_report(&ret, SBI_SUCCESS, locked_value, "Get value %lu", locked_value); Reformatting ? > > report_prefix_pop(); > } > @@ -103,6 +105,7 @@ static void fwft_feature_lock_test_values(uint32_t feature, size_t nr_values, > static void fwft_feature_lock_test(uint32_t feature, unsigned long locked_value) > { > unsigned long values[] = {0, 1}; > + That's some spurious newline here. > fwft_feature_lock_test_values(feature, 2, values, locked_value); > } > > @@ -317,7 +320,10 @@ static void fwft_check_pte_ad_hw_updating(void) > report(ret.value == 0 || ret.value == 1, "first get value is 0/1"); > > enabled = ret.value; > - report_kfail(true, !enabled, "resets to 0"); > + > + bool kfail = __sbi_get_imp_id() == SBI_IMPL_OPENSBI && > + __sbi_get_imp_version() < sbi_impl_opensbi_mk_version(1, 7); > + report_kfail(kfail, !enabled, "resets to 0"); > > install_exception_handler(EXC_LOAD_PAGE_FAULT, adue_read_handler); > install_exception_handler(EXC_STORE_PAGE_FAULT, adue_write_handler); > diff --git a/riscv/sbi.c b/riscv/sbi.c > index 83bc55125d46..edb1a6bef1ac 100644 > --- a/riscv/sbi.c > +++ b/riscv/sbi.c > @@ -515,10 +515,12 @@ end_two: > sbiret_report_error(&ret, SBI_SUCCESS, "no targets, hart_mask_base is 1"); > > /* Try the next higher hartid than the max */ > + bool kfail = __sbi_get_imp_id() == SBI_IMPL_OPENSBI && > + __sbi_get_imp_version() < sbi_impl_opensbi_mk_version(1, 7); > ret = sbi_send_ipi(2, max_hartid);> - report_kfail(true, ret.error == SBI_ERR_INVALID_PARAM, "hart_mask got expected error (%ld)", ret.error); > + sbiret_kfail_error(kfail, &ret, SBI_ERR_INVALID_PARAM, "hart_mask"); > ret = sbi_send_ipi(1, max_hartid + 1); > - report_kfail(true, ret.error == SBI_ERR_INVALID_PARAM, "hart_mask_base got expected error (%ld)", ret.error); > + sbiret_kfail_error(kfail, &ret, SBI_ERR_INVALID_PARAM, "hart_mask_base"); > > report_prefix_pop(); > Hi Andrew, I tried thinking of some way to factorize the version check but can't really find something elegant. Without the spurious newline: Reviewed-by: Cl?ment L?ger Thanks, Cl?ment From andrew.jones at linux.dev Sat Mar 22 00:38:51 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Sat, 22 Mar 2025 08:38:51 +0100 Subject: [kvm-unit-tests PATCH 3/3] riscv: sbi: Use kfail for known opensbi failures In-Reply-To: References: <20250321165403.57859-5-andrew.jones@linux.dev> <20250321165403.57859-8-andrew.jones@linux.dev> Message-ID: <20250322-3d41f407f8d352d262718c20@orel> On Fri, Mar 21, 2025 at 09:22:19PM +0100, Cl?ment L?ger wrote: > > > On 21/03/2025 17:54, Andrew Jones wrote: > > Use kfail for the opensbi s/SBI_ERR_DENIED/SBI_ERR_DENIED_LOCKED/ > > change. We expect it to be fixed in 1.7, so only kfail for opensbi > > which has a version less than that. Also change the other uses of > > kfail to only kfail for opensbi versions less than 1.7. > > > > Signed-off-by: Andrew Jones > > --- > > riscv/sbi-fwft.c | 20 +++++++++++++------- > > riscv/sbi.c | 6 ++++-- > > 2 files changed, 17 insertions(+), 9 deletions(-) > > > > diff --git a/riscv/sbi-fwft.c b/riscv/sbi-fwft.c > > index 3d225997c0ec..c52fbd6e77a6 100644 > > --- a/riscv/sbi-fwft.c > > +++ b/riscv/sbi-fwft.c > > @@ -83,19 +83,21 @@ static void fwft_feature_lock_test_values(uint32_t feature, size_t nr_values, > > > > report_prefix_push("locked"); > > > > + bool kfail = __sbi_get_imp_id() == SBI_IMPL_OPENSBI && > > + __sbi_get_imp_version() < sbi_impl_opensbi_mk_version(1, 7); > > + > > for (int i = 0; i < nr_values; ++i) { > > ret = fwft_set(feature, test_values[i], 0); > > - sbiret_report_error(&ret, SBI_ERR_DENIED_LOCKED, > > - "Set to %lu without lock flag", test_values[i]); > > + sbiret_kfail_error(kfail, &ret, SBI_ERR_DENIED_LOCKED, > > + "Set to %lu without lock flag", test_values[i]); > > > > ret = fwft_set(feature, test_values[i], SBI_FWFT_SET_FLAG_LOCK); > > - sbiret_report_error(&ret, SBI_ERR_DENIED_LOCKED, > > - "Set to %lu with lock flag", test_values[i]); > > + sbiret_kfail_error(kfail, &ret, SBI_ERR_DENIED_LOCKED, > > + "Set to %lu with lock flag", test_values[i]); > > } > > > > ret = fwft_get(feature); > > - sbiret_report(&ret, SBI_SUCCESS, locked_value, > > - "Get value %lu", locked_value); > > + sbiret_report(&ret, SBI_SUCCESS, locked_value, "Get value %lu", locked_value); > > Reformatting ? Yup, and the "Set..." strings above. I missed that the format was wrong when I applied the fwft_feature_lock_test_values patch and just lazily fixed it up with this patch. I still haven't merged to the master branch yet, so I can still squash a formatting fix into the fwft_feature_lock_test_values patch in order to make this patch cleaner. > > > > > report_prefix_pop(); > > } > > @@ -103,6 +105,7 @@ static void fwft_feature_lock_test_values(uint32_t feature, size_t nr_values, > > static void fwft_feature_lock_test(uint32_t feature, unsigned long locked_value) > > { > > unsigned long values[] = {0, 1}; > > + > > That's some spurious newline here. It's also reformatting. > > > > fwft_feature_lock_test_values(feature, 2, values, locked_value); > > } > > > > @@ -317,7 +320,10 @@ static void fwft_check_pte_ad_hw_updating(void) > > report(ret.value == 0 || ret.value == 1, "first get value is 0/1"); > > > > enabled = ret.value; > > - report_kfail(true, !enabled, "resets to 0"); > > + > > + bool kfail = __sbi_get_imp_id() == SBI_IMPL_OPENSBI && > > + __sbi_get_imp_version() < sbi_impl_opensbi_mk_version(1, 7); > > + report_kfail(kfail, !enabled, "resets to 0"); > > > > install_exception_handler(EXC_LOAD_PAGE_FAULT, adue_read_handler); > > install_exception_handler(EXC_STORE_PAGE_FAULT, adue_write_handler); > > diff --git a/riscv/sbi.c b/riscv/sbi.c > > index 83bc55125d46..edb1a6bef1ac 100644 > > --- a/riscv/sbi.c > > +++ b/riscv/sbi.c > > @@ -515,10 +515,12 @@ end_two: > > sbiret_report_error(&ret, SBI_SUCCESS, "no targets, hart_mask_base is 1"); > > > > /* Try the next higher hartid than the max */ > > + bool kfail = __sbi_get_imp_id() == SBI_IMPL_OPENSBI && > > + __sbi_get_imp_version() < sbi_impl_opensbi_mk_version(1, 7); > > ret = sbi_send_ipi(2, max_hartid);> - report_kfail(true, ret.error > == SBI_ERR_INVALID_PARAM, "hart_mask got expected error (%ld)", ret.error); > > + sbiret_kfail_error(kfail, &ret, SBI_ERR_INVALID_PARAM, "hart_mask"); > > ret = sbi_send_ipi(1, max_hartid + 1); > > - report_kfail(true, ret.error == SBI_ERR_INVALID_PARAM, "hart_mask_base got expected error (%ld)", ret.error); > > + sbiret_kfail_error(kfail, &ret, SBI_ERR_INVALID_PARAM, "hart_mask_base"); > > > > report_prefix_pop(); > > > > Hi Andrew, > > I tried thinking of some way to factorize the version check but can't > really find something elegant. Without the spurious newline: I'll move the reformatting to the fwft_feature_lock_test_values patch, but I'm generally not overly opposed to sneaking a couple reformatting fixes into patches when the reformatting is obvious enough. > > Reviewed-by: Cl?ment L?ger Thanks, drew > > Thanks, > > Cl?ment > > -- > kvm-riscv mailing list > kvm-riscv at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kvm-riscv From andrew.jones at linux.dev Sat Mar 22 03:46:18 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Sat, 22 Mar 2025 11:46:18 +0100 Subject: [kvm-unit-tests PATCH v11 0/8] riscv: add SBI SSE extension tests In-Reply-To: <20250317164655.1120015-1-cleger@rivosinc.com> References: <20250317164655.1120015-1-cleger@rivosinc.com> Message-ID: <20250322-2b07672652d6ce2c0972a62d@orel> On Mon, Mar 17, 2025 at 05:46:45PM +0100, Cl?ment L?ger wrote: > This series adds tests for SBI SSE extension as well as needed > infrastructure for SSE support. It also adds test specific asm-offsets > generation to use custom OFFSET and DEFINE from the test directory. > > These tests can be run using an OpenSBI version that implements latest > specifications modification [1] > > Link: https://github.com/rivosinc/opensbi/tree/dev/cleger/sse [1] Merged. Thanks, drew From andrew.jones at linux.dev Sat Mar 22 03:47:14 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Sat, 22 Mar 2025 11:47:14 +0100 Subject: [kvm-unit-tests PATCH 0/3] riscv: sbi: Ensure we can pass with any opensbi In-Reply-To: <20250321165403.57859-5-andrew.jones@linux.dev> References: <20250321165403.57859-5-andrew.jones@linux.dev> Message-ID: <20250322-58f9c16024c28e345fd7ed02@orel> On Fri, Mar 21, 2025 at 05:54:04PM +0100, Andrew Jones wrote: > At some point CI will update the version of QEMU it uses. If the version > selected includes an opensbi that doesn't have all the fixes for current > tests, then CI will start failing for riscv. Guard against that by using > kfail for known opensbi failures. We only kfail for opensbi and for its > versions less than 1.7, though, as we expect everything to be fixed then. > > Andrew Jones (3): > lib/riscv: Also provide sbiret impl functions > riscv: sbi: Add kfail versions of sbiret_report functions > riscv: sbi: Use kfail for known opensbi failures > > lib/riscv/asm/sbi.h | 6 ++++-- > lib/riscv/sbi.c | 18 ++++++++++++++---- > riscv/sbi-fwft.c | 20 +++++++++++++------- > riscv/sbi-sse.c | 4 ++-- > riscv/sbi-tests.h | 20 +++++++++++++++----- > riscv/sbi.c | 6 ++++-- > 6 files changed, 52 insertions(+), 22 deletions(-) > > -- > 2.48.1 > Merged. Thanks, drew From andrew.jones at linux.dev Sat Mar 22 03:49:01 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Sat, 22 Mar 2025 11:49:01 +0100 Subject: [kvm-unit-tests PATCH] riscv: sbi: Add no target ipi test In-Reply-To: <20250317120814.35241-2-andrew.jones@linux.dev> References: <20250317120814.35241-2-andrew.jones@linux.dev> Message-ID: <20250322-954bef7c21610bcc571560d2@orel> On Mon, Mar 17, 2025 at 01:08:15PM +0100, Andrew Jones wrote: > While an hmask of zero without an hbase of -1 (which would mean > broadcast) is pointless, since nothing is targeted, the spec > doesn't say it should return an error. Maybe it should? Anyway, > for now, just confirm that it "works". > > Signed-off-by: Andrew Jones > --- > riscv/sbi.c | 4 ++++ > 1 file changed, 4 insertions(+) > Merged. Thanks, drew From andrew.jones at linux.dev Sat Mar 22 04:04:00 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Sat, 22 Mar 2025 12:04:00 +0100 Subject: [kvm-unit-tests PATCH v2 1/5] configure: arm64: Don't display 'aarch64' as the default architecture In-Reply-To: <20250314154904.3946484-3-jean-philippe@linaro.org> References: <20250314154904.3946484-2-jean-philippe@linaro.org> <20250314154904.3946484-3-jean-philippe@linaro.org> Message-ID: <20250322-4802e87f0e18f0b787adb5fb@orel> On Fri, Mar 14, 2025 at 03:49:01PM +0000, Jean-Philippe Brucker wrote: > From: Alexandru Elisei > > --arch=aarch64, intentional or not, has been supported since the initial > arm64 support, commit 39ac3f8494be ("arm64: initial drop"). However, > "aarch64" does not show up in the list of supported architectures, but > it's displayed as the default architecture if doing ./configure --help > on an arm64 machine. > > Keep everything consistent and make sure that the default value for > $arch is "arm64", but still allow --arch=aarch64, in case they are users > that use this configuration for kvm-unit-tests. > > The help text for --arch changes from: > > --arch=ARCH architecture to compile for (aarch64). ARCH can be one of: > arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 > > to: > > --arch=ARCH architecture to compile for (arm64). ARCH can be one of: > arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 > > Signed-off-by: Alexandru Elisei > --- > configure | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/configure b/configure > index 06532a89..dc3413fc 100755 > --- a/configure > +++ b/configure > @@ -15,8 +15,9 @@ objdump=objdump > readelf=readelf > ar=ar > addr2line=addr2line > -arch=$(uname -m | sed -e 's/i.86/i386/;s/arm64/aarch64/;s/arm.*/arm/;s/ppc64.*/ppc64/') > -host=$arch > +host=$(uname -m | sed -e 's/i.86/i386/;s/arm64/aarch64/;s/arm.*/arm/;s/ppc64.*/ppc64/') > +arch=$host > +[ "$arch" = "aarch64" ] && arch="arm64" I'd prefer we keep our block of assignments a block of assignments. We can put this at the bottom of the block, or, since the whole point of this is to make sure help text looks right, then just put it in usage() at the top. Thanks, drew > cross_prefix= > endian="" > pretty_print_stacks=yes > -- > 2.48.1 > > > -- > kvm-riscv mailing list > kvm-riscv at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kvm-riscv From andrew.jones at linux.dev Sat Mar 22 04:07:55 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Sat, 22 Mar 2025 12:07:55 +0100 Subject: [kvm-unit-tests PATCH v2 2/5] configure: arm/arm64: Display the correct default processor In-Reply-To: <20250314154904.3946484-4-jean-philippe@linaro.org> References: <20250314154904.3946484-2-jean-philippe@linaro.org> <20250314154904.3946484-4-jean-philippe@linaro.org> Message-ID: <20250322-0b1d267fc4717085047e1739@orel> On Fri, Mar 14, 2025 at 03:49:02PM +0000, Jean-Philippe Brucker wrote: > From: Alexandru Elisei > > The help text for the --processor option displays the architecture name as > the default processor type. But the default for arm is cortex-a15, and for > arm64 is cortex-a57. Teach configure to display the correct default > processor type for these two architectures. > > Signed-off-by: Alexandru Elisei > --- > configure | 30 ++++++++++++++++++++++-------- > 1 file changed, 22 insertions(+), 8 deletions(-) > > diff --git a/configure b/configure > index dc3413fc..5306bad3 100755 > --- a/configure > +++ b/configure > @@ -5,6 +5,24 @@ if [ -z "${BASH_VERSINFO[0]}" ] || [ "${BASH_VERSINFO[0]}" -lt 4 ] ; then > exit 1 > fi > > +function get_default_processor() > +{ > + local arch="$1" > + > + case "$arch" in > + "arm") > + default_processor="cortex-a15" > + ;; > + "arm64") > + default_processor="cortex-a57" > + ;; > + *) > + default_processor=$arch Missing ';;' > + esac > + > + echo "$default_processor" > +} > + > srcdir=$(cd "$(dirname "$0")"; pwd) > prefix=/usr/local > cc=gcc > @@ -43,13 +61,14 @@ else > fi > > usage() { > + [ -z "$processor" ] && processor=$(get_default_processor $arch) Seeing this convinces me that the additional arch=arm64 from the last patch should be done here above this. > cat <<-EOF > Usage: $0 [options] > > Options include: > --arch=ARCH architecture to compile for ($arch). ARCH can be one of: > arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 > - --processor=PROCESSOR processor to compile for ($arch) > + --processor=PROCESSOR processor to compile for ($processor) > --target=TARGET target platform that the tests will be running on (qemu or > kvmtool, default is qemu) (arm/arm64 only) > --cross-prefix=PREFIX cross compiler prefix > @@ -319,13 +338,8 @@ if [ "$earlycon" ]; then > fi > fi > > -[ -z "$processor" ] && processor="$arch" > - > -if [ "$processor" = "arm64" ]; then > - processor="cortex-a57" > -elif [ "$processor" = "arm" ]; then > - processor="cortex-a15" > -fi > +# $arch will have changed when cross-compiling. > +[ -z "$processor" ] && processor=$(get_default_processor $arch) > > if [ "$arch" = "i386" ] || [ "$arch" = "x86_64" ]; then > testdir=x86 > -- > 2.48.1 > Otherwise, Reviewed-by: Andrew Jones From andrew.jones at linux.dev Sat Mar 22 04:25:02 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Sat, 22 Mar 2025 12:25:02 +0100 Subject: [kvm-unit-tests PATCH v2 4/5] configure: Add --qemu-cpu option In-Reply-To: References: <20250314154904.3946484-2-jean-philippe@linaro.org> <20250314154904.3946484-6-jean-philippe@linaro.org> Message-ID: <20250322-f2d94472d7ebc003a54d7a9d@orel> On Thu, Mar 20, 2025 at 01:42:01PM +0000, Alexandru Elisei wrote: ... > > mach=$MACHINE_OVERRIDE > > -processor=$PROCESSOR_OVERRIDE > > +qemu_cpu=$QEMU_CPU_OVERRIDE > > The name QEMU_CPU_OVERRIDE makes more sense, but I think this will break > existing setups where people use PROCESSOR_OVERRIDE to pass a particular > -cpu value to qemu. > > If I were to guess, the environment variable was added to pass a value to > -cpu different than what it can be specified with ./configure --processor, > which is exactly what --qemu-cpu does. But I'll let Drew comment on that. Yeah, ideally we wouldn't rename environment variables, but I'll cross my fingers that nobody will notice. Indeed, it was created to avoid needing to reconfigure for experimenting with different QEMU cpu models, so, even with --qemu-cpu we'll want something and the renaming here makes sense. > > > firmware=$FIRMWARE_OVERRIDE > > > > -[ "$PROCESSOR" = "$ARCH" ] && PROCESSOR="max" > > +[ -z "$QEMU_CPU" ] && QEMU_CPU="max" > > : "${mach:=virt}" > > -: "${processor:=$PROCESSOR}" > > +: "${qemu_cpu:=$QEMU_CPU}" > > : "${firmware:=$FIRMWARE}" > > [ "$firmware" ] && firmware="-bios $firmware" > > > > @@ -32,7 +32,7 @@ fi > > mach="-machine $mach" > > > > command="$qemu -nodefaults -nographic -serial mon:stdio" > > -command+=" $mach $acc $firmware -cpu $processor " > > +command+=" $mach $acc $firmware -cpu $qemu_cpu " > > command="$(migration_cmd) $(timeout_cmd) $command" > > > > if [ "$UEFI_SHELL_RUN" = "y" ]; then > > diff --git a/configure b/configure > > index 5306bad3..d25bd23e 100755 > > --- a/configure > > +++ b/configure > > @@ -52,6 +52,7 @@ page_size= > > earlycon= > > efi= > > efi_direct= > > +qemu_cpu= > > > > # Enable -Werror by default for git repositories only (i.e. developer builds) > > if [ -e "$srcdir"/.git ]; then > > @@ -69,6 +70,8 @@ usage() { > > --arch=ARCH architecture to compile for ($arch). ARCH can be one of: > > arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 > > --processor=PROCESSOR processor to compile for ($processor) > > + --qemu-cpu=CPU the CPU model to run on. The default depends on > > + the configuration, usually it is "host" or "max". > > Nitpick here, would you mind changing this to "The default value depends on > the host system and the test configuration [..]", to make it clear that it also > depends on the machine the tests are being run on? I agree. Thanks, drew From andrew.jones at linux.dev Sat Mar 22 04:26:59 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Sat, 22 Mar 2025 12:26:59 +0100 Subject: [kvm-unit-tests PATCH v2 4/5] configure: Add --qemu-cpu option In-Reply-To: <20250314154904.3946484-6-jean-philippe@linaro.org> References: <20250314154904.3946484-2-jean-philippe@linaro.org> <20250314154904.3946484-6-jean-philippe@linaro.org> Message-ID: <20250322-91a8125ad8651b24246e5799@orel> On Fri, Mar 14, 2025 at 03:49:04PM +0000, Jean-Philippe Brucker wrote: > Add the --qemu-cpu option to let users set the CPU type to run on. > At the moment --processor allows to set both GCC -mcpu flag and QEMU > -cpu. On Arm we'd like to pass `-cpu max` to QEMU in order to enable all > the TCG features by default, and it could also be nice to let users > modify the CPU capabilities by setting extra -cpu options. > Since GCC -mcpu doesn't accept "max" or "host", separate the compiler > and QEMU arguments. > > `--processor` is now exclusively for compiler options, as indicated by > its documentation ("processor to compile for"). So use $QEMU_CPU on > RISC-V as well. > > Suggested-by: Andrew Jones > Signed-off-by: Jean-Philippe Brucker > --- > scripts/mkstandalone.sh | 2 +- > arm/run | 17 +++++++++++------ > riscv/run | 8 ++++---- > configure | 7 +++++++ > 4 files changed, 23 insertions(+), 11 deletions(-) > > diff --git a/scripts/mkstandalone.sh b/scripts/mkstandalone.sh > index 2318a85f..6b5f725d 100755 > --- a/scripts/mkstandalone.sh > +++ b/scripts/mkstandalone.sh > @@ -42,7 +42,7 @@ generate_test () > > config_export ARCH > config_export ARCH_NAME > - config_export PROCESSOR > + config_export QEMU_CPU > > echo "echo BUILD_HEAD=$(cat build-head)" > > diff --git a/arm/run b/arm/run > index efdd44ce..561bafab 100755 > --- a/arm/run > +++ b/arm/run > @@ -8,7 +8,7 @@ if [ -z "$KUT_STANDALONE" ]; then > source config.mak > source scripts/arch-run.bash > fi > -processor="$PROCESSOR" > +qemu_cpu="$QEMU_CPU" > > if [ "$QEMU" ] && [ -z "$ACCEL" ] && > [ "$HOST" = "aarch64" ] && [ "$ARCH" = "arm" ] && > @@ -37,12 +37,17 @@ if [ "$ACCEL" = "kvm" ]; then > fi > fi > > -if [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ]; then > - if [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ]; then > - processor="host" > +if [ -z "$qemu_cpu" ]; then > + if ( [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ] ) && > + ( [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ] ); then > + qemu_cpu="host" > if [ "$ARCH" = "arm" ] && [ "$HOST" = "aarch64" ]; then > - processor+=",aarch64=off" > + qemu_cpu+=",aarch64=off" > fi > + elif [ "$ARCH" = "arm64" ]; then > + qemu_cpu="cortex-a57" > + else > + qemu_cpu="cortex-a15" configure could set this in config.mak as DEFAULT_PROCESSOR, avoiding the need to duplicate it here. > fi > fi > > @@ -71,7 +76,7 @@ if $qemu $M -device '?' | grep -q pci-testdev; then > fi > > A="-accel $ACCEL$ACCEL_PROPS" > -command="$qemu -nodefaults $M $A -cpu $processor $chr_testdev $pci_testdev" > +command="$qemu -nodefaults $M $A -cpu $qemu_cpu $chr_testdev $pci_testdev" > command+=" -display none -serial stdio" > command="$(migration_cmd) $(timeout_cmd) $command" > > diff --git a/riscv/run b/riscv/run > index e2f5a922..02fcf0c0 100755 > --- a/riscv/run > +++ b/riscv/run > @@ -11,12 +11,12 @@ fi > > # Allow user overrides of some config.mak variables > mach=$MACHINE_OVERRIDE > -processor=$PROCESSOR_OVERRIDE > +qemu_cpu=$QEMU_CPU_OVERRIDE > firmware=$FIRMWARE_OVERRIDE > > -[ "$PROCESSOR" = "$ARCH" ] && PROCESSOR="max" > +[ -z "$QEMU_CPU" ] && QEMU_CPU="max" > : "${mach:=virt}" > -: "${processor:=$PROCESSOR}" > +: "${qemu_cpu:=$QEMU_CPU}" > : "${firmware:=$FIRMWARE}" > [ "$firmware" ] && firmware="-bios $firmware" > > @@ -32,7 +32,7 @@ fi > mach="-machine $mach" > > command="$qemu -nodefaults -nographic -serial mon:stdio" > -command+=" $mach $acc $firmware -cpu $processor " > +command+=" $mach $acc $firmware -cpu $qemu_cpu " > command="$(migration_cmd) $(timeout_cmd) $command" > > if [ "$UEFI_SHELL_RUN" = "y" ]; then > diff --git a/configure b/configure > index 5306bad3..d25bd23e 100755 > --- a/configure > +++ b/configure > @@ -52,6 +52,7 @@ page_size= > earlycon= > efi= > efi_direct= > +qemu_cpu= > > # Enable -Werror by default for git repositories only (i.e. developer builds) > if [ -e "$srcdir"/.git ]; then > @@ -69,6 +70,8 @@ usage() { > --arch=ARCH architecture to compile for ($arch). ARCH can be one of: > arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 > --processor=PROCESSOR processor to compile for ($processor) > + --qemu-cpu=CPU the CPU model to run on. The default depends on > + the configuration, usually it is "host" or "max". > --target=TARGET target platform that the tests will be running on (qemu or > kvmtool, default is qemu) (arm/arm64 only) > --cross-prefix=PREFIX cross compiler prefix > @@ -142,6 +145,9 @@ while [[ $optno -le $argc ]]; do > --processor) > processor="$arg" > ;; > + --qemu-cpu) > + qemu_cpu="$arg" > + ;; > --target) > target="$arg" > ;; > @@ -464,6 +470,7 @@ ARCH=$arch > ARCH_NAME=$arch_name > ARCH_LIBDIR=$arch_libdir > PROCESSOR=$processor > +QEMU_CPU=$qemu_cpu > CC=$cc > CFLAGS=$cflags > LD=$cross_prefix$ld > -- > 2.48.1 > With the Alex's and Eric's requested changes to the help text, Reviewed-by: Andrew Jones From andrew.jones at linux.dev Sat Mar 22 04:27:56 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Sat, 22 Mar 2025 12:27:56 +0100 Subject: [kvm-unit-tests PATCH v2 5/5] arm64: Use -cpu max as the default for TCG In-Reply-To: <20250314154904.3946484-7-jean-philippe@linaro.org> References: <20250314154904.3946484-2-jean-philippe@linaro.org> <20250314154904.3946484-7-jean-philippe@linaro.org> Message-ID: <20250322-c669034d2100a75ab6e53882@orel> On Fri, Mar 14, 2025 at 03:49:05PM +0000, Jean-Philippe Brucker wrote: > In order to test all the latest features, default to "max" as the QEMU > CPU type on arm64. > > Signed-off-by: Jean-Philippe Brucker > --- > arm/run | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arm/run b/arm/run > index 561bafab..84232e28 100755 > --- a/arm/run > +++ b/arm/run > @@ -45,7 +45,7 @@ if [ -z "$qemu_cpu" ]; then > qemu_cpu+=",aarch64=off" > fi > elif [ "$ARCH" = "arm64" ]; then > - qemu_cpu="cortex-a57" > + qemu_cpu="max" > else > qemu_cpu="cortex-a15" arm should also be able to default to 'max', right? Thanks, drew > fi > -- > 2.48.1 > > > -- > kvm-riscv mailing list > kvm-riscv at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kvm-riscv From ajones at ventanamicro.com Sat Mar 22 05:06:04 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Sat, 22 Mar 2025 13:06:04 +0100 Subject: [PATCH v4 02/18] riscv: sbi: add new SBI error mappings In-Reply-To: <20250317170625.1142870-3-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> <20250317170625.1142870-3-cleger@rivosinc.com> Message-ID: <20250322-cce038c88db88dd119a49846@orel> On Mon, Mar 17, 2025 at 06:06:08PM +0100, Cl?ment L?ger wrote: > A few new errors have been added with SBI V3.0, maps them as close as > possible to errno values. > > Signed-off-by: Cl?ment L?ger > --- > arch/riscv/include/asm/sbi.h | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h > index bb077d0c912f..d11d22717b49 100644 > --- a/arch/riscv/include/asm/sbi.h > +++ b/arch/riscv/include/asm/sbi.h > @@ -536,11 +536,20 @@ static inline int sbi_err_map_linux_errno(int err) > case SBI_SUCCESS: > return 0; > case SBI_ERR_DENIED: > + case SBI_ERR_DENIED_LOCKED: > return -EPERM; > case SBI_ERR_INVALID_PARAM: > + case SBI_ERR_INVALID_STATE: > + case SBI_ERR_BAD_RANGE: > return -EINVAL; > case SBI_ERR_INVALID_ADDRESS: > return -EFAULT; > + case SBI_ERR_NO_SHMEM: > + return -ENOMEM; > + case SBI_ERR_TIMEOUT: > + return -ETIME; > + case SBI_ERR_IO: > + return -EIO; > case SBI_ERR_NOT_SUPPORTED: > case SBI_ERR_FAILURE: > default: > -- > 2.47.2 > I'm not a huge fan sbi_err_map_linux_errno() since the mappings seem a bit arbitrary, but if we're going to do it, then these look pretty good to me. Only other thought I had was E2BIG for bad-range, but nah... Reviewed-by: Andrew Jones Thanks, drew From ajones at ventanamicro.com Sat Mar 22 05:11:28 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Sat, 22 Mar 2025 13:11:28 +0100 Subject: [PATCH v4 03/18] riscv: sbi: add FWFT extension interface In-Reply-To: <20250317170625.1142870-4-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> <20250317170625.1142870-4-cleger@rivosinc.com> Message-ID: <20250322-a87faa18fe5b54b7cb61b353@orel> On Mon, Mar 17, 2025 at 06:06:09PM +0100, Cl?ment L?ger wrote: > This SBI extensions enables supervisor mode to control feature that are > under M-mode control (For instance, Svadu menvcfg ADUE bit, Ssdbltrp > DTE, etc). Add an interface to set local features for a specific cpu > mask as well as for the online cpu mask. > > Signed-off-by: Cl?ment L?ger > --- > arch/riscv/include/asm/sbi.h | 20 +++++++++++ > arch/riscv/kernel/sbi.c | 69 ++++++++++++++++++++++++++++++++++++ > 2 files changed, 89 insertions(+) > > diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h > index d11d22717b49..1cecfa82c2e5 100644 > --- a/arch/riscv/include/asm/sbi.h > +++ b/arch/riscv/include/asm/sbi.h > @@ -503,6 +503,26 @@ int sbi_remote_hfence_vvma_asid(const struct cpumask *cpu_mask, > unsigned long asid); > long sbi_probe_extension(int ext); > > +int sbi_fwft_local_set_cpumask(const cpumask_t *mask, u32 feature, > + unsigned long value, unsigned long flags); > +/** > + * sbi_fwft_local_set() - Set a feature on all online cpus > + * @feature: The feature to be set > + * @value: The feature value to be set > + * @flags: FWFT feature set flags > + * > + * Return: 0 on success, appropriate linux error code otherwise. > + */ > + static inline int sbi_fwft_local_set(u32 feature, unsigned long value, > + unsigned long flags) > + { > + return sbi_fwft_local_set_cpumask(cpu_online_mask, feature, value, > + flags); Let flags stick out. We have 100 chars. > + } > + > +int sbi_fwft_get(u32 feature, unsigned long *value); > +int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags); > + > /* Check if current SBI specification version is 0.1 or not */ > static inline int sbi_spec_is_0_1(void) > { > diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c > index 1989b8cade1b..d41a5642be24 100644 > --- a/arch/riscv/kernel/sbi.c > +++ b/arch/riscv/kernel/sbi.c > @@ -299,6 +299,75 @@ static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask, > return 0; > } > > +/** > + * sbi_fwft_get() - Get a feature for the local hart > + * @feature: The feature ID to be set > + * @value: Will contain the feature value on success > + * > + * Return: 0 on success, appropriate linux error code otherwise. > + */ > +int sbi_fwft_get(u32 feature, unsigned long *value) > +{ > + return -EOPNOTSUPP; > +} > + > +/** > + * sbi_fwft_set() - Set a feature on the local hart > + * @feature: The feature ID to be set > + * @value: The feature value to be set > + * @flags: FWFT feature set flags > + * > + * Return: 0 on success, appropriate linux error code otherwise. > + */ > +int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags) > +{ > + return -EOPNOTSUPP; > +} > + > +struct fwft_set_req { > + u32 feature; > + unsigned long value; > + unsigned long flags; > + atomic_t error; > +}; > + > +static void cpu_sbi_fwft_set(void *arg) > +{ > + struct fwft_set_req *req = arg; > + int ret; > + > + ret = sbi_fwft_set(req->feature, req->value, req->flags); > + if (ret) > + atomic_set(&req->error, ret); > +} > + > +/** > + * sbi_fwft_local_set() - Set a feature for the specified cpumask sbi_fwft_local_set_cpumask > + * @mask: CPU mask of cpus that need the feature to be set > + * @feature: The feature ID to be set > + * @value: The feature value to be set > + * @flags: FWFT feature set flags > + * > + * Return: 0 on success, appropriate linux error code otherwise. > + */ > +int sbi_fwft_local_set_cpumask(const cpumask_t *mask, u32 feature, > + unsigned long value, unsigned long flags) > +{ > + struct fwft_set_req req = { > + .feature = feature, > + .value = value, > + .flags = flags, > + .error = ATOMIC_INIT(0), > + }; > + > + if (feature & SBI_FWFT_GLOBAL_FEATURE_BIT) > + return -EINVAL; > + > + on_each_cpu_mask(mask, cpu_sbi_fwft_set, &req, 1); > + > + return atomic_read(&req.error); > +} > + > /** > * sbi_set_timer() - Program the timer for next timer event. > * @stime_value: The value after which next timer event should fire. > -- > 2.47.2 > Otherwise, Reviewed-by: Andrew Jones From ajones at ventanamicro.com Sat Mar 22 05:14:03 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Sat, 22 Mar 2025 13:14:03 +0100 Subject: [PATCH v4 04/18] riscv: sbi: add SBI FWFT extension calls In-Reply-To: <20250317170625.1142870-5-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> <20250317170625.1142870-5-cleger@rivosinc.com> Message-ID: <20250322-62ad16eaf41abd98170fa6bb@orel> On Mon, Mar 17, 2025 at 06:06:10PM +0100, Cl?ment L?ger wrote: > Add FWFT extension calls. This will be ratified in SBI V3.0 hence, it is > provided as a separate commit that can be left out if needed. > > Signed-off-by: Cl?ment L?ger > --- > arch/riscv/kernel/sbi.c | 30 ++++++++++++++++++++++++++++-- > 1 file changed, 28 insertions(+), 2 deletions(-) > > diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c > index d41a5642be24..54d9ceb7b723 100644 > --- a/arch/riscv/kernel/sbi.c > +++ b/arch/riscv/kernel/sbi.c > @@ -299,6 +299,8 @@ static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask, > return 0; > } > > +static bool sbi_fwft_supported; > + > /** > * sbi_fwft_get() - Get a feature for the local hart > * @feature: The feature ID to be set > @@ -308,7 +310,15 @@ static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask, > */ > int sbi_fwft_get(u32 feature, unsigned long *value) > { > - return -EOPNOTSUPP; > + struct sbiret ret; > + > + if (!sbi_fwft_supported) > + return -EOPNOTSUPP; > + > + ret = sbi_ecall(SBI_EXT_FWFT, SBI_EXT_FWFT_GET, > + feature, 0, 0, 0, 0, 0); We're missing the if (!ret) *value = ret.value; part. > + > + return sbi_err_map_linux_errno(ret.error); > } > > /** > @@ -321,7 +331,15 @@ int sbi_fwft_get(u32 feature, unsigned long *value) > */ > int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags) > { > - return -EOPNOTSUPP; > + struct sbiret ret; > + > + if (!sbi_fwft_supported) > + return -EOPNOTSUPP; > + > + ret = sbi_ecall(SBI_EXT_FWFT, SBI_EXT_FWFT_SET, > + feature, value, flags, 0, 0, 0); > + > + return sbi_err_map_linux_errno(ret.error); > } > > struct fwft_set_req { > @@ -360,6 +378,9 @@ int sbi_fwft_local_set_cpumask(const cpumask_t *mask, u32 feature, > .error = ATOMIC_INIT(0), > }; > > + if (!sbi_fwft_supported) > + return -EOPNOTSUPP; > + > if (feature & SBI_FWFT_GLOBAL_FEATURE_BIT) > return -EINVAL; > > @@ -691,6 +712,11 @@ void __init sbi_init(void) > pr_info("SBI DBCN extension detected\n"); > sbi_debug_console_available = true; > } > + if ((sbi_spec_version >= sbi_mk_version(3, 0)) && > + (sbi_probe_extension(SBI_EXT_FWFT) > 0)) { > + pr_info("SBI FWFT extension detected\n"); > + sbi_fwft_supported = true; > + } > } else { > __sbi_set_timer = __sbi_set_timer_v01; > __sbi_send_ipi = __sbi_send_ipi_v01; > -- > 2.47.2 > Thanks, drew From ajones at ventanamicro.com Sat Mar 22 05:23:35 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Sat, 22 Mar 2025 13:23:35 +0100 Subject: [PATCH v4 15/18] RISC-V: KVM: add SBI extension init()/deinit() functions In-Reply-To: <20250317170625.1142870-16-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> <20250317170625.1142870-16-cleger@rivosinc.com> Message-ID: <20250322-79de477d27a2a75eb89616d1@orel> On Mon, Mar 17, 2025 at 06:06:21PM +0100, Cl?ment L?ger wrote: > The FWFT SBI extension will need to dynamically allocate memory and do > init time specific initialization. Add an init/deinit callbacks that > allows to do so. > > Signed-off-by: Cl?ment L?ger > --- > arch/riscv/include/asm/kvm_vcpu_sbi.h | 9 +++++++++ > arch/riscv/kvm/vcpu.c | 2 ++ > arch/riscv/kvm/vcpu_sbi.c | 26 ++++++++++++++++++++++++++ > 3 files changed, 37 insertions(+) > > diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h > index 4ed6203cdd30..bcb90757b149 100644 > --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h > +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h > @@ -49,6 +49,14 @@ struct kvm_vcpu_sbi_extension { > > /* Extension specific probe function */ > unsigned long (*probe)(struct kvm_vcpu *vcpu); > + > + /* > + * Init/deinit function called once during VCPU init/destroy. These > + * might be use if the SBI extensions need to allocate or do specific > + * init time only configuration. > + */ > + int (*init)(struct kvm_vcpu *vcpu); > + void (*deinit)(struct kvm_vcpu *vcpu); > }; > > void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run); > @@ -69,6 +77,7 @@ const struct kvm_vcpu_sbi_extension *kvm_vcpu_sbi_find_ext( > bool riscv_vcpu_supports_sbi_ext(struct kvm_vcpu *vcpu, int idx); > int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run); > void kvm_riscv_vcpu_sbi_init(struct kvm_vcpu *vcpu); > +void kvm_riscv_vcpu_sbi_deinit(struct kvm_vcpu *vcpu); > > int kvm_riscv_vcpu_get_reg_sbi_sta(struct kvm_vcpu *vcpu, unsigned long reg_num, > unsigned long *reg_val); > diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c > index 60d684c76c58..877bcc85c067 100644 > --- a/arch/riscv/kvm/vcpu.c > +++ b/arch/riscv/kvm/vcpu.c > @@ -185,6 +185,8 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu) > > void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu) > { > + kvm_riscv_vcpu_sbi_deinit(vcpu); > + > /* Cleanup VCPU AIA context */ > kvm_riscv_vcpu_aia_deinit(vcpu); > > diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c > index d1c83a77735e..3139f171c20f 100644 > --- a/arch/riscv/kvm/vcpu_sbi.c > +++ b/arch/riscv/kvm/vcpu_sbi.c > @@ -508,5 +508,31 @@ void kvm_riscv_vcpu_sbi_init(struct kvm_vcpu *vcpu) > scontext->ext_status[idx] = ext->default_disabled ? > KVM_RISCV_SBI_EXT_STATUS_DISABLED : > KVM_RISCV_SBI_EXT_STATUS_ENABLED; > + > + if (ext->init && ext->init(vcpu) != 0) > + scontext->ext_status[idx] = KVM_RISCV_SBI_EXT_STATUS_UNAVAILABLE; > + } > +} > + > +void kvm_riscv_vcpu_sbi_deinit(struct kvm_vcpu *vcpu) > +{ > + struct kvm_vcpu_sbi_context *scontext = &vcpu->arch.sbi_context; > + const struct kvm_riscv_sbi_extension_entry *entry; > + const struct kvm_vcpu_sbi_extension *ext; > + int idx, i; > + > + for (i = 0; i < ARRAY_SIZE(sbi_ext); i++) { > + entry = &sbi_ext[i]; > + ext = entry->ext_ptr; > + idx = entry->ext_idx; > + > + if (idx < 0 || idx >= ARRAY_SIZE(scontext->ext_status)) > + continue; > + > + if (scontext->ext_status[idx] == KVM_RISCV_SBI_EXT_STATUS_UNAVAILABLE || > + !ext->deinit) > + continue; > + > + ext->deinit(vcpu); > } > } > -- > 2.47.2 > Reviewed-by: Andrew Jones From ajones at ventanamicro.com Sat Mar 22 05:30:10 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Sat, 22 Mar 2025 13:30:10 +0100 Subject: [PATCH v4 17/18] RISC-V: KVM: add support for FWFT SBI extension In-Reply-To: <20250317170625.1142870-18-cleger@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> <20250317170625.1142870-18-cleger@rivosinc.com> Message-ID: <20250322-c4eaee71aa9c1f0b13ca8fef@orel> On Mon, Mar 17, 2025 at 06:06:23PM +0100, Cl?ment L?ger wrote: > Add basic infrastructure to support the FWFT extension in KVM. > > Signed-off-by: Cl?ment L?ger > Reviewed-by: Andrew Jones > --- > arch/riscv/include/asm/kvm_host.h | 4 + > arch/riscv/include/asm/kvm_vcpu_sbi.h | 1 + > arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h | 29 +++ > arch/riscv/include/uapi/asm/kvm.h | 1 + > arch/riscv/kvm/Makefile | 1 + > arch/riscv/kvm/vcpu_sbi.c | 4 + > arch/riscv/kvm/vcpu_sbi_fwft.c | 216 +++++++++++++++++++++ > 7 files changed, 256 insertions(+) > create mode 100644 arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h > create mode 100644 arch/riscv/kvm/vcpu_sbi_fwft.c > > diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h > index bb93d2995ea2..c0db61ba691a 100644 > --- a/arch/riscv/include/asm/kvm_host.h > +++ b/arch/riscv/include/asm/kvm_host.h > @@ -19,6 +19,7 @@ > #include > #include > #include > +#include > #include > #include > > @@ -281,6 +282,9 @@ struct kvm_vcpu_arch { > /* Performance monitoring context */ > struct kvm_pmu pmu_context; > > + /* Firmware feature SBI extension context */ > + struct kvm_sbi_fwft fwft_context; > + > /* 'static' configurations which are set only once */ > struct kvm_vcpu_config cfg; > > diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h > index cb68b3a57c8f..ffd03fed0c06 100644 > --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h > +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h > @@ -98,6 +98,7 @@ extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_hsm; > extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_dbcn; > extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_susp; > extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_sta; > +extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_fwft; > extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_experimental; > extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_vendor; > > diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h b/arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h > new file mode 100644 > index 000000000000..9ba841355758 > --- /dev/null > +++ b/arch/riscv/include/asm/kvm_vcpu_sbi_fwft.h > @@ -0,0 +1,29 @@ > +/* SPDX-License-Identifier: GPL-2.0-only */ > +/* > + * Copyright (c) 2025 Rivos Inc. > + * > + * Authors: > + * Cl?ment L?ger > + */ > + > +#ifndef __KVM_VCPU_RISCV_FWFT_H > +#define __KVM_VCPU_RISCV_FWFT_H > + > +#include > + > +struct kvm_sbi_fwft_feature; > + > +struct kvm_sbi_fwft_config { > + const struct kvm_sbi_fwft_feature *feature; > + bool supported; > + unsigned long flags; > +}; > + > +/* FWFT data structure per vcpu */ > +struct kvm_sbi_fwft { > + struct kvm_sbi_fwft_config *configs; > +}; > + > +#define vcpu_to_fwft(vcpu) (&(vcpu)->arch.fwft_context) > + > +#endif /* !__KVM_VCPU_RISCV_FWFT_H */ > diff --git a/arch/riscv/include/uapi/asm/kvm.h b/arch/riscv/include/uapi/asm/kvm.h > index f06bc5efcd79..fa6eee1caf41 100644 > --- a/arch/riscv/include/uapi/asm/kvm.h > +++ b/arch/riscv/include/uapi/asm/kvm.h > @@ -202,6 +202,7 @@ enum KVM_RISCV_SBI_EXT_ID { > KVM_RISCV_SBI_EXT_DBCN, > KVM_RISCV_SBI_EXT_STA, > KVM_RISCV_SBI_EXT_SUSP, > + KVM_RISCV_SBI_EXT_FWFT, > KVM_RISCV_SBI_EXT_MAX, > }; > > diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile > index 4e0bba91d284..06e2d52a9b88 100644 > --- a/arch/riscv/kvm/Makefile > +++ b/arch/riscv/kvm/Makefile > @@ -26,6 +26,7 @@ kvm-y += vcpu_onereg.o > kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o > kvm-y += vcpu_sbi.o > kvm-y += vcpu_sbi_base.o > +kvm-y += vcpu_sbi_fwft.o > kvm-y += vcpu_sbi_hsm.o > kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_sbi_pmu.o > kvm-y += vcpu_sbi_replace.o > diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c > index 50be079b5528..0748810c0252 100644 > --- a/arch/riscv/kvm/vcpu_sbi.c > +++ b/arch/riscv/kvm/vcpu_sbi.c > @@ -78,6 +78,10 @@ static const struct kvm_riscv_sbi_extension_entry sbi_ext[] = { > .ext_idx = KVM_RISCV_SBI_EXT_STA, > .ext_ptr = &vcpu_sbi_ext_sta, > }, > + { > + .ext_idx = KVM_RISCV_SBI_EXT_FWFT, > + .ext_ptr = &vcpu_sbi_ext_fwft, > + }, > { > .ext_idx = KVM_RISCV_SBI_EXT_EXPERIMENTAL, > .ext_ptr = &vcpu_sbi_ext_experimental, > diff --git a/arch/riscv/kvm/vcpu_sbi_fwft.c b/arch/riscv/kvm/vcpu_sbi_fwft.c > new file mode 100644 > index 000000000000..8a7cfe1fe7a7 > --- /dev/null > +++ b/arch/riscv/kvm/vcpu_sbi_fwft.c > @@ -0,0 +1,216 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Copyright (c) 2025 Rivos Inc. > + * > + * Authors: > + * Cl?ment L?ger > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +struct kvm_sbi_fwft_feature { > + /** > + * @id: Feature ID > + */ > + enum sbi_fwft_feature_t id; > + > + /** > + * @supported: Check if the feature is supported on the vcpu > + * > + * This callback is optional, if not provided the feature is assumed to > + * be supported > + */ > + bool (*supported)(struct kvm_vcpu *vcpu); > + > + /** > + * @set: Set the feature value > + * > + * Return SBI_SUCCESS on success or an SBI error (SBI_ERR_*) > + * > + * This callback is mandatory > + */ > + long (*set)(struct kvm_vcpu *vcpu, struct kvm_sbi_fwft_config *conf, unsigned long value); > + > + /** > + * @get: Get the feature current value > + * > + * Return SBI_SUCCESS on success or an SBI error (SBI_ERR_*) > + * > + * This callback is mandatory > + */ > + long (*get)(struct kvm_vcpu *vcpu, struct kvm_sbi_fwft_config *conf, unsigned long *value); ^ extra space here Thanks, drew From alexandru.elisei at arm.com Sun Mar 23 04:16:19 2025 From: alexandru.elisei at arm.com (Alexandru Elisei) Date: Sun, 23 Mar 2025 11:16:19 +0000 Subject: [kvm-unit-tests PATCH v2 4/5] configure: Add --qemu-cpu option In-Reply-To: <20250322-91a8125ad8651b24246e5799@orel> References: <20250314154904.3946484-2-jean-philippe@linaro.org> <20250314154904.3946484-6-jean-philippe@linaro.org> <20250322-91a8125ad8651b24246e5799@orel> Message-ID: Hi Drew, On Sat, Mar 22, 2025 at 12:26:59PM +0100, Andrew Jones wrote: > On Fri, Mar 14, 2025 at 03:49:04PM +0000, Jean-Philippe Brucker wrote: > > Add the --qemu-cpu option to let users set the CPU type to run on. > > At the moment --processor allows to set both GCC -mcpu flag and QEMU > > -cpu. On Arm we'd like to pass `-cpu max` to QEMU in order to enable all > > the TCG features by default, and it could also be nice to let users > > modify the CPU capabilities by setting extra -cpu options. > > Since GCC -mcpu doesn't accept "max" or "host", separate the compiler > > and QEMU arguments. > > > > `--processor` is now exclusively for compiler options, as indicated by > > its documentation ("processor to compile for"). So use $QEMU_CPU on > > RISC-V as well. > > > > Suggested-by: Andrew Jones > > Signed-off-by: Jean-Philippe Brucker > > --- > > scripts/mkstandalone.sh | 2 +- > > arm/run | 17 +++++++++++------ > > riscv/run | 8 ++++---- > > configure | 7 +++++++ > > 4 files changed, 23 insertions(+), 11 deletions(-) > > > > diff --git a/scripts/mkstandalone.sh b/scripts/mkstandalone.sh > > index 2318a85f..6b5f725d 100755 > > --- a/scripts/mkstandalone.sh > > +++ b/scripts/mkstandalone.sh > > @@ -42,7 +42,7 @@ generate_test () > > > > config_export ARCH > > config_export ARCH_NAME > > - config_export PROCESSOR > > + config_export QEMU_CPU > > > > echo "echo BUILD_HEAD=$(cat build-head)" > > > > diff --git a/arm/run b/arm/run > > index efdd44ce..561bafab 100755 > > --- a/arm/run > > +++ b/arm/run > > @@ -8,7 +8,7 @@ if [ -z "$KUT_STANDALONE" ]; then > > source config.mak > > source scripts/arch-run.bash > > fi > > -processor="$PROCESSOR" > > +qemu_cpu="$QEMU_CPU" > > > > if [ "$QEMU" ] && [ -z "$ACCEL" ] && > > [ "$HOST" = "aarch64" ] && [ "$ARCH" = "arm" ] && > > @@ -37,12 +37,17 @@ if [ "$ACCEL" = "kvm" ]; then > > fi > > fi > > > > -if [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ]; then > > - if [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ]; then > > - processor="host" > > +if [ -z "$qemu_cpu" ]; then > > + if ( [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ] ) && > > + ( [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ] ); then > > + qemu_cpu="host" > > if [ "$ARCH" = "arm" ] && [ "$HOST" = "aarch64" ]; then > > - processor+=",aarch64=off" > > + qemu_cpu+=",aarch64=off" > > fi > > + elif [ "$ARCH" = "arm64" ]; then > > + qemu_cpu="cortex-a57" > > + else > > + qemu_cpu="cortex-a15" > > configure could set this in config.mak as DEFAULT_PROCESSOR, avoiding the > need to duplicate it here. That was my first instinct too, having the default value in config.mak seemed like the correct solution. But the problem with this is that the default -cpu type depends on -accel (set via unittests.cfg or as an environment variable), host and test architecture combination. All of these variables are known only at runtime. Let's say we have DEFAULT_QEMU_CPU=cortex-a57 in config.mak. If we keep the above heuristic, arm/run will override it with host,aarch64=off. IMO, having it in config.mak, but arm/run using it only under certain conditions is worse than not having it at all. arm/run choosing the default value **all the time** is at least consistent. We could modify the help text for --qemu-cpu to say something like "If left unset, the $ARCH/run script will choose a best value based on the host system and test configuration." Thanks, Alex > > > fi > > fi > > > > @@ -71,7 +76,7 @@ if $qemu $M -device '?' | grep -q pci-testdev; then > > fi > > > > A="-accel $ACCEL$ACCEL_PROPS" > > -command="$qemu -nodefaults $M $A -cpu $processor $chr_testdev $pci_testdev" > > +command="$qemu -nodefaults $M $A -cpu $qemu_cpu $chr_testdev $pci_testdev" > > command+=" -display none -serial stdio" > > command="$(migration_cmd) $(timeout_cmd) $command" > > > > diff --git a/riscv/run b/riscv/run > > index e2f5a922..02fcf0c0 100755 > > --- a/riscv/run > > +++ b/riscv/run > > @@ -11,12 +11,12 @@ fi > > > > # Allow user overrides of some config.mak variables > > mach=$MACHINE_OVERRIDE > > -processor=$PROCESSOR_OVERRIDE > > +qemu_cpu=$QEMU_CPU_OVERRIDE > > firmware=$FIRMWARE_OVERRIDE > > > > -[ "$PROCESSOR" = "$ARCH" ] && PROCESSOR="max" > > +[ -z "$QEMU_CPU" ] && QEMU_CPU="max" > > : "${mach:=virt}" > > -: "${processor:=$PROCESSOR}" > > +: "${qemu_cpu:=$QEMU_CPU}" > > : "${firmware:=$FIRMWARE}" > > [ "$firmware" ] && firmware="-bios $firmware" > > > > @@ -32,7 +32,7 @@ fi > > mach="-machine $mach" > > > > command="$qemu -nodefaults -nographic -serial mon:stdio" > > -command+=" $mach $acc $firmware -cpu $processor " > > +command+=" $mach $acc $firmware -cpu $qemu_cpu " > > command="$(migration_cmd) $(timeout_cmd) $command" > > > > if [ "$UEFI_SHELL_RUN" = "y" ]; then > > diff --git a/configure b/configure > > index 5306bad3..d25bd23e 100755 > > --- a/configure > > +++ b/configure > > @@ -52,6 +52,7 @@ page_size= > > earlycon= > > efi= > > efi_direct= > > +qemu_cpu= > > > > # Enable -Werror by default for git repositories only (i.e. developer builds) > > if [ -e "$srcdir"/.git ]; then > > @@ -69,6 +70,8 @@ usage() { > > --arch=ARCH architecture to compile for ($arch). ARCH can be one of: > > arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 > > --processor=PROCESSOR processor to compile for ($processor) > > + --qemu-cpu=CPU the CPU model to run on. The default depends on > > + the configuration, usually it is "host" or "max". > > --target=TARGET target platform that the tests will be running on (qemu or > > kvmtool, default is qemu) (arm/arm64 only) > > --cross-prefix=PREFIX cross compiler prefix > > @@ -142,6 +145,9 @@ while [[ $optno -le $argc ]]; do > > --processor) > > processor="$arg" > > ;; > > + --qemu-cpu) > > + qemu_cpu="$arg" > > + ;; > > --target) > > target="$arg" > > ;; > > @@ -464,6 +470,7 @@ ARCH=$arch > > ARCH_NAME=$arch_name > > ARCH_LIBDIR=$arch_libdir > > PROCESSOR=$processor > > +QEMU_CPU=$qemu_cpu > > CC=$cc > > CFLAGS=$cflags > > LD=$cross_prefix$ld > > -- > > 2.48.1 > > > > With the Alex's and Eric's requested changes to the help text, > > Reviewed-by: Andrew Jones From andrew.jones at linux.dev Mon Mar 24 01:19:27 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Mon, 24 Mar 2025 09:19:27 +0100 Subject: [kvm-unit-tests PATCH v2 4/5] configure: Add --qemu-cpu option In-Reply-To: References: <20250314154904.3946484-2-jean-philippe@linaro.org> <20250314154904.3946484-6-jean-philippe@linaro.org> <20250322-91a8125ad8651b24246e5799@orel> Message-ID: <20250324-5d22d8ad79a9db37b1cf6961@orel> On Sun, Mar 23, 2025 at 11:16:19AM +0000, Alexandru Elisei wrote: ... > > > +if [ -z "$qemu_cpu" ]; then > > > + if ( [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ] ) && > > > + ( [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ] ); then > > > + qemu_cpu="host" > > > if [ "$ARCH" = "arm" ] && [ "$HOST" = "aarch64" ]; then > > > - processor+=",aarch64=off" > > > + qemu_cpu+=",aarch64=off" > > > fi > > > + elif [ "$ARCH" = "arm64" ]; then > > > + qemu_cpu="cortex-a57" > > > + else > > > + qemu_cpu="cortex-a15" > > > > configure could set this in config.mak as DEFAULT_PROCESSOR, avoiding the > > need to duplicate it here. > > That was my first instinct too, having the default value in config.mak seemed > like the correct solution. > > But the problem with this is that the default -cpu type depends on -accel (set > via unittests.cfg or as an environment variable), host and test architecture > combination. All of these variables are known only at runtime. > > Let's say we have DEFAULT_QEMU_CPU=cortex-a57 in config.mak. If we keep the > above heuristic, arm/run will override it with host,aarch64=off. IMO, having it > in config.mak, but arm/run using it only under certain conditions is worse than > not having it at all. arm/run choosing the default value **all the time** is at > least consistent. I think having 'DEFAULT' in the name implies that it will only be used if there's nothing better, and we don't require everything in config.mak to be used (there's even some s390x-specific stuff in there for all architectures...) > > We could modify the help text for --qemu-cpu to say something like "If left > unset, the $ARCH/run script will choose a best value based on the host system > and test configuration." This is helpful, so we should add it regardless. Thanks, drew From cleger at rivosinc.com Mon Mar 24 01:29:33 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Mon, 24 Mar 2025 09:29:33 +0100 Subject: [PATCH v4 02/18] riscv: sbi: add new SBI error mappings In-Reply-To: <20250322-cce038c88db88dd119a49846@orel> References: <20250317170625.1142870-1-cleger@rivosinc.com> <20250317170625.1142870-3-cleger@rivosinc.com> <20250322-cce038c88db88dd119a49846@orel> Message-ID: <779c137d-5030-4212-b957-3d2620448ea9@rivosinc.com> On 22/03/2025 13:06, Andrew Jones wrote: > On Mon, Mar 17, 2025 at 06:06:08PM +0100, Cl?ment L?ger wrote: >> A few new errors have been added with SBI V3.0, maps them as close as >> possible to errno values. >> >> Signed-off-by: Cl?ment L?ger >> --- >> arch/riscv/include/asm/sbi.h | 9 +++++++++ >> 1 file changed, 9 insertions(+) >> >> diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h >> index bb077d0c912f..d11d22717b49 100644 >> --- a/arch/riscv/include/asm/sbi.h >> +++ b/arch/riscv/include/asm/sbi.h >> @@ -536,11 +536,20 @@ static inline int sbi_err_map_linux_errno(int err) >> case SBI_SUCCESS: >> return 0; >> case SBI_ERR_DENIED: >> + case SBI_ERR_DENIED_LOCKED: >> return -EPERM; >> case SBI_ERR_INVALID_PARAM: >> + case SBI_ERR_INVALID_STATE: >> + case SBI_ERR_BAD_RANGE: >> return -EINVAL; >> case SBI_ERR_INVALID_ADDRESS: >> return -EFAULT; >> + case SBI_ERR_NO_SHMEM: >> + return -ENOMEM; >> + case SBI_ERR_TIMEOUT: >> + return -ETIME; >> + case SBI_ERR_IO: >> + return -EIO; >> case SBI_ERR_NOT_SUPPORTED: >> case SBI_ERR_FAILURE: >> default: >> -- >> 2.47.2 >> > > I'm not a huge fan sbi_err_map_linux_errno() since the mappings seem a bit > arbitrary, but if we're going to do it, then these look pretty good to me. > Only other thought I had was E2BIG for bad-range, but nah... Yeah I also think some mappings are a bit odd even though I skimmed through the whole errno list to find the best possible mappings. I'd be happy to find something better though. Thanks, Cl?ment > > Reviewed-by: Andrew Jones > > Thanks, > drew From cleger at rivosinc.com Mon Mar 24 01:37:17 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Mon, 24 Mar 2025 09:37:17 +0100 Subject: [PATCH v4 04/18] riscv: sbi: add SBI FWFT extension calls In-Reply-To: <20250322-62ad16eaf41abd98170fa6bb@orel> References: <20250317170625.1142870-1-cleger@rivosinc.com> <20250317170625.1142870-5-cleger@rivosinc.com> <20250322-62ad16eaf41abd98170fa6bb@orel> Message-ID: On 22/03/2025 13:14, Andrew Jones wrote: > On Mon, Mar 17, 2025 at 06:06:10PM +0100, Cl?ment L?ger wrote: >> Add FWFT extension calls. This will be ratified in SBI V3.0 hence, it is >> provided as a separate commit that can be left out if needed. >> >> Signed-off-by: Cl?ment L?ger >> --- >> arch/riscv/kernel/sbi.c | 30 ++++++++++++++++++++++++++++-- >> 1 file changed, 28 insertions(+), 2 deletions(-) >> >> diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c >> index d41a5642be24..54d9ceb7b723 100644 >> --- a/arch/riscv/kernel/sbi.c >> +++ b/arch/riscv/kernel/sbi.c >> @@ -299,6 +299,8 @@ static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask, >> return 0; >> } >> >> +static bool sbi_fwft_supported; >> + >> /** >> * sbi_fwft_get() - Get a feature for the local hart >> * @feature: The feature ID to be set >> @@ -308,7 +310,15 @@ static int __sbi_rfence_v02(int fid, const struct cpumask *cpu_mask, >> */ >> int sbi_fwft_get(u32 feature, unsigned long *value) >> { >> - return -EOPNOTSUPP; >> + struct sbiret ret; >> + >> + if (!sbi_fwft_supported) >> + return -EOPNOTSUPP; >> + >> + ret = sbi_ecall(SBI_EXT_FWFT, SBI_EXT_FWFT_GET, >> + feature, 0, 0, 0, 0, 0); > > We're missing the > > if (!ret) > *value = ret.value; > > part. Damn, and even worse it isn't used at all. I'll probably remove it entirely and keep the strict minimum for this series. Thanks, Cl?ment > >> + >> + return sbi_err_map_linux_errno(ret.error); >> } >> >> /** >> @@ -321,7 +331,15 @@ int sbi_fwft_get(u32 feature, unsigned long *value) >> */ >> int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags) >> { >> - return -EOPNOTSUPP; >> + struct sbiret ret; >> + >> + if (!sbi_fwft_supported) >> + return -EOPNOTSUPP; >> + >> + ret = sbi_ecall(SBI_EXT_FWFT, SBI_EXT_FWFT_SET, >> + feature, value, flags, 0, 0, 0); >> + >> + return sbi_err_map_linux_errno(ret.error); >> } >> >> struct fwft_set_req { >> @@ -360,6 +378,9 @@ int sbi_fwft_local_set_cpumask(const cpumask_t *mask, u32 feature, >> .error = ATOMIC_INIT(0), >> }; >> >> + if (!sbi_fwft_supported) >> + return -EOPNOTSUPP; >> + >> if (feature & SBI_FWFT_GLOBAL_FEATURE_BIT) >> return -EINVAL; >> >> @@ -691,6 +712,11 @@ void __init sbi_init(void) >> pr_info("SBI DBCN extension detected\n"); >> sbi_debug_console_available = true; >> } >> + if ((sbi_spec_version >= sbi_mk_version(3, 0)) && >> + (sbi_probe_extension(SBI_EXT_FWFT) > 0)) { >> + pr_info("SBI FWFT extension detected\n"); >> + sbi_fwft_supported = true; >> + } >> } else { >> __sbi_set_timer = __sbi_set_timer_v01; >> __sbi_send_ipi = __sbi_send_ipi_v01; >> -- >> 2.47.2 >> > > Thanks, > drew From ajones at ventanamicro.com Mon Mar 24 01:38:29 2025 From: ajones at ventanamicro.com (Andrew Jones) Date: Mon, 24 Mar 2025 09:38:29 +0100 Subject: [PATCH v4 02/18] riscv: sbi: add new SBI error mappings In-Reply-To: <779c137d-5030-4212-b957-3d2620448ea9@rivosinc.com> References: <20250317170625.1142870-1-cleger@rivosinc.com> <20250317170625.1142870-3-cleger@rivosinc.com> <20250322-cce038c88db88dd119a49846@orel> <779c137d-5030-4212-b957-3d2620448ea9@rivosinc.com> Message-ID: <20250324-5d1d09fc9e50d2276ba56b6f@orel> On Mon, Mar 24, 2025 at 09:29:33AM +0100, Cl?ment L?ger wrote: > > > On 22/03/2025 13:06, Andrew Jones wrote: > > On Mon, Mar 17, 2025 at 06:06:08PM +0100, Cl?ment L?ger wrote: > >> A few new errors have been added with SBI V3.0, maps them as close as > >> possible to errno values. > >> > >> Signed-off-by: Cl?ment L?ger > >> --- > >> arch/riscv/include/asm/sbi.h | 9 +++++++++ > >> 1 file changed, 9 insertions(+) > >> > >> diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h > >> index bb077d0c912f..d11d22717b49 100644 > >> --- a/arch/riscv/include/asm/sbi.h > >> +++ b/arch/riscv/include/asm/sbi.h > >> @@ -536,11 +536,20 @@ static inline int sbi_err_map_linux_errno(int err) > >> case SBI_SUCCESS: > >> return 0; > >> case SBI_ERR_DENIED: > >> + case SBI_ERR_DENIED_LOCKED: > >> return -EPERM; > >> case SBI_ERR_INVALID_PARAM: > >> + case SBI_ERR_INVALID_STATE: > >> + case SBI_ERR_BAD_RANGE: > >> return -EINVAL; > >> case SBI_ERR_INVALID_ADDRESS: > >> return -EFAULT; > >> + case SBI_ERR_NO_SHMEM: > >> + return -ENOMEM; > >> + case SBI_ERR_TIMEOUT: > >> + return -ETIME; > >> + case SBI_ERR_IO: > >> + return -EIO; > >> case SBI_ERR_NOT_SUPPORTED: > >> case SBI_ERR_FAILURE: > >> default: > >> -- > >> 2.47.2 > >> > > > > I'm not a huge fan sbi_err_map_linux_errno() since the mappings seem a bit > > arbitrary, but if we're going to do it, then these look pretty good to me. > > Only other thought I had was E2BIG for bad-range, but nah... Actually, I just recalled that there is an ERANGE, which would probably be a better match for bad-range than EINVAL, but I'm not sure it matters much anyway since this function doesn't promise 1-to-1 mappings. Thanks, drew > > Yeah I also think some mappings are a bit odd even though I skimmed > through the whole errno list to find the best possible mappings. I'd be > happy to find something better though. > > Thanks, > > Cl?ment > > > > > Reviewed-by: Andrew Jones > > > > Thanks, > > drew > From cleger at rivosinc.com Mon Mar 24 01:40:52 2025 From: cleger at rivosinc.com (=?UTF-8?B?Q2zDqW1lbnQgTMOpZ2Vy?=) Date: Mon, 24 Mar 2025 09:40:52 +0100 Subject: [PATCH v4 02/18] riscv: sbi: add new SBI error mappings In-Reply-To: <20250324-5d1d09fc9e50d2276ba56b6f@orel> References: <20250317170625.1142870-1-cleger@rivosinc.com> <20250317170625.1142870-3-cleger@rivosinc.com> <20250322-cce038c88db88dd119a49846@orel> <779c137d-5030-4212-b957-3d2620448ea9@rivosinc.com> <20250324-5d1d09fc9e50d2276ba56b6f@orel> Message-ID: <0597da6f-cc28-497f-a49e-3f1c99a4e6e1@rivosinc.com> On 24/03/2025 09:38, Andrew Jones wrote: > On Mon, Mar 24, 2025 at 09:29:33AM +0100, Cl?ment L?ger wrote: >> >> >> On 22/03/2025 13:06, Andrew Jones wrote: >>> On Mon, Mar 17, 2025 at 06:06:08PM +0100, Cl?ment L?ger wrote: >>>> A few new errors have been added with SBI V3.0, maps them as close as >>>> possible to errno values. >>>> >>>> Signed-off-by: Cl?ment L?ger >>>> --- >>>> arch/riscv/include/asm/sbi.h | 9 +++++++++ >>>> 1 file changed, 9 insertions(+) >>>> >>>> diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h >>>> index bb077d0c912f..d11d22717b49 100644 >>>> --- a/arch/riscv/include/asm/sbi.h >>>> +++ b/arch/riscv/include/asm/sbi.h >>>> @@ -536,11 +536,20 @@ static inline int sbi_err_map_linux_errno(int err) >>>> case SBI_SUCCESS: >>>> return 0; >>>> case SBI_ERR_DENIED: >>>> + case SBI_ERR_DENIED_LOCKED: >>>> return -EPERM; >>>> case SBI_ERR_INVALID_PARAM: >>>> + case SBI_ERR_INVALID_STATE: >>>> + case SBI_ERR_BAD_RANGE: >>>> return -EINVAL; >>>> case SBI_ERR_INVALID_ADDRESS: >>>> return -EFAULT; >>>> + case SBI_ERR_NO_SHMEM: >>>> + return -ENOMEM; >>>> + case SBI_ERR_TIMEOUT: >>>> + return -ETIME; >>>> + case SBI_ERR_IO: >>>> + return -EIO; >>>> case SBI_ERR_NOT_SUPPORTED: >>>> case SBI_ERR_FAILURE: >>>> default: >>>> -- >>>> 2.47.2 >>>> >>> >>> I'm not a huge fan sbi_err_map_linux_errno() since the mappings seem a bit >>> arbitrary, but if we're going to do it, then these look pretty good to me. >>> Only other thought I had was E2BIG for bad-range, but nah... > > Actually, I just recalled that there is an ERANGE, which would probably be > a better match for bad-range than EINVAL, but I'm not sure it matters much > anyway since this function doesn't promise 1-to-1 mappings. Yes, but ERANGE description is actually "results are too large", but at least it's name is more descriptive. Let's go with it. > > Thanks, > drew > >> >> Yeah I also think some mappings are a bit odd even though I skimmed >> through the whole errno list to find the best possible mappings. I'd be >> happy to find something better though. >> >> Thanks, >> >> Cl?ment >> >>> >>> Reviewed-by: Andrew Jones >>> >>> Thanks, >>> drew >> From alexandru.elisei at arm.com Mon Mar 24 03:41:03 2025 From: alexandru.elisei at arm.com (Alexandru Elisei) Date: Mon, 24 Mar 2025 10:41:03 +0000 Subject: [kvm-unit-tests PATCH v2 4/5] configure: Add --qemu-cpu option In-Reply-To: <20250324-5d22d8ad79a9db37b1cf6961@orel> References: <20250314154904.3946484-2-jean-philippe@linaro.org> <20250314154904.3946484-6-jean-philippe@linaro.org> <20250322-91a8125ad8651b24246e5799@orel> <20250324-5d22d8ad79a9db37b1cf6961@orel> Message-ID: Hi Drew, On Mon, Mar 24, 2025 at 09:19:27AM +0100, Andrew Jones wrote: > On Sun, Mar 23, 2025 at 11:16:19AM +0000, Alexandru Elisei wrote: > ... > > > > +if [ -z "$qemu_cpu" ]; then > > > > + if ( [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ] ) && > > > > + ( [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ] ); then > > > > + qemu_cpu="host" > > > > if [ "$ARCH" = "arm" ] && [ "$HOST" = "aarch64" ]; then > > > > - processor+=",aarch64=off" > > > > + qemu_cpu+=",aarch64=off" > > > > fi > > > > + elif [ "$ARCH" = "arm64" ]; then > > > > + qemu_cpu="cortex-a57" > > > > + else > > > > + qemu_cpu="cortex-a15" > > > > > > configure could set this in config.mak as DEFAULT_PROCESSOR, avoiding the > > > need to duplicate it here. > > > > That was my first instinct too, having the default value in config.mak seemed > > like the correct solution. > > > > But the problem with this is that the default -cpu type depends on -accel (set > > via unittests.cfg or as an environment variable), host and test architecture > > combination. All of these variables are known only at runtime. > > > > Let's say we have DEFAULT_QEMU_CPU=cortex-a57 in config.mak. If we keep the > > above heuristic, arm/run will override it with host,aarch64=off. IMO, having it > > in config.mak, but arm/run using it only under certain conditions is worse than > > not having it at all. arm/run choosing the default value **all the time** is at > > least consistent. > > I think having 'DEFAULT' in the name implies that it will only be used if > there's nothing better, and we don't require everything in config.mak to > be used (there's even some s390x-specific stuff in there for all > architectures...) I'm still leaning towards having the default value and the heuristics for when to pick it in one place ($ARCH/run) as being more convenient, but I can certainly see your point of view. So yeah, up to you :) > > > > > We could modify the help text for --qemu-cpu to say something like "If left > > unset, the $ARCH/run script will choose a best value based on the host system > > and test configuration." > > This is helpful, so we should add it regardless. Thanks, Alex From andrew.jones at linux.dev Mon Mar 24 06:13:13 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Mon, 24 Mar 2025 14:13:13 +0100 Subject: [kvm-unit-tests PATCH v2 4/5] configure: Add --qemu-cpu option In-Reply-To: References: <20250314154904.3946484-2-jean-philippe@linaro.org> <20250314154904.3946484-6-jean-philippe@linaro.org> <20250322-91a8125ad8651b24246e5799@orel> <20250324-5d22d8ad79a9db37b1cf6961@orel> Message-ID: <20250324-37628351a72ca339819b528d@orel> On Mon, Mar 24, 2025 at 10:41:03AM +0000, Alexandru Elisei wrote: > Hi Drew, > > On Mon, Mar 24, 2025 at 09:19:27AM +0100, Andrew Jones wrote: > > On Sun, Mar 23, 2025 at 11:16:19AM +0000, Alexandru Elisei wrote: > > ... > > > > > +if [ -z "$qemu_cpu" ]; then > > > > > + if ( [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ] ) && > > > > > + ( [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ] ); then > > > > > + qemu_cpu="host" > > > > > if [ "$ARCH" = "arm" ] && [ "$HOST" = "aarch64" ]; then > > > > > - processor+=",aarch64=off" > > > > > + qemu_cpu+=",aarch64=off" > > > > > fi > > > > > + elif [ "$ARCH" = "arm64" ]; then > > > > > + qemu_cpu="cortex-a57" > > > > > + else > > > > > + qemu_cpu="cortex-a15" > > > > > > > > configure could set this in config.mak as DEFAULT_PROCESSOR, avoiding the > > > > need to duplicate it here. > > > > > > That was my first instinct too, having the default value in config.mak seemed > > > like the correct solution. > > > > > > But the problem with this is that the default -cpu type depends on -accel (set > > > via unittests.cfg or as an environment variable), host and test architecture > > > combination. All of these variables are known only at runtime. > > > > > > Let's say we have DEFAULT_QEMU_CPU=cortex-a57 in config.mak. If we keep the > > > above heuristic, arm/run will override it with host,aarch64=off. IMO, having it > > > in config.mak, but arm/run using it only under certain conditions is worse than > > > not having it at all. arm/run choosing the default value **all the time** is at > > > least consistent. > > > > I think having 'DEFAULT' in the name implies that it will only be used if > > there's nothing better, and we don't require everything in config.mak to > > be used (there's even some s390x-specific stuff in there for all > > architectures...) > > I'm still leaning towards having the default value and the heuristics for > when to pick it in one place ($ARCH/run) as being more convenient, but I > can certainly see your point of view. I wouldn't mind it only being in $ARCH/run, but this series adds the same logic to $ARCH/run and to ./configure. I think an additional, potentially unused, variable in config.mak is better than code duplication. Thanks, drew From jean-philippe at linaro.org Mon Mar 24 08:48:19 2025 From: jean-philippe at linaro.org (Jean-Philippe Brucker) Date: Mon, 24 Mar 2025 15:48:19 +0000 Subject: [kvm-unit-tests PATCH v2 1/5] configure: arm64: Don't display 'aarch64' as the default architecture In-Reply-To: References: <20250314154904.3946484-2-jean-philippe@linaro.org> <20250314154904.3946484-3-jean-philippe@linaro.org> <1b0233dc-b303-4317-a65d-572cc3582b8a@redhat.com> Message-ID: <20250324154819.GA1844993@myrica> On Mon, Mar 17, 2025 at 09:27:32AM +0100, Eric Auger wrote: > > > On 3/17/25 9:19 AM, Eric Auger wrote: > > Hi Jean-Philippe, > > > > > > On 3/14/25 4:49 PM, Jean-Philippe Brucker wrote: > >> From: Alexandru Elisei > >> > >> --arch=aarch64, intentional or not, has been supported since the initial > >> arm64 support, commit 39ac3f8494be ("arm64: initial drop"). However, > >> "aarch64" does not show up in the list of supported architectures, but > >> it's displayed as the default architecture if doing ./configure --help > >> on an arm64 machine. > >> > >> Keep everything consistent and make sure that the default value for > >> $arch is "arm64", but still allow --arch=aarch64, in case they are users > > there > >> that use this configuration for kvm-unit-tests. > >> > >> The help text for --arch changes from: > >> > >> --arch=ARCH architecture to compile for (aarch64). ARCH can be one of: > >> arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 > >> > >> to: > >> > >> --arch=ARCH architecture to compile for (arm64). ARCH can be one of: > >> arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 > >> > >> Signed-off-by: Alexandru Elisei > >> --- > >> configure | 5 +++-- > >> 1 file changed, 3 insertions(+), 2 deletions(-) > >> > >> diff --git a/configure b/configure > >> index 06532a89..dc3413fc 100755 > >> --- a/configure > >> +++ b/configure > >> @@ -15,8 +15,9 @@ objdump=objdump > >> readelf=readelf > >> ar=ar > >> addr2line=addr2line > >> -arch=$(uname -m | sed -e 's/i.86/i386/;s/arm64/aarch64/;s/arm.*/arm/;s/ppc64.*/ppc64/') > >> -host=$arch > >> +host=$(uname -m | sed -e 's/i.86/i386/;s/arm64/aarch64/;s/arm.*/arm/;s/ppc64.*/ppc64/') > >> +arch=$host > >> +[ "$arch" = "aarch64" ] && arch="arm64" > > Looks the same it done again below > > Ignore this. This is done again after explicit arch setting :-/ Need > another coffee > > So looks good to me > Reviewed-by: Eric Auger Thanks for the review! Jean From jean-philippe at linaro.org Mon Mar 24 08:50:45 2025 From: jean-philippe at linaro.org (Jean-Philippe Brucker) Date: Mon, 24 Mar 2025 15:50:45 +0000 Subject: [kvm-unit-tests PATCH v2 5/5] arm64: Use -cpu max as the default for TCG In-Reply-To: <20250322-c669034d2100a75ab6e53882@orel> References: <20250314154904.3946484-2-jean-philippe@linaro.org> <20250314154904.3946484-7-jean-philippe@linaro.org> <20250322-c669034d2100a75ab6e53882@orel> Message-ID: <20250324155045.GB1844993@myrica> On Sat, Mar 22, 2025 at 12:27:56PM +0100, Andrew Jones wrote: > On Fri, Mar 14, 2025 at 03:49:05PM +0000, Jean-Philippe Brucker wrote: > > In order to test all the latest features, default to "max" as the QEMU > > CPU type on arm64. > > > > Signed-off-by: Jean-Philippe Brucker > > --- > > arm/run | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/arm/run b/arm/run > > index 561bafab..84232e28 100755 > > --- a/arm/run > > +++ b/arm/run > > @@ -45,7 +45,7 @@ if [ -z "$qemu_cpu" ]; then > > qemu_cpu+=",aarch64=off" > > fi > > elif [ "$ARCH" = "arm64" ]; then > > - qemu_cpu="cortex-a57" > > + qemu_cpu="max" > > else > > qemu_cpu="cortex-a15" > > arm should also be able to default to 'max', right? Yes I'll change this. I didn't earlier because it failed when I tried it, but it looks like I had QEMU=.../qemu-system-aarch64 in my environment variables, overriding the default qemu-system-arm (32-bit only). "qemu-system-aarch64 -cpu max" doesn't boot 32-bit code, but "qemu-system-aarch64 -cpu cortex-a15" does. Anyway, without explicitly setting the wrong QEMU, "-cpu max" works for 32-bit. Thanks, Jean From jean-philippe at linaro.org Mon Mar 24 08:52:05 2025 From: jean-philippe at linaro.org (Jean-Philippe Brucker) Date: Mon, 24 Mar 2025 15:52:05 +0000 Subject: [kvm-unit-tests PATCH v2 4/5] configure: Add --qemu-cpu option In-Reply-To: <20250324-37628351a72ca339819b528d@orel> References: <20250314154904.3946484-2-jean-philippe@linaro.org> <20250314154904.3946484-6-jean-philippe@linaro.org> <20250322-91a8125ad8651b24246e5799@orel> <20250324-5d22d8ad79a9db37b1cf6961@orel> <20250324-37628351a72ca339819b528d@orel> Message-ID: <20250324155205.GC1844993@myrica> On Mon, Mar 24, 2025 at 02:13:13PM +0100, Andrew Jones wrote: > On Mon, Mar 24, 2025 at 10:41:03AM +0000, Alexandru Elisei wrote: > > Hi Drew, > > > > On Mon, Mar 24, 2025 at 09:19:27AM +0100, Andrew Jones wrote: > > > On Sun, Mar 23, 2025 at 11:16:19AM +0000, Alexandru Elisei wrote: > > > ... > > > > > > +if [ -z "$qemu_cpu" ]; then > > > > > > + if ( [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ] ) && > > > > > > + ( [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ] ); then > > > > > > + qemu_cpu="host" > > > > > > if [ "$ARCH" = "arm" ] && [ "$HOST" = "aarch64" ]; then > > > > > > - processor+=",aarch64=off" > > > > > > + qemu_cpu+=",aarch64=off" > > > > > > fi > > > > > > + elif [ "$ARCH" = "arm64" ]; then > > > > > > + qemu_cpu="cortex-a57" > > > > > > + else > > > > > > + qemu_cpu="cortex-a15" > > > > > > > > > > configure could set this in config.mak as DEFAULT_PROCESSOR, avoiding the > > > > > need to duplicate it here. > > > > > > > > That was my first instinct too, having the default value in config.mak seemed > > > > like the correct solution. > > > > > > > > But the problem with this is that the default -cpu type depends on -accel (set > > > > via unittests.cfg or as an environment variable), host and test architecture > > > > combination. All of these variables are known only at runtime. > > > > > > > > Let's say we have DEFAULT_QEMU_CPU=cortex-a57 in config.mak. If we keep the > > > > above heuristic, arm/run will override it with host,aarch64=off. IMO, having it > > > > in config.mak, but arm/run using it only under certain conditions is worse than > > > > not having it at all. arm/run choosing the default value **all the time** is at > > > > least consistent. > > > > > > I think having 'DEFAULT' in the name implies that it will only be used if > > > there's nothing better, and we don't require everything in config.mak to > > > be used (there's even some s390x-specific stuff in there for all > > > architectures...) > > > > I'm still leaning towards having the default value and the heuristics for > > when to pick it in one place ($ARCH/run) as being more convenient, but I > > can certainly see your point of view. > > I wouldn't mind it only being in $ARCH/run, but this series adds the same > logic to $ARCH/run and to ./configure. I think an additional, potentially > unused, variable in config.mak is better than code duplication. I agree with this. However the next version ends up replacing both cortex-* types here with "max", so we won't need to pass these values in the end. The "processor" selection in configure will only be used by the Makefile for the compiler flag. Thanks, Jean From andrew.jones at linux.dev Mon Mar 24 09:33:21 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Mon, 24 Mar 2025 17:33:21 +0100 Subject: [kvm-unit-tests PATCH v2 5/5] arm64: Use -cpu max as the default for TCG In-Reply-To: <20250324155045.GB1844993@myrica> References: <20250314154904.3946484-2-jean-philippe@linaro.org> <20250314154904.3946484-7-jean-philippe@linaro.org> <20250322-c669034d2100a75ab6e53882@orel> <20250324155045.GB1844993@myrica> Message-ID: <20250324-9428d144e09d0876ebfa6f3c@orel> On Mon, Mar 24, 2025 at 03:50:45PM +0000, Jean-Philippe Brucker wrote: > On Sat, Mar 22, 2025 at 12:27:56PM +0100, Andrew Jones wrote: > > On Fri, Mar 14, 2025 at 03:49:05PM +0000, Jean-Philippe Brucker wrote: > > > In order to test all the latest features, default to "max" as the QEMU > > > CPU type on arm64. > > > > > > Signed-off-by: Jean-Philippe Brucker > > > --- > > > arm/run | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/arm/run b/arm/run > > > index 561bafab..84232e28 100755 > > > --- a/arm/run > > > +++ b/arm/run > > > @@ -45,7 +45,7 @@ if [ -z "$qemu_cpu" ]; then > > > qemu_cpu+=",aarch64=off" > > > fi > > > elif [ "$ARCH" = "arm64" ]; then > > > - qemu_cpu="cortex-a57" > > > + qemu_cpu="max" > > > else > > > qemu_cpu="cortex-a15" > > > > arm should also be able to default to 'max', right? > > Yes I'll change this. > > I didn't earlier because it failed when I tried it, but it looks like I > had QEMU=.../qemu-system-aarch64 in my environment variables, overriding > the default qemu-system-arm (32-bit only). "qemu-system-aarch64 -cpu max" > doesn't boot 32-bit code, but "qemu-system-aarch64 -cpu cortex-a15" does. > Anyway, without explicitly setting the wrong QEMU, "-cpu max" works for > 32-bit. Hmm, so now I'm not sure we want to change arm to max. Maybe? People have certainly gotten used to only needing qemu-system-aarch64 to run both arm64 and arm, so this change would break that. It seems like qemu-system-aarch64 should also have a 'max32' cpu model, but it doesn't. So, let's hold off on changing arm to max for now. Users who want it will have to both change their QEMU binary and specify -qemu-cpu. Thanks, drew From atishp at rivosinc.com Mon Mar 24 17:40:28 2025 From: atishp at rivosinc.com (Atish Patra) Date: Mon, 24 Mar 2025 17:40:28 -0700 Subject: [PATCH 0/3] RISC-V KVM selftests improvements Message-ID: <20250324-kvm_selftest_improve-v1-0-583620219d4f@rivosinc.com> This series improves the following tests. 1. Get-reg-list : Adds vector support 2. SBI PMU test : Distinguish between different types of illegal exception The first patch is just helper patch that adds stval support during exception handling. Signed-off-by: Atish Patra --- Atish Patra (3): KVM: riscv: selftests: Add stval to exception handling KVM: riscv: selftests: Decode stval to identify exact exception type KVM: riscv: selftests: Add vector extension tests .../selftests/kvm/include/riscv/processor.h | 1 + tools/testing/selftests/kvm/lib/riscv/handlers.S | 2 + tools/testing/selftests/kvm/riscv/get-reg-list.c | 111 ++++++++++++++++++++- tools/testing/selftests/kvm/riscv/sbi_pmu_test.c | 32 ++++++ 4 files changed, 145 insertions(+), 1 deletion(-) --- base-commit: b3f263a98d30fe2e33eefea297598c590ee3560e change-id: 20250324-kvm_selftest_improve-9bedb9f0a6d3 -- Regards, Atish patra From atishp at rivosinc.com Mon Mar 24 17:40:29 2025 From: atishp at rivosinc.com (Atish Patra) Date: Mon, 24 Mar 2025 17:40:29 -0700 Subject: [PATCH 1/3] KVM: riscv: selftests: Add stval to exception handling In-Reply-To: <20250324-kvm_selftest_improve-v1-0-583620219d4f@rivosinc.com> References: <20250324-kvm_selftest_improve-v1-0-583620219d4f@rivosinc.com> Message-ID: <20250324-kvm_selftest_improve-v1-1-583620219d4f@rivosinc.com> Save stval during exception handling so that it can be decoded to figure out the details of exception type. Signed-off-by: Atish Patra --- tools/testing/selftests/kvm/include/riscv/processor.h | 1 + tools/testing/selftests/kvm/lib/riscv/handlers.S | 2 ++ 2 files changed, 3 insertions(+) diff --git a/tools/testing/selftests/kvm/include/riscv/processor.h b/tools/testing/selftests/kvm/include/riscv/processor.h index 5f389166338c..f4a7d64fbe9a 100644 --- a/tools/testing/selftests/kvm/include/riscv/processor.h +++ b/tools/testing/selftests/kvm/include/riscv/processor.h @@ -95,6 +95,7 @@ struct ex_regs { unsigned long epc; unsigned long status; unsigned long cause; + unsigned long stval; }; #define NR_VECTORS 2 diff --git a/tools/testing/selftests/kvm/lib/riscv/handlers.S b/tools/testing/selftests/kvm/lib/riscv/handlers.S index aa0abd3f35bb..2884c1e8939b 100644 --- a/tools/testing/selftests/kvm/lib/riscv/handlers.S +++ b/tools/testing/selftests/kvm/lib/riscv/handlers.S @@ -45,9 +45,11 @@ csrr s0, CSR_SEPC csrr s1, CSR_SSTATUS csrr s2, CSR_SCAUSE + csrr s3, CSR_STVAL sd s0, 248(sp) sd s1, 256(sp) sd s2, 264(sp) + sd s3, 272(sp) .endm .macro restore_context -- 2.43.0 From atishp at rivosinc.com Mon Mar 24 17:40:30 2025 From: atishp at rivosinc.com (Atish Patra) Date: Mon, 24 Mar 2025 17:40:30 -0700 Subject: [PATCH 2/3] KVM: riscv: selftests: Decode stval to identify exact exception type In-Reply-To: <20250324-kvm_selftest_improve-v1-0-583620219d4f@rivosinc.com> References: <20250324-kvm_selftest_improve-v1-0-583620219d4f@rivosinc.com> Message-ID: <20250324-kvm_selftest_improve-v1-2-583620219d4f@rivosinc.com> Currently, the sbi_pmu_test continues if the exception type is illegal instruction because access to hpmcounter will generate that. However, we may get illegal for other reasons as well which should result in test assertion. Use the stval to decode the exact type of instructions and which csrs are being accessed if it is csr access instructions. Assert in all cases except if it is a csr access instructions that access valid PMU related registers. Signed-off-by: Atish Patra --- tools/testing/selftests/kvm/riscv/sbi_pmu_test.c | 32 ++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c b/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c index 03406de4989d..11bde69b5238 100644 --- a/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c +++ b/tools/testing/selftests/kvm/riscv/sbi_pmu_test.c @@ -128,11 +128,43 @@ static void stop_counter(unsigned long counter, unsigned long stop_flags) "Unable to stop counter %ld error %ld\n", counter, ret.error); } +#define INSN_OPCODE_MASK 0x007c +#define INSN_OPCODE_SHIFT 2 +#define INSN_OPCODE_SYSTEM 28 + +#define INSN_MASK_FUNCT3 0x7000 +#define INSN_SHIFT_FUNCT3 12 + +#define INSN_CSR_MASK 0xfff00000 +#define INSN_CSR_SHIFT 20 + +#define GET_RM(insn) (((insn) & INSN_MASK_FUNCT3) >> INSN_SHIFT_FUNCT3) +#define GET_CSR_NUM(insn) (((insn) & INSN_CSR_MASK) >> INSN_CSR_SHIFT) + static void guest_illegal_exception_handler(struct ex_regs *regs) { + unsigned long insn; + int opcode, csr_num, funct3; + __GUEST_ASSERT(regs->cause == EXC_INST_ILLEGAL, "Unexpected exception handler %lx\n", regs->cause); + insn = regs->stval; + opcode = (insn & INSN_OPCODE_MASK) >> INSN_OPCODE_SHIFT; + __GUEST_ASSERT(opcode == INSN_OPCODE_SYSTEM, + "Unexpected instruction with opcode 0x%x insn 0x%lx\n", opcode, insn); + + csr_num = GET_CSR_NUM(insn); + funct3 = GET_RM(insn); + /* Validate if it is a CSR read/write operation */ + __GUEST_ASSERT(funct3 <= 7 && (funct3 != 0 || funct3 != 4), + "Unexpected system opcode with funct3 0x%x csr_num 0x%x\n", + funct3, csr_num); + + /* Validate if it is a HPMCOUNTER CSR operation */ + __GUEST_ASSERT(csr_num == CSR_CYCLE || csr_num <= CSR_HPMCOUNTER31, + "Unexpected csr_num 0x%x\n", csr_num); + illegal_handler_invoked = true; /* skip the trapping instruction */ regs->epc += 4; -- 2.43.0 From atishp at rivosinc.com Mon Mar 24 17:40:31 2025 From: atishp at rivosinc.com (Atish Patra) Date: Mon, 24 Mar 2025 17:40:31 -0700 Subject: [PATCH 3/3] KVM: riscv: selftests: Add vector extension tests In-Reply-To: <20250324-kvm_selftest_improve-v1-0-583620219d4f@rivosinc.com> References: <20250324-kvm_selftest_improve-v1-0-583620219d4f@rivosinc.com> Message-ID: <20250324-kvm_selftest_improve-v1-3-583620219d4f@rivosinc.com> Add vector related tests with the ISA extension standard template. However, the vector registers are bit tricky as the register length is variable based on vlenb value of the system. That's why the macros are defined with a default and overidden with actual value at runtime. Signed-off-by: Atish Patra --- tools/testing/selftests/kvm/riscv/get-reg-list.c | 111 ++++++++++++++++++++++- 1 file changed, 110 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/kvm/riscv/get-reg-list.c b/tools/testing/selftests/kvm/riscv/get-reg-list.c index 8515921dfdbf..576ab8eb7368 100644 --- a/tools/testing/selftests/kvm/riscv/get-reg-list.c +++ b/tools/testing/selftests/kvm/riscv/get-reg-list.c @@ -145,7 +145,9 @@ void finalize_vcpu(struct kvm_vcpu *vcpu, struct vcpu_reg_list *c) { unsigned long isa_ext_state[KVM_RISCV_ISA_EXT_MAX] = { 0 }; struct vcpu_reg_sublist *s; - uint64_t feature; + uint64_t feature = 0; + u64 reg, size; + unsigned long vlenb_reg; int rc; for (int i = 0; i < KVM_RISCV_ISA_EXT_MAX; i++) @@ -173,6 +175,23 @@ void finalize_vcpu(struct kvm_vcpu *vcpu, struct vcpu_reg_list *c) switch (s->feature_type) { case VCPU_FEATURE_ISA_EXT: feature = RISCV_ISA_EXT_REG(s->feature); + if (s->feature == KVM_RISCV_ISA_EXT_V) { + /* Enable V extension so that we can get the vlenb register */ + __vcpu_set_reg(vcpu, feature, 1); + /* Compute the correct vector register size */ + rc = __vcpu_get_reg(vcpu, s->regs[4], &vlenb_reg); + if (rc < 0) + /* The vector test may fail if the default reg size doesn't match */ + break; + size = __builtin_ctzl(vlenb_reg); + size <<= KVM_REG_SIZE_SHIFT; + for (int i = 0; i < 32; i++) { + reg = KVM_REG_RISCV | KVM_REG_RISCV_VECTOR | size | + KVM_REG_RISCV_VECTOR_REG(i); + s->regs[5 + i] = reg; + } + __vcpu_set_reg(vcpu, feature, 0); + } break; case VCPU_FEATURE_SBI_EXT: feature = RISCV_SBI_EXT_REG(s->feature); @@ -408,6 +427,35 @@ static const char *fp_d_id_to_str(const char *prefix, __u64 id) return strdup_printf("%lld /* UNKNOWN */", reg_off); } +static const char *vector_id_to_str(const char *prefix, __u64 id) +{ + /* reg_off is the offset into struct __riscv_v_ext_state */ + __u64 reg_off = id & ~(REG_MASK | KVM_REG_RISCV_VECTOR); + int reg_index = 0; + + assert((id & KVM_REG_RISCV_TYPE_MASK) == KVM_REG_RISCV_VECTOR); + + if (reg_off >= KVM_REG_RISCV_VECTOR_REG(0)) + reg_index = reg_off - KVM_REG_RISCV_VECTOR_REG(0); + switch (reg_off) { + case KVM_REG_RISCV_VECTOR_REG(0) ... + KVM_REG_RISCV_VECTOR_REG(31): + return strdup_printf("KVM_REG_RISCV_VECTOR_REG(%d)", reg_index); + case KVM_REG_RISCV_VECTOR_CSR_REG(vstart): + return "KVM_REG_RISCV_VECTOR_CSR_REG(vstart)"; + case KVM_REG_RISCV_VECTOR_CSR_REG(vl): + return "KVM_REG_RISCV_VECTOR_CSR_REG(vl)"; + case KVM_REG_RISCV_VECTOR_CSR_REG(vtype): + return "KVM_REG_RISCV_VECTOR_CSR_REG(vtype)"; + case KVM_REG_RISCV_VECTOR_CSR_REG(vcsr): + return "KVM_RISCV_VCPU_VECTOR_CSR_REG(vcsr)"; + case KVM_REG_RISCV_VECTOR_CSR_REG(vlenb): + return "KVM_REG_RISCV_VECTOR_CSR_REG(vlenb)"; + } + + return strdup_printf("%lld /* UNKNOWN */", reg_off); +} + #define KVM_ISA_EXT_ARR(ext) \ [KVM_RISCV_ISA_EXT_##ext] = "KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_" #ext @@ -635,6 +683,9 @@ void print_reg(const char *prefix, __u64 id) case KVM_REG_SIZE_U128: reg_size = "KVM_REG_SIZE_U128"; break; + case KVM_REG_SIZE_U256: + reg_size = "KVM_REG_SIZE_U256"; + break; default: printf("\tKVM_REG_RISCV | (%lld << KVM_REG_SIZE_SHIFT) | 0x%llx /* UNKNOWN */,\n", (id & KVM_REG_SIZE_MASK) >> KVM_REG_SIZE_SHIFT, id & ~REG_MASK); @@ -666,6 +717,10 @@ void print_reg(const char *prefix, __u64 id) printf("\tKVM_REG_RISCV | %s | KVM_REG_RISCV_FP_D | %s,\n", reg_size, fp_d_id_to_str(prefix, id)); break; + case KVM_REG_RISCV_VECTOR: + printf("\tKVM_REG_RISCV | %s | KVM_REG_RISCV_VECTOR | %s,\n", + reg_size, vector_id_to_str(prefix, id)); + break; case KVM_REG_RISCV_ISA_EXT: printf("\tKVM_REG_RISCV | %s | KVM_REG_RISCV_ISA_EXT | %s,\n", reg_size, isa_ext_id_to_str(prefix, id)); @@ -870,6 +925,54 @@ static __u64 fp_d_regs[] = { KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_D, }; +/* Define a default vector registers with length. This will be overwritten at runtime */ +static __u64 vector_regs[] = { + KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_VECTOR | + KVM_REG_RISCV_VECTOR_CSR_REG(vstart), + KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_VECTOR | + KVM_REG_RISCV_VECTOR_CSR_REG(vl), + KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_VECTOR | + KVM_REG_RISCV_VECTOR_CSR_REG(vtype), + KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_VECTOR | + KVM_REG_RISCV_VECTOR_CSR_REG(vcsr), + KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_VECTOR | + KVM_REG_RISCV_VECTOR_CSR_REG(vlenb), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(0), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(1), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(2), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(3), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(4), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(5), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(6), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(7), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(8), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(9), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(10), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(11), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(12), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(13), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(14), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(15), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(16), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(17), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(18), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(19), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(20), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(21), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(22), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(23), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(24), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(25), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(26), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(27), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(28), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(29), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(30), + KVM_REG_RISCV | KVM_REG_SIZE_U128 | KVM_REG_RISCV_VECTOR | KVM_REG_RISCV_VECTOR_REG(31), + KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | + KVM_RISCV_ISA_EXT_V, +}; + #define SUBLIST_BASE \ {"base", .regs = base_regs, .regs_n = ARRAY_SIZE(base_regs), \ .skips_set = base_skips_set, .skips_set_n = ARRAY_SIZE(base_skips_set),} @@ -894,6 +997,10 @@ static __u64 fp_d_regs[] = { {"fp_d", .feature = KVM_RISCV_ISA_EXT_D, .regs = fp_d_regs, \ .regs_n = ARRAY_SIZE(fp_d_regs),} +#define SUBLIST_V \ + {"v", .feature = KVM_RISCV_ISA_EXT_V, .regs = vector_regs, \ + .regs_n = ARRAY_SIZE(vector_regs),} + #define KVM_ISA_EXT_SIMPLE_CONFIG(ext, extu) \ static __u64 regs_##ext[] = { \ KVM_REG_RISCV | KVM_REG_SIZE_ULONG | \ @@ -962,6 +1069,7 @@ KVM_SBI_EXT_SIMPLE_CONFIG(susp, SUSP); KVM_ISA_EXT_SUBLIST_CONFIG(aia, AIA); KVM_ISA_EXT_SUBLIST_CONFIG(fp_f, FP_F); KVM_ISA_EXT_SUBLIST_CONFIG(fp_d, FP_D); +KVM_ISA_EXT_SUBLIST_CONFIG(v, V); KVM_ISA_EXT_SIMPLE_CONFIG(h, H); KVM_ISA_EXT_SIMPLE_CONFIG(smnpm, SMNPM); KVM_ISA_EXT_SUBLIST_CONFIG(smstateen, SMSTATEEN); @@ -1034,6 +1142,7 @@ struct vcpu_reg_list *vcpu_configs[] = { &config_fp_f, &config_fp_d, &config_h, + &config_v, &config_smnpm, &config_smstateen, &config_sscofpmf, -- 2.43.0 From guoren at kernel.org Tue Mar 25 05:15:41 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:15:41 -0400 Subject: [RFC PATCH V3 00/43] rv64ilp32_abi: Build CONFIG_64BIT kernel-self with ILP32 ABI Message-ID: <20250325121624.523258-1-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" Since 2001, the CONFIG_64BIT kernel has been built with the LP64 ABI, but this patchset allows the CONFIG_64BIT kernel to use an ILP32 ABI for construction to reduce cache & memory footprint (Compared to kernel-lp64-abi, kernel-rv64ilp32-abi decreased the used memory by about 20%, as shown in "free -h" in the following demo.) Caution: this patchset doesn't introduce any new userspace ABI; it's only for the kernel-self. The patchset targets RISC-V and is built on the RV64ILP32 ABI, which was introduced into RISC-V's psABI in January 2025 [1]. This patchset equips an rv64ilp32-abi kernel with all the functionalities of a traditional lp64-abi kernel, yet restricts the address space to 2GiB. Hence, the rv64ilp32-abi kernel simultaneously supports lp64-abi userspace and ilp32-abi (compat) userspace, the same as the traditional lp64-abi kernel. +--------------------------------+ | +-------------+--------------+ | | | | (compat) | | | | lp64-abi | ilp32-abi | | User | +-------------+--------------+ | +--------------------------------+------- | +----------------------------+ | | | rv64ilp32-abi / lp64-abi | | Kernel | | ^^^^^^^^^^^^^ | | | +----------------------------+ | +--------------lp64-sbi----------+------- | +----------------------------+ | | | lp64-abi | | OpenSBI | +----------------------------+ | +--------------------------------+------- | +----------------------------+ | | | rv64gcbvh (RISC-V 64-bit) | | ISA | +----------------------------+ | +--------------------------------+ Caution: The rv64ilp32-abi and lp64-abi kernels are equivalent and can be used interchangeably. The only difference is that the rv64ilp32-abi kernel restricts kernel and user space to separate 2GiB address spaces. Motivation ========== Because all RISC-V RVA(B) Profiles are based on the 64-bit ISA, the market has experienced a significant rise in RISC-V 64-bit ISA SoCs and CPU cores for resource-constrained scenarios, such as: - allwinner/sun20i-d1-lichee - allwinner/sun20i-d1s-mangopi - bouffalo/bl808 - canaan/k230d - microchip/mpfs-beaglev-fire - renesas/rzfive-smarc - sophgo/cv1800b - sophgo/cv1812h - sophgo/sg2002 The listed RV64 ISA-based SoCs with limited memory (less than 1GiB) can benefit from this patchset. The patchset's benefit is not only decreasing the memory footprint but also improving performance due to increased cache density. Hence, All RVA(B) Profile hardwares can benefit from this patchset. Patchset Organization ===================== This patchset is now in its third version. The major update is the shift to CONFIG_64BIT with user lp64-abi & ilp32-abi support. The prior versions (v1, v2) are all based on CONFIG_32BIT and only support the user ilp32-abi. The innovation of v3 lies in supporting user lp64-abi by inheriting CONFIG_64BIT. This patchset comprises 43 patches affecting more than 20 subsystems. Most modifications are about ensuring the correct usage of BITS_PER_LONG and CONFIG_64BIT. Part of the Linux code doesn't care about that because BITS_PER_LONG and CONFIG_64BIT were the same before. - PATCH[1] : The rv64ilp32-abi kernel reuses lp64-abi uapi. - PATCH[2~17] : The riscv subsystem-related modifications. - PATCH[18~43]: Other subsystem-related modifications. The first patch needs discussion and is titled "uapi: Reuse lp64 ABI interface." How do we define a unified set of lp64-abi uapi header files that could be utilized for the lp64-abi kernel and the rv64ilp32-abi kernel? To get started with the patch set quickly, check out the following demo. Demo Introduction ================= To test the patchset, use a riscv64 toolchain with rv64ilp32-abi support. The rv64ilp32-abi is integrated as a -mabi=ilp32 feature within the standard rv64 toolchain. We've built a multi-lib riscv64-elf-toolchain [2] containing rv64ilp32-abi. We also provide the pre-compiled demo materials for a quick start, such as qemu, kernel, and rootfs binaries for the demo. After download from [2]: $ tar zxvf riscv64-elf-ubuntu-20(2).04-gcc-nightly-2025.03.24-nightly.tar.gz $ cd riscv/qemu-linux - Image_rv64ilp32 rv64ilp32-abi kernel - Image_rv64lp64 lp64-abi kernel - u64lp64_rootfs.ext2 lp64-abi userspace rootfs - start-qemu-rv64.sh qemu running wrapper script Compile Image_rv64ilp32: $ make ARCH=riscv CROSS_COMPILE=/riscv/bin/riscv64-unknown-elf- rv64ilp32_defconfig all Compile Image_rv64lp64: $ make ARCH=riscv CROSS_COMPILE=/riscv/bin/riscv64-unknown-elf- defconfig all Quick Start: $ ./start-qemu-rv64.sh Image_rv64ilp32 u64lp64_rootfs.ext2 v.s. $ ./start-qemu-rv64.sh Image_rv64lp64 u64lp64_rootfs.ext2 Used Memory Comparison ====================== Under the same configuration, the used memory decreased by 20% (10.8 -> 8.2) with the Image_rv64ilp32 replacement (Qemu, firmware, and lp64-abi user rootfs are the same). $ ./start-qemu-rv64.sh Image_rv64ilp32 u64lp64_rootfs.ext2 $ free -h total used free shared buff/cache available Mem: 105.4M 8.2M 93.9M 44.0K 3.3M 93.6M ^^^^ $ ./start-qemu-rv64.sh Image_rv64lp64 u64lp64_rootfs.ext2 $ free -h total used free shared buff/cache available Mem: 89.3M 10.8M 74.8M 44.0K 3.7M 74.9M ^^^^^ $ cat start-qemu-rv64.sh exec qemu-system-riscv64 -cpu rv64 -M virt -m 128m -nographic -kernel $1 -drive file=$2,format=raw,id=hd0 -device virtio-blk-device,drive=hd0 -append "rootwait root=/dev/vda ro console=ttyS0 earlycon=sbi norandmaps no5lvl no4lvl" User Virtual Memory Layout ========================== Here is the comparison running lp64-abi userspace rootfs on rv64ilp32-abi kernel and lp64-abi kernel: (rv64ilp32-abi kernel + lp64-abi user rootfs) $ cat /proc/1/maps 55555000-5560c000 r-xp 00000000 fe:00 17 /bin/busybox 5560c000-5560f000 r--p 000b7000 fe:00 17 /bin/busybox 5560f000-55610000 rw-p 000ba000 fe:00 17 /bin/busybox 55610000-55631000 rw-p 00000000 00:00 0 [heap] 77e69000-77e6b000 rw-p 00000000 00:00 0 77e6b000-77fba000 r-xp 00000000 fe:00 140 /lib/libc.so.6 77fba000-77fbd000 r--p 0014f000 fe:00 140 /lib/libc.so.6 77fbd000-77fbf000 rw-p 00152000 fe:00 140 /lib/libc.so.6 77fbf000-77fcb000 rw-p 00000000 00:00 0 77fcb000-77fd5000 r-xp 00000000 fe:00 148 /lib/libresolv.so.2 77fd5000-77fd6000 r--p 0000a000 fe:00 148 /lib/libresolv.so.2 77fd6000-77fd7000 rw-p 0000b000 fe:00 148 /lib/libresolv.so.2 77fd7000-77fd9000 rw-p 00000000 00:00 0 77fd9000-77fdb000 r--p 00000000 00:00 0 [vvar] 77fdb000-77fdc000 r-xp 00000000 00:00 0 [vdso] 77fdc000-77ffc000 r-xp 00000000 fe:00 135 /lib/ld-linux-riscv64-lp64d.so.1 77ffc000-77ffe000 r--p 0001f000 fe:00 135 /lib/ld-linux-riscv64-lp64d.so.1 77ffe000-78000000 rw-p 00021000 fe:00 135 /lib/ld-linux-riscv64-lp64d.so.1 7ffdf000-80000000 rw-p 00000000 00:00 0 [stack] (lp64-abi kernel + lp64-abi user rootfs) $ cat /proc/1/maps 2aaaaaa000-2aaab61000 r-xp 00000000 fe:00 17 /bin/busybox 2aaab61000-2aaab64000 r--p 000b7000 fe:00 17 /bin/busybox 2aaab64000-2aaab65000 rw-p 000ba000 fe:00 17 /bin/busybox 2aaab65000-2aaab86000 rw-p 00000000 00:00 0 [heap] 3ff7e69000-3ff7e6b000 rw-p 00000000 00:00 0 3ff7e6b000-3ff7fba000 r-xp 00000000 fe:00 140 /lib/libc.so.6 3ff7fba000-3ff7fbd000 r--p 0014f000 fe:00 140 /lib/libc.so.6 3ff7fbd000-3ff7fbf000 rw-p 00152000 fe:00 140 /lib/libc.so.6 3ff7fbf000-3ff7fcb000 rw-p 00000000 00:00 0 3ff7fcb000-3ff7fd5000 r-xp 00000000 fe:00 148 /lib/libresolv.so.2 3ff7fd5000-3ff7fd6000 r--p 0000a000 fe:00 148 /lib/libresolv.so.2 3ff7fd6000-3ff7fd7000 rw-p 0000b000 fe:00 148 /lib/libresolv.so.2 3ff7fd7000-3ff7fd9000 rw-p 00000000 00:00 0 3ff7fd9000-3ff7fdb000 r--p 00000000 00:00 0 [vvar] 3ff7fdb000-3ff7fdc000 r-xp 00000000 00:00 0 [vdso] 3ff7fdc000-3ff7ffc000 r-xp 00000000 fe:00 135 /lib/ld-linux-riscv64-lp64d.so.1 3ff7ffc000-3ff7ffe000 r--p 0001f000 fe:00 135 /lib/ld-linux-riscv64-lp64d.so.1 3ff7ffe000-3ff8000000 rw-p 00021000 fe:00 135 /lib/ld-linux-riscv64-lp64d.so.1 3ffffdf000-4000000000 rw-p 00000000 00:00 0 [stack] ~~~~~~~~~~~~~~~~~~~~~~ For the ilp32-abi userspace rootfs, the virtual memory layouts are the same: (lp64-abi/rv64ilp32-abi kernel + ilp32-abi user rootfs) $ cat /proc/1/maps 55555000-55637000 r-xp 00000000 fe:00 17 /bin/busybox 55637000-55639000 r--p 000e1000 fe:00 17 /bin/busybox 55639000-5563a000 rw-p 000e3000 fe:00 17 /bin/busybox 5563a000-5565c000 rw-p 00000000 00:00 0 [heap] 77e63000-77fbe000 r-xp 00000000 fe:00 145 /lib/libc.so.6 77fbe000-77fc0000 r--p 0015a000 fe:00 145 /lib/libc.so.6 77fc0000-77fc1000 rw-p 0015c000 fe:00 145 /lib/libc.so.6 77fc1000-77fcb000 rw-p 00000000 00:00 0 77fcb000-77fd5000 r-xp 00000000 fe:00 154 /lib/libresolv.so.2 77fd5000-77fd6000 r--p 0000a000 fe:00 154 /lib/libresolv.so.2 77fd6000-77fd7000 rw-p 0000b000 fe:00 154 /lib/libresolv.so.2 77fd7000-77fd9000 rw-p 00000000 00:00 0 77fd9000-77fdb000 r--p 00000000 00:00 0 [vvar] 77fdb000-77fdc000 r-xp 00000000 00:00 0 [vdso] 77fdc000-77ffe000 r-xp 00000000 fe:00 138 /lib/ld-linux-riscv32-ilp32d.so.1 77ffe000-77fff000 r--p 00022000 fe:00 138 /lib/ld-linux-riscv32-ilp32d.so.1 77fff000-78000000 rw-p 00023000 fe:00 138 /lib/ld-linux-riscv32-ilp32d.so.1 7ffdf000-80000000 rw-p 00000000 00:00 0 [stack] Kernel Virtual Memory Layout ============================ Here is the comparison on rv64ilp32-abi kernel and lp64-abi kernel: Virtual kernel memory layout (rv64ilp32-abi kernel): fixmap : 0x94a00000 - 0x94ffffff (6144 kB) pci io : 0x95000000 - 0x95ffffff ( 16 MB) vmemmap : 0x96000000 - 0x97ffffff ( 32 MB) vmalloc : 0x98000000 - 0xb7ffffff ( 512 MB) modules : 0xb8000000 - 0xbbffffff ( 64 MB) lowmem : 0xc0000000 - 0xc7ffffff ( 128 MB) kasan : 0x80000000 - 0x8fffffff ( 256 MB) kernel : 0xbc000000 - 0xbfffffff ( 64 MB) Virtual kernel memory layout (lp64-abi kernel): fixmap : 0xffffffc4fea00000 - 0xffffffc4feffffff (6144 kB) pci io : 0xffffffc4ff000000 - 0xffffffc4ffffffff ( 16 MB) vmemmap : 0xffffffc500000000 - 0xffffffc5ffffffff (4096 MB) vmalloc : 0xffffffc600000000 - 0xffffffd5ffffffff ( 64 GB) modules : 0xffffffff01591000 - 0xffffffff7fffffff (2026 MB) lowmem : 0xffffffd600000000 - 0xffffffd607ffffff ( 128 MB) kasan : 0xfffffff700000000 - 0xfffffffeffffffff ( 32 GB) kernel : 0xffffffff80000000 - 0xfffffffffffffffe (2047 MB) Memory Info Comparison ====================== $ ./start-qemu-rv64.sh Image_rv64ilp32 u64lp64_rootfs.ext2 $ cat /proc/meminfo MemTotal: 107916 kB MemFree: 96200 kB MemAvailable: 95932 kB Buffers: 448 kB Cached: 2268 kB SwapCached: 0 kB Active: 2432 kB Inactive: 832 kB Active(anon): 44 kB Inactive(anon): 548 kB Active(file): 2388 kB Inactive(file): 284 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 108 kB Writeback: 0 kB AnonPages: 648 kB Mapped: 1672 kB Shmem: 44 kB KReclaimable: 688 kB Slab: 4996 kB SReclaimable: 688 kB SUnreclaim: 4308 kB KernelStack: 768 kB PageTables: 164 kB SecPageTables: 0 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 53956 kB Committed_AS: 2240 kB VmallocTotal: 524288 kB VmallocUsed: 924 kB VmallocChunk: 0 kB Percpu: 76 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB $ ./start-qemu-rv64.sh Image_rv64lp64 u64lp64_rootfs.ext2 $ cat /proc/meminfo MemTotal: 91428 kB MemFree: 77048 kB MemAvailable: 77172 kB Buffers: 448 kB Cached: 2268 kB SwapCached: 0 kB Active: 2492 kB Inactive: 768 kB Active(anon): 48 kB Inactive(anon): 540 kB Active(file): 2444 kB Inactive(file): 228 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 32 kB Writeback: 0 kB AnonPages: 648 kB Mapped: 1768 kB Shmem: 44 kB KReclaimable: 1140 kB Slab: 7220 kB SReclaimable: 1140 kB SUnreclaim: 6080 kB KernelStack: 768 kB PageTables: 408 kB SecPageTables: 0 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 45712 kB Committed_AS: 2240 kB VmallocTotal: 67108864 kB VmallocUsed: 864 kB VmallocChunk: 0 kB Percpu: 88 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB CPU Info Comparison =================== After disabling sv48 and sv57, there is no difference in the "/proc/cpuinfo". $ ./start-qemu-rv64.sh Image_rv64lp64 u64lp64_rootfs.ext2 $ cat /proc/cpuinfo processor : 0 hart : 0 isa : rv64imafdch_zicbom_zicboz_zicntr_zicsr_zifencei_zihintpause_zihpm_zawrs_zfa_zca_zcd_zba_zbb_zbc_zbs_sstc_svadu mmu : sv39 mvendorid : 0x0 marchid : 0x0 mimpid : 0x0 hart isa : rv64imafdch_zicbom_zicboz_zicntr_zicsr_zifencei_zihintpause_zihpm_zawrs_zfa_zca_zcd_zba_zbb_zbc_zbs_sstc_svadu $ ./start-qemu-rv64.sh Image_rv64ilp32 u64lp64_rootfs.ext2 $ cat /proc/cpuinfo processor : 0 hart : 0 isa : rv64imafdch_zicbom_zicboz_zicntr_zicsr_zifencei_zihintpause_zihpm_zawrs_zfa_zca_zcd_zba_zbb_zbc_zbs_sstc_svadu mmu : sv39 mvendorid : 0x0 marchid : 0x0 mimpid : 0x0 hart isa : rv64imafdch_zicbom_zicboz_zicntr_zicsr_zifencei_zihintpause_zihpm_zawrs_zfa_zca_zcd_zba_zbb_zbc_zbs_sstc_svadu .config Difference ================== The patchset adds CONFIG_ABI_RV64ILP32 to Kconfig, switching the compile options from "-mabi=lp64 -melf64lriscv" to "-mabi=ilp32 -melf32lriscv" depending on CONFIG_64BIT. So, The differences of Kconfig between with and without ABI_RV64ILP32 are rare: - CONFIG_PAGE_OFFSET Change to 0xc0000000. - CONFIG_ILLEGAL_POINTER_VALUE Change to 0x0. - CONFIG_ABI_RV64ILP32 Compile option depends on CONFIG_64BIT. - CONFIG_HAVE_CMPXCHG_DOUBLE The rv64ilp32-abi kernel offers new feature. - CONFIG_ZONE_DMA32 It's unnecessary for rv64ilp32-abi kernel. - CONFIG_CSD_LOCK_WAIT_DEBUG Because of BITS_PER_LONG = 32, rv64ilp32-abi kernel doesn't support. $ diff build-rv64lp64/.config build-rv64ilp32/.config 296c296 < CONFIG_PAGE_OFFSET=0xff60000000000000 --- > CONFIG_PAGE_OFFSET=0xc0000000 308c308 < CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000 --- > CONFIG_ILLEGAL_POINTER_VALUE=0 352c352 < # CONFIG_ABI_RV64ILP32 is not set --- > CONFIG_ABI_RV64ILP32=y 609a610 > CONFIG_HAVE_CMPXCHG_DOUBLE=y 837d837 < CONFIG_ZONE_DMA32=y 7240d7239 < # CONFIG_CSD_LOCK_WAIT_DEBUG is not set 6 differences Use "zcat /proc/config.gz" to get the .config file in our qemu demo. Dmesg Difference ================ $ diff rv64lp64.log rv64ilp32.log 1c1 < Linux version 6.14.0-rc1-00041-g804ac3b4d679 (ren.guo at ea134-sw12.eng.xrvm.cn) (riscv64-unknown-elf-gcc (gf9ffd92f861-dirty) 13.2.0, GNU ld (GNU Binutils) 2.42) #1 SMP Sat Mar 15 11:57:20 CST 2025 --- > Linux version 6.14.0-rc1-00041-g804ac3b4d679 (ren.guo at ea134-sw12.eng.xrvm.cn) (riscv64-unknown-elf-gcc (gf9ffd92f861-dirty) 13.2.0, GNU ld (GNU Binutils) 2.42) #1 SMP Sat Mar 15 11:55:33 CST 2025 13,14d12 < Disabled 5-level paging < Disabled 4-level and 5-level paging 19,20c17 < DMA32 [mem 0x0000000060000000-0x0000000067ffffff] < Normal empty --- > Normal [mem 0x0000000060000000-0x0000000067ffffff] 31c28 < percpu: Embedded 22 pages/cpu s49384 r8192 d32536 u90112 --- > percpu: Embedded 16 pages/cpu s34264 r8192 d23080 u65536 33,35c30,33 < printk: log buffer data + meta data: 131072 + 458752 = 589824 bytes < Dentry cache hash table entries: 16384 (order: 5, 131072 bytes, linear) < Inode-cache hash table entries: 8192 (order: 4, 65536 bytes, linear) --- > Unknown kernel command line parameters "no5lvl no4lvl", will be passed to user space. > printk: log buffer data + meta data: 131072 + 409600 = 540672 bytes > Dentry cache hash table entries: 16384 (order: 4, 65536 bytes, linear) > Inode-cache hash table entries: 8192 (order: 3, 32768 bytes, linear) 40c38 < software IO TLB: mapped [mem 0x0000000067f6b000-0x0000000067fab000] (0MB) --- > software IO TLB: mapped [mem 0x0000000067f8e000-0x0000000067fce000] (0MB) 42,48c40,46 < fixmap : 0xffffffc4fea00000 - 0xffffffc4feffffff (6144 kB) < pci io : 0xffffffc4ff000000 - 0xffffffc4ffffffff ( 16 MB) < vmemmap : 0xffffffc500000000 - 0xffffffc5ffffffff (4096 MB) < vmalloc : 0xffffffc600000000 - 0xffffffd5ffffffff ( 64 GB) < modules : 0xffffffff0158d000 - 0xffffffff7fffffff (2026 MB) < lowmem : 0xffffffd600000000 - 0xffffffd607ffffff ( 128 MB) < kernel : 0xffffffff80000000 - 0xfffffffffffffffe (2047 MB) --- > fixmap : 0x94a00000 - 0x94ffffff (6144 kB) > pci io : 0x95000000 - 0x95ffffff ( 16 MB) > vmemmap : 0x96000000 - 0x97ffffff ( 32 MB) > vmalloc : 0x98000000 - 0xb7ffffff ( 512 MB) > modules : 0xb8000000 - 0xbbffffff ( 64 MB) > lowmem : 0xc0000000 - 0xc7ffffff ( 128 MB) > kernel : 0xbc000000 - 0xbfffffff ( 64 MB) 68,69c66,67 < Mount-cache hash table entries: 512 (order: 0, 4096 bytes, linear) < Mountpoint-cache hash table entries: 512 (order: 0, 4096 bytes, linear) --- > Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear) > Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear) 77c75 < Memory: 87584K/131072K available (9731K kernel code, 4933K rwdata, 4096K rodata, 2307K init, 484K bss, 41948K reserved, 0K cma-reserved) --- > Memory: 104768K/131072K available (10043K kernel code, 4722K rwdata, 4096K rodata, 2265K init, 371K bss, 25420K reserved, 0K cma-reserved) 85d82 < DMA: preallocated 128 KiB GFP_KERNEL|GFP_DMA32 pool for atomic allocations 88c85 < audit: type=2000 audit(0.104:1): state=initialized audit_enabled=0 res=1 --- > audit: type=2000 audit(0.152:1): state=initialized audit_enabled=0 res=1 92c89 < HugeTLB: 28 KiB vmemmap can be freed for a 2.00 MiB page --- > HugeTLB: 0 KiB vmemmap can be freed for a 2.00 MiB page 106c103 < tcp_listen_portaddr_hash hash table entries: 128 (order: 0, 4096 bytes, linear) --- > tcp_listen_portaddr_hash hash table entries: 256 (order: 0, 5120 bytes, linear) 108,109c105,106 < TCP established hash table entries: 1024 (order: 1, 8192 bytes, linear) < TCP bind hash table entries: 1024 (order: 4, 65536 bytes, linear) --- > TCP established hash table entries: 1024 (order: 0, 4096 bytes, linear) > TCP bind hash table entries: 1024 (order: 3, 40960 bytes, linear) 111,112c108,109 < UDP hash table entries: 256 (order: 3, 40960 bytes, linear) < UDP-Lite hash table entries: 256 (order: 3, 40960 bytes, linear) --- > UDP hash table entries: 256 (order: 2, 20480 bytes, linear) > UDP-Lite hash table entries: 256 (order: 2, 20480 bytes, linear) 120c117 < workingset: timestamp_bits=46 max_order=15 bucket_order=0 --- > workingset: timestamp_bits=14 max_order=15 bucket_order=1 165c162 < goldfish_rtc 101000.rtc: setting system clock to 2025-03-15T08:48:58 UTC (1742028538) --- > goldfish_rtc 101000.rtc: setting system clock to 2025-03-15T08:51:36 UTC (1742028696) 191c188 < Freeing unused kernel image (initmem) memory: 2304K --- > Freeing unused kernel image (initmem) memory: 2264K 18 differences References ========== [1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/381 [2] https://github.com/ruyisdk/riscv-gnu-toolchain-rv64ilp32/releases/tag/2025.03.24 Changelog ========= v3: - Base on CONFIG_64BIT instead of CONFIG_32BIT - Add lp64-abi userspace support - Remove rv64ilp32-abi userspace support - Rebase on v6.14-rc1 v2: https://lore.kernel.org/linux-riscv/20231112061514.2306187-1-guoren at kernel.org/ - Add u64ilp32 support - Rebase v6.5-rc1 - Enable 64ilp32 vgettimeofday for benchmarking v1: https://lore.kernel.org/linux-riscv/20230518131013.3366406-1-guoren at kernel.org/ Guo Ren (Alibaba DAMO Academy) (43): rv64ilp32_abi: uapi: Reuse lp64 ABI interface rv64ilp32_abi: riscv: Adapt Makefile and Kconfig rv64ilp32_abi: riscv: Adapt ULL & UL definition rv64ilp32_abi: riscv: Introduce xlen_t to adapt __riscv_xlen != BITS_PER_LONG rv64ilp32_abi: riscv: crc32: Utilize 64-bit width to improve the performance rv64ilp32_abi: riscv: csum: Utilize 64-bit width to improve the performance rv64ilp32_abi: riscv: arch_hweight: Adapt cpopw & cpop of zbb extension rv64ilp32_abi: riscv: bitops: Adapt ctzw & clzw of zbb extension rv64ilp32_abi: riscv: Reuse LP64 SBI interface rv64ilp32_abi: riscv: Update SATP.MODE.ASID width rv64ilp32_abi: riscv: Introduce PTR_L and PTR_S rv64ilp32_abi: riscv: Introduce cmpxchg_double rv64ilp32_abi: riscv: Correct stackframe layout rv64ilp32_abi: riscv: Adapt kernel module code rv64ilp32_abi: riscv: mm: Adapt MMU_SV39 for 2GiB address space rv64ilp32_abi: riscv: Support physical addresses >= 0x80000000 rv64ilp32_abi: riscv: Adapt kasan memory layout rv64ilp32_abi: riscv: kvm: Initial support rv64ilp32_abi: irqchip: irq-riscv-intc: Use xlen_t instead of ulong rv64ilp32_abi: drivers/perf: Adapt xlen_t of sbiret rv64ilp32_abi: asm-generic: Add custom BITS_PER_LONG definition rv64ilp32_abi: bpf: Change KERN_ARENA_SZ to 256MiB rv64ilp32_abi: compat: Correct compat_ulong_t cast rv64ilp32_abi: compiler_types: Add "long long" into __native_word() rv64ilp32_abi: exec: Adapt 64lp64 env and argv rv64ilp32_abi: file_ref: Use 32-bit width for refcnt rv64ilp32_abi: input: Adapt BITS_PER_LONG to dword rv64ilp32_abi: iov_iter: Resize kvec to match iov_iter's size rv64ilp32_abi: locking/atomic: Use BITS_PER_LONG for scripts rv64ilp32_abi: kernel/smp: Disable CSD_LOCK_WAIT_DEBUG rv64ilp32_abi: maple_tree: Use BITS_PER_LONG instead of CONFIG_64BIT rv64ilp32_abi: mm: Remove _folio_nr_pages rv64ilp32_abi: mm/auxvec: Adapt mm->saved_auxv[] to Elf64 rv64ilp32_abi: mm: Adapt vm_flags_t struct rv64ilp32_abi: net: Use BITS_PER_LONG in struct dst_entry rv64ilp32_abi: printf: Use BITS_PER_LONG instead of CONFIG_64BIT rv64ilp32_abi: random: Adapt fast_pool struct rv64ilp32_abi: syscall: Use CONFIG_64BIT instead of BITS_PER_LONG rv64ilp32_abi: sysinfo: Adapt sysinfo structure to lp64 uapi rv64ilp32_abi: tracepoint-defs: Using u64 for trace_print_flags.mask rv64ilp32_abi: tty: Adapt ptr_to_compat rv64ilp32_abi: memfd: Use vm_flag_t riscv: Fixup address space overlay of print_mlk arch/riscv/Kconfig | 15 +- arch/riscv/Makefile | 17 ++ arch/riscv/configs/rv64ilp32.config | 1 + arch/riscv/include/asm/arch_hweight.h | 8 +- arch/riscv/include/asm/asm.h | 13 +- arch/riscv/include/asm/bitops.h | 21 +- arch/riscv/include/asm/checksum.h | 4 + arch/riscv/include/asm/cmpxchg.h | 57 ++++- arch/riscv/include/asm/cpu_ops_sbi.h | 4 +- arch/riscv/include/asm/csr.h | 227 +++++++++--------- arch/riscv/include/asm/kasan.h | 6 +- arch/riscv/include/asm/kvm_aia.h | 32 +-- arch/riscv/include/asm/kvm_host.h | 192 +++++++-------- arch/riscv/include/asm/kvm_nacl.h | 26 +- arch/riscv/include/asm/kvm_vcpu_insn.h | 4 +- arch/riscv/include/asm/kvm_vcpu_pmu.h | 8 +- arch/riscv/include/asm/kvm_vcpu_sbi.h | 4 +- arch/riscv/include/asm/page.h | 23 +- arch/riscv/include/asm/pgtable-64.h | 55 +++-- arch/riscv/include/asm/pgtable.h | 60 ++++- arch/riscv/include/asm/processor.h | 12 +- arch/riscv/include/asm/ptrace.h | 92 +++---- arch/riscv/include/asm/sbi.h | 32 +-- arch/riscv/include/asm/scs.h | 4 +- arch/riscv/include/asm/sparsemem.h | 2 +- arch/riscv/include/asm/stacktrace.h | 6 + arch/riscv/include/asm/switch_to.h | 4 +- arch/riscv/include/asm/syscall_table.h | 2 +- arch/riscv/include/asm/thread_info.h | 2 +- arch/riscv/include/asm/timex.h | 4 +- arch/riscv/include/asm/unistd.h | 4 +- arch/riscv/include/uapi/asm/bitsperlong.h | 6 + arch/riscv/include/uapi/asm/elf.h | 4 +- arch/riscv/include/uapi/asm/kvm.h | 56 ++--- arch/riscv/include/uapi/asm/ptrace.h | 97 ++++---- arch/riscv/include/uapi/asm/ucontext.h | 7 +- arch/riscv/include/uapi/asm/unistd.h | 2 +- arch/riscv/kernel/compat_signal.c | 4 +- arch/riscv/kernel/cpu.c | 4 +- arch/riscv/kernel/cpu_ops_sbi.c | 4 +- arch/riscv/kernel/entry.S | 32 +-- arch/riscv/kernel/head.S | 120 ++++++++- arch/riscv/kernel/module.c | 2 +- arch/riscv/kernel/process.c | 8 +- arch/riscv/kernel/sbi_ecall.c | 22 +- arch/riscv/kernel/signal.c | 4 +- arch/riscv/kernel/traps.c | 4 +- arch/riscv/kernel/vector.c | 2 +- arch/riscv/kvm/aia.c | 26 +- arch/riscv/kvm/aia_imsic.c | 6 +- arch/riscv/kvm/main.c | 2 +- arch/riscv/kvm/mmu.c | 10 +- arch/riscv/kvm/tlb.c | 76 +++--- arch/riscv/kvm/vcpu.c | 10 +- arch/riscv/kvm/vcpu_exit.c | 4 +- arch/riscv/kvm/vcpu_insn.c | 12 +- arch/riscv/kvm/vcpu_onereg.c | 18 +- arch/riscv/kvm/vcpu_pmu.c | 8 +- arch/riscv/kvm/vcpu_sbi_base.c | 2 +- arch/riscv/kvm/vmid.c | 4 +- arch/riscv/lib/crc32-riscv.c | 35 +-- arch/riscv/lib/csum.c | 48 ++-- arch/riscv/mm/context.c | 12 +- arch/riscv/mm/fault.c | 12 +- arch/riscv/mm/init.c | 63 +++-- arch/riscv/mm/kasan_init.c | 2 +- arch/riscv/mm/pageattr.c | 4 +- arch/riscv/mm/pgtable.c | 2 +- arch/riscv/net/bpf_jit_comp64.c | 6 +- drivers/char/random.c | 8 + drivers/input/input.c | 4 + drivers/irqchip/irq-riscv-intc.c | 9 +- drivers/perf/riscv_pmu_sbi.c | 4 +- drivers/tty/tty_io.c | 4 + fs/exec.c | 4 + fs/proc/loadavg.c | 10 +- fs/proc/task_mmu.c | 9 +- include/asm-generic/bitsperlong.h | 2 + include/asm-generic/module.h | 2 +- include/linux/atomic/atomic-long.h | 174 +++++++------- include/linux/compiler_types.h | 7 + include/linux/file_ref.h | 4 +- include/linux/maple_tree.h | 2 +- include/linux/memfd.h | 4 +- include/linux/mm.h | 14 +- include/linux/mm_types.h | 10 +- include/linux/sched/loadavg.h | 4 + include/linux/smp_types.h | 2 +- include/linux/socket.h | 35 +++ include/linux/tracepoint-defs.h | 4 + include/linux/uio.h | 6 + include/net/dst.h | 6 +- include/uapi/asm-generic/siginfo.h | 50 ++++ include/uapi/asm-generic/signal.h | 35 +++ include/uapi/asm-generic/stat.h | 25 ++ include/uapi/linux/atm.h | 7 + include/uapi/linux/atmdev.h | 7 + include/uapi/linux/auto_fs.h | 6 + include/uapi/linux/blkpg.h | 7 + include/uapi/linux/btrfs.h | 19 ++ include/uapi/linux/capi.h | 11 + include/uapi/linux/fs.h | 12 + include/uapi/linux/futex.h | 18 ++ include/uapi/linux/if.h | 6 + include/uapi/linux/netfilter/x_tables.h | 8 + include/uapi/linux/netfilter_ipv4/ip_tables.h | 7 + include/uapi/linux/nfs4_mount.h | 14 ++ include/uapi/linux/ppp-ioctl.h | 7 + include/uapi/linux/sctp.h | 3 + include/uapi/linux/sem.h | 38 +++ include/uapi/linux/socket.h | 7 + include/uapi/linux/sysctl.h | 32 +++ include/uapi/linux/sysinfo.h | 20 ++ include/uapi/linux/uhid.h | 7 + include/uapi/linux/uio.h | 11 + include/uapi/linux/usb/tmc.h | 14 ++ include/uapi/linux/usbdevice_fs.h | 50 ++++ include/uapi/linux/uvcvideo.h | 14 ++ include/uapi/linux/vfio.h | 7 + include/uapi/linux/videodev2.h | 7 + kernel/bpf/arena.c | 19 +- kernel/compat.c | 15 +- kernel/sched/loadavg.c | 4 + kernel/sys.c | 8 + lib/Kconfig.debug | 1 + lib/vsprintf.c | 2 +- mm/debug.c | 4 + mm/internal.h | 2 +- mm/memfd.c | 8 +- mm/memory.c | 4 + scripts/atomic/gen-atomic-long.sh | 4 +- scripts/checksyscalls.sh | 2 +- 132 files changed, 1728 insertions(+), 803 deletions(-) create mode 100644 arch/riscv/configs/rv64ilp32.config -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:15:42 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:15:42 -0400 Subject: [RFC PATCH V3 01/43] rv64ilp32_abi: uapi: Reuse lp64 ABI interface In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-2-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The rv64ilp32 abi kernel accommodates the lp64 abi userspace and leverages the lp64 abi Linux interface. Hence, unify the BITS_PER_LONG = 32 memory layout to match BITS_PER_LONG = 64. #if (__riscv_xlen == 64) && (BITS_PER_LONG == 32) union { void *datap; __u64 __datap; }; #else void *datap; #endif This is inspired from include/uapi/linux/kvm.h: struct kvm_dirty_log { ... union { void __user *dirty_bitmap; /* one bit per page */ __u64 padding2; }; }; This is a suggestion solution for __riscv_xlen == 64, but we need a general way to determine CONFIG_64BIT/32BIT in uapi. Any help are welcome. TODO: Find a general way to replace __riscv_xlen for uapi headers. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- include/linux/socket.h | 35 +++++++++++++ include/uapi/asm-generic/siginfo.h | 50 +++++++++++++++++++ include/uapi/asm-generic/signal.h | 35 +++++++++++++ include/uapi/asm-generic/stat.h | 25 ++++++++++ include/uapi/linux/atm.h | 7 +++ include/uapi/linux/atmdev.h | 7 +++ include/uapi/linux/blkpg.h | 7 +++ include/uapi/linux/btrfs.h | 19 +++++++ include/uapi/linux/capi.h | 11 ++++ include/uapi/linux/fs.h | 12 +++++ include/uapi/linux/futex.h | 18 +++++++ include/uapi/linux/if.h | 6 +++ include/uapi/linux/netfilter/x_tables.h | 8 +++ include/uapi/linux/netfilter_ipv4/ip_tables.h | 7 +++ include/uapi/linux/nfs4_mount.h | 14 ++++++ include/uapi/linux/ppp-ioctl.h | 7 +++ include/uapi/linux/sctp.h | 3 ++ include/uapi/linux/sem.h | 38 ++++++++++++++ include/uapi/linux/socket.h | 7 +++ include/uapi/linux/sysctl.h | 32 ++++++++++++ include/uapi/linux/uhid.h | 7 +++ include/uapi/linux/uio.h | 11 ++++ include/uapi/linux/usb/tmc.h | 14 ++++++ include/uapi/linux/usbdevice_fs.h | 50 +++++++++++++++++++ include/uapi/linux/uvcvideo.h | 14 ++++++ include/uapi/linux/vfio.h | 7 +++ include/uapi/linux/videodev2.h | 7 +++ 27 files changed, 458 insertions(+) diff --git a/include/linux/socket.h b/include/linux/socket.h index d18cc47e89bd..a1bc6e2b809e 100644 --- a/include/linux/socket.h +++ b/include/linux/socket.h @@ -81,12 +81,47 @@ struct msghdr { }; struct user_msghdr { +#if __riscv_xlen == 64 + union { + void __user *msg_name; /* ptr to socket address structure */ + u64 __msg_name; + }; +#else void __user *msg_name; /* ptr to socket address structure */ +#endif int msg_namelen; /* size of socket address structure */ +#if __riscv_xlen == 64 + union { + struct iovec __user *msg_iov; /* scatter/gather array */ + u64 __msg_iov; + }; +#else struct iovec __user *msg_iov; /* scatter/gather array */ +#endif +#if __riscv_xlen == 64 + union { + __kernel_size_t msg_iovlen; /* # elements in msg_iov */ + u64 __msg_iovlen; + }; +#else __kernel_size_t msg_iovlen; /* # elements in msg_iov */ +#endif +#if __riscv_xlen == 64 + union { + void __user *msg_control; /* ancillary data */ + u64 __msg_control; + }; +#else void __user *msg_control; /* ancillary data */ +#endif +#if __riscv_xlen == 64 + union { + __kernel_size_t msg_controllen; /* ancillary data buffer length */ + u64 __msg_controllen; + }; +#else __kernel_size_t msg_controllen; /* ancillary data buffer length */ +#endif unsigned int msg_flags; /* flags on received message */ }; diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h index 5a1ca43b5fc6..5c87b85d7858 100644 --- a/include/uapi/asm-generic/siginfo.h +++ b/include/uapi/asm-generic/siginfo.h @@ -7,7 +7,14 @@ typedef union sigval { int sival_int; +#if __riscv_xlen == 64 + union { + void __user *sival_ptr; + __u64 __sival_ptr; + }; +#else void __user *sival_ptr; +#endif } sigval_t; #define SI_MAX_SIZE 128 @@ -67,7 +74,14 @@ union __sifields { /* SIGILL, SIGFPE, SIGSEGV, SIGBUS, SIGTRAP, SIGEMT */ struct { +#if __riscv_xlen == 64 + union { + void __user *_addr; /* faulting insn/memory ref. */ + __u64 ___addr; + }; +#else void __user *_addr; /* faulting insn/memory ref. */ +#endif #define __ADDR_BND_PKEY_PAD (__alignof__(void *) < sizeof(short) ? \ sizeof(short) : __alignof__(void *)) @@ -82,8 +96,23 @@ union __sifields { /* used when si_code=SEGV_BNDERR */ struct { char _dummy_bnd[__ADDR_BND_PKEY_PAD]; +#if __riscv_xlen == 64 + union { + void __user *_lower; + __u64 ___lower; + }; +#else void __user *_lower; +#endif + +#if __riscv_xlen == 64 + union { + void __user *_upper; + __u64 ___upper; + }; +#else void __user *_upper; +#endif } _addr_bnd; /* used when si_code=SEGV_PKUERR */ struct { @@ -92,7 +121,14 @@ union __sifields { } _addr_pkey; /* used when si_code=TRAP_PERF */ struct { +#if __riscv_xlen == 64 + union { + unsigned long _data; + __u64 ___data; + }; +#else unsigned long _data; +#endif __u32 _type; __u32 _flags; } _perf; @@ -101,13 +137,27 @@ union __sifields { /* SIGPOLL */ struct { +#if __riscv_xlen == 64 + union { + __ARCH_SI_BAND_T _band; /* POLL_IN, POLL_OUT, POLL_MSG */ + __u64 ___band; + }; +#else __ARCH_SI_BAND_T _band; /* POLL_IN, POLL_OUT, POLL_MSG */ +#endif int _fd; } _sigpoll; /* SIGSYS */ struct { +#if __riscv_xlen == 64 + union { + void __user *_call_addr; /* calling user insn */ + __u64 ___call_addr; + }; +#else void __user *_call_addr; /* calling user insn */ +#endif int _syscall; /* triggering system call number */ unsigned int _arch; /* AUDIT_ARCH_* of syscall */ } _sigsys; diff --git a/include/uapi/asm-generic/signal.h b/include/uapi/asm-generic/signal.h index 0eb69dc8e572..efcd31a677ee 100644 --- a/include/uapi/asm-generic/signal.h +++ b/include/uapi/asm-generic/signal.h @@ -73,19 +73,54 @@ typedef unsigned long old_sigset_t; #ifndef __KERNEL__ struct sigaction { +#if __riscv_xlen == 64 + union { + __sighandler_t sa_handler; + __u64 __sa_handler; + }; +#else __sighandler_t sa_handler; +#endif +#if __riscv_xlen == 64 + union { + unsigned long sa_flags; + __u64 __sa_flags; + }; +#else unsigned long sa_flags; +#endif #ifdef SA_RESTORER +#if __riscv_xlen == 64 + union { + __sigrestore_t sa_restorer; + __u64 __sa_restorer; + }; +#else __sigrestore_t sa_restorer; +#endif #endif sigset_t sa_mask; /* mask last for extensibility */ }; #endif typedef struct sigaltstack { +#if __riscv_xlen == 64 + union { + void __user *ss_sp; + __u64 __ss_sp; + }; +#else void __user *ss_sp; +#endif int ss_flags; +#if __riscv_xlen == 64 + union { + __kernel_size_t ss_size; + __u64 __ss_size; + }; +#else __kernel_size_t ss_size; +#endif } stack_t; #endif /* __ASSEMBLY__ */ diff --git a/include/uapi/asm-generic/stat.h b/include/uapi/asm-generic/stat.h index 0d962ecd1663..c8908df5213f 100644 --- a/include/uapi/asm-generic/stat.h +++ b/include/uapi/asm-generic/stat.h @@ -21,6 +21,30 @@ #define STAT_HAVE_NSEC 1 +#if __riscv_xlen == 64 +struct stat { + unsigned long long st_dev; /* Device. */ + unsigned long long st_ino; /* File serial number. */ + unsigned int st_mode; /* File mode. */ + unsigned int st_nlink; /* Link count. */ + unsigned int st_uid; /* User ID of the file's owner. */ + unsigned int st_gid; /* Group ID of the file's group. */ + unsigned long long st_rdev; /* Device number, if device. */ + unsigned long long __pad1; + long long st_size; /* Size of file, in bytes. */ + int st_blksize; /* Optimal block size for I/O. */ + int __pad2; + long long st_blocks; /* Number 512-byte blocks allocated. */ + long long st_atime; /* Time of last access. */ + unsigned long long st_atime_nsec; + long long st_mtime; /* Time of last modification. */ + unsigned long long st_mtime_nsec; + long long st_ctime; /* Time of last status change. */ + unsigned long long st_ctime_nsec; + unsigned int __unused4; + unsigned int __unused5; +}; +#else struct stat { unsigned long st_dev; /* Device. */ unsigned long st_ino; /* File serial number. */ @@ -43,6 +67,7 @@ struct stat { unsigned int __unused4; unsigned int __unused5; }; +#endif /* This matches struct stat64 in glibc2.1. Only used for 32 bit. */ #if __BITS_PER_LONG != 64 || defined(__ARCH_WANT_STAT64) diff --git a/include/uapi/linux/atm.h b/include/uapi/linux/atm.h index 95ebdcf4fe88..fe0da6a5e26d 100644 --- a/include/uapi/linux/atm.h +++ b/include/uapi/linux/atm.h @@ -234,7 +234,14 @@ static __inline__ int atmpvc_addr_in_use(struct sockaddr_atmpvc addr) struct atmif_sioc { int number; int length; +#if __riscv_xlen == 64 + union { + void __user *arg; + __u64 __arg; + }; +#else void __user *arg; +#endif }; diff --git a/include/uapi/linux/atmdev.h b/include/uapi/linux/atmdev.h index 20b0215084fc..e0456ed8b698 100644 --- a/include/uapi/linux/atmdev.h +++ b/include/uapi/linux/atmdev.h @@ -155,7 +155,14 @@ struct atm_dev_stats { struct atm_iobuf { int length; +#if __riscv_xlen == 64 + union { + void __user *buffer; + __u64 __buffer; + }; +#else void __user *buffer; +#endif }; /* for ATM_GETCIRANGE / ATM_SETCIRANGE */ diff --git a/include/uapi/linux/blkpg.h b/include/uapi/linux/blkpg.h index d0a64ee97c6d..31f70c9114c2 100644 --- a/include/uapi/linux/blkpg.h +++ b/include/uapi/linux/blkpg.h @@ -12,7 +12,14 @@ struct blkpg_ioctl_arg { int op; int flags; int datalen; +#if __riscv_xlen == 64 + union { + void __user *data; + __u64 __data; + }; +#else void __user *data; +#endif }; /* The subfunctions (for the op field) */ diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h index d3b222d7af24..25a9570cbb1c 100644 --- a/include/uapi/linux/btrfs.h +++ b/include/uapi/linux/btrfs.h @@ -838,7 +838,14 @@ struct btrfs_ioctl_received_subvol_args { struct btrfs_ioctl_send_args { __s64 send_fd; /* in */ __u64 clone_sources_count; /* in */ +#if __riscv_xlen == 64 + union { + __u64 __user *clone_sources; /* in */ + __u64 __pad; + }; +#else __u64 __user *clone_sources; /* in */ +#endif __u64 parent_root; /* in */ __u64 flags; /* in */ __u32 version; /* in */ @@ -959,9 +966,21 @@ struct btrfs_ioctl_encoded_io_args { * increase in the future). This must also be less than or equal to * unencoded_len. */ +#if __riscv_xlen == 64 + union { + const struct iovec __user *iov; + const __u64 __iov; + }; + /* Number of iovecs. */ + union { + unsigned long iovcnt; + __u64 __iovcnt; + }; +#else const struct iovec __user *iov; /* Number of iovecs. */ unsigned long iovcnt; +#endif /* * Offset in file. * diff --git a/include/uapi/linux/capi.h b/include/uapi/linux/capi.h index 31f946f8a88d..dab4bb8e3ebb 100644 --- a/include/uapi/linux/capi.h +++ b/include/uapi/linux/capi.h @@ -77,8 +77,19 @@ typedef struct capi_profile { #define CAPI_GET_PROFILE _IOWR('C',0x09,struct capi_profile) typedef struct capi_manufacturer_cmd { +#if __riscv_xlen == 64 + union { + unsigned long cmd; + __u64 __cmd; + }; + union { + void __user *data; + __u64 __data; + }; +#else unsigned long cmd; void __user *data; +#endif } capi_manufacturer_cmd; /* diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h index 2bbe00cf1248..3ccd123a23a2 100644 --- a/include/uapi/linux/fs.h +++ b/include/uapi/linux/fs.h @@ -122,15 +122,27 @@ struct file_dedupe_range { /* And dynamically-tunable limits and defaults: */ struct files_stat_struct { +#if __riscv_xlen == 64 + unsigned long long nr_files; /* read only */ + unsigned long long nr_free_files; /* read only */ + unsigned long long max_files; /* tunable */ +#else unsigned long nr_files; /* read only */ unsigned long nr_free_files; /* read only */ unsigned long max_files; /* tunable */ +#endif }; struct inodes_stat_t { +#if __riscv_xlen == 64 + long long nr_inodes; + long long nr_unused; + long long dummy[5]; /* padding for sysctl ABI compatibility */ +#else long nr_inodes; long nr_unused; long dummy[5]; /* padding for sysctl ABI compatibility */ +#endif }; diff --git a/include/uapi/linux/futex.h b/include/uapi/linux/futex.h index d2ee625ea189..ae4ee8a66de1 100644 --- a/include/uapi/linux/futex.h +++ b/include/uapi/linux/futex.h @@ -108,7 +108,14 @@ struct futex_waitv { * changed. */ struct robust_list { +#if __riscv_xlen == 64 + union { + struct robust_list __user *next; + u64 __next; + }; +#else struct robust_list __user *next; +#endif }; /* @@ -131,7 +138,11 @@ struct robust_list_head { * we keep userspace flexible, to freely shape its data-structure, * without hardcoding any particular offset into the kernel: */ +#if __riscv_xlen == 64 + long long futex_offset; +#else long futex_offset; +#endif /* * The death of the thread may race with userspace setting @@ -143,7 +154,14 @@ struct robust_list_head { * _might_ have taken. We check the owner TID in any case, * so only truly owned locks will be handled. */ +#if __riscv_xlen == 64 + union { + struct robust_list __user *list_op_pending; + u64 __list_op_pending; + }; +#else struct robust_list __user *list_op_pending; +#endif }; /* diff --git a/include/uapi/linux/if.h b/include/uapi/linux/if.h index 797ba2c1562a..232ab74922fe 100644 --- a/include/uapi/linux/if.h +++ b/include/uapi/linux/if.h @@ -219,6 +219,9 @@ struct if_settings { /* interface settings */ sync_serial_settings __user *sync; te1_settings __user *te1; +#if __riscv_xlen == 64 + __u64 unused; +#endif } ifs_ifsu; }; @@ -288,6 +291,9 @@ struct ifconf { union { char __user *ifcu_buf; struct ifreq __user *ifcu_req; +#if __riscv_xlen == 64 + __u64 unused; +#endif } ifc_ifcu; }; #endif /* __UAPI_DEF_IF_IFCONF */ diff --git a/include/uapi/linux/netfilter/x_tables.h b/include/uapi/linux/netfilter/x_tables.h index 796af83a963a..7e02e34c6fad 100644 --- a/include/uapi/linux/netfilter/x_tables.h +++ b/include/uapi/linux/netfilter/x_tables.h @@ -18,7 +18,11 @@ struct xt_entry_match { __u8 revision; } user; struct { +#if __riscv_xlen == 64 + __u64 match_size; +#else __u16 match_size; +#endif /* Used inside the kernel */ struct xt_match *match; @@ -41,7 +45,11 @@ struct xt_entry_target { __u8 revision; } user; struct { +#if __riscv_xlen == 64 + __u64 target_size; +#else __u16 target_size; +#endif /* Used inside the kernel */ struct xt_target *target; diff --git a/include/uapi/linux/netfilter_ipv4/ip_tables.h b/include/uapi/linux/netfilter_ipv4/ip_tables.h index 1485df28b239..3a78f8f7bf5d 100644 --- a/include/uapi/linux/netfilter_ipv4/ip_tables.h +++ b/include/uapi/linux/netfilter_ipv4/ip_tables.h @@ -200,7 +200,14 @@ struct ipt_replace { /* Number of counters (must be equal to current number of entries). */ unsigned int num_counters; /* The old entries' counters. */ +#if __riscv_xlen == 64 + union { + struct xt_counters __user *counters; + __u64 __counters; + }; +#else struct xt_counters __user *counters; +#endif /* The entries (hang off end: not really an array). */ struct ipt_entry entries[]; diff --git a/include/uapi/linux/nfs4_mount.h b/include/uapi/linux/nfs4_mount.h index d20bb869bb99..6ec3cec66b6f 100644 --- a/include/uapi/linux/nfs4_mount.h +++ b/include/uapi/linux/nfs4_mount.h @@ -21,7 +21,14 @@ struct nfs_string { unsigned int len; +#if __riscv_xlen == 64 + union { + const char __user * data; + __u64 __data; + }; +#else const char __user * data; +#endif }; struct nfs4_mount_data { @@ -53,7 +60,14 @@ struct nfs4_mount_data { /* Pseudo-flavours to use for authentication. See RFC2623 */ int auth_flavourlen; /* 1 */ +#if __riscv_xlen == 64 + union { + int __user *auth_flavours; /* 1 */ + __u64 __auth_flavours; + }; +#else int __user *auth_flavours; /* 1 */ +#endif }; /* bits in the flags field */ diff --git a/include/uapi/linux/ppp-ioctl.h b/include/uapi/linux/ppp-ioctl.h index 1cc5ce0ae062..8d48eab430c1 100644 --- a/include/uapi/linux/ppp-ioctl.h +++ b/include/uapi/linux/ppp-ioctl.h @@ -59,7 +59,14 @@ struct npioctl { /* Structure describing a CCP configuration option, for PPPIOCSCOMPRESS */ struct ppp_option_data { +#if __riscv_xlen == 64 + union { + __u8 __user *ptr; + __u64 __ptr; + }; +#else __u8 __user *ptr; +#endif __u32 length; int transmit; }; diff --git a/include/uapi/linux/sctp.h b/include/uapi/linux/sctp.h index b7d91d4cf0db..46a06fddcd2f 100644 --- a/include/uapi/linux/sctp.h +++ b/include/uapi/linux/sctp.h @@ -1024,6 +1024,9 @@ struct sctp_getaddrs_old { #else struct sockaddr *addrs; #endif +#if (__riscv_xlen == 64) && (__SIZEOF_LONG__ == 4) + __u32 unused; +#endif }; struct sctp_getaddrs { diff --git a/include/uapi/linux/sem.h b/include/uapi/linux/sem.h index 75aa3b273cd9..de9f441913cd 100644 --- a/include/uapi/linux/sem.h +++ b/include/uapi/linux/sem.h @@ -26,10 +26,29 @@ struct semid_ds { struct ipc_perm sem_perm; /* permissions .. see ipc.h */ __kernel_old_time_t sem_otime; /* last semop time */ __kernel_old_time_t sem_ctime; /* create/last semctl() time */ +#if __riscv_xlen == 64 + union { + struct sem *sem_base; /* ptr to first semaphore in array */ + __u64 __sem_base; + }; + union { + struct sem_queue *sem_pending; /* pending operations to be processed */ + __u64 __sem_pending; + }; + union { + struct sem_queue **sem_pending_last; /* last pending operation */ + __u64 __sem_pending_last; + }; + union { + struct sem_undo *undo; /* undo requests on this array */ + __u64 __undo; + }; +#else struct sem *sem_base; /* ptr to first semaphore in array */ struct sem_queue *sem_pending; /* pending operations to be processed */ struct sem_queue **sem_pending_last; /* last pending operation */ struct sem_undo *undo; /* undo requests on this array */ +#endif unsigned short sem_nsems; /* no. of semaphores in array */ }; @@ -46,10 +65,29 @@ struct sembuf { /* arg for semctl system calls. */ union semun { int val; /* value for SETVAL */ +#if __riscv_xlen == 64 + union { + struct semid_ds __user *buf; /* buffer for IPC_STAT & IPC_SET */ + __u64 ___buf; + }; + union { + unsigned short __user *array; /* array for GETALL & SETALL */ + __u64 __array; + }; + union { + struct seminfo __user *__buf; /* buffer for IPC_INFO */ + __u64 ____buf; + }; + union { + void __user *__pad; + __u64 ____pad; + }; +#else struct semid_ds __user *buf; /* buffer for IPC_STAT & IPC_SET */ unsigned short __user *array; /* array for GETALL & SETALL */ struct seminfo __user *__buf; /* buffer for IPC_INFO */ void __user *__pad; +#endif }; struct seminfo { diff --git a/include/uapi/linux/socket.h b/include/uapi/linux/socket.h index d3fcd3b5ec53..5f7a83649395 100644 --- a/include/uapi/linux/socket.h +++ b/include/uapi/linux/socket.h @@ -22,7 +22,14 @@ struct __kernel_sockaddr_storage { /* space to achieve desired size, */ /* _SS_MAXSIZE value minus size of ss_family */ }; +#if __riscv_xlen == 64 + union { + void *__align; /* implementation specific desired alignment */ + u64 ___align; + }; +#else void *__align; /* implementation specific desired alignment */ +#endif }; }; diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h index 8981f00204db..8ed7b29897f9 100644 --- a/include/uapi/linux/sysctl.h +++ b/include/uapi/linux/sysctl.h @@ -33,13 +33,45 @@ member of a struct __sysctl_args to have? */ struct __sysctl_args { +#if __riscv_xlen == 64 + union { + int __user *name; + __u64 __name; + }; +#else int __user *name; +#endif int nlen; +#if __riscv_xlen == 64 + union { + void __user *oldval; + __u64 __oldval; + }; +#else void __user *oldval; +#endif +#if __riscv_xlen == 64 + union { + size_t __user *oldlenp; + __u64 __oldlenp; + }; +#else size_t __user *oldlenp; +#endif +#if __riscv_xlen == 64 + union { + void __user *newval; + __u64 __newval; + }; +#else void __user *newval; +#endif size_t newlen; +#if __riscv_xlen == 64 + unsigned long long __unused[4]; +#else unsigned long __unused[4]; +#endif }; /* Define sysctl names first */ diff --git a/include/uapi/linux/uhid.h b/include/uapi/linux/uhid.h index cef7534d2d19..4a774dbd3de8 100644 --- a/include/uapi/linux/uhid.h +++ b/include/uapi/linux/uhid.h @@ -130,7 +130,14 @@ struct uhid_create_req { __u8 name[128]; __u8 phys[64]; __u8 uniq[64]; +#if __riscv_xlen == 64 + union { + __u8 __user *rd_data; + __u64 __rd_data; + }; +#else __u8 __user *rd_data; +#endif __u16 rd_size; __u16 bus; diff --git a/include/uapi/linux/uio.h b/include/uapi/linux/uio.h index 649739e0c404..27dfd6032dc6 100644 --- a/include/uapi/linux/uio.h +++ b/include/uapi/linux/uio.h @@ -16,8 +16,19 @@ struct iovec { +#if __riscv_xlen == 64 + union { + void __user *iov_base; /* BSD uses caddr_t (1003.1g requires void *) */ + __u64 __iov_base; + }; + union { + __kernel_size_t iov_len; /* Must be size_t (1003.1g) */ + __u64 __iov_len; + }; +#else void __user *iov_base; /* BSD uses caddr_t (1003.1g requires void *) */ __kernel_size_t iov_len; /* Must be size_t (1003.1g) */ +#endif }; struct dmabuf_cmsg { diff --git a/include/uapi/linux/usb/tmc.h b/include/uapi/linux/usb/tmc.h index d791cc58a7f0..443ec5356caf 100644 --- a/include/uapi/linux/usb/tmc.h +++ b/include/uapi/linux/usb/tmc.h @@ -51,7 +51,14 @@ struct usbtmc_request { struct usbtmc_ctrlrequest { struct usbtmc_request req; +#if __riscv_xlen == 64 + union { + void __user *data; /* pointer to user space */ + __u64 __data; /* pointer to user space */ + }; +#else void __user *data; /* pointer to user space */ +#endif } __attribute__ ((packed)); struct usbtmc_termchar { @@ -70,7 +77,14 @@ struct usbtmc_message { __u32 transfer_size; /* size of bytes to transfer */ __u32 transferred; /* size of received/written bytes */ __u32 flags; /* bit 0: 0 = synchronous; 1 = asynchronous */ +#if __riscv_xlen == 64 + union { + void __user *message; /* pointer to header and data in user space */ + __u64 __message; + }; +#else void __user *message; /* pointer to header and data in user space */ +#endif } __attribute__ ((packed)); /* Request values for USBTMC driver's ioctl entry point */ diff --git a/include/uapi/linux/usbdevice_fs.h b/include/uapi/linux/usbdevice_fs.h index 74a84e02422a..8c8efef74c3c 100644 --- a/include/uapi/linux/usbdevice_fs.h +++ b/include/uapi/linux/usbdevice_fs.h @@ -44,14 +44,28 @@ struct usbdevfs_ctrltransfer { __u16 wIndex; __u16 wLength; __u32 timeout; /* in milliseconds */ +#if __riscv_xlen == 64 + union { + void __user *data; + __u64 __data; + }; +#else void __user *data; +#endif }; struct usbdevfs_bulktransfer { unsigned int ep; unsigned int len; unsigned int timeout; /* in milliseconds */ +#if __riscv_xlen == 64 + union { + void __user *data; + __u64 __data; + }; +#else void __user *data; +#endif }; struct usbdevfs_setinterface { @@ -61,7 +75,14 @@ struct usbdevfs_setinterface { struct usbdevfs_disconnectsignal { unsigned int signr; +#if __riscv_xlen == 64 + union { + void __user *context; + __u64 __context; + }; +#else void __user *context; +#endif }; #define USBDEVFS_MAXDRIVERNAME 255 @@ -119,7 +140,14 @@ struct usbdevfs_urb { unsigned char endpoint; int status; unsigned int flags; +#if __riscv_xlen == 64 + union { + void __user *buffer; + __u64 __buffer; + }; +#else void __user *buffer; +#endif int buffer_length; int actual_length; int start_frame; @@ -130,7 +158,14 @@ struct usbdevfs_urb { int error_count; unsigned int signr; /* signal to be sent on completion, or 0 if none should be sent. */ +#if __riscv_xlen == 64 + union { + void __user *usercontext; + __u64 __usercontext; + }; +#else void __user *usercontext; +#endif struct usbdevfs_iso_packet_desc iso_frame_desc[]; }; @@ -139,7 +174,14 @@ struct usbdevfs_ioctl { int ifno; /* interface 0..N ; negative numbers reserved */ int ioctl_code; /* MUST encode size + direction of data so the * macros in give correct values */ +#if __riscv_xlen == 64 + union { + void __user *data; /* param buffer (in, or out) */ + __u64 __pad; + }; +#else void __user *data; /* param buffer (in, or out) */ +#endif }; /* You can do most things with hubs just through control messages, @@ -195,9 +237,17 @@ struct usbdevfs_streams { #define USBDEVFS_SUBMITURB _IOR('U', 10, struct usbdevfs_urb) #define USBDEVFS_SUBMITURB32 _IOR('U', 10, struct usbdevfs_urb32) #define USBDEVFS_DISCARDURB _IO('U', 11) +#if __riscv_xlen == 64 +#define USBDEVFS_REAPURB _IOW('U', 12, __u64) +#else #define USBDEVFS_REAPURB _IOW('U', 12, void *) +#endif #define USBDEVFS_REAPURB32 _IOW('U', 12, __u32) +#if __riscv_xlen == 64 +#define USBDEVFS_REAPURBNDELAY _IOW('U', 13, __u64) +#else #define USBDEVFS_REAPURBNDELAY _IOW('U', 13, void *) +#endif #define USBDEVFS_REAPURBNDELAY32 _IOW('U', 13, __u32) #define USBDEVFS_DISCSIGNAL _IOR('U', 14, struct usbdevfs_disconnectsignal) #define USBDEVFS_DISCSIGNAL32 _IOR('U', 14, struct usbdevfs_disconnectsignal32) diff --git a/include/uapi/linux/uvcvideo.h b/include/uapi/linux/uvcvideo.h index f86185456dc5..3ccb99039a43 100644 --- a/include/uapi/linux/uvcvideo.h +++ b/include/uapi/linux/uvcvideo.h @@ -54,7 +54,14 @@ struct uvc_xu_control_mapping { __u32 v4l2_type; __u32 data_type; +#if __riscv_xlen == 64 + union { + struct uvc_menu_info __user *menu_info; + __u64 __menu_info; + }; +#else struct uvc_menu_info __user *menu_info; +#endif __u32 menu_count; __u32 reserved[4]; @@ -66,7 +73,14 @@ struct uvc_xu_control_query { __u8 query; /* Video Class-Specific Request Code, */ /* defined in linux/usb/video.h A.8. */ __u16 size; +#if __riscv_xlen == 64 + union { + __u8 __user *data; + __u64 __data; + }; +#else __u8 __user *data; +#endif }; #define UVCIOC_CTRL_MAP _IOWR('u', 0x20, struct uvc_xu_control_mapping) diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index c8dbf8219c4f..0a1dc2a780fb 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -1570,7 +1570,14 @@ struct vfio_iommu_type1_dma_map { struct vfio_bitmap { __u64 pgsize; /* page size for bitmap in bytes */ __u64 size; /* in bytes */ + #if __riscv_xlen == 64 + union { + __u64 __user *data; /* one bit per page */ + __u64 __data; + }; + #else __u64 __user *data; /* one bit per page */ + #endif }; /** diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h index e7c4dce39007..8e5391f07626 100644 --- a/include/uapi/linux/videodev2.h +++ b/include/uapi/linux/videodev2.h @@ -1898,7 +1898,14 @@ struct v4l2_ext_controls { __u32 error_idx; __s32 request_fd; __u32 reserved[1]; +#if __riscv_xlen == 64 + union { + struct v4l2_ext_control *controls; + __u64 __controls; + }; +#else struct v4l2_ext_control *controls; +#endif }; #define V4L2_CTRL_ID_MASK (0x0fffffff) -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:15:43 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:15:43 -0400 Subject: [RFC PATCH V3 02/43] rv64ilp32_abi: riscv: Adapt Makefile and Kconfig In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-3-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" Extend the ARCH_RV64I base with ABI_RV64ILP32 to compile the Linux kernel self into ILP32 on CONFIG_64BIT=y, minimizing the kernel's memory footprint and cache occupation. The 'cmd_cpp_lds_S' in scripts/Makefile.build uses cpp_flags for ld.s generation, so add "-mabi=xxx" to KBUILD_CPPFLAGS, just like what we've done in KBUILD_CLFAGS and KBUILD_AFLAGS. cmd_cpp_lds_S = $(CPP) $(cpp_flags) -P -U$(ARCH) The rv64ilp32 ABI reuses an rv64 toolchain whose default "-mabi=" is lp64, so add "-mabi=ilp32" to correct it. Add config entry with rv64ilp32.config fragment in Makefile: - rv64ilp32_defconfig Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- arch/riscv/Kconfig | 12 ++++++++++-- arch/riscv/Makefile | 17 +++++++++++++++++ arch/riscv/configs/rv64ilp32.config | 1 + 3 files changed, 28 insertions(+), 2 deletions(-) create mode 100644 arch/riscv/configs/rv64ilp32.config diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 7612c52e9b1e..da2111b0111c 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -213,7 +213,7 @@ config RISCV select TRACE_IRQFLAGS_SUPPORT select UACCESS_MEMCPY if !MMU select USER_STACKTRACE_SUPPORT - select ZONE_DMA32 if 64BIT + select ZONE_DMA32 if 64BIT && !ABI_RV64ILP32 config RUSTC_SUPPORTS_RISCV def_bool y @@ -298,6 +298,7 @@ config PAGE_OFFSET config KASAN_SHADOW_OFFSET hex depends on KASAN_GENERIC + default 0x70000000 if ABI_RV64ILP32 default 0xdfffffff00000000 if 64BIT default 0xffffffff if 32BIT @@ -341,7 +342,7 @@ config FIX_EARLYCON_MEM config ILLEGAL_POINTER_VALUE hex - default 0 if 32BIT + default 0 if 32BIT || ABI_RV64ILP32 default 0xdead000000000000 if 64BIT config PGTABLE_LEVELS @@ -418,6 +419,13 @@ config ARCH_RV64I endchoice +config ABI_RV64ILP32 + bool "ABI RV64ILP32" + depends on 64BIT + help + Compile linux kernel self into RV64ILP32 ABI of RISC-V psabi + specification. + # We must be able to map all physical memory into the kernel, but the compiler # is still a bit more efficient when generating code if it's setup in a manner # such that it can only map 2GiB of memory. diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile index 13fbc0f94238..76db01020a22 100644 --- a/arch/riscv/Makefile +++ b/arch/riscv/Makefile @@ -30,10 +30,21 @@ ifeq ($(CONFIG_ARCH_RV64I),y) BITS := 64 UTS_MACHINE := riscv64 +ifeq ($(CONFIG_ABI_RV64ILP32),y) + KBUILD_CPPFLAGS += -mabi=ilp32 + + KBUILD_CFLAGS += -mabi=ilp32 + KBUILD_AFLAGS += -mabi=ilp32 + + KBUILD_LDFLAGS += -melf32lriscv +else + KBUILD_CPPFLAGS += -mabi=lp64 + KBUILD_CFLAGS += -mabi=lp64 KBUILD_AFLAGS += -mabi=lp64 KBUILD_LDFLAGS += -melf64lriscv +endif KBUILD_RUSTFLAGS += -Ctarget-cpu=generic-rv64 --target=riscv64imac-unknown-none-elf \ -Cno-redzone @@ -41,6 +52,8 @@ else BITS := 32 UTS_MACHINE := riscv32 + KBUILD_CPPFLAGS += -mabi=ilp32 + KBUILD_CFLAGS += -mabi=ilp32 KBUILD_AFLAGS += -mabi=ilp32 KBUILD_LDFLAGS += -melf32lriscv @@ -224,6 +237,10 @@ PHONY += rv32_nommu_virt_defconfig rv32_nommu_virt_defconfig: $(Q)$(MAKE) -f $(srctree)/Makefile nommu_virt_defconfig 32-bit.config +PHONY += rv64ilp32_defconfig +rv64ilp32_defconfig: + $(Q)$(MAKE) -f $(srctree)/Makefile defconfig rv64ilp32.config + define archhelp echo ' Image - Uncompressed kernel image (arch/riscv/boot/Image)' echo ' Image.gz - Compressed kernel image (arch/riscv/boot/Image.gz)' diff --git a/arch/riscv/configs/rv64ilp32.config b/arch/riscv/configs/rv64ilp32.config new file mode 100644 index 000000000000..07536586e169 --- /dev/null +++ b/arch/riscv/configs/rv64ilp32.config @@ -0,0 +1 @@ +CONFIG_ABI_RV64ILP32=y -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:15:44 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:15:44 -0400 Subject: [RFC PATCH V3 03/43] rv64ilp32_abi: riscv: Adapt ULL & UL definition In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-4-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" On 64-bit systems with ILP32 ABI, BITS_PER_LONG is 32, making the register width different from UL's. Thus, correct them into ULL. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- arch/riscv/include/asm/cmpxchg.h | 4 +- arch/riscv/include/asm/csr.h | 212 ++++++++++++++++--------------- arch/riscv/net/bpf_jit_comp64.c | 6 +- 3 files changed, 115 insertions(+), 107 deletions(-) diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h index 4cadc56220fe..938d50194dba 100644 --- a/arch/riscv/include/asm/cmpxchg.h +++ b/arch/riscv/include/asm/cmpxchg.h @@ -29,7 +29,7 @@ } else { \ u32 *__ptr32b = (u32 *)((ulong)(p) & ~0x3); \ ulong __s = ((ulong)(p) & (0x4 - sizeof(*p))) * BITS_PER_BYTE; \ - ulong __mask = GENMASK(((sizeof(*p)) * BITS_PER_BYTE) - 1, 0) \ + ulong __mask = GENMASK_ULL(((sizeof(*p)) * BITS_PER_BYTE) - 1, 0) \ << __s; \ ulong __newx = (ulong)(n) << __s; \ ulong __retx; \ @@ -145,7 +145,7 @@ } else { \ u32 *__ptr32b = (u32 *)((ulong)(p) & ~0x3); \ ulong __s = ((ulong)(p) & (0x4 - sizeof(*p))) * BITS_PER_BYTE; \ - ulong __mask = GENMASK(((sizeof(*p)) * BITS_PER_BYTE) - 1, 0) \ + ulong __mask = GENMASK_ULL(((sizeof(*p)) * BITS_PER_BYTE) - 1, 0) \ << __s; \ ulong __newx = (ulong)(n) << __s; \ ulong __oldx = (ulong)(o) << __s; \ diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h index 6fed42e37705..25f7c5afea3a 100644 --- a/arch/riscv/include/asm/csr.h +++ b/arch/riscv/include/asm/csr.h @@ -9,74 +9,82 @@ #include #include +#if __riscv_xlen == 64 +#define UXL ULL +#define GENMASK_UXL GENMASK_ULL +#else +#define UXL UL +#define GENMASK_UXL GENMASK +#endif + /* Status register flags */ -#define SR_SIE _AC(0x00000002, UL) /* Supervisor Interrupt Enable */ -#define SR_MIE _AC(0x00000008, UL) /* Machine Interrupt Enable */ -#define SR_SPIE _AC(0x00000020, UL) /* Previous Supervisor IE */ -#define SR_MPIE _AC(0x00000080, UL) /* Previous Machine IE */ -#define SR_SPP _AC(0x00000100, UL) /* Previously Supervisor */ -#define SR_MPP _AC(0x00001800, UL) /* Previously Machine */ -#define SR_SUM _AC(0x00040000, UL) /* Supervisor User Memory Access */ - -#define SR_FS _AC(0x00006000, UL) /* Floating-point Status */ -#define SR_FS_OFF _AC(0x00000000, UL) -#define SR_FS_INITIAL _AC(0x00002000, UL) -#define SR_FS_CLEAN _AC(0x00004000, UL) -#define SR_FS_DIRTY _AC(0x00006000, UL) - -#define SR_VS _AC(0x00000600, UL) /* Vector Status */ -#define SR_VS_OFF _AC(0x00000000, UL) -#define SR_VS_INITIAL _AC(0x00000200, UL) -#define SR_VS_CLEAN _AC(0x00000400, UL) -#define SR_VS_DIRTY _AC(0x00000600, UL) - -#define SR_VS_THEAD _AC(0x01800000, UL) /* xtheadvector Status */ -#define SR_VS_OFF_THEAD _AC(0x00000000, UL) -#define SR_VS_INITIAL_THEAD _AC(0x00800000, UL) -#define SR_VS_CLEAN_THEAD _AC(0x01000000, UL) -#define SR_VS_DIRTY_THEAD _AC(0x01800000, UL) - -#define SR_XS _AC(0x00018000, UL) /* Extension Status */ -#define SR_XS_OFF _AC(0x00000000, UL) -#define SR_XS_INITIAL _AC(0x00008000, UL) -#define SR_XS_CLEAN _AC(0x00010000, UL) -#define SR_XS_DIRTY _AC(0x00018000, UL) +#define SR_SIE _AC(0x00000002, UXL) /* Supervisor Interrupt Enable */ +#define SR_MIE _AC(0x00000008, UXL) /* Machine Interrupt Enable */ +#define SR_SPIE _AC(0x00000020, UXL) /* Previous Supervisor IE */ +#define SR_MPIE _AC(0x00000080, UXL) /* Previous Machine IE */ +#define SR_SPP _AC(0x00000100, UXL) /* Previously Supervisor */ +#define SR_MPP _AC(0x00001800, UXL) /* Previously Machine */ +#define SR_SUM _AC(0x00040000, UXL) /* Supervisor User Memory Access */ + +#define SR_FS _AC(0x00006000, UXL) /* Floating-point Status */ +#define SR_FS_OFF _AC(0x00000000, UXL) +#define SR_FS_INITIAL _AC(0x00002000, UXL) +#define SR_FS_CLEAN _AC(0x00004000, UXL) +#define SR_FS_DIRTY _AC(0x00006000, UXL) + +#define SR_VS _AC(0x00000600, UXL) /* Vector Status */ +#define SR_VS_OFF _AC(0x00000000, UXL) +#define SR_VS_INITIAL _AC(0x00000200, UXL) +#define SR_VS_CLEAN _AC(0x00000400, UXL) +#define SR_VS_DIRTY _AC(0x00000600, UXL) + +#define SR_VS_THEAD _AC(0x01800000, UXL) /* xtheadvector Status */ +#define SR_VS_OFF_THEAD _AC(0x00000000, UXL) +#define SR_VS_INITIAL_THEAD _AC(0x00800000, UXL) +#define SR_VS_CLEAN_THEAD _AC(0x01000000, UXL) +#define SR_VS_DIRTY_THEAD _AC(0x01800000, UXL) + +#define SR_XS _AC(0x00018000, UXL) /* Extension Status */ +#define SR_XS_OFF _AC(0x00000000, UXL) +#define SR_XS_INITIAL _AC(0x00008000, UXL) +#define SR_XS_CLEAN _AC(0x00010000, UXL) +#define SR_XS_DIRTY _AC(0x00018000, UXL) #define SR_FS_VS (SR_FS | SR_VS) /* Vector and Floating-Point Unit */ -#ifndef CONFIG_64BIT -#define SR_SD _AC(0x80000000, UL) /* FS/VS/XS dirty */ +#if __riscv_xlen == 32 +#define SR_SD _AC(0x80000000, UXL) /* FS/VS/XS dirty */ #else -#define SR_SD _AC(0x8000000000000000, UL) /* FS/VS/XS dirty */ +#define SR_SD _AC(0x8000000000000000, UXL) /* FS/VS/XS dirty */ #endif -#ifdef CONFIG_64BIT -#define SR_UXL _AC(0x300000000, UL) /* XLEN mask for U-mode */ -#define SR_UXL_32 _AC(0x100000000, UL) /* XLEN = 32 for U-mode */ -#define SR_UXL_64 _AC(0x200000000, UL) /* XLEN = 64 for U-mode */ +#if __riscv_xlen == 64 +#define SR_UXL _AC(0x300000000, UXL) /* XLEN mask for U-mode */ +#define SR_UXL_32 _AC(0x100000000, UXL) /* XLEN = 32 for U-mode */ +#define SR_UXL_64 _AC(0x200000000, UXL) /* XLEN = 64 for U-mode */ #endif /* SATP flags */ -#ifndef CONFIG_64BIT -#define SATP_PPN _AC(0x003FFFFF, UL) -#define SATP_MODE_32 _AC(0x80000000, UL) +#if __riscv_xlen == 32 +#define SATP_PPN _AC(0x003FFFFF, UXL) +#define SATP_MODE_32 _AC(0x80000000, UXL) #define SATP_MODE_SHIFT 31 #define SATP_ASID_BITS 9 #define SATP_ASID_SHIFT 22 -#define SATP_ASID_MASK _AC(0x1FF, UL) +#define SATP_ASID_MASK _AC(0x1FF, UXL) #else -#define SATP_PPN _AC(0x00000FFFFFFFFFFF, UL) -#define SATP_MODE_39 _AC(0x8000000000000000, UL) -#define SATP_MODE_48 _AC(0x9000000000000000, UL) -#define SATP_MODE_57 _AC(0xa000000000000000, UL) +#define SATP_PPN _AC(0x00000FFFFFFFFFFF, UXL) +#define SATP_MODE_39 _AC(0x8000000000000000, UXL) +#define SATP_MODE_48 _AC(0x9000000000000000, UXL) +#define SATP_MODE_57 _AC(0xa000000000000000, UXL) #define SATP_MODE_SHIFT 60 #define SATP_ASID_BITS 16 #define SATP_ASID_SHIFT 44 -#define SATP_ASID_MASK _AC(0xFFFF, UL) +#define SATP_ASID_MASK _AC(0xFFFF, UXL) #endif /* Exception cause high bit - is an interrupt if set */ -#define CAUSE_IRQ_FLAG (_AC(1, UL) << (__riscv_xlen - 1)) +#define CAUSE_IRQ_FLAG (_AC(1, UXL) << (__riscv_xlen - 1)) /* Interrupt causes (minus the high bit) */ #define IRQ_S_SOFT 1 @@ -91,7 +99,7 @@ #define IRQ_S_GEXT 12 #define IRQ_PMU_OVF 13 #define IRQ_LOCAL_MAX (IRQ_PMU_OVF + 1) -#define IRQ_LOCAL_MASK GENMASK((IRQ_LOCAL_MAX - 1), 0) +#define IRQ_LOCAL_MASK GENMASK_UXL((IRQ_LOCAL_MAX - 1), 0) /* Exception causes */ #define EXC_INST_MISALIGNED 0 @@ -124,45 +132,45 @@ #define PMP_L 0x80 /* HSTATUS flags */ -#ifdef CONFIG_64BIT -#define HSTATUS_HUPMM _AC(0x3000000000000, UL) -#define HSTATUS_HUPMM_PMLEN_0 _AC(0x0000000000000, UL) -#define HSTATUS_HUPMM_PMLEN_7 _AC(0x2000000000000, UL) -#define HSTATUS_HUPMM_PMLEN_16 _AC(0x3000000000000, UL) -#define HSTATUS_VSXL _AC(0x300000000, UL) +#if __riscv_xlen == 64 +#define HSTATUS_HUPMM _AC(0x3000000000000, UXL) +#define HSTATUS_HUPMM_PMLEN_0 _AC(0x0000000000000, UXL) +#define HSTATUS_HUPMM_PMLEN_7 _AC(0x2000000000000, UXL) +#define HSTATUS_HUPMM_PMLEN_16 _AC(0x3000000000000, UXL) +#define HSTATUS_VSXL _AC(0x300000000, UXL) #define HSTATUS_VSXL_SHIFT 32 #endif -#define HSTATUS_VTSR _AC(0x00400000, UL) -#define HSTATUS_VTW _AC(0x00200000, UL) -#define HSTATUS_VTVM _AC(0x00100000, UL) -#define HSTATUS_VGEIN _AC(0x0003f000, UL) +#define HSTATUS_VTSR _AC(0x00400000, UXL) +#define HSTATUS_VTW _AC(0x00200000, UXL) +#define HSTATUS_VTVM _AC(0x00100000, UXL) +#define HSTATUS_VGEIN _AC(0x0003f000, UXL) #define HSTATUS_VGEIN_SHIFT 12 -#define HSTATUS_HU _AC(0x00000200, UL) -#define HSTATUS_SPVP _AC(0x00000100, UL) -#define HSTATUS_SPV _AC(0x00000080, UL) -#define HSTATUS_GVA _AC(0x00000040, UL) -#define HSTATUS_VSBE _AC(0x00000020, UL) +#define HSTATUS_HU _AC(0x00000200, UXL) +#define HSTATUS_SPVP _AC(0x00000100, UXL) +#define HSTATUS_SPV _AC(0x00000080, UXL) +#define HSTATUS_GVA _AC(0x00000040, UXL) +#define HSTATUS_VSBE _AC(0x00000020, UXL) /* HGATP flags */ -#define HGATP_MODE_OFF _AC(0, UL) -#define HGATP_MODE_SV32X4 _AC(1, UL) -#define HGATP_MODE_SV39X4 _AC(8, UL) -#define HGATP_MODE_SV48X4 _AC(9, UL) -#define HGATP_MODE_SV57X4 _AC(10, UL) +#define HGATP_MODE_OFF _AC(0, UXL) +#define HGATP_MODE_SV32X4 _AC(1, UXL) +#define HGATP_MODE_SV39X4 _AC(8, UXL) +#define HGATP_MODE_SV48X4 _AC(9, UXL) +#define HGATP_MODE_SV57X4 _AC(10, UXL) #define HGATP32_MODE_SHIFT 31 #define HGATP32_VMID_SHIFT 22 -#define HGATP32_VMID GENMASK(28, 22) -#define HGATP32_PPN GENMASK(21, 0) +#define HGATP32_VMID GENMASK_UXL(28, 22) +#define HGATP32_PPN GENMASK_UXL(21, 0) #define HGATP64_MODE_SHIFT 60 #define HGATP64_VMID_SHIFT 44 -#define HGATP64_VMID GENMASK(57, 44) -#define HGATP64_PPN GENMASK(43, 0) +#define HGATP64_VMID GENMASK_UXL(57, 44) +#define HGATP64_PPN GENMASK_UXL(43, 0) #define HGATP_PAGE_SHIFT 12 -#ifdef CONFIG_64BIT +#if __riscv_xlen == 64 #define HGATP_PPN HGATP64_PPN #define HGATP_VMID_SHIFT HGATP64_VMID_SHIFT #define HGATP_VMID HGATP64_VMID @@ -176,31 +184,31 @@ /* VSIP & HVIP relation */ #define VSIP_TO_HVIP_SHIFT (IRQ_VS_SOFT - IRQ_S_SOFT) -#define VSIP_VALID_MASK ((_AC(1, UL) << IRQ_S_SOFT) | \ - (_AC(1, UL) << IRQ_S_TIMER) | \ - (_AC(1, UL) << IRQ_S_EXT) | \ - (_AC(1, UL) << IRQ_PMU_OVF)) +#define VSIP_VALID_MASK ((_AC(1, UXL) << IRQ_S_SOFT) | \ + (_AC(1, UXL) << IRQ_S_TIMER) | \ + (_AC(1, UXL) << IRQ_S_EXT) | \ + (_AC(1, UXL) << IRQ_PMU_OVF)) /* AIA CSR bits */ #define TOPI_IID_SHIFT 16 -#define TOPI_IID_MASK GENMASK(11, 0) -#define TOPI_IPRIO_MASK GENMASK(7, 0) +#define TOPI_IID_MASK GENMASK_UXL(11, 0) +#define TOPI_IPRIO_MASK GENMASK_UXL(7, 0) #define TOPI_IPRIO_BITS 8 #define TOPEI_ID_SHIFT 16 -#define TOPEI_ID_MASK GENMASK(10, 0) -#define TOPEI_PRIO_MASK GENMASK(10, 0) +#define TOPEI_ID_MASK GENMASK_UXL(10, 0) +#define TOPEI_PRIO_MASK GENMASK_UXL(10, 0) #define ISELECT_IPRIO0 0x30 #define ISELECT_IPRIO15 0x3f -#define ISELECT_MASK GENMASK(8, 0) +#define ISELECT_MASK GENMASK_UXL(8, 0) #define HVICTL_VTI BIT(30) -#define HVICTL_IID GENMASK(27, 16) +#define HVICTL_IID GENMASK_UXL(27, 16) #define HVICTL_IID_SHIFT 16 #define HVICTL_DPR BIT(9) #define HVICTL_IPRIOM BIT(8) -#define HVICTL_IPRIO GENMASK(7, 0) +#define HVICTL_IPRIO GENMASK_UXL(7, 0) /* xENVCFG flags */ #define ENVCFG_STCE (_AC(1, ULL) << 63) @@ -210,14 +218,14 @@ #define ENVCFG_PMM_PMLEN_0 (_AC(0x0, ULL) << 32) #define ENVCFG_PMM_PMLEN_7 (_AC(0x2, ULL) << 32) #define ENVCFG_PMM_PMLEN_16 (_AC(0x3, ULL) << 32) -#define ENVCFG_CBZE (_AC(1, UL) << 7) -#define ENVCFG_CBCFE (_AC(1, UL) << 6) +#define ENVCFG_CBZE (_AC(1, UXL) << 7) +#define ENVCFG_CBCFE (_AC(1, UXL) << 6) #define ENVCFG_CBIE_SHIFT 4 -#define ENVCFG_CBIE (_AC(0x3, UL) << ENVCFG_CBIE_SHIFT) -#define ENVCFG_CBIE_ILL _AC(0x0, UL) -#define ENVCFG_CBIE_FLUSH _AC(0x1, UL) -#define ENVCFG_CBIE_INV _AC(0x3, UL) -#define ENVCFG_FIOM _AC(0x1, UL) +#define ENVCFG_CBIE (_AC(0x3, UXL) << ENVCFG_CBIE_SHIFT) +#define ENVCFG_CBIE_ILL _AC(0x0, UXL) +#define ENVCFG_CBIE_FLUSH _AC(0x1, UXL) +#define ENVCFG_CBIE_INV _AC(0x3, UXL) +#define ENVCFG_FIOM _AC(0x1, UXL) /* Smstateen bits */ #define SMSTATEEN0_AIA_IMSIC_SHIFT 58 @@ -446,12 +454,12 @@ /* Scalar Crypto Extension - Entropy */ #define CSR_SEED 0x015 -#define SEED_OPST_MASK _AC(0xC0000000, UL) -#define SEED_OPST_BIST _AC(0x00000000, UL) -#define SEED_OPST_WAIT _AC(0x40000000, UL) -#define SEED_OPST_ES16 _AC(0x80000000, UL) -#define SEED_OPST_DEAD _AC(0xC0000000, UL) -#define SEED_ENTROPY_MASK _AC(0xFFFF, UL) +#define SEED_OPST_MASK _AC(0xC0000000, UXL) +#define SEED_OPST_BIST _AC(0x00000000, UXL) +#define SEED_OPST_WAIT _AC(0x40000000, UXL) +#define SEED_OPST_ES16 _AC(0x80000000, UXL) +#define SEED_OPST_DEAD _AC(0xC0000000, UXL) +#define SEED_ENTROPY_MASK _AC(0xFFFF, UXL) #ifdef CONFIG_RISCV_M_MODE # define CSR_STATUS CSR_MSTATUS @@ -504,14 +512,14 @@ # define RV_IRQ_TIMER IRQ_S_TIMER # define RV_IRQ_EXT IRQ_S_EXT # define RV_IRQ_PMU IRQ_PMU_OVF -# define SIP_LCOFIP (_AC(0x1, UL) << IRQ_PMU_OVF) +# define SIP_LCOFIP (_AC(0x1, UXL) << IRQ_PMU_OVF) #endif /* !CONFIG_RISCV_M_MODE */ /* IE/IP (Supervisor/Machine Interrupt Enable/Pending) flags */ -#define IE_SIE (_AC(0x1, UL) << RV_IRQ_SOFT) -#define IE_TIE (_AC(0x1, UL) << RV_IRQ_TIMER) -#define IE_EIE (_AC(0x1, UL) << RV_IRQ_EXT) +#define IE_SIE (_AC(0x1, UXL) << RV_IRQ_SOFT) +#define IE_TIE (_AC(0x1, UXL) << RV_IRQ_TIMER) +#define IE_EIE (_AC(0x1, UXL) << RV_IRQ_EXT) #ifndef __ASSEMBLY__ diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c index ca60db75199d..4f958722ca41 100644 --- a/arch/riscv/net/bpf_jit_comp64.c +++ b/arch/riscv/net/bpf_jit_comp64.c @@ -136,7 +136,7 @@ static u8 rv_tail_call_reg(struct rv_jit_context *ctx) static bool is_32b_int(s64 val) { - return -(1L << 31) <= val && val < (1L << 31); + return -(1LL << 31) <= val && val < (1LL << 31); } static bool in_auipc_jalr_range(s64 val) @@ -145,8 +145,8 @@ static bool in_auipc_jalr_range(s64 val) * auipc+jalr can reach any signed PC-relative offset in the range * [-2^31 - 2^11, 2^31 - 2^11). */ - return (-(1L << 31) - (1L << 11)) <= val && - val < ((1L << 31) - (1L << 11)); + return (-(1LL << 31) - (1LL << 11)) <= val && + val < ((1LL << 31) - (1LL << 11)); } /* Modify rd pointer to alternate reg to avoid corrupting original reg */ -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:15:45 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:15:45 -0400 Subject: [RFC PATCH V3 04/43] rv64ilp32_abi: riscv: Introduce xlen_t to adapt __riscv_xlen != BITS_PER_LONG In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-5-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" Upon RV64ILP32 ABI definition, BITS_PER_LONG couldn't determine XLEN due to its 32-bit value when CONFIG_64BIT=y. Hence, we've introduced xlen_t and utilized CONFIG_64BIT or __riscv_xlen == 64 to determine register width. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- arch/riscv/include/asm/checksum.h | 4 ++ arch/riscv/include/asm/csr.h | 15 ++-- arch/riscv/include/asm/processor.h | 10 +-- arch/riscv/include/asm/ptrace.h | 92 ++++++++++++------------ arch/riscv/include/asm/sparsemem.h | 2 +- arch/riscv/include/asm/switch_to.h | 4 +- arch/riscv/include/asm/thread_info.h | 2 +- arch/riscv/include/asm/timex.h | 4 +- arch/riscv/include/uapi/asm/elf.h | 4 +- arch/riscv/include/uapi/asm/ptrace.h | 97 ++++++++++++++------------ arch/riscv/include/uapi/asm/ucontext.h | 7 +- arch/riscv/include/uapi/asm/unistd.h | 2 +- arch/riscv/kernel/compat_signal.c | 4 +- arch/riscv/kernel/process.c | 8 +-- arch/riscv/kernel/signal.c | 4 +- arch/riscv/kernel/traps.c | 4 +- arch/riscv/kernel/vector.c | 2 +- arch/riscv/mm/fault.c | 2 +- 18 files changed, 143 insertions(+), 124 deletions(-) diff --git a/arch/riscv/include/asm/checksum.h b/arch/riscv/include/asm/checksum.h index 88e6f1499e88..e887f0983b69 100644 --- a/arch/riscv/include/asm/checksum.h +++ b/arch/riscv/include/asm/checksum.h @@ -36,7 +36,11 @@ __sum16 csum_ipv6_magic(const struct in6_addr *saddr, */ static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl) { +#if __riscv_xlen == 64 + unsigned long long csum = 0; +#else unsigned long csum = 0; +#endif int pos = 0; do { diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h index 25f7c5afea3a..4339600e3c56 100644 --- a/arch/riscv/include/asm/csr.h +++ b/arch/riscv/include/asm/csr.h @@ -522,10 +522,11 @@ #define IE_EIE (_AC(0x1, UXL) << RV_IRQ_EXT) #ifndef __ASSEMBLY__ +#include #define csr_swap(csr, val) \ ({ \ - unsigned long __v = (unsigned long)(val); \ + xlen_t __v = (xlen_t)(val); \ __asm__ __volatile__ ("csrrw %0, " __ASM_STR(csr) ", %1"\ : "=r" (__v) : "rK" (__v) \ : "memory"); \ @@ -534,7 +535,7 @@ #define csr_read(csr) \ ({ \ - register unsigned long __v; \ + register xlen_t __v; \ __asm__ __volatile__ ("csrr %0, " __ASM_STR(csr) \ : "=r" (__v) : \ : "memory"); \ @@ -543,7 +544,7 @@ #define csr_write(csr, val) \ ({ \ - unsigned long __v = (unsigned long)(val); \ + xlen_t __v = (xlen_t)(val); \ __asm__ __volatile__ ("csrw " __ASM_STR(csr) ", %0" \ : : "rK" (__v) \ : "memory"); \ @@ -551,7 +552,7 @@ #define csr_read_set(csr, val) \ ({ \ - unsigned long __v = (unsigned long)(val); \ + xlen_t __v = (xlen_t)(val); \ __asm__ __volatile__ ("csrrs %0, " __ASM_STR(csr) ", %1"\ : "=r" (__v) : "rK" (__v) \ : "memory"); \ @@ -560,7 +561,7 @@ #define csr_set(csr, val) \ ({ \ - unsigned long __v = (unsigned long)(val); \ + xlen_t __v = (xlen_t)(val); \ __asm__ __volatile__ ("csrs " __ASM_STR(csr) ", %0" \ : : "rK" (__v) \ : "memory"); \ @@ -568,7 +569,7 @@ #define csr_read_clear(csr, val) \ ({ \ - unsigned long __v = (unsigned long)(val); \ + xlen_t __v = (xlen_t)(val); \ __asm__ __volatile__ ("csrrc %0, " __ASM_STR(csr) ", %1"\ : "=r" (__v) : "rK" (__v) \ : "memory"); \ @@ -577,7 +578,7 @@ #define csr_clear(csr, val) \ ({ \ - unsigned long __v = (unsigned long)(val); \ + xlen_t __v = (xlen_t)(val); \ __asm__ __volatile__ ("csrc " __ASM_STR(csr) ", %0" \ : : "rK" (__v) \ : "memory"); \ diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h index 5f56eb9d114a..ca57a650c3d2 100644 --- a/arch/riscv/include/asm/processor.h +++ b/arch/riscv/include/asm/processor.h @@ -45,7 +45,7 @@ * This decides where the kernel will search for a free chunk of vm * space during mmap's. */ -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 #define TASK_UNMAPPED_BASE PAGE_ALIGN((UL(1) << MMAP_MIN_VA_BITS) / 3) #else #define TASK_UNMAPPED_BASE PAGE_ALIGN(TASK_SIZE / 3) @@ -99,10 +99,10 @@ struct thread_struct { /* Callee-saved registers */ unsigned long ra; unsigned long sp; /* Kernel mode stack */ - unsigned long s[12]; /* s[0]: frame pointer */ + xlen_t s[12]; /* s[0]: frame pointer */ struct __riscv_d_ext_state fstate; unsigned long bad_cause; - unsigned long envcfg; + xlen_t envcfg; u32 riscv_v_flags; u32 vstate_ctrl; struct __riscv_v_ext_state vstate; @@ -133,8 +133,8 @@ static inline void arch_thread_struct_whitelist(unsigned long *offset, ((struct pt_regs *)(task_stack_page(tsk) + THREAD_SIZE \ - ALIGN(sizeof(struct pt_regs), STACK_ALIGN))) -#define KSTK_EIP(tsk) (task_pt_regs(tsk)->epc) -#define KSTK_ESP(tsk) (task_pt_regs(tsk)->sp) +#define KSTK_EIP(tsk) (ulong)(task_pt_regs(tsk)->epc) +#define KSTK_ESP(tsk) (ulong)(task_pt_regs(tsk)->sp) /* Do necessary setup to start up a newly executed thread. */ diff --git a/arch/riscv/include/asm/ptrace.h b/arch/riscv/include/asm/ptrace.h index b5b0adcc85c1..a0ed27c2346b 100644 --- a/arch/riscv/include/asm/ptrace.h +++ b/arch/riscv/include/asm/ptrace.h @@ -13,51 +13,51 @@ #ifndef __ASSEMBLY__ struct pt_regs { - unsigned long epc; - unsigned long ra; - unsigned long sp; - unsigned long gp; - unsigned long tp; - unsigned long t0; - unsigned long t1; - unsigned long t2; - unsigned long s0; - unsigned long s1; - unsigned long a0; - unsigned long a1; - unsigned long a2; - unsigned long a3; - unsigned long a4; - unsigned long a5; - unsigned long a6; - unsigned long a7; - unsigned long s2; - unsigned long s3; - unsigned long s4; - unsigned long s5; - unsigned long s6; - unsigned long s7; - unsigned long s8; - unsigned long s9; - unsigned long s10; - unsigned long s11; - unsigned long t3; - unsigned long t4; - unsigned long t5; - unsigned long t6; + xlen_t epc; + xlen_t ra; + xlen_t sp; + xlen_t gp; + xlen_t tp; + xlen_t t0; + xlen_t t1; + xlen_t t2; + xlen_t s0; + xlen_t s1; + xlen_t a0; + xlen_t a1; + xlen_t a2; + xlen_t a3; + xlen_t a4; + xlen_t a5; + xlen_t a6; + xlen_t a7; + xlen_t s2; + xlen_t s3; + xlen_t s4; + xlen_t s5; + xlen_t s6; + xlen_t s7; + xlen_t s8; + xlen_t s9; + xlen_t s10; + xlen_t s11; + xlen_t t3; + xlen_t t4; + xlen_t t5; + xlen_t t6; /* Supervisor/Machine CSRs */ - unsigned long status; - unsigned long badaddr; - unsigned long cause; + xlen_t status; + xlen_t badaddr; + xlen_t cause; /* a0 value before the syscall */ - unsigned long orig_a0; + xlen_t orig_a0; }; #define PTRACE_SYSEMU 0x1f #define PTRACE_SYSEMU_SINGLESTEP 0x20 #ifdef CONFIG_64BIT -#define REG_FMT "%016lx" +#define REG_FMT "%016llx" #else #define REG_FMT "%08lx" #endif @@ -69,12 +69,12 @@ struct pt_regs { /* Helpers for working with the instruction pointer */ static inline unsigned long instruction_pointer(struct pt_regs *regs) { - return regs->epc; + return (unsigned long)regs->epc; } static inline void instruction_pointer_set(struct pt_regs *regs, unsigned long val) { - regs->epc = val; + regs->epc = (xlen_t)val; } #define profile_pc(regs) instruction_pointer(regs) @@ -82,40 +82,40 @@ static inline void instruction_pointer_set(struct pt_regs *regs, /* Helpers for working with the user stack pointer */ static inline unsigned long user_stack_pointer(struct pt_regs *regs) { - return regs->sp; + return (unsigned long)regs->sp; } static inline void user_stack_pointer_set(struct pt_regs *regs, unsigned long val) { - regs->sp = val; + regs->sp = (xlen_t)val; } /* Valid only for Kernel mode traps. */ static inline unsigned long kernel_stack_pointer(struct pt_regs *regs) { - return regs->sp; + return (unsigned long)regs->sp; } /* Helpers for working with the frame pointer */ static inline unsigned long frame_pointer(struct pt_regs *regs) { - return regs->s0; + return (unsigned long)regs->s0; } static inline void frame_pointer_set(struct pt_regs *regs, unsigned long val) { - regs->s0 = val; + regs->s0 = (xlen_t)val; } static inline unsigned long regs_return_value(struct pt_regs *regs) { - return regs->a0; + return (unsigned long)regs->a0; } static inline void regs_set_return_value(struct pt_regs *regs, unsigned long val) { - regs->a0 = val; + regs->a0 = (xlen_t)val; } extern int regs_query_register_offset(const char *name); diff --git a/arch/riscv/include/asm/sparsemem.h b/arch/riscv/include/asm/sparsemem.h index 2f901a410586..68907698caa6 100644 --- a/arch/riscv/include/asm/sparsemem.h +++ b/arch/riscv/include/asm/sparsemem.h @@ -4,7 +4,7 @@ #define _ASM_RISCV_SPARSEMEM_H #ifdef CONFIG_SPARSEMEM -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 #define MAX_PHYSMEM_BITS 56 #else #define MAX_PHYSMEM_BITS 32 diff --git a/arch/riscv/include/asm/switch_to.h b/arch/riscv/include/asm/switch_to.h index 0e71eb82f920..6d01b0fc5a25 100644 --- a/arch/riscv/include/asm/switch_to.h +++ b/arch/riscv/include/asm/switch_to.h @@ -71,9 +71,9 @@ static __always_inline bool has_fpu(void) { return false; } #endif static inline void envcfg_update_bits(struct task_struct *task, - unsigned long mask, unsigned long val) + xlen_t mask, xlen_t val) { - unsigned long envcfg; + xlen_t envcfg; envcfg = (task->thread.envcfg & ~mask) | val; task->thread.envcfg = envcfg; diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h index f5916a70879a..637a46fc7ed8 100644 --- a/arch/riscv/include/asm/thread_info.h +++ b/arch/riscv/include/asm/thread_info.h @@ -71,7 +71,7 @@ struct thread_info { * Used in handle_exception() to save a0, a1 and a2 before knowing if we * can access the kernel stack. */ - unsigned long a0, a1, a2; + xlen_t a0, a1, a2; #endif }; diff --git a/arch/riscv/include/asm/timex.h b/arch/riscv/include/asm/timex.h index a06697846e69..b5ca67b30d0b 100644 --- a/arch/riscv/include/asm/timex.h +++ b/arch/riscv/include/asm/timex.h @@ -8,7 +8,7 @@ #include -typedef unsigned long cycles_t; +typedef xlen_t cycles_t; #ifdef CONFIG_RISCV_M_MODE @@ -84,7 +84,7 @@ static inline u64 get_cycles64(void) #define ARCH_HAS_READ_CURRENT_TIMER static inline int read_current_timer(unsigned long *timer_val) { - *timer_val = get_cycles(); + *timer_val = (unsigned long)get_cycles(); return 0; } diff --git a/arch/riscv/include/uapi/asm/elf.h b/arch/riscv/include/uapi/asm/elf.h index 11a71b8533d5..9fc8c2e3556b 100644 --- a/arch/riscv/include/uapi/asm/elf.h +++ b/arch/riscv/include/uapi/asm/elf.h @@ -15,7 +15,7 @@ #include /* ELF register definitions */ -typedef unsigned long elf_greg_t; +typedef xlen_t elf_greg_t; typedef struct user_regs_struct elf_gregset_t; #define ELF_NGREG (sizeof(elf_gregset_t) / sizeof(elf_greg_t)) @@ -24,7 +24,7 @@ typedef __u64 elf_fpreg_t; typedef union __riscv_fp_state elf_fpregset_t; #define ELF_NFPREG (sizeof(struct __riscv_d_ext_state) / sizeof(elf_fpreg_t)) -#if __riscv_xlen == 64 +#if BITS_PER_LONG == 64 #define ELF_RISCV_R_SYM(r_info) ELF64_R_SYM(r_info) #define ELF_RISCV_R_TYPE(r_info) ELF64_R_TYPE(r_info) #else diff --git a/arch/riscv/include/uapi/asm/ptrace.h b/arch/riscv/include/uapi/asm/ptrace.h index a38268b19c3d..f040a2ba07b0 100644 --- a/arch/riscv/include/uapi/asm/ptrace.h +++ b/arch/riscv/include/uapi/asm/ptrace.h @@ -15,6 +15,14 @@ #define PTRACE_GETFDPIC_EXEC 0 #define PTRACE_GETFDPIC_INTERP 1 +#if __riscv_xlen == 64 +typedef u64 xlen_t; +#endif + +#if __riscv_xlen == 32 +typedef ulong xlen_t; +#endif + /* * User-mode register state for core dumps, ptrace, sigcontext * @@ -22,38 +30,38 @@ * struct user_regs_struct must form a prefix of struct pt_regs. */ struct user_regs_struct { - unsigned long pc; - unsigned long ra; - unsigned long sp; - unsigned long gp; - unsigned long tp; - unsigned long t0; - unsigned long t1; - unsigned long t2; - unsigned long s0; - unsigned long s1; - unsigned long a0; - unsigned long a1; - unsigned long a2; - unsigned long a3; - unsigned long a4; - unsigned long a5; - unsigned long a6; - unsigned long a7; - unsigned long s2; - unsigned long s3; - unsigned long s4; - unsigned long s5; - unsigned long s6; - unsigned long s7; - unsigned long s8; - unsigned long s9; - unsigned long s10; - unsigned long s11; - unsigned long t3; - unsigned long t4; - unsigned long t5; - unsigned long t6; + xlen_t pc; + xlen_t ra; + xlen_t sp; + xlen_t gp; + xlen_t tp; + xlen_t t0; + xlen_t t1; + xlen_t t2; + xlen_t s0; + xlen_t s1; + xlen_t a0; + xlen_t a1; + xlen_t a2; + xlen_t a3; + xlen_t a4; + xlen_t a5; + xlen_t a6; + xlen_t a7; + xlen_t s2; + xlen_t s3; + xlen_t s4; + xlen_t s5; + xlen_t s6; + xlen_t s7; + xlen_t s8; + xlen_t s9; + xlen_t s10; + xlen_t s11; + xlen_t t3; + xlen_t t4; + xlen_t t5; + xlen_t t6; }; struct __riscv_f_ext_state { @@ -98,12 +106,15 @@ union __riscv_fp_state { }; struct __riscv_v_ext_state { - unsigned long vstart; - unsigned long vl; - unsigned long vtype; - unsigned long vcsr; - unsigned long vlenb; - void *datap; + xlen_t vstart; + xlen_t vl; + xlen_t vtype; + xlen_t vcsr; + xlen_t vlenb; + union { + void *datap; + xlen_t pad; + }; /* * In signal handler, datap will be set a correct user stack offset * and vector registers will be copied to the address of datap @@ -112,11 +123,11 @@ struct __riscv_v_ext_state { }; struct __riscv_v_regset_state { - unsigned long vstart; - unsigned long vl; - unsigned long vtype; - unsigned long vcsr; - unsigned long vlenb; + xlen_t vstart; + xlen_t vl; + xlen_t vtype; + xlen_t vcsr; + xlen_t vlenb; char vreg[]; }; diff --git a/arch/riscv/include/uapi/asm/ucontext.h b/arch/riscv/include/uapi/asm/ucontext.h index 516bd0bb0da5..572b96c3ccf4 100644 --- a/arch/riscv/include/uapi/asm/ucontext.h +++ b/arch/riscv/include/uapi/asm/ucontext.h @@ -11,8 +11,11 @@ #include struct ucontext { - unsigned long uc_flags; - struct ucontext *uc_link; + xlen_t uc_flags; + union { + struct ucontext *uc_link; + xlen_t pad; + }; stack_t uc_stack; sigset_t uc_sigmask; /* diff --git a/arch/riscv/include/uapi/asm/unistd.h b/arch/riscv/include/uapi/asm/unistd.h index 81896bbbf727..e33dd5161b8d 100644 --- a/arch/riscv/include/uapi/asm/unistd.h +++ b/arch/riscv/include/uapi/asm/unistd.h @@ -16,7 +16,7 @@ */ #include -#if __BITS_PER_LONG == 64 +#if __riscv_xlen == 64 #include #else #include diff --git a/arch/riscv/kernel/compat_signal.c b/arch/riscv/kernel/compat_signal.c index 6ec4e34255a9..859104618f34 100644 --- a/arch/riscv/kernel/compat_signal.c +++ b/arch/riscv/kernel/compat_signal.c @@ -126,7 +126,7 @@ COMPAT_SYSCALL_DEFINE0(rt_sigreturn) /* Always make any pending restarted system calls return -EINTR */ current->restart_block.fn = do_no_restart_syscall; - frame = (struct compat_rt_sigframe __user *)regs->sp; + frame = (struct compat_rt_sigframe __user *)(ulong)regs->sp; if (!access_ok(frame, sizeof(*frame))) goto badframe; @@ -150,7 +150,7 @@ COMPAT_SYSCALL_DEFINE0(rt_sigreturn) pr_info_ratelimited( "%s[%d]: bad frame in %s: frame=%p pc=%p sp=%p\n", task->comm, task_pid_nr(task), __func__, - frame, (void *)regs->epc, (void *)regs->sp); + frame, (void *)(ulong)regs->epc, (void *)(ulong)regs->sp); } force_sig(SIGSEGV); return 0; diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c index 7c244de77180..5c827761f84b 100644 --- a/arch/riscv/kernel/process.c +++ b/arch/riscv/kernel/process.c @@ -65,8 +65,8 @@ void __show_regs(struct pt_regs *regs) show_regs_print_info(KERN_DEFAULT); if (!user_mode(regs)) { - pr_cont("epc : %pS\n", (void *)regs->epc); - pr_cont(" ra : %pS\n", (void *)regs->ra); + pr_cont("epc : %pS\n", (void *)(ulong)regs->epc); + pr_cont(" ra : %pS\n", (void *)(ulong)regs->ra); } pr_cont("epc : " REG_FMT " ra : " REG_FMT " sp : " REG_FMT "\n", @@ -272,7 +272,7 @@ long set_tagged_addr_ctrl(struct task_struct *task, unsigned long arg) unsigned long valid_mask = PR_PMLEN_MASK | PR_TAGGED_ADDR_ENABLE; struct thread_info *ti = task_thread_info(task); struct mm_struct *mm = task->mm; - unsigned long pmm; + xlen_t pmm; u8 pmlen; if (is_compat_thread(ti)) @@ -352,7 +352,7 @@ long get_tagged_addr_ctrl(struct task_struct *task) return ret; } -static bool try_to_set_pmm(unsigned long value) +static bool try_to_set_pmm(xlen_t value) { csr_set(CSR_ENVCFG, value); return (csr_read_clear(CSR_ENVCFG, ENVCFG_PMM) & ENVCFG_PMM) == value; diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c index 94e905eea1de..b3eb4154faf7 100644 --- a/arch/riscv/kernel/signal.c +++ b/arch/riscv/kernel/signal.c @@ -239,7 +239,7 @@ SYSCALL_DEFINE0(rt_sigreturn) /* Always make any pending restarted system calls return -EINTR */ current->restart_block.fn = do_no_restart_syscall; - frame = (struct rt_sigframe __user *)regs->sp; + frame = (struct rt_sigframe __user *)(ulong)regs->sp; if (!access_ok(frame, frame_size)) goto badframe; @@ -265,7 +265,7 @@ SYSCALL_DEFINE0(rt_sigreturn) pr_info_ratelimited( "%s[%d]: bad frame in %s: frame=%p pc=%p sp=%p\n", task->comm, task_pid_nr(task), __func__, - frame, (void *)regs->epc, (void *)regs->sp); + frame, (void *)(ulong)regs->epc, (void *)(ulong)regs->sp); } force_sig(SIGSEGV); return 0; diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 8ff8e8b36524..1fada4c7ddfa 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -118,7 +118,7 @@ void do_trap(struct pt_regs *regs, int signo, int code, unsigned long addr) if (show_unhandled_signals && unhandled_signal(tsk, signo) && printk_ratelimit()) { pr_info("%s[%d]: unhandled signal %d code 0x%x at 0x" REG_FMT, - tsk->comm, task_pid_nr(tsk), signo, code, addr); + tsk->comm, task_pid_nr(tsk), signo, code, (xlen_t)addr); print_vma_addr(KERN_CONT " in ", instruction_pointer(regs)); pr_cont("\n"); __show_regs(regs); @@ -281,7 +281,7 @@ void handle_break(struct pt_regs *regs) current->thread.bad_cause = regs->cause; if (user_mode(regs)) - force_sig_fault(SIGTRAP, TRAP_BRKPT, (void __user *)regs->epc); + force_sig_fault(SIGTRAP, TRAP_BRKPT, (void __user *)instruction_pointer(regs)); #ifdef CONFIG_KGDB else if (notify_die(DIE_TRAP, "EBREAK", regs, 0, regs->cause, SIGTRAP) == NOTIFY_STOP) diff --git a/arch/riscv/kernel/vector.c b/arch/riscv/kernel/vector.c index 184f780c932d..884edd99e6b0 100644 --- a/arch/riscv/kernel/vector.c +++ b/arch/riscv/kernel/vector.c @@ -180,7 +180,7 @@ EXPORT_SYMBOL_GPL(riscv_v_vstate_ctrl_user_allowed); bool riscv_v_first_use_handler(struct pt_regs *regs) { - u32 __user *epc = (u32 __user *)regs->epc; + u32 __user *epc = (u32 __user *)(ulong)regs->epc; u32 insn = (u32)regs->badaddr; if (!(has_vector() || has_xtheadvector())) diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index 0194324a0c50..fcc23350610e 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -78,7 +78,7 @@ static void die_kernel_fault(const char *msg, unsigned long addr, { bust_spinlocks(1); - pr_alert("Unable to handle kernel %s at virtual address " REG_FMT "\n", msg, + pr_alert("Unable to handle kernel %s at virtual address %08lx\n", msg, addr); bust_spinlocks(0); -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:15:46 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:15:46 -0400 Subject: [RFC PATCH V3 05/43] rv64ilp32_abi: riscv: crc32: Utilize 64-bit width to improve the performance In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-6-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The RV64ILP32 ABI, derived from a 64-bit ISA, uses 32-bit BITS_PER_LONG. Therefore, crc32 algorithm could utilize 64-bit width to improve the performance. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- arch/riscv/lib/crc32-riscv.c | 35 ++++++++++++++++++----------------- 1 file changed, 18 insertions(+), 17 deletions(-) diff --git a/arch/riscv/lib/crc32-riscv.c b/arch/riscv/lib/crc32-riscv.c index 53d56ab422c7..68dfb0565696 100644 --- a/arch/riscv/lib/crc32-riscv.c +++ b/arch/riscv/lib/crc32-riscv.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include @@ -59,12 +60,12 @@ */ # define CRC32_POLY_QT_BE 0x04d101df481b4e5a -static inline u64 crc32_le_prep(u32 crc, unsigned long const *ptr) +static inline u64 crc32_le_prep(u32 crc, u64 const *ptr) { return (u64)crc ^ (__force u64)__cpu_to_le64(*ptr); } -static inline u32 crc32_le_zbc(unsigned long s, u32 poly, unsigned long poly_qt) +static inline u32 crc32_le_zbc(u64 s, u32 poly, u64 poly_qt) { u32 crc; @@ -85,7 +86,7 @@ static inline u32 crc32_le_zbc(unsigned long s, u32 poly, unsigned long poly_qt) return crc; } -static inline u64 crc32_be_prep(u32 crc, unsigned long const *ptr) +static inline u64 crc32_be_prep(u32 crc, u64 const *ptr) { return ((u64)crc << 32) ^ (__force u64)__cpu_to_be64(*ptr); } @@ -131,7 +132,7 @@ static inline u32 crc32_be_prep(u32 crc, unsigned long const *ptr) # error "Unexpected __riscv_xlen" #endif -static inline u32 crc32_be_zbc(unsigned long s) +static inline u32 crc32_be_zbc(xlen_t s) { u32 crc; @@ -156,16 +157,16 @@ typedef u32 (*fallback)(u32 crc, unsigned char const *p, size_t len); static inline u32 crc32_le_unaligned(u32 crc, unsigned char const *p, size_t len, u32 poly, - unsigned long poly_qt) + xlen_t poly_qt) { size_t bits = len * 8; - unsigned long s = 0; + xlen_t s = 0; u32 crc_low = 0; for (int i = 0; i < len; i++) - s = ((unsigned long)*p++ << (__riscv_xlen - 8)) | (s >> 8); + s = ((xlen_t)*p++ << (__riscv_xlen - 8)) | (s >> 8); - s ^= (unsigned long)crc << (__riscv_xlen - bits); + s ^= (xlen_t)crc << (__riscv_xlen - bits); if (__riscv_xlen == 32 || len < sizeof(u32)) crc_low = crc >> bits; @@ -177,12 +178,12 @@ static inline u32 crc32_le_unaligned(u32 crc, unsigned char const *p, static inline u32 __pure crc32_le_generic(u32 crc, unsigned char const *p, size_t len, u32 poly, - unsigned long poly_qt, + xlen_t poly_qt, fallback crc_fb) { size_t offset, head_len, tail_len; - unsigned long const *p_ul; - unsigned long s; + xlen_t const *p_ul; + xlen_t s; asm goto(ALTERNATIVE("j %l[legacy]", "nop", 0, RISCV_ISA_EXT_ZBC, 1) @@ -199,7 +200,7 @@ static inline u32 __pure crc32_le_generic(u32 crc, unsigned char const *p, tail_len = len & OFFSET_MASK; len = len >> STEP_ORDER; - p_ul = (unsigned long const *)p; + p_ul = (xlen_t const *)p; for (int i = 0; i < len; i++) { s = crc32_le_prep(crc, p_ul); @@ -236,7 +237,7 @@ static inline u32 crc32_be_unaligned(u32 crc, unsigned char const *p, size_t len) { size_t bits = len * 8; - unsigned long s = 0; + xlen_t s = 0; u32 crc_low = 0; s = 0; @@ -247,7 +248,7 @@ static inline u32 crc32_be_unaligned(u32 crc, unsigned char const *p, s ^= crc >> (32 - bits); crc_low = crc << bits; } else { - s ^= (unsigned long)crc << (bits - 32); + s ^= (xlen_t)crc << (bits - 32); } crc = crc32_be_zbc(s); @@ -259,8 +260,8 @@ static inline u32 crc32_be_unaligned(u32 crc, unsigned char const *p, u32 __pure crc32_be_arch(u32 crc, const u8 *p, size_t len) { size_t offset, head_len, tail_len; - unsigned long const *p_ul; - unsigned long s; + xlen_t const *p_ul; + xlen_t s; asm goto(ALTERNATIVE("j %l[legacy]", "nop", 0, RISCV_ISA_EXT_ZBC, 1) @@ -277,7 +278,7 @@ u32 __pure crc32_be_arch(u32 crc, const u8 *p, size_t len) tail_len = len & OFFSET_MASK; len = len >> STEP_ORDER; - p_ul = (unsigned long const *)p; + p_ul = (xlen_t const *)p; for (int i = 0; i < len; i++) { s = crc32_be_prep(crc, p_ul); -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:15:47 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:15:47 -0400 Subject: [RFC PATCH V3 06/43] rv64ilp32_abi: riscv: csum: Utilize 64-bit width to improve the performance In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-7-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The RV64ILP32 ABI, derived from a 64-bit ISA, uses 32-bit BITS_PER_LONG. Therefore, checksum algorithm could utilize 64-bit width to improve the performance. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- arch/riscv/lib/csum.c | 48 +++++++++++++++++++++---------------------- 1 file changed, 24 insertions(+), 24 deletions(-) diff --git a/arch/riscv/lib/csum.c b/arch/riscv/lib/csum.c index 7fb12c59e571..7139ab855349 100644 --- a/arch/riscv/lib/csum.c +++ b/arch/riscv/lib/csum.c @@ -22,17 +22,17 @@ __sum16 csum_ipv6_magic(const struct in6_addr *saddr, __u32 len, __u8 proto, __wsum csum) { unsigned int ulen, uproto; - unsigned long sum = (__force unsigned long)csum; + xlen_t sum = (__force xlen_t)csum; - sum += (__force unsigned long)saddr->s6_addr32[0]; - sum += (__force unsigned long)saddr->s6_addr32[1]; - sum += (__force unsigned long)saddr->s6_addr32[2]; - sum += (__force unsigned long)saddr->s6_addr32[3]; + sum += (__force xlen_t)saddr->s6_addr32[0]; + sum += (__force xlen_t)saddr->s6_addr32[1]; + sum += (__force xlen_t)saddr->s6_addr32[2]; + sum += (__force xlen_t)saddr->s6_addr32[3]; - sum += (__force unsigned long)daddr->s6_addr32[0]; - sum += (__force unsigned long)daddr->s6_addr32[1]; - sum += (__force unsigned long)daddr->s6_addr32[2]; - sum += (__force unsigned long)daddr->s6_addr32[3]; + sum += (__force xlen_t)daddr->s6_addr32[0]; + sum += (__force xlen_t)daddr->s6_addr32[1]; + sum += (__force xlen_t)daddr->s6_addr32[2]; + sum += (__force xlen_t)daddr->s6_addr32[3]; ulen = (__force unsigned int)htonl((unsigned int)len); sum += ulen; @@ -46,7 +46,7 @@ __sum16 csum_ipv6_magic(const struct in6_addr *saddr, */ if (IS_ENABLED(CONFIG_RISCV_ISA_ZBB) && IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) { - unsigned long fold_temp; + xlen_t fold_temp; /* * Zbb is likely available when the kernel is compiled with Zbb @@ -85,12 +85,12 @@ EXPORT_SYMBOL(csum_ipv6_magic); #define OFFSET_MASK 7 #endif -static inline __no_sanitize_address unsigned long -do_csum_common(const unsigned long *ptr, const unsigned long *end, - unsigned long data) +static inline __no_sanitize_address xlen_t +do_csum_common(const xlen_t *ptr, const xlen_t *end, + xlen_t data) { unsigned int shift; - unsigned long csum = 0, carry = 0; + xlen_t csum = 0, carry = 0; /* * Do 32-bit reads on RV32 and 64-bit reads otherwise. This should be @@ -130,8 +130,8 @@ static inline __no_sanitize_address unsigned int do_csum_with_alignment(const unsigned char *buff, int len) { unsigned int offset, shift; - unsigned long csum, data; - const unsigned long *ptr, *end; + xlen_t csum, data; + const xlen_t *ptr, *end; /* * Align address to closest word (double word on rv64) that comes before @@ -140,7 +140,7 @@ do_csum_with_alignment(const unsigned char *buff, int len) */ offset = (unsigned long)buff & OFFSET_MASK; kasan_check_read(buff, len); - ptr = (const unsigned long *)(buff - offset); + ptr = (const xlen_t *)(buff - offset); /* * Clear the most significant bytes that were over-read if buff was not @@ -153,7 +153,7 @@ do_csum_with_alignment(const unsigned char *buff, int len) #else data = (data << shift) >> shift; #endif - end = (const unsigned long *)(buff + len); + end = (const xlen_t *)(buff + len); csum = do_csum_common(ptr, end, data); #ifdef CC_HAS_ASM_GOTO_TIED_OUTPUT @@ -163,7 +163,7 @@ do_csum_with_alignment(const unsigned char *buff, int len) */ if (IS_ENABLED(CONFIG_RISCV_ISA_ZBB) && IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) { - unsigned long fold_temp; + xlen_t fold_temp; /* * Zbb is likely available when the kernel is compiled with Zbb @@ -233,15 +233,15 @@ do_csum_with_alignment(const unsigned char *buff, int len) static inline __no_sanitize_address unsigned int do_csum_no_alignment(const unsigned char *buff, int len) { - unsigned long csum, data; - const unsigned long *ptr, *end; + xlen_t csum, data; + const xlen_t *ptr, *end; - ptr = (const unsigned long *)(buff); + ptr = (const xlen_t *)(buff); data = *(ptr++); kasan_check_read(buff, len); - end = (const unsigned long *)(buff + len); + end = (const xlen_t *)(buff + len); csum = do_csum_common(ptr, end, data); /* @@ -250,7 +250,7 @@ do_csum_no_alignment(const unsigned char *buff, int len) */ if (IS_ENABLED(CONFIG_RISCV_ISA_ZBB) && IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) { - unsigned long fold_temp; + xlen_t fold_temp; /* * Zbb is likely available when the kernel is compiled with Zbb -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:15:48 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:15:48 -0400 Subject: [RFC PATCH V3 07/43] rv64ilp32_abi: riscv: arch_hweight: Adapt cpopw & cpop of zbb extension In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-8-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The RV64ILP32 ABI is based on 64-bit ISA, but BITS_PER_LONG is 32. Use cpopw for u32_weight and cpop for u64_weight. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- arch/riscv/include/asm/arch_hweight.h | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/arch/riscv/include/asm/arch_hweight.h b/arch/riscv/include/asm/arch_hweight.h index 613769b9cdc9..42577965f5bb 100644 --- a/arch/riscv/include/asm/arch_hweight.h +++ b/arch/riscv/include/asm/arch_hweight.h @@ -12,7 +12,11 @@ #if (BITS_PER_LONG == 64) #define CPOPW "cpopw " #elif (BITS_PER_LONG == 32) +#ifdef CONFIG_64BIT +#define CPOPW "cpopw " +#else #define CPOPW "cpop " +#endif #else #error "Unexpected BITS_PER_LONG" #endif @@ -47,7 +51,7 @@ static inline unsigned int __arch_hweight8(unsigned int w) return __arch_hweight32(w & 0xff); } -#if BITS_PER_LONG == 64 +#ifdef CONFIG_64BIT static __always_inline unsigned long __arch_hweight64(__u64 w) { # ifdef CONFIG_RISCV_ISA_ZBB @@ -61,7 +65,7 @@ static __always_inline unsigned long __arch_hweight64(__u64 w) ".option pop\n" : "=r" (w) : "r" (w) :); - return w; + return (unsigned long)w; legacy: # endif -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:15:49 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:15:49 -0400 Subject: [RFC PATCH V3 08/43] rv64ilp32_abi: riscv: bitops: Adapt ctzw & clzw of zbb extension In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-9-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The RV64ILP32 ABI is based on 64-bit ISA, but BITS_PER_LONG is 32. Use ctzw and clzw for int and long types instead of ctz and clz. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- arch/riscv/include/asm/bitops.h | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/arch/riscv/include/asm/bitops.h b/arch/riscv/include/asm/bitops.h index c6bd3d8354a9..d041b9e3ba84 100644 --- a/arch/riscv/include/asm/bitops.h +++ b/arch/riscv/include/asm/bitops.h @@ -35,14 +35,27 @@ #include #include -#if (BITS_PER_LONG == 64) +#if (__riscv_xlen == 64) #define CTZW "ctzw " #define CLZW "clzw " + +#if (BITS_PER_LONG == 64) +#define CTZ "ctz " +#define CLZ "clz " #elif (BITS_PER_LONG == 32) +#define CTZ "ctzw " +#define CLZ "clzw " +#else +#error "Unexpected BITS_PER_LONG" +#endif + +#elif (__riscv_xlen == 32) #define CTZW "ctz " #define CLZW "clz " +#define CTZ "ctz " +#define CLZ "clz " #else -#error "Unexpected BITS_PER_LONG" +#error "Unexpected __riscv_xlen" #endif static __always_inline unsigned long variable__ffs(unsigned long word) @@ -53,7 +66,7 @@ static __always_inline unsigned long variable__ffs(unsigned long word) asm volatile (".option push\n" ".option arch,+zbb\n" - "ctz %0, %1\n" + CTZ "%0, %1\n" ".option pop\n" : "=r" (word) : "r" (word) :); @@ -82,7 +95,7 @@ static __always_inline unsigned long variable__fls(unsigned long word) asm volatile (".option push\n" ".option arch,+zbb\n" - "clz %0, %1\n" + CLZ "%0, %1\n" ".option pop\n" : "=r" (word) : "r" (word) :); -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:15:50 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:15:50 -0400 Subject: [RFC PATCH V3 09/43] rv64ilp32_abi: riscv: Reuse LP64 SBI interface In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-10-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The RV64ILP32 ABI leverages the LP64 SBI interface, enabling the RV64ILP32 Linux kernel to run seamlessly on LP64 OpenSBI or KVM. Using RV64ILP32 Linux doesn't require changing the bootloader, firmware, or hypervisor; it could replace the LP64 kernel directly. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- arch/riscv/include/asm/cpu_ops_sbi.h | 4 ++-- arch/riscv/include/asm/sbi.h | 22 +++++++++++----------- arch/riscv/kernel/cpu_ops_sbi.c | 4 ++-- arch/riscv/kernel/sbi_ecall.c | 22 +++++++++++----------- 4 files changed, 26 insertions(+), 26 deletions(-) diff --git a/arch/riscv/include/asm/cpu_ops_sbi.h b/arch/riscv/include/asm/cpu_ops_sbi.h index d6e4665b3195..d967adad6b48 100644 --- a/arch/riscv/include/asm/cpu_ops_sbi.h +++ b/arch/riscv/include/asm/cpu_ops_sbi.h @@ -19,8 +19,8 @@ extern const struct cpu_operations cpu_ops_sbi; * @stack_ptr: A pointer to the hart specific sp */ struct sbi_hart_boot_data { - void *task_ptr; - void *stack_ptr; + xlen_t task_ptr; + xlen_t stack_ptr; }; #endif diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h index 3d250824178b..fd9a9c723ec6 100644 --- a/arch/riscv/include/asm/sbi.h +++ b/arch/riscv/include/asm/sbi.h @@ -138,16 +138,16 @@ enum sbi_ext_pmu_fid { }; union sbi_pmu_ctr_info { - unsigned long value; + xlen_t value; struct { - unsigned long csr:12; - unsigned long width:6; + xlen_t csr:12; + xlen_t width:6; #if __riscv_xlen == 32 - unsigned long reserved:13; + xlen_t reserved:13; #else - unsigned long reserved:45; + xlen_t reserved:45; #endif - unsigned long type:1; + xlen_t type:1; }; }; @@ -422,15 +422,15 @@ enum sbi_ext_nacl_feature { extern unsigned long sbi_spec_version; struct sbiret { - long error; - long value; + xlen_t error; + xlen_t value; }; void sbi_init(void); long __sbi_base_ecall(int fid); -struct sbiret __sbi_ecall(unsigned long arg0, unsigned long arg1, - unsigned long arg2, unsigned long arg3, - unsigned long arg4, unsigned long arg5, +struct sbiret __sbi_ecall(xlen_t arg0, xlen_t arg1, + xlen_t arg2, xlen_t arg3, + xlen_t arg4, xlen_t arg5, int fid, int ext); #define sbi_ecall(e, f, a0, a1, a2, a3, a4, a5) \ __sbi_ecall(a0, a1, a2, a3, a4, a5, f, e) diff --git a/arch/riscv/kernel/cpu_ops_sbi.c b/arch/riscv/kernel/cpu_ops_sbi.c index e6fbaaf54956..f9ef3c0155f4 100644 --- a/arch/riscv/kernel/cpu_ops_sbi.c +++ b/arch/riscv/kernel/cpu_ops_sbi.c @@ -71,8 +71,8 @@ static int sbi_cpu_start(unsigned int cpuid, struct task_struct *tidle) /* Make sure tidle is updated */ smp_mb(); - bdata->task_ptr = tidle; - bdata->stack_ptr = task_pt_regs(tidle); + bdata->task_ptr = (ulong)tidle; + bdata->stack_ptr = (ulong)task_pt_regs(tidle); /* Make sure boot data is updated */ smp_mb(); hsm_data = __pa(bdata); diff --git a/arch/riscv/kernel/sbi_ecall.c b/arch/riscv/kernel/sbi_ecall.c index 24aabb4fbde3..ee22e69d70da 100644 --- a/arch/riscv/kernel/sbi_ecall.c +++ b/arch/riscv/kernel/sbi_ecall.c @@ -17,23 +17,23 @@ long __sbi_base_ecall(int fid) } EXPORT_SYMBOL(__sbi_base_ecall); -struct sbiret __sbi_ecall(unsigned long arg0, unsigned long arg1, - unsigned long arg2, unsigned long arg3, - unsigned long arg4, unsigned long arg5, +struct sbiret __sbi_ecall(xlen_t arg0, xlen_t arg1, + xlen_t arg2, xlen_t arg3, + xlen_t arg4, xlen_t arg5, int fid, int ext) { struct sbiret ret; trace_sbi_call(ext, fid); - register uintptr_t a0 asm ("a0") = (uintptr_t)(arg0); - register uintptr_t a1 asm ("a1") = (uintptr_t)(arg1); - register uintptr_t a2 asm ("a2") = (uintptr_t)(arg2); - register uintptr_t a3 asm ("a3") = (uintptr_t)(arg3); - register uintptr_t a4 asm ("a4") = (uintptr_t)(arg4); - register uintptr_t a5 asm ("a5") = (uintptr_t)(arg5); - register uintptr_t a6 asm ("a6") = (uintptr_t)(fid); - register uintptr_t a7 asm ("a7") = (uintptr_t)(ext); + register xlen_t a0 asm ("a0") = (xlen_t)(arg0); + register xlen_t a1 asm ("a1") = (xlen_t)(arg1); + register xlen_t a2 asm ("a2") = (xlen_t)(arg2); + register xlen_t a3 asm ("a3") = (xlen_t)(arg3); + register xlen_t a4 asm ("a4") = (xlen_t)(arg4); + register xlen_t a5 asm ("a5") = (xlen_t)(arg5); + register xlen_t a6 asm ("a6") = (xlen_t)(fid); + register xlen_t a7 asm ("a7") = (xlen_t)(ext); asm volatile ("ecall" : "+r" (a0), "+r" (a1) : "r" (a2), "r" (a3), "r" (a4), "r" (a5), "r" (a6), "r" (a7) -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:15:51 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:15:51 -0400 Subject: [RFC PATCH V3 10/43] rv64ilp32_abi: riscv: Update SATP.MODE.ASID width In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-11-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The RV32 employs 9-bit asid_bits due to CSR's xlen=32 constraint, whereas RV64ILP32 ABI, rooted in RV64 ISA, features a 64-bit satp CSR. Hence, for rv64ilp32 abi, the exact asid mechanism as in 64-bit architecture is adopted. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- arch/riscv/mm/context.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c index 4abe3de23225..c3f9926d9337 100644 --- a/arch/riscv/mm/context.c +++ b/arch/riscv/mm/context.c @@ -226,14 +226,18 @@ static inline void set_mm(struct mm_struct *prev, static int __init asids_init(void) { - unsigned long asid_bits, old; + xlen_t asid_bits, old; /* Figure-out number of ASID bits in HW */ old = csr_read(CSR_SATP); asid_bits = old | (SATP_ASID_MASK << SATP_ASID_SHIFT); csr_write(CSR_SATP, asid_bits); asid_bits = (csr_read(CSR_SATP) >> SATP_ASID_SHIFT) & SATP_ASID_MASK; - asid_bits = fls_long(asid_bits); +#if __riscv_xlen == 64 + asid_bits = fls64(asid_bits); +#else + asid_bits = fls(asid_bits); +#endif csr_write(CSR_SATP, old); /* @@ -265,9 +269,9 @@ static int __init asids_init(void) static_branch_enable(&use_asid_allocator); pr_info("ASID allocator using %lu bits (%lu entries)\n", - asid_bits, num_asids); + (ulong)asid_bits, num_asids); } else { - pr_info("ASID allocator disabled (%lu bits)\n", asid_bits); + pr_info("ASID allocator disabled (%lu bits)\n", (ulong)asid_bits); } return 0; -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:15:52 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:15:52 -0400 Subject: [RFC PATCH V3 11/43] rv64ilp32_abi: riscv: Introduce PTR_L and PTR_S In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-12-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" REG_L and REG_S can't satisfy rv64ilp32 abi requirements, because BITS_PER_LONG != __riscv_xlen. So we introduce new PTR_L and PTR_S macro to help head.S and entry.S deal with the pointer data type. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- arch/riscv/include/asm/asm.h | 13 +++++++++---- arch/riscv/include/asm/scs.h | 4 ++-- arch/riscv/kernel/entry.S | 32 ++++++++++++++++---------------- arch/riscv/kernel/head.S | 8 ++++---- 4 files changed, 31 insertions(+), 26 deletions(-) diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h index 776354895b81..e37d73abbedd 100644 --- a/arch/riscv/include/asm/asm.h +++ b/arch/riscv/include/asm/asm.h @@ -38,6 +38,7 @@ #define RISCV_SZPTR "8" #define RISCV_LGPTR "3" #endif +#define __PTR_SEL(a, b) __ASM_STR(a) #elif __SIZEOF_POINTER__ == 4 #ifdef __ASSEMBLY__ #define RISCV_PTR .word @@ -48,10 +49,14 @@ #define RISCV_SZPTR "4" #define RISCV_LGPTR "2" #endif +#define __PTR_SEL(a, b) __ASM_STR(b) #else #error "Unexpected __SIZEOF_POINTER__" #endif +#define PTR_L __PTR_SEL(ld, lw) +#define PTR_S __PTR_SEL(sd, sw) + #if (__SIZEOF_INT__ == 4) #define RISCV_INT __ASM_STR(.word) #define RISCV_SZINT __ASM_STR(4) @@ -83,18 +88,18 @@ .endm #ifdef CONFIG_SMP -#ifdef CONFIG_32BIT +#if BITS_PER_LONG == 32 #define PER_CPU_OFFSET_SHIFT 2 #else #define PER_CPU_OFFSET_SHIFT 3 #endif .macro asm_per_cpu dst sym tmp - REG_L \tmp, TASK_TI_CPU_NUM(tp) + PTR_L \tmp, TASK_TI_CPU_NUM(tp) slli \tmp, \tmp, PER_CPU_OFFSET_SHIFT la \dst, __per_cpu_offset add \dst, \dst, \tmp - REG_L \tmp, 0(\dst) + PTR_L \tmp, 0(\dst) la \dst, \sym add \dst, \dst, \tmp .endm @@ -106,7 +111,7 @@ .macro load_per_cpu dst ptr tmp asm_per_cpu \dst \ptr \tmp - REG_L \dst, 0(\dst) + PTR_L \dst, 0(\dst) .endm #ifdef CONFIG_SHADOW_CALL_STACK diff --git a/arch/riscv/include/asm/scs.h b/arch/riscv/include/asm/scs.h index 0e45db78b24b..30929afb4e1a 100644 --- a/arch/riscv/include/asm/scs.h +++ b/arch/riscv/include/asm/scs.h @@ -20,7 +20,7 @@ /* Load task_scs_sp(current) to gp. */ .macro scs_load_current - REG_L gp, TASK_TI_SCS_SP(tp) + PTR_L gp, TASK_TI_SCS_SP(tp) .endm /* Load task_scs_sp(current) to gp, but only if tp has changed. */ @@ -32,7 +32,7 @@ /* Save gp to task_scs_sp(current). */ .macro scs_save_current - REG_S gp, TASK_TI_SCS_SP(tp) + PTR_S gp, TASK_TI_SCS_SP(tp) .endm #else /* CONFIG_SHADOW_CALL_STACK */ diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index 33a5a9f2a0d4..2cf36e3ab6b9 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -117,19 +117,19 @@ SYM_CODE_START(handle_exception) new_vmalloc_check #endif - REG_S sp, TASK_TI_KERNEL_SP(tp) + PTR_S sp, TASK_TI_KERNEL_SP(tp) #ifdef CONFIG_VMAP_STACK addi sp, sp, -(PT_SIZE_ON_STACK) srli sp, sp, THREAD_SHIFT andi sp, sp, 0x1 bnez sp, handle_kernel_stack_overflow - REG_L sp, TASK_TI_KERNEL_SP(tp) + PTR_L sp, TASK_TI_KERNEL_SP(tp) #endif .Lsave_context: - REG_S sp, TASK_TI_USER_SP(tp) - REG_L sp, TASK_TI_KERNEL_SP(tp) + PTR_S sp, TASK_TI_USER_SP(tp) + PTR_L sp, TASK_TI_KERNEL_SP(tp) addi sp, sp, -(PT_SIZE_ON_STACK) REG_S x1, PT_RA(sp) REG_S x3, PT_GP(sp) @@ -145,7 +145,7 @@ SYM_CODE_START(handle_exception) */ li t0, SR_SUM | SR_FS_VS - REG_L s0, TASK_TI_USER_SP(tp) + PTR_L s0, TASK_TI_USER_SP(tp) csrrc s1, CSR_STATUS, t0 csrr s2, CSR_EPC csrr s3, CSR_TVAL @@ -193,7 +193,7 @@ SYM_CODE_START(handle_exception) add t0, t1, t0 /* Check if exception code lies within bounds */ bgeu t0, t2, 3f - REG_L t1, 0(t0) + PTR_L t1, 0(t0) 2: jalr t1 j ret_from_exception 3: @@ -226,7 +226,7 @@ SYM_CODE_START_NOALIGN(ret_from_exception) /* Save unwound kernel stack pointer in thread_info */ addi s0, sp, PT_SIZE_ON_STACK - REG_S s0, TASK_TI_KERNEL_SP(tp) + PTR_S s0, TASK_TI_KERNEL_SP(tp) /* Save the kernel shadow call stack pointer */ scs_save_current @@ -301,7 +301,7 @@ SYM_CODE_START_LOCAL(handle_kernel_stack_overflow) REG_S x5, PT_T0(sp) save_from_x6_to_x31 - REG_L s0, TASK_TI_KERNEL_SP(tp) + PTR_L s0, TASK_TI_KERNEL_SP(tp) csrr s1, CSR_STATUS csrr s2, CSR_EPC csrr s3, CSR_TVAL @@ -341,8 +341,8 @@ SYM_CODE_END(ret_from_fork) SYM_FUNC_START(call_on_irq_stack) /* Create a frame record to save ra and s0 (fp) */ addi sp, sp, -STACKFRAME_SIZE_ON_STACK - REG_S ra, STACKFRAME_RA(sp) - REG_S s0, STACKFRAME_FP(sp) + PTR_S ra, STACKFRAME_RA(sp) + PTR_S s0, STACKFRAME_FP(sp) addi s0, sp, STACKFRAME_SIZE_ON_STACK /* Switch to the per-CPU shadow call stack */ @@ -360,8 +360,8 @@ SYM_FUNC_START(call_on_irq_stack) /* Switch back to the thread stack and restore ra and s0 */ addi sp, s0, -STACKFRAME_SIZE_ON_STACK - REG_L ra, STACKFRAME_RA(sp) - REG_L s0, STACKFRAME_FP(sp) + PTR_L ra, STACKFRAME_RA(sp) + PTR_L s0, STACKFRAME_FP(sp) addi sp, sp, STACKFRAME_SIZE_ON_STACK ret @@ -383,8 +383,8 @@ SYM_FUNC_START(__switch_to) li a4, TASK_THREAD_RA add a3, a0, a4 add a4, a1, a4 - REG_S ra, TASK_THREAD_RA_RA(a3) - REG_S sp, TASK_THREAD_SP_RA(a3) + PTR_S ra, TASK_THREAD_RA_RA(a3) + PTR_S sp, TASK_THREAD_SP_RA(a3) REG_S s0, TASK_THREAD_S0_RA(a3) REG_S s1, TASK_THREAD_S1_RA(a3) REG_S s2, TASK_THREAD_S2_RA(a3) @@ -400,8 +400,8 @@ SYM_FUNC_START(__switch_to) /* Save the kernel shadow call stack pointer */ scs_save_current /* Restore context from next->thread */ - REG_L ra, TASK_THREAD_RA_RA(a4) - REG_L sp, TASK_THREAD_SP_RA(a4) + PTR_L ra, TASK_THREAD_RA_RA(a4) + PTR_L sp, TASK_THREAD_SP_RA(a4) REG_L s0, TASK_THREAD_S0_RA(a4) REG_L s1, TASK_THREAD_S1_RA(a4) REG_L s2, TASK_THREAD_S2_RA(a4) diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S index 356d5397b2a2..e55a92be12b1 100644 --- a/arch/riscv/kernel/head.S +++ b/arch/riscv/kernel/head.S @@ -42,7 +42,7 @@ SYM_CODE_START(_start) /* Image load offset (0MB) from start of RAM for M-mode */ .dword 0 #else -#if __riscv_xlen == 64 +#ifdef CONFIG_64BIT /* Image load offset(2MB) from start of RAM */ .dword 0x200000 #else @@ -75,7 +75,7 @@ relocate_enable_mmu: /* Relocate return address */ la a1, kernel_map XIP_FIXUP_OFFSET a1 - REG_L a1, KERNEL_MAP_VIRT_ADDR(a1) + PTR_L a1, KERNEL_MAP_VIRT_ADDR(a1) la a2, _start sub a1, a1, a2 add ra, ra, a1 @@ -349,8 +349,8 @@ SYM_CODE_START(_start_kernel) */ .Lwait_for_cpu_up: /* FIXME: We should WFI to save some energy here. */ - REG_L sp, (a1) - REG_L tp, (a2) + PTR_L sp, (a1) + PTR_L tp, (a2) beqz sp, .Lwait_for_cpu_up beqz tp, .Lwait_for_cpu_up fence -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:15:53 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:15:53 -0400 Subject: [RFC PATCH V3 12/43] rv64ilp32_abi: riscv: Introduce cmpxchg_double In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-13-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The rv64ilp32 abi has the ability to exclusively load and store (ld/sd) a pair of words from an address. Then the SLUB can take advantage of a cmpxchg_double implementation to avoid taking some locks. This patch provides an implementation of cmpxchg_double for 32-bit pairs, and activates the logic required for the SLUB to use these functions (HAVE_ALIGNED_STRUCT_PAGE and HAVE_CMPXCHG_DOUBLE). Inspired from the commit: 5284e1b4bc8a ("arm64: xchg: Implement cmpxchg_double") Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- arch/riscv/Kconfig | 1 + arch/riscv/include/asm/cmpxchg.h | 53 ++++++++++++++++++++++++++++++++ 2 files changed, 54 insertions(+) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index da2111b0111c..884235cf4092 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -141,6 +141,7 @@ config RISCV select HAVE_ARCH_USERFAULTFD_MINOR if 64BIT && USERFAULTFD select HAVE_ARCH_VMAP_STACK if MMU && 64BIT select HAVE_ASM_MODVERSIONS + select HAVE_CMPXCHG_DOUBLE if ABI_RV64ILP32 select HAVE_CONTEXT_TRACKING_USER select HAVE_DEBUG_KMEMLEAK select HAVE_DMA_CONTIGUOUS if MMU diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h index 938d50194dba..944f6d825f78 100644 --- a/arch/riscv/include/asm/cmpxchg.h +++ b/arch/riscv/include/asm/cmpxchg.h @@ -7,6 +7,7 @@ #define _ASM_RISCV_CMPXCHG_H #include +#include #include #include @@ -409,6 +410,58 @@ static __always_inline void __cmpwait(volatile void *ptr, #define __cmpwait_relaxed(ptr, val) \ __cmpwait((ptr), (unsigned long)(val), sizeof(*(ptr))) + +#ifdef CONFIG_HAVE_CMPXCHG_DOUBLE +#define system_has_cmpxchg_double() 1 + +#define __cmpxchg_double_check(ptr1, ptr2) \ +({ \ + if (sizeof(*(ptr1)) != 4) \ + BUILD_BUG(); \ + if (sizeof(*(ptr2)) != 4) \ + BUILD_BUG(); \ + VM_BUG_ON((ulong *)(ptr2) - (ulong *)(ptr1) != 1); \ + VM_BUG_ON(((ulong)ptr1 & 0x7) != 0); \ +}) + +#define __cmpxchg_double(old1, old2, new1, new2, ptr) \ +({ \ + __typeof__(ptr) __ptr = (ptr); \ + register unsigned int __ret; \ + u64 __old; \ + u64 __new; \ + u64 __tmp; \ + switch (sizeof(*(ptr))) { \ + case 4: \ + __old = ((u64)old2 << 32) | (u64)old1; \ + __new = ((u64)new2 << 32) | (u64)new1; \ + __asm__ __volatile__ ( \ + "0: lr.d %0, %2\n" \ + " bne %0, %z3, 1f\n" \ + " sc.d %1, %z4, %2\n" \ + " bnez %1, 0b\n" \ + "1:\n" \ + : "=&r" (__tmp), "=&r" (__ret), "+A" (*__ptr) \ + : "rJ" (__old), "rJ" (__new) \ + : "memory"); \ + __ret = (__old == __tmp); \ + break; \ + default: \ + BUILD_BUG(); \ + } \ + __ret; \ +}) + +#define arch_cmpxchg_double(ptr1, ptr2, o1, o2, n1, n2) \ +({ \ + int __ret; \ + __cmpxchg_double_check(ptr1, ptr2); \ + __ret = __cmpxchg_double((ulong)(o1), (ulong)(o2), \ + (ulong)(n1), (ulong)(n2), \ + ptr1); \ + __ret; \ +}) +#endif #endif #endif /* _ASM_RISCV_CMPXCHG_H */ -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:15:54 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:15:54 -0400 Subject: [RFC PATCH V3 13/43] rv64ilp32_abi: riscv: Correct stackframe layout In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-14-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" In RV64ILP32 ABI, the callee saved fp & ra are 64-bit width, not long size. This patch corrects the layout for the struct stackframe. echo c > /proc/sysrq-trigger Before the patch: sysrq: Trigger a crash Kernel panic - not syncing: sysrq triggered crash CPU: 0 PID: 102 Comm: sh Not tainted ... Hardware name: riscv-virtio,qemu (DT) Call Trace: ---[ end Kernel panic - not syncing: sysrq triggered crash ]--- After the patch: sysrq: Trigger a crash Kernel panic - not syncing: sysrq triggered crash CPU: 0 PID: 102 Comm: sh Not tainted ... Hardware name: riscv-virtio,qemu (DT) Call Trace: [] dump_backtrace+0x1e/0x26 [] show_stack+0x2e/0x3c [] dump_stack_lvl+0x40/0x5a [] dump_stack+0x16/0x1e [] panic+0x10c/0x2a8 [] sysrq_reset_seq_param_set+0x0/0x76 [] __handle_sysrq+0x9c/0x19c [] write_sysrq_trigger+0x64/0x78 [] proc_reg_write+0x4a/0xa2 [] vfs_write+0xac/0x308 [] ksys_write+0x62/0xda [] sys_write+0xe/0x16 [] do_trap_ecall_u+0xd8/0xda [] ret_from_exception+0x0/0x66 ---[ end Kernel panic - not syncing: sysrq triggered crash ]--- Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- arch/riscv/include/asm/stacktrace.h | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/riscv/include/asm/stacktrace.h b/arch/riscv/include/asm/stacktrace.h index b1495a7e06ce..556655cab09d 100644 --- a/arch/riscv/include/asm/stacktrace.h +++ b/arch/riscv/include/asm/stacktrace.h @@ -8,7 +8,13 @@ struct stackframe { unsigned long fp; +#if IS_ENABLED(CONFIG_64BIT) && (BITS_PER_LONG == 32) + unsigned long __fp; +#endif unsigned long ra; +#if IS_ENABLED(CONFIG_64BIT) && (BITS_PER_LONG == 32) + unsigned long __ra; +#endif }; extern void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs, -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:15:55 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:15:55 -0400 Subject: [RFC PATCH V3 14/43] rv64ilp32_abi: riscv: Adapt kernel module code In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-15-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" Because riscv_insn_valid_32bit_offset is always true for ILP32, use BITS_PER_LONG instead of CONFIG_64BIT. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- arch/riscv/kernel/module.c | 2 +- include/asm-generic/module.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c index 47d0ebeec93c..d7360878e618 100644 --- a/arch/riscv/kernel/module.c +++ b/arch/riscv/kernel/module.c @@ -45,7 +45,7 @@ struct relocation_handlers { */ static bool riscv_insn_valid_32bit_offset(ptrdiff_t val) { -#ifdef CONFIG_32BIT +#if BITS_PER_LONG == 32 return true; #else return (-(1L << 31) - (1L << 11)) <= val && val < ((1L << 31) - (1L << 11)); diff --git a/include/asm-generic/module.h b/include/asm-generic/module.h index 98e1541b72b7..f870171b14a8 100644 --- a/include/asm-generic/module.h +++ b/include/asm-generic/module.h @@ -12,7 +12,7 @@ struct mod_arch_specific }; #endif -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 #define Elf_Shdr Elf64_Shdr #define Elf_Phdr Elf64_Phdr #define Elf_Sym Elf64_Sym -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:15:56 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:15:56 -0400 Subject: [RFC PATCH V3 15/43] rv64ilp32_abi: riscv: mm: Adapt MMU_SV39 for 2GiB address space In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-16-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The RV64ILP32 ABI has two independent 2GiB address space for kernel and user. There is no sv32 mmu mode support in xlen=64 ISA. This commit enables MMU_SV39 for RV64ILP32 to satisfy the user & kernel 2GiB mapping requirements. The Sv39 is the mandatory MMU mode when rv64 satp != bare, so we needn't care about Sv48 & Sv57. 2GiB virtual userspace memory layout (u64lp64 ABI): 55555000-5560c000 r-xp 00000000 fe:00 17 /bin/busybox 5560c000-5560f000 r--p 000b7000 fe:00 17 /bin/busybox 5560f000-55610000 rw-p 000ba000 fe:00 17 /bin/busybox 55610000-55631000 rw-p 00000000 00:00 0 [heap] 77e69000-77e6b000 rw-p 00000000 00:00 0 77e6b000-77fba000 r-xp 00000000 fe:00 140 /lib/libc.so.6 77fba000-77fbd000 r--p 0014f000 fe:00 140 /lib/libc.so.6 77fbd000-77fbf000 rw-p 00152000 fe:00 140 /lib/libc.so.6 77fbf000-77fcb000 rw-p 00000000 00:00 0 77fcb000-77fd5000 r-xp 00000000 fe:00 148 /lib/libresolv.so.2 77fd5000-77fd6000 r--p 0000a000 fe:00 148 /lib/libresolv.so.2 77fd6000-77fd7000 rw-p 0000b000 fe:00 148 /lib/libresolv.so.2 77fd7000-77fd9000 rw-p 00000000 00:00 0 77fd9000-77fdb000 r--p 00000000 00:00 0 [vvar] 77fdb000-77fdc000 r-xp 00000000 00:00 0 [vdso] 77fdc000-77ffc000 r-xp 00000000 fe:00 135 /lib/ld-linux-riscv64-lp64d.so.1 77ffc000-77ffe000 r--p 0001f000 fe:00 135 /lib/ld-linux-riscv64-lp64d.so.1 77ffe000-78000000 rw-p 00021000 fe:00 135 /lib/ld-linux-riscv64-lp64d.so.1 7ffdf000-80000000 rw-p 00000000 00:00 0 [stack] 2GiB virtual kernel memory layout: fixmap : 0x90a00000 - 0x90ffffff (6144 kB) pci io : 0x91000000 - 0x91ffffff ( 16 MB) vmemmap : 0x92000000 - 0x93ffffff ( 32 MB) vmalloc : 0x94000000 - 0xb3ffffff ( 512 MB) modules : 0xb4000000 - 0xb7ffffff ( 64 MB) lowmem : 0xc0000000 - 0xc7ffffff ( 128 MB) kasan : 0x80000000 - 0x8fffffff ( 256 MB) kernel : 0xb8000000 - 0xbfffffff ( 128 MB) For satp=sv39, introduce a double mapping to make the sign-extended virtual address identical to the zero-extended virtual address: +--------+ +---------+ +--------+ | | +--| 511:PUD1| | | | | | +---------+ | | | | | | 510:PUD0|--+ | | | | | +---------+ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | INVALID | | | | | | | | | | | | | .... | | | | | | .... | | | | | | | | | | | | +---------+ | | | | | +--| 3:PUD1 | | | | | | | +---------+ | | | | | | | 2:PUD0 |--+ | | | | | +---------+ | | | | | | |1:USR_PUD| | | | | | | +---------+ | | | | | | |0:USR_PUD| | | | +--------+<--+ +---------+ +-->+--------+ PUD1 ^ PGD PUD0 1GB | 4GB 1GB | +----------+ | Sv39 PGDP| +----------+ SATP Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- arch/riscv/Kconfig | 2 +- arch/riscv/include/asm/page.h | 23 ++++++----- arch/riscv/include/asm/pgtable-64.h | 55 ++++++++++++++------------ arch/riscv/include/asm/pgtable.h | 60 ++++++++++++++++++++++++----- arch/riscv/include/asm/processor.h | 2 +- arch/riscv/kernel/cpu.c | 4 +- arch/riscv/mm/fault.c | 10 ++--- arch/riscv/mm/init.c | 55 ++++++++++++++++++-------- arch/riscv/mm/pageattr.c | 4 +- arch/riscv/mm/pgtable.c | 2 +- 10 files changed, 145 insertions(+), 72 deletions(-) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 884235cf4092..9469cdc51ba4 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -293,7 +293,7 @@ config PAGE_OFFSET hex default 0x80000000 if !MMU && RISCV_M_MODE default 0x80200000 if !MMU - default 0xc0000000 if 32BIT + default 0xc0000000 if 32BIT || ABI_RV64ILP32 default 0xff60000000000000 if 64BIT config KASAN_SHADOW_OFFSET diff --git a/arch/riscv/include/asm/page.h b/arch/riscv/include/asm/page.h index 125f5ecd9565..45091a9de0d4 100644 --- a/arch/riscv/include/asm/page.h +++ b/arch/riscv/include/asm/page.h @@ -24,7 +24,7 @@ * When not using MMU this corresponds to the first free page in * physical memory (aligned on a page boundary). */ -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 #ifdef CONFIG_MMU #define PAGE_OFFSET kernel_map.page_offset #else @@ -38,7 +38,7 @@ #define PAGE_OFFSET_L3 _AC(0xffffffd600000000, UL) #else #define PAGE_OFFSET _AC(CONFIG_PAGE_OFFSET, UL) -#endif /* CONFIG_64BIT */ +#endif /* BITS_PER_LONG == 64 */ #ifndef __ASSEMBLY__ @@ -56,19 +56,24 @@ void clear_page(void *page); /* * Use struct definitions to apply C type checking */ +#if CONFIG_PGTABLE_LEVELS > 2 +typedef u64 ptval_t; +#else +typedef ulong ptval_t; +#endif /* Page Global Directory entry */ typedef struct { - unsigned long pgd; + ptval_t pgd; } pgd_t; /* Page Table entry */ typedef struct { - unsigned long pte; + ptval_t pte; } pte_t; typedef struct { - unsigned long pgprot; + ptval_t pgprot; } pgprot_t; typedef struct page *pgtable_t; @@ -81,13 +86,13 @@ typedef struct page *pgtable_t; #define __pgd(x) ((pgd_t) { (x) }) #define __pgprot(x) ((pgprot_t) { (x) }) -#ifdef CONFIG_64BIT -#define PTE_FMT "%016lx" +#if CONFIG_PGTABLE_LEVELS > 2 +#define PTE_FMT "%016llx" #else #define PTE_FMT "%08lx" #endif -#if defined(CONFIG_64BIT) && defined(CONFIG_MMU) +#if (CONFIG_PGTABLE_LEVELS > 2) && defined(CONFIG_MMU) /* * We override this value as its generic definition uses __pa too early in * the boot process (before kernel_map.va_pa_offset is set). @@ -128,7 +133,7 @@ extern unsigned long vmemmap_start_pfn; ((x) >= kernel_map.virt_addr && (x) < (kernel_map.virt_addr + kernel_map.size)) #define is_linear_mapping(x) \ - ((x) >= PAGE_OFFSET && (!IS_ENABLED(CONFIG_64BIT) || (x) < PAGE_OFFSET + KERN_VIRT_SIZE)) + ((x) >= PAGE_OFFSET && ((BITS_PER_LONG == 32) || (x) < PAGE_OFFSET + KERN_VIRT_SIZE)) #ifndef CONFIG_DEBUG_VIRTUAL #define linear_mapping_pa_to_va(x) ((void *)((unsigned long)(x) + kernel_map.va_pa_offset)) diff --git a/arch/riscv/include/asm/pgtable-64.h b/arch/riscv/include/asm/pgtable-64.h index 0897dd99ab8d..401c012d0b66 100644 --- a/arch/riscv/include/asm/pgtable-64.h +++ b/arch/riscv/include/asm/pgtable-64.h @@ -19,7 +19,12 @@ extern bool pgtable_l5_enabled; #define PGDIR_SHIFT (pgtable_l5_enabled ? PGDIR_SHIFT_L5 : \ (pgtable_l4_enabled ? PGDIR_SHIFT_L4 : PGDIR_SHIFT_L3)) /* Size of region mapped by a page global directory */ +#if BITS_PER_LONG == 64 #define PGDIR_SIZE (_AC(1, UL) << PGDIR_SHIFT) +#else +#define PGDIR_SIZE (_AC(1, ULL) << PGDIR_SHIFT) +#endif + #define PGDIR_MASK (~(PGDIR_SIZE - 1)) /* p4d is folded into pgd in case of 4-level page table */ @@ -28,7 +33,7 @@ extern bool pgtable_l5_enabled; #define P4D_SHIFT_L5 39 #define P4D_SHIFT (pgtable_l5_enabled ? P4D_SHIFT_L5 : \ (pgtable_l4_enabled ? P4D_SHIFT_L4 : P4D_SHIFT_L3)) -#define P4D_SIZE (_AC(1, UL) << P4D_SHIFT) +#define P4D_SIZE (_AC(1, ULL) << P4D_SHIFT) #define P4D_MASK (~(P4D_SIZE - 1)) /* pud is folded into pgd in case of 3-level page table */ @@ -43,7 +48,7 @@ extern bool pgtable_l5_enabled; /* Page 4th Directory entry */ typedef struct { - unsigned long p4d; + u64 p4d; } p4d_t; #define p4d_val(x) ((x).p4d) @@ -52,7 +57,7 @@ typedef struct { /* Page Upper Directory entry */ typedef struct { - unsigned long pud; + u64 pud; } pud_t; #define pud_val(x) ((x).pud) @@ -61,7 +66,7 @@ typedef struct { /* Page Middle Directory entry */ typedef struct { - unsigned long pmd; + u64 pmd; } pmd_t; #define pmd_val(x) ((x).pmd) @@ -74,7 +79,7 @@ typedef struct { * | 63 | 62 61 | 60 54 | 53 10 | 9 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 * N MT RSV PFN reserved for SW D A G U X W R V */ -#define _PAGE_PFN_MASK GENMASK(53, 10) +#define _PAGE_PFN_MASK GENMASK_ULL(53, 10) /* * [63] Svnapot definitions: @@ -82,7 +87,7 @@ typedef struct { * 1 Svnapot enabled */ #define _PAGE_NAPOT_SHIFT 63 -#define _PAGE_NAPOT BIT(_PAGE_NAPOT_SHIFT) +#define _PAGE_NAPOT BIT_ULL(_PAGE_NAPOT_SHIFT) /* * Only 64KB (order 4) napot ptes supported. */ @@ -100,9 +105,9 @@ enum napot_cont_order { #define napot_cont_order(val) (__builtin_ctzl((val.pte >> _PAGE_PFN_SHIFT) << 1)) #define napot_cont_shift(order) ((order) + PAGE_SHIFT) -#define napot_cont_size(order) BIT(napot_cont_shift(order)) +#define napot_cont_size(order) BIT_ULL(napot_cont_shift(order)) #define napot_cont_mask(order) (~(napot_cont_size(order) - 1UL)) -#define napot_pte_num(order) BIT(order) +#define napot_pte_num(order) BIT_ULL(order) #ifdef CONFIG_RISCV_ISA_SVNAPOT #define HUGE_MAX_HSTATE (2 + (NAPOT_ORDER_MAX - NAPOT_CONT_ORDER_BASE)) @@ -118,8 +123,8 @@ enum napot_cont_order { * 10 - IO Non-cacheable, non-idempotent, strongly-ordered I/O memory * 11 - Rsvd Reserved for future standard use */ -#define _PAGE_NOCACHE_SVPBMT (1UL << 61) -#define _PAGE_IO_SVPBMT (1UL << 62) +#define _PAGE_NOCACHE_SVPBMT (1ULL << 61) +#define _PAGE_IO_SVPBMT (1ULL << 62) #define _PAGE_MTMASK_SVPBMT (_PAGE_NOCACHE_SVPBMT | _PAGE_IO_SVPBMT) /* @@ -133,10 +138,10 @@ enum napot_cont_order { * 01110 - PMA Weakly-ordered, Cacheable, Bufferable, Shareable, Non-trustable * 10010 - IO Strongly-ordered, Non-cacheable, Non-bufferable, Shareable, Non-trustable */ -#define _PAGE_PMA_THEAD ((1UL << 62) | (1UL << 61) | (1UL << 60)) -#define _PAGE_NOCACHE_THEAD ((1UL << 61) | (1UL << 60)) -#define _PAGE_IO_THEAD ((1UL << 63) | (1UL << 60)) -#define _PAGE_MTMASK_THEAD (_PAGE_PMA_THEAD | _PAGE_IO_THEAD | (1UL << 59)) +#define _PAGE_PMA_THEAD ((1ULL << 62) | (1ULL << 61) | (1ULL << 60)) +#define _PAGE_NOCACHE_THEAD ((1ULL << 61) | (1ULL << 60)) +#define _PAGE_IO_THEAD ((1ULL << 63) | (1ULL << 60)) +#define _PAGE_MTMASK_THEAD (_PAGE_PMA_THEAD | _PAGE_IO_THEAD | (1ULL << 59)) static inline u64 riscv_page_mtmask(void) { @@ -167,7 +172,7 @@ static inline u64 riscv_page_io(void) #define _PAGE_MTMASK riscv_page_mtmask() /* Set of bits to preserve across pte_modify() */ -#define _PAGE_CHG_MASK (~(unsigned long)(_PAGE_PRESENT | _PAGE_READ | \ +#define _PAGE_CHG_MASK (~(u64)(_PAGE_PRESENT | _PAGE_READ | \ _PAGE_WRITE | _PAGE_EXEC | \ _PAGE_USER | _PAGE_GLOBAL | \ _PAGE_MTMASK)) @@ -208,12 +213,12 @@ static inline void pud_clear(pud_t *pudp) set_pud(pudp, __pud(0)); } -static inline pud_t pfn_pud(unsigned long pfn, pgprot_t prot) +static inline pud_t pfn_pud(u64 pfn, pgprot_t prot) { return __pud((pfn << _PAGE_PFN_SHIFT) | pgprot_val(prot)); } -static inline unsigned long _pud_pfn(pud_t pud) +static inline u64 _pud_pfn(pud_t pud) { return __page_val_to_pfn(pud_val(pud)); } @@ -248,16 +253,16 @@ static inline bool mm_pud_folded(struct mm_struct *mm) #define pmd_index(addr) (((addr) >> PMD_SHIFT) & (PTRS_PER_PMD - 1)) -static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot) +static inline pmd_t pfn_pmd(u64 pfn, pgprot_t prot) { - unsigned long prot_val = pgprot_val(prot); + u64 prot_val = pgprot_val(prot); ALT_THEAD_PMA(prot_val); return __pmd((pfn << _PAGE_PFN_SHIFT) | prot_val); } -static inline unsigned long _pmd_pfn(pmd_t pmd) +static inline u64 _pmd_pfn(pmd_t pmd) { return __page_val_to_pfn(pmd_val(pmd)); } @@ -265,13 +270,13 @@ static inline unsigned long _pmd_pfn(pmd_t pmd) #define mk_pmd(page, prot) pfn_pmd(page_to_pfn(page), prot) #define pmd_ERROR(e) \ - pr_err("%s:%d: bad pmd %016lx.\n", __FILE__, __LINE__, pmd_val(e)) + pr_err("%s:%d: bad pmd " PTE_FMT ".\n", __FILE__, __LINE__, pmd_val(e)) #define pud_ERROR(e) \ - pr_err("%s:%d: bad pud %016lx.\n", __FILE__, __LINE__, pud_val(e)) + pr_err("%s:%d: bad pud " PTE_FMT ".\n", __FILE__, __LINE__, pud_val(e)) #define p4d_ERROR(e) \ - pr_err("%s:%d: bad p4d %016lx.\n", __FILE__, __LINE__, p4d_val(e)) + pr_err("%s:%d: bad p4d " PTE_FMT ".\n", __FILE__, __LINE__, p4d_val(e)) static inline void set_p4d(p4d_t *p4dp, p4d_t p4d) { @@ -311,12 +316,12 @@ static inline void p4d_clear(p4d_t *p4d) set_p4d(p4d, __p4d(0)); } -static inline p4d_t pfn_p4d(unsigned long pfn, pgprot_t prot) +static inline p4d_t pfn_p4d(u64 pfn, pgprot_t prot) { return __p4d((pfn << _PAGE_PFN_SHIFT) | pgprot_val(prot)); } -static inline unsigned long _p4d_pfn(p4d_t p4d) +static inline u64 _p4d_pfn(p4d_t p4d) { return __page_val_to_pfn(p4d_val(p4d)); } diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 050fdc49b5ad..5f1b48cb3311 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -9,6 +9,7 @@ #include #include +#include #include #ifndef CONFIG_MMU @@ -19,8 +20,13 @@ #define ADDRESS_SPACE_END (UL(-1)) #ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 /* Leave 2GB for kernel and BPF at the end of the address space */ #define KERNEL_LINK_ADDR (ADDRESS_SPACE_END - SZ_2G + 1) +#elif BITS_PER_LONG == 32 +/* Leave 64MB for kernel and BPF below PAGE_OFFSET */ +#define KERNEL_LINK_ADDR (PAGE_OFFSET - SZ_64M) +#endif #else #define KERNEL_LINK_ADDR PAGE_OFFSET #endif @@ -34,31 +40,45 @@ * Half of the kernel address space (1/4 of the entries of the page global * directory) is for the direct mapping. */ +#if (BITS_PER_LONG == 32) && (CONFIG_PGTABLE_LEVELS > 2) +#define KERN_VIRT_SIZE (PTRS_PER_PGD * PMD_SIZE) +#else #define KERN_VIRT_SIZE ((PTRS_PER_PGD / 2 * PGDIR_SIZE) / 2) +#endif #define VMALLOC_SIZE (KERN_VIRT_SIZE >> 1) +#if defined(CONFIG_64BIT) && (BITS_PER_LONG == 32) +#define VMALLOC_END MODULES_LOWEST_VADDR +#else #define VMALLOC_END PAGE_OFFSET -#define VMALLOC_START (PAGE_OFFSET - VMALLOC_SIZE) +#endif +#define VMALLOC_START (VMALLOC_END - VMALLOC_SIZE) #define BPF_JIT_REGION_SIZE (SZ_128M) -#ifdef CONFIG_64BIT #define BPF_JIT_REGION_START (BPF_JIT_REGION_END - BPF_JIT_REGION_SIZE) +#if BITS_PER_LONG == 64 #define BPF_JIT_REGION_END (MODULES_END) #else -#define BPF_JIT_REGION_START (PAGE_OFFSET - BPF_JIT_REGION_SIZE) #define BPF_JIT_REGION_END (VMALLOC_END) #endif /* Modules always live before the kernel */ -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 /* This is used to define the end of the KASAN shadow region */ #define MODULES_LOWEST_VADDR (KERNEL_LINK_ADDR - SZ_2G) #define MODULES_VADDR (PFN_ALIGN((unsigned long)&_end) - SZ_2G) #define MODULES_END (PFN_ALIGN((unsigned long)&_start)) #else +#ifdef CONFIG_64BIT +#define MODULES_LOWEST_VADDR (KERNEL_LINK_ADDR - SZ_64M) +#define MODULES_VADDR MODULES_LOWEST_VADDR +#define MODULES_END KERNEL_LINK_ADDR +#else +#define MODULES_LOWEST_VADDR VMALLOC_START #define MODULES_VADDR VMALLOC_START #define MODULES_END VMALLOC_END #endif +#endif /* * Roughly size the vmemmap space to be large enough to fit enough @@ -66,7 +86,7 @@ * position vmemmap directly below the VMALLOC region. */ #define VA_BITS_SV32 32 -#ifdef CONFIG_64BIT +#if defined(CONFIG_64BIT) && (BITS_PER_LONG == 64) #define VA_BITS_SV39 39 #define VA_BITS_SV48 48 #define VA_BITS_SV57 57 @@ -126,9 +146,14 @@ #define MMAP_VA_BITS_64 ((VA_BITS >= VA_BITS_SV48) ? VA_BITS_SV48 : VA_BITS) #define MMAP_MIN_VA_BITS_64 (VA_BITS_SV39) +#if BITS_PER_LONG == 64 #define MMAP_VA_BITS (is_compat_task() ? VA_BITS_SV32 : MMAP_VA_BITS_64) #define MMAP_MIN_VA_BITS (is_compat_task() ? VA_BITS_SV32 : MMAP_MIN_VA_BITS_64) #else +#define MMAP_VA_BITS VA_BITS_SV32 +#define MMAP_MIN_VA_BITS VA_BITS_SV32 +#endif +#else #include #endif /* CONFIG_64BIT */ @@ -252,7 +277,7 @@ static inline void pmd_clear(pmd_t *pmdp) static inline pgd_t pfn_pgd(unsigned long pfn, pgprot_t prot) { - unsigned long prot_val = pgprot_val(prot); + ptval_t prot_val = pgprot_val(prot); ALT_THEAD_PMA(prot_val); @@ -591,7 +616,11 @@ extern int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long a static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long address, pte_t *ptep) { +#if CONFIG_PGTABLE_LEVELS > 2 + pte_t pte = __pte(atomic64_xchg((atomic64_t *)ptep, 0)); +#else pte_t pte = __pte(atomic_long_xchg((atomic_long_t *)ptep, 0)); +#endif page_table_check_pte_clear(mm, pte); @@ -602,7 +631,11 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long address, pte_t *ptep) { +#if CONFIG_PGTABLE_LEVELS > 2 + atomic64_and(~(u64)_PAGE_WRITE, (atomic64_t *)ptep); +#else atomic_long_and(~(unsigned long)_PAGE_WRITE, (atomic_long_t *)ptep); +#endif } #define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH @@ -636,7 +669,7 @@ static inline pgprot_t pgprot_nx(pgprot_t _prot) #define pgprot_noncached pgprot_noncached static inline pgprot_t pgprot_noncached(pgprot_t _prot) { - unsigned long prot = pgprot_val(_prot); + ptval_t prot = pgprot_val(_prot); prot &= ~_PAGE_MTMASK; prot |= _PAGE_IO; @@ -647,7 +680,7 @@ static inline pgprot_t pgprot_noncached(pgprot_t _prot) #define pgprot_writecombine pgprot_writecombine static inline pgprot_t pgprot_writecombine(pgprot_t _prot) { - unsigned long prot = pgprot_val(_prot); + ptval_t prot = pgprot_val(_prot); prot &= ~_PAGE_MTMASK; prot |= _PAGE_NOCACHE; @@ -905,8 +938,12 @@ static inline pte_t pte_swp_clear_exclusive(pte_t pte) * and give the kernel the other (upper) half. */ #ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 #define KERN_VIRT_START (-(BIT(VA_BITS)) + TASK_SIZE) #else +#define KERN_VIRT_START TASK_SIZE_32 +#endif +#else #define KERN_VIRT_START FIXADDR_START #endif @@ -915,6 +952,7 @@ static inline pte_t pte_swp_clear_exclusive(pte_t pte) * Note that PGDIR_SIZE must evenly divide TASK_SIZE. * Task size is: * - 0x9fc00000 (~2.5GB) for RV32. + * - 0x80000000 ( 2GB) for RV32_COMPAT & RV64ILP32 * - 0x4000000000 ( 256GB) for RV64 using SV39 mmu * - 0x800000000000 ( 128TB) for RV64 using SV48 mmu * - 0x100000000000000 ( 64PB) for RV64 using SV57 mmu @@ -928,15 +966,19 @@ static inline pte_t pte_swp_clear_exclusive(pte_t pte) #ifdef CONFIG_64BIT #define TASK_SIZE_64 (PGDIR_SIZE * PTRS_PER_PGD / 2) #define TASK_SIZE_MAX LONG_MAX +#define TASK_SIZE_32 _AC(0x80000000, UL) +#if BITS_PER_LONG == 64 #ifdef CONFIG_COMPAT -#define TASK_SIZE_32 (_AC(0x80000000, UL) - PAGE_SIZE) #define TASK_SIZE (is_compat_task() ? \ TASK_SIZE_32 : TASK_SIZE_64) #else #define TASK_SIZE TASK_SIZE_64 #endif +#else +#define TASK_SIZE TASK_SIZE_32 +#endif #else #define TASK_SIZE FIXADDR_START #endif diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h index ca57a650c3d2..9f4e0be595fd 100644 --- a/arch/riscv/include/asm/processor.h +++ b/arch/riscv/include/asm/processor.h @@ -24,7 +24,7 @@ base; \ }) -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 #define DEFAULT_MAP_WINDOW (UL(1) << (MMAP_VA_BITS - 1)) #define STACK_TOP_MAX TASK_SIZE_64 #else diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c index f6b13e9f5e6c..ce1440c63606 100644 --- a/arch/riscv/kernel/cpu.c +++ b/arch/riscv/kernel/cpu.c @@ -291,9 +291,9 @@ static void print_mmu(struct seq_file *f) const char *sv_type; #ifdef CONFIG_MMU -#if defined(CONFIG_32BIT) +#if CONFIG_PGTABLE_LEVELS == 2 sv_type = "sv32"; -#elif defined(CONFIG_64BIT) +#else if (pgtable_l5_enabled) sv_type = "sv57"; else if (pgtable_l4_enabled) diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index fcc23350610e..1e854e9633b3 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -40,25 +40,25 @@ static void show_pte(unsigned long addr) pgdp = pgd_offset(mm, addr); pgd = pgdp_get(pgdp); - pr_alert("[%016lx] pgd=%016lx", addr, pgd_val(pgd)); + pr_alert("[%016lx] pgd=" REG_FMT, addr, pgd_val(pgd)); if (pgd_none(pgd) || pgd_bad(pgd) || pgd_leaf(pgd)) goto out; p4dp = p4d_offset(pgdp, addr); p4d = p4dp_get(p4dp); - pr_cont(", p4d=%016lx", p4d_val(p4d)); + pr_cont(", p4d=" REG_FMT, p4d_val(p4d)); if (p4d_none(p4d) || p4d_bad(p4d) || p4d_leaf(p4d)) goto out; pudp = pud_offset(p4dp, addr); pud = pudp_get(pudp); - pr_cont(", pud=%016lx", pud_val(pud)); + pr_cont(", pud=" REG_FMT, pud_val(pud)); if (pud_none(pud) || pud_bad(pud) || pud_leaf(pud)) goto out; pmdp = pmd_offset(pudp, addr); pmd = pmdp_get(pmdp); - pr_cont(", pmd=%016lx", pmd_val(pmd)); + pr_cont(", pmd=" REG_FMT, pmd_val(pmd)); if (pmd_none(pmd) || pmd_bad(pmd) || pmd_leaf(pmd)) goto out; @@ -67,7 +67,7 @@ static void show_pte(unsigned long addr) goto out; pte = ptep_get(ptep); - pr_cont(", pte=%016lx", pte_val(pte)); + pr_cont(", pte=" REG_FMT, pte_val(pte)); pte_unmap(ptep); out: pr_cont("\n"); diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 15b2eda4c364..3cdbb033860e 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -46,16 +46,20 @@ EXPORT_SYMBOL(kernel_map); #define kernel_map (*(struct kernel_mapping *)XIP_FIXUP(&kernel_map)) #endif -#ifdef CONFIG_64BIT +#if CONFIG_PGTABLE_LEVELS > 2 +#if BITS_PER_LONG == 64 u64 satp_mode __ro_after_init = !IS_ENABLED(CONFIG_XIP_KERNEL) ? SATP_MODE_57 : SATP_MODE_39; #else +u64 satp_mode __ro_after_init = SATP_MODE_39; +#endif +#else u64 satp_mode __ro_after_init = SATP_MODE_32; #endif EXPORT_SYMBOL(satp_mode); #ifdef CONFIG_64BIT -bool pgtable_l4_enabled __ro_after_init = !IS_ENABLED(CONFIG_XIP_KERNEL); -bool pgtable_l5_enabled __ro_after_init = !IS_ENABLED(CONFIG_XIP_KERNEL); +bool pgtable_l4_enabled __ro_after_init = !IS_ENABLED(CONFIG_XIP_KERNEL) && (BITS_PER_LONG == 64); +bool pgtable_l5_enabled __ro_after_init = !IS_ENABLED(CONFIG_XIP_KERNEL) && (BITS_PER_LONG == 64); EXPORT_SYMBOL(pgtable_l4_enabled); EXPORT_SYMBOL(pgtable_l5_enabled); #endif @@ -117,7 +121,7 @@ static inline void print_mlg(char *name, unsigned long b, unsigned long t) (((t) - (b)) >> LOG2_SZ_1G)); } -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 static inline void print_mlt(char *name, unsigned long b, unsigned long t) { pr_notice("%12s : 0x%08lx - 0x%08lx (%4ld TB)\n", name, b, t, @@ -131,7 +135,7 @@ static inline void print_ml(char *name, unsigned long b, unsigned long t) { unsigned long diff = t - b; - if (IS_ENABLED(CONFIG_64BIT) && (diff >> LOG2_SZ_1T) >= 10) + if ((BITS_PER_LONG == 64) && (diff >> LOG2_SZ_1T) >= 10) print_mlt(name, b, t); else if ((diff >> LOG2_SZ_1G) >= 10) print_mlg(name, b, t); @@ -164,7 +168,9 @@ static void __init print_vm_layout(void) #endif print_ml("kernel", (unsigned long)kernel_map.virt_addr, - (unsigned long)ADDRESS_SPACE_END); + (BITS_PER_LONG == 64) ? + (unsigned long)ADDRESS_SPACE_END : + (unsigned long)PAGE_OFFSET); } } #else @@ -173,7 +179,8 @@ static void print_vm_layout(void) { } void __init mem_init(void) { - bool swiotlb = max_pfn > PFN_DOWN(dma32_phys_limit); + bool swiotlb = (BITS_PER_LONG == 32) ? false: + (max_pfn > PFN_DOWN(dma32_phys_limit)); #ifdef CONFIG_FLATMEM BUG_ON(!mem_map); #endif /* CONFIG_FLATMEM */ @@ -319,7 +326,7 @@ static void __init setup_bootmem(void) memblock_reserve(dtb_early_pa, fdt_totalsize(dtb_early_va)); dma_contiguous_reserve(dma32_phys_limit); - if (IS_ENABLED(CONFIG_64BIT)) + if (BITS_PER_LONG == 64) hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT); } @@ -685,16 +692,26 @@ void __meminit create_pgd_mapping(pgd_t *pgdp, uintptr_t va, phys_addr_t pa, phy pgd_next_t *nextp; phys_addr_t next_phys; uintptr_t pgd_idx = pgd_index(va); +#if (CONFIG_PGTABLE_LEVELS > 2) && (BITS_PER_LONG == 32) + uintptr_t pgd_idh = pgd_index(sign_extend64((u64)va, 31)); +#endif if (sz == PGDIR_SIZE) { - if (pgd_val(pgdp[pgd_idx]) == 0) + if (pgd_val(pgdp[pgd_idx]) == 0) { pgdp[pgd_idx] = pfn_pgd(PFN_DOWN(pa), prot); +#if (CONFIG_PGTABLE_LEVELS > 2) && (BITS_PER_LONG == 32) + pgdp[pgd_idh] = pfn_pgd(PFN_DOWN(pa), prot); +#endif + } return; } if (pgd_val(pgdp[pgd_idx]) == 0) { next_phys = alloc_pgd_next(va); pgdp[pgd_idx] = pfn_pgd(PFN_DOWN(next_phys), PAGE_TABLE); +#if (CONFIG_PGTABLE_LEVELS > 2) && (BITS_PER_LONG == 32) + pgdp[pgd_idh] = pfn_pgd(PFN_DOWN(next_phys), PAGE_TABLE); +#endif nextp = get_pgd_next_virt(next_phys); memset(nextp, 0, PAGE_SIZE); } else { @@ -775,7 +792,7 @@ static __meminit pgprot_t pgprot_from_va(uintptr_t va) } #endif /* CONFIG_STRICT_KERNEL_RWX */ -#if defined(CONFIG_64BIT) && !defined(CONFIG_XIP_KERNEL) +#if (BITS_PER_LONG == 64) && !defined(CONFIG_XIP_KERNEL) u64 __pi_set_satp_mode_from_cmdline(uintptr_t dtb_pa); static void __init disable_pgtable_l5(void) @@ -981,8 +998,8 @@ static void __init create_fdt_early_page_table(uintptr_t fix_fdt_va, /* Make sure the fdt fixmap address is always aligned on PMD size */ BUILD_BUG_ON(FIX_FDT % (PMD_SIZE / PAGE_SIZE)); - /* In 32-bit only, the fdt lies in its own PGD */ - if (!IS_ENABLED(CONFIG_64BIT)) { + /* In Sv32 only, the fdt lies in its own PGD */ + if (CONFIG_PGTABLE_LEVELS == 2) { create_pgd_mapping(early_pg_dir, fix_fdt_va, pa, MAX_FDT_SIZE, PAGE_KERNEL); } else { @@ -1108,7 +1125,7 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa) kernel_map.virt_addr = KERNEL_LINK_ADDR + kernel_map.virt_offset; #ifdef CONFIG_XIP_KERNEL -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 kernel_map.page_offset = PAGE_OFFSET_L3; #else kernel_map.page_offset = _AC(CONFIG_PAGE_OFFSET, UL); @@ -1133,7 +1150,7 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa) kernel_map.va_kernel_pa_offset = kernel_map.virt_addr - kernel_map.phys_addr; #endif -#if defined(CONFIG_64BIT) && !defined(CONFIG_XIP_KERNEL) +#if (BITS_PER_LONG == 64) && !defined(CONFIG_XIP_KERNEL) set_satp_mode(dtb_pa); set_mmap_rnd_bits_max(); #endif @@ -1164,7 +1181,11 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa) * The last 4K bytes of the addressable memory can not be mapped because * of IS_ERR_VALUE macro. */ +#if BITS_PER_LONG == 64 BUG_ON((kernel_map.virt_addr + kernel_map.size) > ADDRESS_SPACE_END - SZ_4K); +#else + BUG_ON((kernel_map.virt_addr + kernel_map.size) > PAGE_OFFSET - SZ_4K); +#endif #endif #ifdef CONFIG_RELOCATABLE @@ -1246,7 +1267,7 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa) fix_bmap_epmd = fixmap_pmd[pmd_index(__fix_to_virt(FIX_BTMAP_END))]; if (pmd_val(fix_bmap_spmd) != pmd_val(fix_bmap_epmd)) { WARN_ON(1); - pr_warn("fixmap btmap start [%08lx] != end [%08lx]\n", + pr_warn("fixmap btmap start [" PTE_FMT "] != end [" PTE_FMT "]\n", pmd_val(fix_bmap_spmd), pmd_val(fix_bmap_epmd)); pr_warn("fix_to_virt(FIX_BTMAP_BEGIN): %08lx\n", fix_to_virt(FIX_BTMAP_BEGIN)); @@ -1336,7 +1357,7 @@ static void __init create_linear_mapping_page_table(void) static void __init setup_vm_final(void) { /* Setup swapper PGD for fixmap */ -#if !defined(CONFIG_64BIT) +#if CONFIG_PGTABLE_LEVELS == 2 /* * In 32-bit, the device tree lies in a pgd entry, so it must be copied * directly in swapper_pg_dir in addition to the pgd entry that points @@ -1354,7 +1375,7 @@ static void __init setup_vm_final(void) create_linear_mapping_page_table(); /* Map the kernel */ - if (IS_ENABLED(CONFIG_64BIT)) + if (CONFIG_PGTABLE_LEVELS > 2) create_kernel_page_table(swapper_pg_dir, false); #ifdef CONFIG_KASAN diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c index d815448758a1..45927f713cb9 100644 --- a/arch/riscv/mm/pageattr.c +++ b/arch/riscv/mm/pageattr.c @@ -15,10 +15,10 @@ struct pageattr_masks { pgprot_t clear_mask; }; -static unsigned long set_pageattr_masks(unsigned long val, struct mm_walk *walk) +static unsigned long set_pageattr_masks(ptval_t val, struct mm_walk *walk) { struct pageattr_masks *masks = walk->private; - unsigned long new_val = val; + ptval_t new_val = val; new_val &= ~(pgprot_val(masks->clear_mask)); new_val |= (pgprot_val(masks->set_mask)); diff --git a/arch/riscv/mm/pgtable.c b/arch/riscv/mm/pgtable.c index 4ae67324f992..564679b4c48e 100644 --- a/arch/riscv/mm/pgtable.c +++ b/arch/riscv/mm/pgtable.c @@ -37,7 +37,7 @@ int ptep_test_and_clear_young(struct vm_area_struct *vma, { if (!pte_young(ptep_get(ptep))) return 0; - return test_and_clear_bit(_PAGE_ACCESSED_OFFSET, &pte_val(*ptep)); + return test_and_clear_bit(_PAGE_ACCESSED_OFFSET, (unsigned long *)&pte_val(*ptep)); } EXPORT_SYMBOL_GPL(ptep_test_and_clear_young); -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:15:57 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:15:57 -0400 Subject: [RFC PATCH V3 16/43] rv64ilp32_abi: riscv: Support physical addresses >= 0x80000000 In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-17-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The RV64ILP32 ABI has two independent 2GiB address space: - 0 ~ 0x7fffffff - 0xffffffff80000000 ~ 0xffffffffffffffff In the rv64ilp32 ABI, 0x80000000 is illegal; hence, use a temporary trap handler to zero-extend the address for jalr, load and store operations. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- arch/riscv/kernel/head.S | 112 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 112 insertions(+) diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S index e55a92be12b1..bd2f30aa6d01 100644 --- a/arch/riscv/kernel/head.S +++ b/arch/riscv/kernel/head.S @@ -170,6 +170,118 @@ secondary_start_sbi: .align 2 .Lsecondary_park: +#ifdef CONFIG_ABI_RV64ILP32 +.option push +.option norelax +.option norvc + addiw sp, sp, -32 + + /* zext.w sp */ + slli sp, sp, 32 + srli sp, sp, 32 + + /* zext.w ra */ + slli ra, ra, 32 + srli ra, ra, 32 + + /* zext.w fp */ + slli fp, fp, 32 + srli fp, fp, 32 + + /* zext.w tp */ + slli tp, tp, 32 + srli tp, tp, 32 + + /* save tmp reg */ + REG_S ra, 24(sp) + REG_S fp, 16(sp) + REG_S tp, 8(sp) + REG_S gp, 0(sp) + + /* zext.w epc */ + csrr ra, CSR_EPC + slli ra, ra, 32 + srli ra, ra, 32 + csrw CSR_SEPC, ra + + csrr gp, CSR_CAUSE + + /* EXC_INST_ACCESS */ + addiw fp, gp, -1 + beqz fp, 6f + + /* EXC_LOAD_ACCESS */ + addiw fp, gp, -5 + beqz fp, 1f + + /* EXC_STORE_ACCESS */ + addiw fp, gp, -7 + beqz fp, 1f + + j 7f +1: + /* get inst */ + lw ra, 0(ra) + andi gp, ra, 0x3 + + /* c.(lw/sw/ld/sd)sp */ + addiw fp, gp, -2 + beqz fp, 6f + + /* lw/sw/ld/sd */ + addiw fp, gp, -3 + beqz fp, 2f + + /* c.(lw/sw/ld/sd) */ + li fp, 0x7 + slli fp, fp, 7 + and ra, fp, ra + slli ra, ra, 8 + j 3f + +2: + /* get rs1 */ + li fp, 0x1f + slli fp, fp, 15 + and ra, fp, ra + +3: + /* copy rs1 to rd */ + mv fp, ra + srli fp, fp, 8 + or ra, fp, ra + + /* modify slli */ + la fp, 4f + lw tp, 0(fp) + mv gp, tp + or tp, ra, tp + sw tp, 0(fp) + fence.i +4: slli x0, x0, 32 + sw gp, 0(fp) + + /* modify srli */ + la fp, 5f + lw tp, 0(fp) + mv gp, tp + or tp, ra, tp + sw tp, 0(fp) + fence.i +5: srli x0, x0, 32 + sw gp, 0(fp) + +6: + /* restore tmp reg */ + REG_L ra, 24(sp) + REG_L fp, 16(sp) + REG_L tp, 8(sp) + REG_L gp, 0(sp) + addi sp, sp, 32 + sret +.option pop +7: +#endif /* * Park this hart if we: * - have too many harts on CONFIG_RISCV_BOOT_SPINWAIT -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:15:58 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:15:58 -0400 Subject: [RFC PATCH V3 17/43] rv64ilp32_abi: riscv: Adapt kasan memory layout In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-18-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" For generic KASAN, the size of each memory granule is 8, which needs 1/8 address space. The kernel space is 2GiB in rv64ilp32, so we need 256MiB range (0x80000000 ~ 0x90000000), and the offset is 0x7000000 for the whole 4GiB address space. Virtual kernel memory layout: fixmap : 0x90a00000 - 0x90ffffff (6144 kB) pci io : 0x91000000 - 0x91ffffff ( 16 MB) vmemmap : 0x92000000 - 0x93ffffff ( 32 MB) vmalloc : 0x94000000 - 0xb3ffffff ( 512 MB) modules : 0xb4000000 - 0xb7ffffff ( 64 MB) lowmem : 0xc0000000 - 0xc7ffffff ( 128 MB) kasan : 0x80000000 - 0x8fffffff ( 256 MB) <= kernel : 0xb8000000 - 0xbfffffff ( 128 MB) Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- arch/riscv/include/asm/kasan.h | 6 +++++- arch/riscv/mm/kasan_init.c | 2 +- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/arch/riscv/include/asm/kasan.h b/arch/riscv/include/asm/kasan.h index e6a0071bdb56..dd3a211bc5d0 100644 --- a/arch/riscv/include/asm/kasan.h +++ b/arch/riscv/include/asm/kasan.h @@ -21,7 +21,7 @@ * [KASAN_SHADOW_OFFSET, KASAN_SHADOW_END) cover all 64-bits of virtual * addresses. So KASAN_SHADOW_OFFSET should satisfy the following equation: * KASAN_SHADOW_OFFSET = KASAN_SHADOW_END - - * (1ULL << (64 - KASAN_SHADOW_SCALE_SHIFT)) + * (1ULL << (BITS_PER_LONG - KASAN_SHADOW_SCALE_SHIFT)) */ #define KASAN_SHADOW_SCALE_SHIFT 3 @@ -31,7 +31,11 @@ * aligned on PGDIR_SIZE, so force its alignment to ease its population. */ #define KASAN_SHADOW_START ((KASAN_SHADOW_END - KASAN_SHADOW_SIZE) & PGDIR_MASK) +#if defined(CONFIG_64BIT) && (BITS_PER_LONG == 32) +#define KASAN_SHADOW_END 0x90000000UL +#else #define KASAN_SHADOW_END MODULES_LOWEST_VADDR +#endif #ifdef CONFIG_KASAN #define KASAN_SHADOW_OFFSET _AC(CONFIG_KASAN_SHADOW_OFFSET, UL) diff --git a/arch/riscv/mm/kasan_init.c b/arch/riscv/mm/kasan_init.c index 41c635d6aca4..1e864598779a 100644 --- a/arch/riscv/mm/kasan_init.c +++ b/arch/riscv/mm/kasan_init.c @@ -324,7 +324,7 @@ asmlinkage void __init kasan_early_init(void) uintptr_t i; BUILD_BUG_ON(KASAN_SHADOW_OFFSET != - KASAN_SHADOW_END - (1UL << (64 - KASAN_SHADOW_SCALE_SHIFT))); + KASAN_SHADOW_END - (1UL << (BITS_PER_LONG - KASAN_SHADOW_SCALE_SHIFT))); for (i = 0; i < PTRS_PER_PTE; ++i) set_pte(kasan_early_shadow_pte + i, -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:15:59 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:15:59 -0400 Subject: [RFC PATCH V3 18/43] rv64ilp32_abi: riscv: kvm: Initial support In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-19-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" This is the initial support for rv64ilp32 abi, and haven't passed the kvm self test. It could support rv64ilp32 & rv64lp64 linux guest kernels. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- arch/riscv/include/asm/kvm_aia.h | 32 ++--- arch/riscv/include/asm/kvm_host.h | 192 ++++++++++++------------- arch/riscv/include/asm/kvm_nacl.h | 26 ++-- arch/riscv/include/asm/kvm_vcpu_insn.h | 4 +- arch/riscv/include/asm/kvm_vcpu_pmu.h | 8 +- arch/riscv/include/asm/kvm_vcpu_sbi.h | 4 +- arch/riscv/include/asm/sbi.h | 10 +- arch/riscv/include/uapi/asm/kvm.h | 56 ++++---- arch/riscv/kvm/aia.c | 26 ++-- arch/riscv/kvm/aia_imsic.c | 6 +- arch/riscv/kvm/main.c | 2 +- arch/riscv/kvm/mmu.c | 10 +- arch/riscv/kvm/tlb.c | 76 +++++----- arch/riscv/kvm/vcpu.c | 10 +- arch/riscv/kvm/vcpu_exit.c | 4 +- arch/riscv/kvm/vcpu_insn.c | 12 +- arch/riscv/kvm/vcpu_onereg.c | 18 +-- arch/riscv/kvm/vcpu_pmu.c | 8 +- arch/riscv/kvm/vcpu_sbi_base.c | 2 +- arch/riscv/kvm/vmid.c | 4 +- 20 files changed, 256 insertions(+), 254 deletions(-) diff --git a/arch/riscv/include/asm/kvm_aia.h b/arch/riscv/include/asm/kvm_aia.h index 1f37b600ca47..d7dae9128b5e 100644 --- a/arch/riscv/include/asm/kvm_aia.h +++ b/arch/riscv/include/asm/kvm_aia.h @@ -50,13 +50,13 @@ struct kvm_aia { }; struct kvm_vcpu_aia_csr { - unsigned long vsiselect; - unsigned long hviprio1; - unsigned long hviprio2; - unsigned long vsieh; - unsigned long hviph; - unsigned long hviprio1h; - unsigned long hviprio2h; + xlen_t vsiselect; + xlen_t hviprio1; + xlen_t hviprio2; + xlen_t vsieh; + xlen_t hviph; + xlen_t hviprio1h; + xlen_t hviprio2h; }; struct kvm_vcpu_aia { @@ -95,8 +95,8 @@ int kvm_riscv_vcpu_aia_imsic_update(struct kvm_vcpu *vcpu); #define KVM_RISCV_AIA_IMSIC_TOPEI (ISELECT_MASK + 1) int kvm_riscv_vcpu_aia_imsic_rmw(struct kvm_vcpu *vcpu, unsigned long isel, - unsigned long *val, unsigned long new_val, - unsigned long wr_mask); + xlen_t *val, xlen_t new_val, + xlen_t wr_mask); int kvm_riscv_aia_imsic_rw_attr(struct kvm *kvm, unsigned long type, bool write, unsigned long *val); int kvm_riscv_aia_imsic_has_attr(struct kvm *kvm, unsigned long type); @@ -131,19 +131,19 @@ void kvm_riscv_vcpu_aia_load(struct kvm_vcpu *vcpu, int cpu); void kvm_riscv_vcpu_aia_put(struct kvm_vcpu *vcpu); int kvm_riscv_vcpu_aia_get_csr(struct kvm_vcpu *vcpu, unsigned long reg_num, - unsigned long *out_val); + xlen_t *out_val); int kvm_riscv_vcpu_aia_set_csr(struct kvm_vcpu *vcpu, unsigned long reg_num, - unsigned long val); + xlen_t val); int kvm_riscv_vcpu_aia_rmw_topei(struct kvm_vcpu *vcpu, unsigned int csr_num, - unsigned long *val, - unsigned long new_val, - unsigned long wr_mask); + xlen_t *val, + xlen_t new_val, + xlen_t wr_mask); int kvm_riscv_vcpu_aia_rmw_ireg(struct kvm_vcpu *vcpu, unsigned int csr_num, - unsigned long *val, unsigned long new_val, - unsigned long wr_mask); + xlen_t *val, xlen_t new_val, + xlen_t wr_mask); #define KVM_RISCV_VCPU_AIA_CSR_FUNCS \ { .base = CSR_SIREG, .count = 1, .func = kvm_riscv_vcpu_aia_rmw_ireg }, \ { .base = CSR_STOPEI, .count = 1, .func = kvm_riscv_vcpu_aia_rmw_topei }, diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h index cc33e35cd628..166cae2c74cf 100644 --- a/arch/riscv/include/asm/kvm_host.h +++ b/arch/riscv/include/asm/kvm_host.h @@ -64,8 +64,8 @@ enum kvm_riscv_hfence_type { struct kvm_riscv_hfence { enum kvm_riscv_hfence_type type; - unsigned long asid; - unsigned long order; + xlen_t asid; + xlen_t order; gpa_t addr; gpa_t size; }; @@ -102,8 +102,8 @@ struct kvm_vmid { * Writes to vmid_version and vmid happen with vmid_lock held * whereas reads happen without any lock held. */ - unsigned long vmid_version; - unsigned long vmid; + xlen_t vmid_version; + xlen_t vmid; }; struct kvm_arch { @@ -122,75 +122,75 @@ struct kvm_arch { }; struct kvm_cpu_trap { - unsigned long sepc; - unsigned long scause; - unsigned long stval; - unsigned long htval; - unsigned long htinst; + xlen_t sepc; + xlen_t scause; + xlen_t stval; + xlen_t htval; + xlen_t htinst; }; struct kvm_cpu_context { - unsigned long zero; - unsigned long ra; - unsigned long sp; - unsigned long gp; - unsigned long tp; - unsigned long t0; - unsigned long t1; - unsigned long t2; - unsigned long s0; - unsigned long s1; - unsigned long a0; - unsigned long a1; - unsigned long a2; - unsigned long a3; - unsigned long a4; - unsigned long a5; - unsigned long a6; - unsigned long a7; - unsigned long s2; - unsigned long s3; - unsigned long s4; - unsigned long s5; - unsigned long s6; - unsigned long s7; - unsigned long s8; - unsigned long s9; - unsigned long s10; - unsigned long s11; - unsigned long t3; - unsigned long t4; - unsigned long t5; - unsigned long t6; - unsigned long sepc; - unsigned long sstatus; - unsigned long hstatus; + xlen_t zero; + xlen_t ra; + xlen_t sp; + xlen_t gp; + xlen_t tp; + xlen_t t0; + xlen_t t1; + xlen_t t2; + xlen_t s0; + xlen_t s1; + xlen_t a0; + xlen_t a1; + xlen_t a2; + xlen_t a3; + xlen_t a4; + xlen_t a5; + xlen_t a6; + xlen_t a7; + xlen_t s2; + xlen_t s3; + xlen_t s4; + xlen_t s5; + xlen_t s6; + xlen_t s7; + xlen_t s8; + xlen_t s9; + xlen_t s10; + xlen_t s11; + xlen_t t3; + xlen_t t4; + xlen_t t5; + xlen_t t6; + xlen_t sepc; + xlen_t sstatus; + xlen_t hstatus; union __riscv_fp_state fp; struct __riscv_v_ext_state vector; }; struct kvm_vcpu_csr { - unsigned long vsstatus; - unsigned long vsie; - unsigned long vstvec; - unsigned long vsscratch; - unsigned long vsepc; - unsigned long vscause; - unsigned long vstval; - unsigned long hvip; - unsigned long vsatp; - unsigned long scounteren; - unsigned long senvcfg; + xlen_t vsstatus; + xlen_t vsie; + xlen_t vstvec; + xlen_t vsscratch; + xlen_t vsepc; + xlen_t vscause; + xlen_t vstval; + xlen_t hvip; + xlen_t vsatp; + xlen_t scounteren; + xlen_t senvcfg; }; struct kvm_vcpu_config { u64 henvcfg; u64 hstateen0; - unsigned long hedeleg; + xlen_t hedeleg; }; struct kvm_vcpu_smstateen_csr { - unsigned long sstateen0; + xlen_t sstateen0; }; struct kvm_vcpu_arch { @@ -204,16 +204,16 @@ struct kvm_vcpu_arch { DECLARE_BITMAP(isa, RISCV_ISA_EXT_MAX); /* Vendor, Arch, and Implementation details */ - unsigned long mvendorid; - unsigned long marchid; - unsigned long mimpid; + xlen_t mvendorid; + xlen_t marchid; + xlen_t mimpid; /* SSCRATCH, STVEC, and SCOUNTEREN of Host */ - unsigned long host_sscratch; - unsigned long host_stvec; - unsigned long host_scounteren; - unsigned long host_senvcfg; - unsigned long host_sstateen0; + xlen_t host_sscratch; + xlen_t host_stvec; + xlen_t host_scounteren; + xlen_t host_senvcfg; + xlen_t host_sstateen0; /* CPU context of Host */ struct kvm_cpu_context host_context; @@ -252,8 +252,8 @@ struct kvm_vcpu_arch { /* HFENCE request queue */ spinlock_t hfence_lock; - unsigned long hfence_head; - unsigned long hfence_tail; + xlen_t hfence_head; + xlen_t hfence_tail; struct kvm_riscv_hfence hfence_queue[KVM_RISCV_VCPU_MAX_HFENCE]; /* MMIO instruction details */ @@ -305,24 +305,24 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {} #define KVM_RISCV_GSTAGE_TLB_MIN_ORDER 12 -void kvm_riscv_local_hfence_gvma_vmid_gpa(unsigned long vmid, +void kvm_riscv_local_hfence_gvma_vmid_gpa(xlen_t vmid, gpa_t gpa, gpa_t gpsz, - unsigned long order); -void kvm_riscv_local_hfence_gvma_vmid_all(unsigned long vmid); + xlen_t order); +void kvm_riscv_local_hfence_gvma_vmid_all(xlen_t vmid); void kvm_riscv_local_hfence_gvma_gpa(gpa_t gpa, gpa_t gpsz, - unsigned long order); + xlen_t order); void kvm_riscv_local_hfence_gvma_all(void); -void kvm_riscv_local_hfence_vvma_asid_gva(unsigned long vmid, - unsigned long asid, - unsigned long gva, - unsigned long gvsz, - unsigned long order); -void kvm_riscv_local_hfence_vvma_asid_all(unsigned long vmid, - unsigned long asid); -void kvm_riscv_local_hfence_vvma_gva(unsigned long vmid, - unsigned long gva, unsigned long gvsz, - unsigned long order); -void kvm_riscv_local_hfence_vvma_all(unsigned long vmid); +void kvm_riscv_local_hfence_vvma_asid_gva(xlen_t vmid, + xlen_t asid, + xlen_t gva, + xlen_t gvsz, + xlen_t order); +void kvm_riscv_local_hfence_vvma_asid_all(xlen_t vmid, + xlen_t asid); +void kvm_riscv_local_hfence_vvma_gva(xlen_t vmid, + xlen_t gva, xlen_t gvsz, + xlen_t order); +void kvm_riscv_local_hfence_vvma_all(xlen_t vmid); void kvm_riscv_local_tlb_sanitize(struct kvm_vcpu *vcpu); @@ -332,26 +332,26 @@ void kvm_riscv_hfence_vvma_all_process(struct kvm_vcpu *vcpu); void kvm_riscv_hfence_process(struct kvm_vcpu *vcpu); void kvm_riscv_fence_i(struct kvm *kvm, - unsigned long hbase, unsigned long hmask); + xlen_t hbase, xlen_t hmask); void kvm_riscv_hfence_gvma_vmid_gpa(struct kvm *kvm, - unsigned long hbase, unsigned long hmask, + xlen_t hbase, xlen_t hmask, gpa_t gpa, gpa_t gpsz, - unsigned long order); + xlen_t order); void kvm_riscv_hfence_gvma_vmid_all(struct kvm *kvm, - unsigned long hbase, unsigned long hmask); + xlen_t hbase, xlen_t hmask); void kvm_riscv_hfence_vvma_asid_gva(struct kvm *kvm, - unsigned long hbase, unsigned long hmask, - unsigned long gva, unsigned long gvsz, - unsigned long order, unsigned long asid); + xlen_t hbase, xlen_t hmask, + xlen_t gva, xlen_t gvsz, + xlen_t order, xlen_t asid); void kvm_riscv_hfence_vvma_asid_all(struct kvm *kvm, - unsigned long hbase, unsigned long hmask, - unsigned long asid); + xlen_t hbase, xlen_t hmask, + xlen_t asid); void kvm_riscv_hfence_vvma_gva(struct kvm *kvm, - unsigned long hbase, unsigned long hmask, - unsigned long gva, unsigned long gvsz, - unsigned long order); + xlen_t hbase, xlen_t hmask, + xlen_t gva, xlen_t gvsz, + xlen_t order); void kvm_riscv_hfence_vvma_all(struct kvm *kvm, - unsigned long hbase, unsigned long hmask); + xlen_t hbase, xlen_t hmask); int kvm_riscv_gstage_ioremap(struct kvm *kvm, gpa_t gpa, phys_addr_t hpa, unsigned long size, @@ -369,7 +369,7 @@ unsigned long __init kvm_riscv_gstage_mode(void); int kvm_riscv_gstage_gpa_bits(void); void __init kvm_riscv_gstage_vmid_detect(void); -unsigned long kvm_riscv_gstage_vmid_bits(void); +xlen_t kvm_riscv_gstage_vmid_bits(void); int kvm_riscv_gstage_vmid_init(struct kvm *kvm); bool kvm_riscv_gstage_vmid_ver_changed(struct kvm_vmid *vmid); void kvm_riscv_gstage_vmid_update(struct kvm_vcpu *vcpu); diff --git a/arch/riscv/include/asm/kvm_nacl.h b/arch/riscv/include/asm/kvm_nacl.h index 4124d5e06a0f..59be64c068fc 100644 --- a/arch/riscv/include/asm/kvm_nacl.h +++ b/arch/riscv/include/asm/kvm_nacl.h @@ -68,26 +68,26 @@ int kvm_riscv_nacl_init(void); #define nacl_shmem() \ this_cpu_ptr(&kvm_riscv_nacl)->shmem -#define nacl_scratch_read_long(__shmem, __offset) \ +#define nacl_scratch_read_csr(__shmem, __offset) \ ({ \ - unsigned long *__p = (__shmem) + \ + xlen_t *__p = (__shmem) + \ SBI_NACL_SHMEM_SCRATCH_OFFSET + \ (__offset); \ lelong_to_cpu(*__p); \ }) -#define nacl_scratch_write_long(__shmem, __offset, __val) \ +#define nacl_scratch_write_csr(__shmem, __offset, __val) \ do { \ - unsigned long *__p = (__shmem) + \ + xlen_t *__p = (__shmem) + \ SBI_NACL_SHMEM_SCRATCH_OFFSET + \ (__offset); \ *__p = cpu_to_lelong(__val); \ } while (0) -#define nacl_scratch_write_longs(__shmem, __offset, __array, __count) \ +#define nacl_scratch_write_csrs(__shmem, __offset, __array, __count) \ do { \ unsigned int __i; \ - unsigned long *__p = (__shmem) + \ + xlen_t *__p = (__shmem) + \ SBI_NACL_SHMEM_SCRATCH_OFFSET + \ (__offset); \ for (__i = 0; __i < (__count); __i++) \ @@ -100,7 +100,7 @@ do { \ #define nacl_hfence_mkconfig(__type, __order, __vmid, __asid) \ ({ \ - unsigned long __c = SBI_NACL_SHMEM_HFENCE_CONFIG_PEND; \ + xlen_t __c = SBI_NACL_SHMEM_HFENCE_CONFIG_PEND; \ __c |= ((__type) & SBI_NACL_SHMEM_HFENCE_CONFIG_TYPE_MASK) \ << SBI_NACL_SHMEM_HFENCE_CONFIG_TYPE_SHIFT; \ __c |= (((__order) - SBI_NACL_SHMEM_HFENCE_ORDER_BASE) & \ @@ -168,7 +168,7 @@ __kvm_riscv_nacl_hfence(__shmem, \ #define nacl_csr_read(__shmem, __csr) \ ({ \ - unsigned long *__a = (__shmem) + SBI_NACL_SHMEM_CSR_OFFSET; \ + xlen_t *__a = (__shmem) + SBI_NACL_SHMEM_CSR_OFFSET; \ lelong_to_cpu(__a[SBI_NACL_SHMEM_CSR_INDEX(__csr)]); \ }) @@ -176,7 +176,7 @@ __kvm_riscv_nacl_hfence(__shmem, \ do { \ void *__s = (__shmem); \ unsigned int __i = SBI_NACL_SHMEM_CSR_INDEX(__csr); \ - unsigned long *__a = (__s) + SBI_NACL_SHMEM_CSR_OFFSET; \ + xlen_t *__a = (__s) + SBI_NACL_SHMEM_CSR_OFFSET; \ u8 *__b = (__s) + SBI_NACL_SHMEM_DBITMAP_OFFSET; \ __a[__i] = cpu_to_lelong(__val); \ __b[__i >> 3] |= 1U << (__i & 0x7); \ @@ -186,9 +186,9 @@ do { \ ({ \ void *__s = (__shmem); \ unsigned int __i = SBI_NACL_SHMEM_CSR_INDEX(__csr); \ - unsigned long *__a = (__s) + SBI_NACL_SHMEM_CSR_OFFSET; \ + xlen_t *__a = (__s) + SBI_NACL_SHMEM_CSR_OFFSET; \ u8 *__b = (__s) + SBI_NACL_SHMEM_DBITMAP_OFFSET; \ - unsigned long __r = lelong_to_cpu(__a[__i]); \ + xlen_t __r = lelong_to_cpu(__a[__i]); \ __a[__i] = cpu_to_lelong(__val); \ __b[__i >> 3] |= 1U << (__i & 0x7); \ __r; \ @@ -210,7 +210,7 @@ do { \ #define ncsr_read(__csr) \ ({ \ - unsigned long __r; \ + xlen_t __r; \ if (kvm_riscv_nacl_available()) \ __r = nacl_csr_read(nacl_shmem(), __csr); \ else \ @@ -228,7 +228,7 @@ do { \ #define ncsr_swap(__csr, __val) \ ({ \ - unsigned long __r; \ + xlen_t __r; \ if (kvm_riscv_nacl_sync_csr_available()) \ __r = nacl_csr_swap(nacl_shmem(), __csr, __val); \ else \ diff --git a/arch/riscv/include/asm/kvm_vcpu_insn.h b/arch/riscv/include/asm/kvm_vcpu_insn.h index 350011c83581..a0da75683894 100644 --- a/arch/riscv/include/asm/kvm_vcpu_insn.h +++ b/arch/riscv/include/asm/kvm_vcpu_insn.h @@ -11,7 +11,7 @@ struct kvm_run; struct kvm_cpu_trap; struct kvm_mmio_decode { - unsigned long insn; + xlen_t insn; int insn_len; int len; int shift; @@ -19,7 +19,7 @@ struct kvm_mmio_decode { }; struct kvm_csr_decode { - unsigned long insn; + xlen_t insn; int return_handled; }; diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h index 1d85b6617508..e69b102bde49 100644 --- a/arch/riscv/include/asm/kvm_vcpu_pmu.h +++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h @@ -74,8 +74,8 @@ struct kvm_pmu { int kvm_riscv_vcpu_pmu_incr_fw(struct kvm_vcpu *vcpu, unsigned long fid); int kvm_riscv_vcpu_pmu_read_hpm(struct kvm_vcpu *vcpu, unsigned int csr_num, - unsigned long *val, unsigned long new_val, - unsigned long wr_mask); + xlen_t *val, xlen_t new_val, + xlen_t wr_mask); int kvm_riscv_vcpu_pmu_num_ctrs(struct kvm_vcpu *vcpu, struct kvm_vcpu_sbi_return *retdata); int kvm_riscv_vcpu_pmu_ctr_info(struct kvm_vcpu *vcpu, unsigned long cidx, @@ -106,8 +106,8 @@ struct kvm_pmu { }; static inline int kvm_riscv_vcpu_pmu_read_legacy(struct kvm_vcpu *vcpu, unsigned int csr_num, - unsigned long *val, unsigned long new_val, - unsigned long wr_mask) + xlen_t *val, xlen_t new_val, + xlen_t wr_mask) { if (csr_num == CSR_CYCLE || csr_num == CSR_INSTRET) { *val = 0; diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h index 4ed6203cdd30..83d786111450 100644 --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h @@ -27,8 +27,8 @@ struct kvm_vcpu_sbi_context { }; struct kvm_vcpu_sbi_return { - unsigned long out_val; - unsigned long err_val; + xlen_t out_val; + xlen_t err_val; struct kvm_cpu_trap *utrap; bool uexit; }; diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h index fd9a9c723ec6..df73a0eb231b 100644 --- a/arch/riscv/include/asm/sbi.h +++ b/arch/riscv/include/asm/sbi.h @@ -343,7 +343,7 @@ enum sbi_ext_nacl_feature { #define SBI_NACL_SHMEM_HFENCE_CONFIG_PEND_SHIFT \ (__riscv_xlen - SBI_NACL_SHMEM_HFENCE_CONFIG_PEND_BITS) #define SBI_NACL_SHMEM_HFENCE_CONFIG_PEND_MASK \ - ((1UL << SBI_NACL_SHMEM_HFENCE_CONFIG_PEND_BITS) - 1) + ((_AC(1, UXL) << SBI_NACL_SHMEM_HFENCE_CONFIG_PEND_BITS) - 1) #define SBI_NACL_SHMEM_HFENCE_CONFIG_PEND \ (SBI_NACL_SHMEM_HFENCE_CONFIG_PEND_MASK << \ SBI_NACL_SHMEM_HFENCE_CONFIG_PEND_SHIFT) @@ -358,7 +358,7 @@ enum sbi_ext_nacl_feature { (SBI_NACL_SHMEM_HFENCE_CONFIG_RSVD1_SHIFT - \ SBI_NACL_SHMEM_HFENCE_CONFIG_TYPE_BITS) #define SBI_NACL_SHMEM_HFENCE_CONFIG_TYPE_MASK \ - ((1UL << SBI_NACL_SHMEM_HFENCE_CONFIG_TYPE_BITS) - 1) + ((_AC(1, UXL) << SBI_NACL_SHMEM_HFENCE_CONFIG_TYPE_BITS) - 1) #define SBI_NACL_SHMEM_HFENCE_TYPE_GVMA 0x0 #define SBI_NACL_SHMEM_HFENCE_TYPE_GVMA_ALL 0x1 @@ -379,7 +379,7 @@ enum sbi_ext_nacl_feature { (SBI_NACL_SHMEM_HFENCE_CONFIG_RSVD2_SHIFT - \ SBI_NACL_SHMEM_HFENCE_CONFIG_ORDER_BITS) #define SBI_NACL_SHMEM_HFENCE_CONFIG_ORDER_MASK \ - ((1UL << SBI_NACL_SHMEM_HFENCE_CONFIG_ORDER_BITS) - 1) + ((_AC(1, UXL) << SBI_NACL_SHMEM_HFENCE_CONFIG_ORDER_BITS) - 1) #define SBI_NACL_SHMEM_HFENCE_ORDER_BASE 12 #if __riscv_xlen == 32 @@ -392,9 +392,9 @@ enum sbi_ext_nacl_feature { #define SBI_NACL_SHMEM_HFENCE_CONFIG_VMID_SHIFT \ SBI_NACL_SHMEM_HFENCE_CONFIG_ASID_BITS #define SBI_NACL_SHMEM_HFENCE_CONFIG_ASID_MASK \ - ((1UL << SBI_NACL_SHMEM_HFENCE_CONFIG_ASID_BITS) - 1) + ((_AC(1, UXL) << SBI_NACL_SHMEM_HFENCE_CONFIG_ASID_BITS) - 1) #define SBI_NACL_SHMEM_HFENCE_CONFIG_VMID_MASK \ - ((1UL << SBI_NACL_SHMEM_HFENCE_CONFIG_VMID_BITS) - 1) + ((_AC(1, UXL) << SBI_NACL_SHMEM_HFENCE_CONFIG_VMID_BITS) - 1) #define SBI_NACL_SHMEM_AUTOSWAP_FLAG_HSTATUS BIT(0) #define SBI_NACL_SHMEM_AUTOSWAP_HSTATUS ((__riscv_xlen / 8) * 1) diff --git a/arch/riscv/include/uapi/asm/kvm.h b/arch/riscv/include/uapi/asm/kvm.h index f06bc5efcd79..9001e8081ce2 100644 --- a/arch/riscv/include/uapi/asm/kvm.h +++ b/arch/riscv/include/uapi/asm/kvm.h @@ -48,13 +48,13 @@ struct kvm_sregs { /* CONFIG registers for KVM_GET_ONE_REG and KVM_SET_ONE_REG */ struct kvm_riscv_config { - unsigned long isa; - unsigned long zicbom_block_size; - unsigned long mvendorid; - unsigned long marchid; - unsigned long mimpid; - unsigned long zicboz_block_size; - unsigned long satp_mode; + xlen_t isa; + xlen_t zicbom_block_size; + xlen_t mvendorid; + xlen_t marchid; + xlen_t mimpid; + xlen_t zicboz_block_size; + xlen_t satp_mode; }; /* CORE registers for KVM_GET_ONE_REG and KVM_SET_ONE_REG */ @@ -69,33 +69,33 @@ struct kvm_riscv_core { /* General CSR registers for KVM_GET_ONE_REG and KVM_SET_ONE_REG */ struct kvm_riscv_csr { - unsigned long sstatus; - unsigned long sie; - unsigned long stvec; - unsigned long sscratch; - unsigned long sepc; - unsigned long scause; - unsigned long stval; - unsigned long sip; - unsigned long satp; - unsigned long scounteren; - unsigned long senvcfg; + xlen_t sstatus; + xlen_t sie; + xlen_t stvec; + xlen_t sscratch; + xlen_t sepc; + xlen_t scause; + xlen_t stval; + xlen_t sip; + xlen_t satp; + xlen_t scounteren; + xlen_t senvcfg; }; /* AIA CSR registers for KVM_GET_ONE_REG and KVM_SET_ONE_REG */ struct kvm_riscv_aia_csr { - unsigned long siselect; - unsigned long iprio1; - unsigned long iprio2; - unsigned long sieh; - unsigned long siph; - unsigned long iprio1h; - unsigned long iprio2h; + xlen_t siselect; + xlen_t iprio1; + xlen_t iprio2; + xlen_t sieh; + xlen_t siph; + xlen_t iprio1h; + xlen_t iprio2h; }; /* Smstateen CSR for KVM_GET_ONE_REG and KVM_SET_ONE_REG */ struct kvm_riscv_smstateen_csr { - unsigned long sstateen0; + xlen_t sstateen0; }; /* TIMER registers for KVM_GET_ONE_REG and KVM_SET_ONE_REG */ @@ -207,8 +207,8 @@ enum KVM_RISCV_SBI_EXT_ID { /* SBI STA extension registers for KVM_GET_ONE_REG and KVM_SET_ONE_REG */ struct kvm_riscv_sbi_sta { - unsigned long shmem_lo; - unsigned long shmem_hi; + xlen_t shmem_lo; + xlen_t shmem_hi; }; /* Possible states for kvm_riscv_timer */ diff --git a/arch/riscv/kvm/aia.c b/arch/riscv/kvm/aia.c index 19afd1f23537..77f6943292a3 100644 --- a/arch/riscv/kvm/aia.c +++ b/arch/riscv/kvm/aia.c @@ -200,31 +200,31 @@ void kvm_riscv_vcpu_aia_put(struct kvm_vcpu *vcpu) int kvm_riscv_vcpu_aia_get_csr(struct kvm_vcpu *vcpu, unsigned long reg_num, - unsigned long *out_val) + xlen_t *out_val) { struct kvm_vcpu_aia_csr *csr = &vcpu->arch.aia_context.guest_csr; - if (reg_num >= sizeof(struct kvm_riscv_aia_csr) / sizeof(unsigned long)) + if (reg_num >= sizeof(struct kvm_riscv_aia_csr) / sizeof(xlen_t)) return -ENOENT; *out_val = 0; if (kvm_riscv_aia_available()) - *out_val = ((unsigned long *)csr)[reg_num]; + *out_val = ((xlen_t *)csr)[reg_num]; return 0; } int kvm_riscv_vcpu_aia_set_csr(struct kvm_vcpu *vcpu, unsigned long reg_num, - unsigned long val) + xlen_t val) { struct kvm_vcpu_aia_csr *csr = &vcpu->arch.aia_context.guest_csr; - if (reg_num >= sizeof(struct kvm_riscv_aia_csr) / sizeof(unsigned long)) + if (reg_num >= sizeof(struct kvm_riscv_aia_csr) / sizeof(xlen_t)) return -ENOENT; if (kvm_riscv_aia_available()) { - ((unsigned long *)csr)[reg_num] = val; + ((xlen_t *)csr)[reg_num] = val; #ifdef CONFIG_32BIT if (reg_num == KVM_REG_RISCV_CSR_AIA_REG(siph)) @@ -237,9 +237,9 @@ int kvm_riscv_vcpu_aia_set_csr(struct kvm_vcpu *vcpu, int kvm_riscv_vcpu_aia_rmw_topei(struct kvm_vcpu *vcpu, unsigned int csr_num, - unsigned long *val, - unsigned long new_val, - unsigned long wr_mask) + xlen_t *val, + xlen_t new_val, + xlen_t wr_mask) { /* If AIA not available then redirect trap */ if (!kvm_riscv_aia_available()) @@ -271,7 +271,7 @@ static int aia_irq2bitpos[] = { static u8 aia_get_iprio8(struct kvm_vcpu *vcpu, unsigned int irq) { - unsigned long hviprio; + xlen_t hviprio; int bitpos = aia_irq2bitpos[irq]; if (bitpos < 0) @@ -396,8 +396,8 @@ static int aia_rmw_iprio(struct kvm_vcpu *vcpu, unsigned int isel, } int kvm_riscv_vcpu_aia_rmw_ireg(struct kvm_vcpu *vcpu, unsigned int csr_num, - unsigned long *val, unsigned long new_val, - unsigned long wr_mask) + xlen_t *val, xlen_t new_val, + xlen_t wr_mask) { unsigned int isel; @@ -408,7 +408,7 @@ int kvm_riscv_vcpu_aia_rmw_ireg(struct kvm_vcpu *vcpu, unsigned int csr_num, /* First try to emulate in kernel space */ isel = ncsr_read(CSR_VSISELECT) & ISELECT_MASK; if (isel >= ISELECT_IPRIO0 && isel <= ISELECT_IPRIO15) - return aia_rmw_iprio(vcpu, isel, val, new_val, wr_mask); + return aia_rmw_iprio(vcpu, isel, (ulong *)val, new_val, wr_mask); else if (isel >= IMSIC_FIRST && isel <= IMSIC_LAST && kvm_riscv_aia_initialized(vcpu->kvm)) return kvm_riscv_vcpu_aia_imsic_rmw(vcpu, isel, val, new_val, diff --git a/arch/riscv/kvm/aia_imsic.c b/arch/riscv/kvm/aia_imsic.c index a8085cd8215e..3c7f13b7a2ba 100644 --- a/arch/riscv/kvm/aia_imsic.c +++ b/arch/riscv/kvm/aia_imsic.c @@ -839,8 +839,8 @@ int kvm_riscv_vcpu_aia_imsic_update(struct kvm_vcpu *vcpu) } int kvm_riscv_vcpu_aia_imsic_rmw(struct kvm_vcpu *vcpu, unsigned long isel, - unsigned long *val, unsigned long new_val, - unsigned long wr_mask) + xlen_t *val, xlen_t new_val, + xlen_t wr_mask) { u32 topei; struct imsic_mrif_eix *eix; @@ -866,7 +866,7 @@ int kvm_riscv_vcpu_aia_imsic_rmw(struct kvm_vcpu *vcpu, unsigned long isel, } } else { r = imsic_mrif_rmw(imsic->swfile, imsic->nr_eix, isel, - val, new_val, wr_mask); + (ulong *)val, (ulong)new_val, (ulong)wr_mask); /* Forward unknown IMSIC register to user-space */ if (r) rc = (r == -ENOENT) ? 0 : KVM_INSN_ILLEGAL_TRAP; diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c index 1fa8be5ee509..34d053ae09a9 100644 --- a/arch/riscv/kvm/main.c +++ b/arch/riscv/kvm/main.c @@ -152,7 +152,7 @@ static int __init riscv_kvm_init(void) } kvm_info("using %s G-stage page table format\n", str); - kvm_info("VMID %ld bits available\n", kvm_riscv_gstage_vmid_bits()); + kvm_info("VMID %ld bits available\n", (ulong)kvm_riscv_gstage_vmid_bits()); if (kvm_riscv_aia_available()) kvm_info("AIA available with %d guest external interrupts\n", diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c index 1087ea74567b..a89e5701076d 100644 --- a/arch/riscv/kvm/mmu.c +++ b/arch/riscv/kvm/mmu.c @@ -20,7 +20,7 @@ #include #ifdef CONFIG_64BIT -static unsigned long gstage_mode __ro_after_init = (HGATP_MODE_SV39X4 << HGATP_MODE_SHIFT); +static xlen_t gstage_mode __ro_after_init = (HGATP_MODE_SV39X4 << HGATP_MODE_SHIFT); static unsigned long gstage_pgd_levels __ro_after_init = 3; #define gstage_index_bits 9 #else @@ -30,11 +30,11 @@ static unsigned long gstage_pgd_levels __ro_after_init = 2; #endif #define gstage_pgd_xbits 2 -#define gstage_pgd_size (1UL << (HGATP_PAGE_SHIFT + gstage_pgd_xbits)) +#define gstage_pgd_size (_AC(1, UXL) << (HGATP_PAGE_SHIFT + gstage_pgd_xbits)) #define gstage_gpa_bits (HGATP_PAGE_SHIFT + \ (gstage_pgd_levels * gstage_index_bits) + \ gstage_pgd_xbits) -#define gstage_gpa_size ((gpa_t)(1ULL << gstage_gpa_bits)) +#define gstage_gpa_size ((gpa_t)(_AC(1, UXL) << gstage_gpa_bits)) #define gstage_pte_leaf(__ptep) \ (pte_val(*(__ptep)) & (_PAGE_READ | _PAGE_WRITE | _PAGE_EXEC)) @@ -623,7 +623,7 @@ int kvm_riscv_gstage_map(struct kvm_vcpu *vcpu, vma_pageshift = huge_page_shift(hstate_vma(vma)); else vma_pageshift = PAGE_SHIFT; - vma_pagesize = 1ULL << vma_pageshift; + vma_pagesize = _AC(1, UXL) << vma_pageshift; if (logging || (vma->vm_flags & VM_PFNMAP)) vma_pagesize = PAGE_SIZE; @@ -725,7 +725,7 @@ void kvm_riscv_gstage_free_pgd(struct kvm *kvm) void kvm_riscv_gstage_update_hgatp(struct kvm_vcpu *vcpu) { - unsigned long hgatp = gstage_mode; + xlen_t hgatp = gstage_mode; struct kvm_arch *k = &vcpu->kvm->arch; hgatp |= (READ_ONCE(k->vmid.vmid) << HGATP_VMID_SHIFT) & HGATP_VMID; diff --git a/arch/riscv/kvm/tlb.c b/arch/riscv/kvm/tlb.c index 2f91ea5f8493..01d581763849 100644 --- a/arch/riscv/kvm/tlb.c +++ b/arch/riscv/kvm/tlb.c @@ -18,9 +18,9 @@ #define has_svinval() riscv_has_extension_unlikely(RISCV_ISA_EXT_SVINVAL) -void kvm_riscv_local_hfence_gvma_vmid_gpa(unsigned long vmid, +void kvm_riscv_local_hfence_gvma_vmid_gpa(xlen_t vmid, gpa_t gpa, gpa_t gpsz, - unsigned long order) + xlen_t order) { gpa_t pos; @@ -42,13 +42,13 @@ void kvm_riscv_local_hfence_gvma_vmid_gpa(unsigned long vmid, } } -void kvm_riscv_local_hfence_gvma_vmid_all(unsigned long vmid) +void kvm_riscv_local_hfence_gvma_vmid_all(xlen_t vmid) { asm volatile(HFENCE_GVMA(zero, %0) : : "r" (vmid) : "memory"); } void kvm_riscv_local_hfence_gvma_gpa(gpa_t gpa, gpa_t gpsz, - unsigned long order) + xlen_t order) { gpa_t pos; @@ -75,13 +75,14 @@ void kvm_riscv_local_hfence_gvma_all(void) asm volatile(HFENCE_GVMA(zero, zero) : : : "memory"); } -void kvm_riscv_local_hfence_vvma_asid_gva(unsigned long vmid, - unsigned long asid, - unsigned long gva, - unsigned long gvsz, - unsigned long order) +void kvm_riscv_local_hfence_vvma_asid_gva(xlen_t vmid, + xlen_t asid, + xlen_t gva, + xlen_t gvsz, + xlen_t order) { - unsigned long pos, hgatp; + xlen_t pos; + xlen_t hgatp; if (PTRS_PER_PTE < (gvsz >> order)) { kvm_riscv_local_hfence_vvma_asid_all(vmid, asid); @@ -105,10 +106,10 @@ void kvm_riscv_local_hfence_vvma_asid_gva(unsigned long vmid, csr_write(CSR_HGATP, hgatp); } -void kvm_riscv_local_hfence_vvma_asid_all(unsigned long vmid, - unsigned long asid) +void kvm_riscv_local_hfence_vvma_asid_all(xlen_t vmid, + xlen_t asid) { - unsigned long hgatp; + xlen_t hgatp; hgatp = csr_swap(CSR_HGATP, vmid << HGATP_VMID_SHIFT); @@ -117,11 +118,12 @@ void kvm_riscv_local_hfence_vvma_asid_all(unsigned long vmid, csr_write(CSR_HGATP, hgatp); } -void kvm_riscv_local_hfence_vvma_gva(unsigned long vmid, - unsigned long gva, unsigned long gvsz, - unsigned long order) +void kvm_riscv_local_hfence_vvma_gva(xlen_t vmid, + xlen_t gva, xlen_t gvsz, + xlen_t order) { - unsigned long pos, hgatp; + xlen_t pos; + xlen_t hgatp; if (PTRS_PER_PTE < (gvsz >> order)) { kvm_riscv_local_hfence_vvma_all(vmid); @@ -145,9 +147,9 @@ void kvm_riscv_local_hfence_vvma_gva(unsigned long vmid, csr_write(CSR_HGATP, hgatp); } -void kvm_riscv_local_hfence_vvma_all(unsigned long vmid) +void kvm_riscv_local_hfence_vvma_all(xlen_t vmid) { - unsigned long hgatp; + xlen_t hgatp; hgatp = csr_swap(CSR_HGATP, vmid << HGATP_VMID_SHIFT); @@ -158,7 +160,7 @@ void kvm_riscv_local_hfence_vvma_all(unsigned long vmid) void kvm_riscv_local_tlb_sanitize(struct kvm_vcpu *vcpu) { - unsigned long vmid; + xlen_t vmid; if (!kvm_riscv_gstage_vmid_bits() || vcpu->arch.last_exit_cpu == vcpu->cpu) @@ -188,7 +190,7 @@ void kvm_riscv_fence_i_process(struct kvm_vcpu *vcpu) void kvm_riscv_hfence_gvma_vmid_all_process(struct kvm_vcpu *vcpu) { struct kvm_vmid *v = &vcpu->kvm->arch.vmid; - unsigned long vmid = READ_ONCE(v->vmid); + xlen_t vmid = READ_ONCE(v->vmid); if (kvm_riscv_nacl_available()) nacl_hfence_gvma_vmid_all(nacl_shmem(), vmid); @@ -199,7 +201,7 @@ void kvm_riscv_hfence_gvma_vmid_all_process(struct kvm_vcpu *vcpu) void kvm_riscv_hfence_vvma_all_process(struct kvm_vcpu *vcpu) { struct kvm_vmid *v = &vcpu->kvm->arch.vmid; - unsigned long vmid = READ_ONCE(v->vmid); + xlen_t vmid = READ_ONCE(v->vmid); if (kvm_riscv_nacl_available()) nacl_hfence_vvma_all(nacl_shmem(), vmid); @@ -258,7 +260,7 @@ static bool vcpu_hfence_enqueue(struct kvm_vcpu *vcpu, void kvm_riscv_hfence_process(struct kvm_vcpu *vcpu) { - unsigned long vmid; + xlen_t vmid; struct kvm_riscv_hfence d = { 0 }; struct kvm_vmid *v = &vcpu->kvm->arch.vmid; @@ -310,7 +312,7 @@ void kvm_riscv_hfence_process(struct kvm_vcpu *vcpu) } static void make_xfence_request(struct kvm *kvm, - unsigned long hbase, unsigned long hmask, + xlen_t hbase, xlen_t hmask, unsigned int req, unsigned int fallback_req, const struct kvm_riscv_hfence *data) { @@ -346,16 +348,16 @@ static void make_xfence_request(struct kvm *kvm, } void kvm_riscv_fence_i(struct kvm *kvm, - unsigned long hbase, unsigned long hmask) + xlen_t hbase, xlen_t hmask) { make_xfence_request(kvm, hbase, hmask, KVM_REQ_FENCE_I, KVM_REQ_FENCE_I, NULL); } void kvm_riscv_hfence_gvma_vmid_gpa(struct kvm *kvm, - unsigned long hbase, unsigned long hmask, + xlen_t hbase, xlen_t hmask, gpa_t gpa, gpa_t gpsz, - unsigned long order) + xlen_t order) { struct kvm_riscv_hfence data; @@ -369,16 +371,16 @@ void kvm_riscv_hfence_gvma_vmid_gpa(struct kvm *kvm, } void kvm_riscv_hfence_gvma_vmid_all(struct kvm *kvm, - unsigned long hbase, unsigned long hmask) + xlen_t hbase, xlen_t hmask) { make_xfence_request(kvm, hbase, hmask, KVM_REQ_HFENCE_GVMA_VMID_ALL, KVM_REQ_HFENCE_GVMA_VMID_ALL, NULL); } void kvm_riscv_hfence_vvma_asid_gva(struct kvm *kvm, - unsigned long hbase, unsigned long hmask, - unsigned long gva, unsigned long gvsz, - unsigned long order, unsigned long asid) + xlen_t hbase, xlen_t hmask, + xlen_t gva, xlen_t gvsz, + xlen_t order, xlen_t asid) { struct kvm_riscv_hfence data; @@ -392,8 +394,8 @@ void kvm_riscv_hfence_vvma_asid_gva(struct kvm *kvm, } void kvm_riscv_hfence_vvma_asid_all(struct kvm *kvm, - unsigned long hbase, unsigned long hmask, - unsigned long asid) + xlen_t hbase, xlen_t hmask, + xlen_t asid) { struct kvm_riscv_hfence data; @@ -405,9 +407,9 @@ void kvm_riscv_hfence_vvma_asid_all(struct kvm *kvm, } void kvm_riscv_hfence_vvma_gva(struct kvm *kvm, - unsigned long hbase, unsigned long hmask, - unsigned long gva, unsigned long gvsz, - unsigned long order) + xlen_t hbase, xlen_t hmask, + xlen_t gva, xlen_t gvsz, + xlen_t order) { struct kvm_riscv_hfence data; @@ -421,7 +423,7 @@ void kvm_riscv_hfence_vvma_gva(struct kvm *kvm, } void kvm_riscv_hfence_vvma_all(struct kvm *kvm, - unsigned long hbase, unsigned long hmask) + xlen_t hbase, xlen_t hmask) { make_xfence_request(kvm, hbase, hmask, KVM_REQ_HFENCE_VVMA_ALL, KVM_REQ_HFENCE_VVMA_ALL, NULL); diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c index 60d684c76c58..144e25ead287 100644 --- a/arch/riscv/kvm/vcpu.c +++ b/arch/riscv/kvm/vcpu.c @@ -797,11 +797,11 @@ static void noinstr kvm_riscv_vcpu_enter_exit(struct kvm_vcpu *vcpu, if (kvm_riscv_nacl_autoswap_csr_available()) { hcntx->hstatus = nacl_csr_read(nsh, CSR_HSTATUS); - nacl_scratch_write_long(nsh, + nacl_scratch_write_csr(nsh, SBI_NACL_SHMEM_AUTOSWAP_OFFSET + SBI_NACL_SHMEM_AUTOSWAP_HSTATUS, gcntx->hstatus); - nacl_scratch_write_long(nsh, + nacl_scratch_write_csr(nsh, SBI_NACL_SHMEM_AUTOSWAP_OFFSET, SBI_NACL_SHMEM_AUTOSWAP_FLAG_HSTATUS); } else if (kvm_riscv_nacl_sync_csr_available()) { @@ -811,7 +811,7 @@ static void noinstr kvm_riscv_vcpu_enter_exit(struct kvm_vcpu *vcpu, hcntx->hstatus = csr_swap(CSR_HSTATUS, gcntx->hstatus); } - nacl_scratch_write_longs(nsh, + nacl_scratch_write_csrs(nsh, SBI_NACL_SHMEM_SRET_OFFSET + SBI_NACL_SHMEM_SRET_X(1), &gcntx->ra, @@ -821,10 +821,10 @@ static void noinstr kvm_riscv_vcpu_enter_exit(struct kvm_vcpu *vcpu, SBI_EXT_NACL_SYNC_SRET); if (kvm_riscv_nacl_autoswap_csr_available()) { - nacl_scratch_write_long(nsh, + nacl_scratch_write_csr(nsh, SBI_NACL_SHMEM_AUTOSWAP_OFFSET, 0); - gcntx->hstatus = nacl_scratch_read_long(nsh, + gcntx->hstatus = nacl_scratch_read_csr(nsh, SBI_NACL_SHMEM_AUTOSWAP_OFFSET + SBI_NACL_SHMEM_AUTOSWAP_HSTATUS); } else { diff --git a/arch/riscv/kvm/vcpu_exit.c b/arch/riscv/kvm/vcpu_exit.c index 6e0c18412795..0f6b80d87825 100644 --- a/arch/riscv/kvm/vcpu_exit.c +++ b/arch/riscv/kvm/vcpu_exit.c @@ -246,11 +246,11 @@ int kvm_riscv_vcpu_exit(struct kvm_vcpu *vcpu, struct kvm_run *run, /* Print details in-case of error */ if (ret < 0) { kvm_err("VCPU exit error %d\n", ret); - kvm_err("SEPC=0x%lx SSTATUS=0x%lx HSTATUS=0x%lx\n", + kvm_err("SEPC=0x" REG_FMT "SSTATUS=0x" REG_FMT " HSTATUS=0x" REG_FMT "\n", vcpu->arch.guest_context.sepc, vcpu->arch.guest_context.sstatus, vcpu->arch.guest_context.hstatus); - kvm_err("SCAUSE=0x%lx STVAL=0x%lx HTVAL=0x%lx HTINST=0x%lx\n", + kvm_err("SCAUSE=0x" REG_FMT " STVAL=0x" REG_FMT " HTVAL=0x" REG_FMT " HTINST=0x" REG_FMT "\n", trap->scause, trap->stval, trap->htval, trap->htinst); } diff --git a/arch/riscv/kvm/vcpu_insn.c b/arch/riscv/kvm/vcpu_insn.c index 97dec18e6989..c25415d63d96 100644 --- a/arch/riscv/kvm/vcpu_insn.c +++ b/arch/riscv/kvm/vcpu_insn.c @@ -221,13 +221,13 @@ struct csr_func { * "struct insn_func". */ int (*func)(struct kvm_vcpu *vcpu, unsigned int csr_num, - unsigned long *val, unsigned long new_val, - unsigned long wr_mask); + xlen_t *val, xlen_t new_val, + xlen_t wr_mask); }; static int seed_csr_rmw(struct kvm_vcpu *vcpu, unsigned int csr_num, - unsigned long *val, unsigned long new_val, - unsigned long wr_mask) + xlen_t *val, xlen_t new_val, + xlen_t wr_mask) { if (!riscv_isa_extension_available(vcpu->arch.isa, ZKR)) return KVM_INSN_ILLEGAL_TRAP; @@ -275,9 +275,9 @@ static int csr_insn(struct kvm_vcpu *vcpu, struct kvm_run *run, ulong insn) int i, rc = KVM_INSN_ILLEGAL_TRAP; unsigned int csr_num = insn >> SH_RS2; unsigned int rs1_num = (insn >> SH_RS1) & MASK_RX; - ulong rs1_val = GET_RS1(insn, &vcpu->arch.guest_context); + xlen_t rs1_val = GET_RS1(insn, &vcpu->arch.guest_context); const struct csr_func *tcfn, *cfn = NULL; - ulong val = 0, wr_mask = 0, new_val = 0; + xlen_t val = 0, wr_mask = 0, new_val = 0; /* Decode the CSR instruction */ switch (GET_FUNCT3(insn)) { diff --git a/arch/riscv/kvm/vcpu_onereg.c b/arch/riscv/kvm/vcpu_onereg.c index f6d27b59c641..34e11fbe27e8 100644 --- a/arch/riscv/kvm/vcpu_onereg.c +++ b/arch/riscv/kvm/vcpu_onereg.c @@ -448,7 +448,7 @@ static int kvm_riscv_vcpu_set_reg_core(struct kvm_vcpu *vcpu, static int kvm_riscv_vcpu_general_get_csr(struct kvm_vcpu *vcpu, unsigned long reg_num, - unsigned long *out_val) + xlen_t *out_val) { struct kvm_vcpu_csr *csr = &vcpu->arch.guest_csr; @@ -494,24 +494,24 @@ static inline int kvm_riscv_vcpu_smstateen_set_csr(struct kvm_vcpu *vcpu, struct kvm_vcpu_smstateen_csr *csr = &vcpu->arch.smstateen_csr; if (reg_num >= sizeof(struct kvm_riscv_smstateen_csr) / - sizeof(unsigned long)) + sizeof(xlen_t)) return -EINVAL; - ((unsigned long *)csr)[reg_num] = reg_val; + ((xlen_t *)csr)[reg_num] = reg_val; return 0; } static int kvm_riscv_vcpu_smstateen_get_csr(struct kvm_vcpu *vcpu, unsigned long reg_num, - unsigned long *out_val) + xlen_t *out_val) { struct kvm_vcpu_smstateen_csr *csr = &vcpu->arch.smstateen_csr; if (reg_num >= sizeof(struct kvm_riscv_smstateen_csr) / - sizeof(unsigned long)) + sizeof(xlen_t)) return -EINVAL; - *out_val = ((unsigned long *)csr)[reg_num]; + *out_val = ((xlen_t *)csr)[reg_num]; return 0; } @@ -519,12 +519,12 @@ static int kvm_riscv_vcpu_get_reg_csr(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) { int rc; - unsigned long __user *uaddr = - (unsigned long __user *)(unsigned long)reg->addr; + xlen_t __user *uaddr = + (xlen_t __user *)(unsigned long)reg->addr; unsigned long reg_num = reg->id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_RISCV_CSR); - unsigned long reg_val, reg_subtype; + xlen_t reg_val, reg_subtype; if (KVM_REG_SIZE(reg->id) != sizeof(unsigned long)) return -EINVAL; diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c index 2707a51b082c..3bfecda72150 100644 --- a/arch/riscv/kvm/vcpu_pmu.c +++ b/arch/riscv/kvm/vcpu_pmu.c @@ -198,7 +198,7 @@ static int pmu_get_pmc_index(struct kvm_pmu *pmu, unsigned long eidx, } static int pmu_fw_ctr_read_hi(struct kvm_vcpu *vcpu, unsigned long cidx, - unsigned long *out_val) + xlen_t *out_val) { struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu); struct kvm_pmc *pmc; @@ -228,7 +228,7 @@ static int pmu_fw_ctr_read_hi(struct kvm_vcpu *vcpu, unsigned long cidx, } static int pmu_ctr_read(struct kvm_vcpu *vcpu, unsigned long cidx, - unsigned long *out_val) + xlen_t *out_val) { struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu); struct kvm_pmc *pmc; @@ -354,8 +354,8 @@ int kvm_riscv_vcpu_pmu_incr_fw(struct kvm_vcpu *vcpu, unsigned long fid) } int kvm_riscv_vcpu_pmu_read_hpm(struct kvm_vcpu *vcpu, unsigned int csr_num, - unsigned long *val, unsigned long new_val, - unsigned long wr_mask) + xlen_t *val, xlen_t new_val, + xlen_t wr_mask) { struct kvm_pmu *kvpmu = vcpu_to_pmu(vcpu); int cidx, ret = KVM_INSN_CONTINUE_NEXT_SEPC; diff --git a/arch/riscv/kvm/vcpu_sbi_base.c b/arch/riscv/kvm/vcpu_sbi_base.c index 5bc570b984f4..a243339a73fd 100644 --- a/arch/riscv/kvm/vcpu_sbi_base.c +++ b/arch/riscv/kvm/vcpu_sbi_base.c @@ -18,7 +18,7 @@ static int kvm_sbi_ext_base_handler(struct kvm_vcpu *vcpu, struct kvm_run *run, { struct kvm_cpu_context *cp = &vcpu->arch.guest_context; const struct kvm_vcpu_sbi_extension *sbi_ext; - unsigned long *out_val = &retdata->out_val; + xlen_t *out_val = &retdata->out_val; switch (cp->a6) { case SBI_EXT_BASE_GET_SPEC_VERSION: diff --git a/arch/riscv/kvm/vmid.c b/arch/riscv/kvm/vmid.c index ddc98714ce8e..17744dfaf008 100644 --- a/arch/riscv/kvm/vmid.c +++ b/arch/riscv/kvm/vmid.c @@ -17,7 +17,7 @@ static unsigned long vmid_version = 1; static unsigned long vmid_next; -static unsigned long vmid_bits __ro_after_init; +static xlen_t vmid_bits __ro_after_init; static DEFINE_SPINLOCK(vmid_lock); void __init kvm_riscv_gstage_vmid_detect(void) @@ -40,7 +40,7 @@ void __init kvm_riscv_gstage_vmid_detect(void) vmid_bits = 0; } -unsigned long kvm_riscv_gstage_vmid_bits(void) +xlen_t kvm_riscv_gstage_vmid_bits(void) { return vmid_bits; } -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:00 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:00 -0400 Subject: [RFC PATCH V3 19/43] rv64ilp32_abi: irqchip: irq-riscv-intc: Use xlen_t instead of ulong In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-20-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The RV64ILP32 ABI is based on CONFIG_64BIT, so use xlen/xlen_t instead of BITS_PER_LONG/ulong. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- drivers/irqchip/irq-riscv-intc.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/irqchip/irq-riscv-intc.c b/drivers/irqchip/irq-riscv-intc.c index f653c13de62b..4fc7d5704acf 100644 --- a/drivers/irqchip/irq-riscv-intc.c +++ b/drivers/irqchip/irq-riscv-intc.c @@ -20,18 +20,19 @@ #include #include +#include static struct irq_domain *intc_domain; -static unsigned int riscv_intc_nr_irqs __ro_after_init = BITS_PER_LONG; -static unsigned int riscv_intc_custom_base __ro_after_init = BITS_PER_LONG; +static unsigned int riscv_intc_nr_irqs __ro_after_init = __riscv_xlen; +static unsigned int riscv_intc_custom_base __ro_after_init = __riscv_xlen; static unsigned int riscv_intc_custom_nr_irqs __ro_after_init; static void riscv_intc_irq(struct pt_regs *regs) { - unsigned long cause = regs->cause & ~CAUSE_IRQ_FLAG; + xlen_t cause = regs->cause & ~CAUSE_IRQ_FLAG; if (generic_handle_domain_irq(intc_domain, cause)) - pr_warn_ratelimited("Failed to handle interrupt (cause: %ld)\n", cause); + pr_warn_ratelimited("Failed to handle interrupt (cause: " REG_FMT ")\n", cause); } static void riscv_intc_aia_irq(struct pt_regs *regs) -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:01 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:01 -0400 Subject: [RFC PATCH V3 20/43] rv64ilp32_abi: drivers/perf: Adapt xlen_t of sbiret In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-21-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" Because rv64ilp32_abi change the sbiret struct, from: struct sbiret { long error; long value; }; to: struct sbiret { xlen_t error; xlen_t value; }; So, use a cast long to prevent compile warning. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- drivers/perf/riscv_pmu_sbi.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c index 698de8ddf895..e2c4d1bcbc7c 100644 --- a/drivers/perf/riscv_pmu_sbi.c +++ b/drivers/perf/riscv_pmu_sbi.c @@ -638,7 +638,7 @@ static int pmu_sbi_snapshot_setup(struct riscv_pmu *pmu, int cpu) /* Free up the snapshot area memory and fall back to SBI PMU calls without snapshot */ if (ret.error) { if (ret.error != SBI_ERR_NOT_SUPPORTED) - pr_warn("pmu snapshot setup failed with error %ld\n", ret.error); + pr_warn("pmu snapshot setup failed with error %ld\n", (long)ret.error); return sbi_err_map_linux_errno(ret.error); } @@ -679,7 +679,7 @@ static u64 pmu_sbi_ctr_read(struct perf_event *event) val |= ((u64)ret.value << 32); else WARN_ONCE(1, "Unable to read upper 32 bits of firmware counter error: %ld\n", - ret.error); + (long)ret.error); } } else { val = riscv_pmu_ctr_read_csr(info.csr); -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:02 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:02 -0400 Subject: [RFC PATCH V3 21/43] rv64ilp32_abi: asm-generic: Add custom BITS_PER_LONG definition In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-22-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The RV64ILP32 ABI linux kernel is based on CONFIG_64BIT, but BITS_PER_LONG is 32. So, give a custom architectural definition of BITS_PER_LONG to match the correct macro definition. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- arch/riscv/include/uapi/asm/bitsperlong.h | 6 ++++++ include/asm-generic/bitsperlong.h | 2 ++ 2 files changed, 8 insertions(+) diff --git a/arch/riscv/include/uapi/asm/bitsperlong.h b/arch/riscv/include/uapi/asm/bitsperlong.h index 7d0b32e3b701..fec2ad91597c 100644 --- a/arch/riscv/include/uapi/asm/bitsperlong.h +++ b/arch/riscv/include/uapi/asm/bitsperlong.h @@ -9,6 +9,12 @@ #define __BITS_PER_LONG (__SIZEOF_POINTER__ * 8) +#if __BITS_PER_LONG == 64 +#define BITS_PER_LONG 64 +#else +#define BITS_PER_LONG 32 +#endif + #include #endif /* _UAPI_ASM_RISCV_BITSPERLONG_H */ diff --git a/include/asm-generic/bitsperlong.h b/include/asm-generic/bitsperlong.h index 1023e2a4bd37..7ccbb7ce6610 100644 --- a/include/asm-generic/bitsperlong.h +++ b/include/asm-generic/bitsperlong.h @@ -6,7 +6,9 @@ #ifdef CONFIG_64BIT +#ifndef BITS_PER_LONG #define BITS_PER_LONG 64 +#endif #else #define BITS_PER_LONG 32 #endif /* CONFIG_64BIT */ -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:03 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:03 -0400 Subject: [RFC PATCH V3 22/43] rv64ilp32_abi: bpf: Change KERN_ARENA_SZ to 256MiB In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-23-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The RV64ILP32 ABI limits the vmalloc range to 512MB, hence the arena kernel range is set to 256MiB instead of 4GiB. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- kernel/bpf/arena.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/kernel/bpf/arena.c b/kernel/bpf/arena.c index 870aeb51d70a..4eb99f83d4a1 100644 --- a/kernel/bpf/arena.c +++ b/kernel/bpf/arena.c @@ -40,7 +40,14 @@ /* number of bytes addressable by LDX/STX insn with 16-bit 'off' field */ #define GUARD_SZ (1ull << sizeof_field(struct bpf_insn, off) * 8) -#define KERN_VM_SZ (SZ_4G + GUARD_SZ) + +#if BITS_PER_LONG == 64 +#define KERN_ARENA_SZ SZ_4G +#else +#define KERN_ARENA_SZ SZ_256M +#endif + +#define KERN_VM_SZ (KERN_ARENA_SZ + GUARD_SZ) struct bpf_arena { struct bpf_map map; @@ -115,7 +122,7 @@ static struct bpf_map *arena_map_alloc(union bpf_attr *attr) return ERR_PTR(-EINVAL); vm_range = (u64)attr->max_entries * PAGE_SIZE; - if (vm_range > SZ_4G) + if (vm_range > KERN_ARENA_SZ) return ERR_PTR(-E2BIG); if ((attr->map_extra >> 32) != ((attr->map_extra + vm_range - 1) >> 32)) @@ -321,7 +328,7 @@ static unsigned long arena_get_unmapped_area(struct file *filp, unsigned long ad if (pgoff) return -EINVAL; - if (len > SZ_4G) + if (len > KERN_ARENA_SZ) return -E2BIG; /* if user_vm_start was specified at arena creation time */ @@ -337,12 +344,14 @@ static unsigned long arena_get_unmapped_area(struct file *filp, unsigned long ad ret = mm_get_unmapped_area(current->mm, filp, addr, len * 2, 0, flags); if (IS_ERR_VALUE(ret)) return ret; +#if BITS_PER_LONG == 64 if ((ret >> 32) == ((ret + len - 1) >> 32)) return ret; +#endif if (WARN_ON_ONCE(arena->user_vm_start)) /* checks at map creation time should prevent this */ return -EFAULT; - return round_up(ret, SZ_4G); + return round_up(ret, KERN_ARENA_SZ); } static int arena_map_mmap(struct bpf_map *map, struct vm_area_struct *vma) @@ -366,7 +375,7 @@ static int arena_map_mmap(struct bpf_map *map, struct vm_area_struct *vma) return -EBUSY; /* Earlier checks should prevent this */ - if (WARN_ON_ONCE(vma->vm_end - vma->vm_start > SZ_4G || vma->vm_pgoff)) + if (WARN_ON_ONCE(vma->vm_end - vma->vm_start > KERN_ARENA_SZ || vma->vm_pgoff)) return -EFAULT; if (remember_vma(arena, vma)) -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:04 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:04 -0400 Subject: [RFC PATCH V3 23/43] rv64ilp32_abi: compat: Correct compat_ulong_t cast In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-24-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" RV64ILP32 ABI systems have BITS_PER_LONG set to 32, matching sizeof(compat_ulong_t). Adjust code involving compat_ulong_t accordingly. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- include/uapi/linux/auto_fs.h | 6 ++++++ kernel/compat.c | 15 ++++++++++++--- 2 files changed, 18 insertions(+), 3 deletions(-) diff --git a/include/uapi/linux/auto_fs.h b/include/uapi/linux/auto_fs.h index 8081df849743..7d925ee810b6 100644 --- a/include/uapi/linux/auto_fs.h +++ b/include/uapi/linux/auto_fs.h @@ -80,9 +80,15 @@ enum { #define AUTOFS_IOC_SETTIMEOUT32 _IOWR(AUTOFS_IOCTL, \ AUTOFS_IOC_SETTIMEOUT_CMD, \ compat_ulong_t) +#if __riscv_xlen == 64 +#define AUTOFS_IOC_SETTIMEOUT _IOWR(AUTOFS_IOCTL, \ + AUTOFS_IOC_SETTIMEOUT_CMD, \ + unsigned long long) +#else #define AUTOFS_IOC_SETTIMEOUT _IOWR(AUTOFS_IOCTL, \ AUTOFS_IOC_SETTIMEOUT_CMD, \ unsigned long) +#endif #define AUTOFS_IOC_EXPIRE _IOR(AUTOFS_IOCTL, \ AUTOFS_IOC_EXPIRE_CMD, \ struct autofs_packet_expire) diff --git a/kernel/compat.c b/kernel/compat.c index fb50f29d9b36..46ffdc5e7cc4 100644 --- a/kernel/compat.c +++ b/kernel/compat.c @@ -203,11 +203,17 @@ long compat_get_bitmap(unsigned long *mask, const compat_ulong_t __user *umask, return -EFAULT; while (nr_compat_longs > 1) { - compat_ulong_t l1, l2; + compat_ulong_t l1; unsafe_get_user(l1, umask++, Efault); + nr_compat_longs -= 1; +#if BITS_PER_LONG == 64 + compat_ulong_t l2; unsafe_get_user(l2, umask++, Efault); *mask++ = ((unsigned long)l2 << BITS_PER_COMPAT_LONG) | l1; - nr_compat_longs -= 2; + nr_compat_longs -= 1; +#else + *mask++ = l1; +#endif } if (nr_compat_longs) unsafe_get_user(*mask, umask++, Efault); @@ -234,8 +240,11 @@ long compat_put_bitmap(compat_ulong_t __user *umask, unsigned long *mask, while (nr_compat_longs > 1) { unsigned long m = *mask++; unsafe_put_user((compat_ulong_t)m, umask++, Efault); + nr_compat_longs -= 1; +#if BITS_PER_LONG == 64 unsafe_put_user(m >> BITS_PER_COMPAT_LONG, umask++, Efault); - nr_compat_longs -= 2; + nr_compat_longs -= 1; +#endif } if (nr_compat_longs) unsafe_put_user((compat_ulong_t)*mask, umask++, Efault); -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:05 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:05 -0400 Subject: [RFC PATCH V3 24/43] rv64ilp32_abi: compiler_types: Add "long long" into __native_word() In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-25-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The rv64ilp32 abi supports native atomic64 operations. The atomic64_t is defined by "long long," so add "long long" into __native_word(). Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- include/linux/compiler_types.h | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h index 981cc3d7e3aa..6cf36a8e9570 100644 --- a/include/linux/compiler_types.h +++ b/include/linux/compiler_types.h @@ -505,9 +505,16 @@ struct ftrace_likely_data { default: (x))) /* Is this type a native word size -- useful for atomic operations */ +#ifdef CONFIG_64BIT +#define __native_word(t) \ + (sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || \ + sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long) || \ + sizeof(t) == sizeof(long long)) +#else #define __native_word(t) \ (sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || \ sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long)) +#endif #ifdef __OPTIMIZE__ # define __compiletime_assert(condition, msg, prefix, suffix) \ -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:06 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:06 -0400 Subject: [RFC PATCH V3 25/43] rv64ilp32_abi: exec: Adapt 64lp64 env and argv In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-26-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The rv64ilp32 abi reuses the env and argv memory layout of the lp64 abi, so leave the space to fit the lp64 struct layout. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- fs/exec.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/exec.c b/fs/exec.c index 506cd411f4ac..548d18b7ae92 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -424,6 +424,10 @@ static const char __user *get_user_arg_ptr(struct user_arg_ptr argv, int nr) } #endif +#if defined(CONFIG_64BIT) && (BITS_PER_LONG == 32) + nr = nr * 2; +#endif + if (get_user(native, argv.ptr.native + nr)) return ERR_PTR(-EFAULT); -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:07 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:07 -0400 Subject: [RFC PATCH V3 26/43] rv64ilp32_abi: file_ref: Use 32-bit width for refcnt In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-27-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The sizeof(atomic_t) is 4 in rv64ilp32 abi linux kernel, which could provide a higher density of cache and a smaller memory footprint. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- include/linux/file_ref.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/file_ref.h b/include/linux/file_ref.h index 9b3a8d9b17ab..ce9b47359e14 100644 --- a/include/linux/file_ref.h +++ b/include/linux/file_ref.h @@ -27,7 +27,7 @@ * 0xFFFFFFFFFFFFFFFFUL */ -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 #define FILE_REF_ONEREF 0x0000000000000000UL #define FILE_REF_MAXREF 0x7FFFFFFFFFFFFFFFUL #define FILE_REF_SATURATED 0xA000000000000000UL @@ -44,7 +44,7 @@ #endif typedef struct { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 atomic64_t refcnt; #else atomic_t refcnt; -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:08 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:08 -0400 Subject: [RFC PATCH V3 27/43] rv64ilp32_abi: input: Adapt BITS_PER_LONG to dword In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-28-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The RV64ILP32 ABI linux kernel is based on CONFIG_64BIT, but BITS_PER_LONG is 32. So, adapt bits to dword with BITS_PER_LONG. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- drivers/input/input.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/input/input.c b/drivers/input/input.c index c9e3ac64bcd0..7af5e8c66f25 100644 --- a/drivers/input/input.c +++ b/drivers/input/input.c @@ -1006,7 +1006,11 @@ static int input_bits_to_string(char *buf, int buf_size, int len = 0; if (in_compat_syscall()) { +#if BITS_PER_LONG == 64 u32 dword = bits >> 32; +#else + u32 dword = bits; +#endif if (dword || !skip_empty) len += snprintf(buf, buf_size, "%x ", dword); -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:09 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:09 -0400 Subject: [RFC PATCH V3 28/43] rv64ilp32_abi: iov_iter: Resize kvec to match iov_iter's size In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-29-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" As drivers/vhost/vringh.c uses BUILD_BUG_ON(sizeof(struct iovec) != sizeof(struct kvec)), make them the same. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- include/linux/uio.h | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/include/linux/uio.h b/include/linux/uio.h index 8ada84e85447..0e1ca023374c 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -17,7 +17,13 @@ typedef unsigned int __bitwise iov_iter_extraction_t; struct kvec { void *iov_base; /* and that should *never* hold a userland pointer */ +#if defined(CONFIG_64BIT) && (BITS_PER_LONG == 32) + u32 __pad1; +#endif size_t iov_len; +#if defined(CONFIG_64BIT) && (BITS_PER_LONG == 32) + u32 __pad2; +#endif }; enum iter_type { -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:10 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:10 -0400 Subject: [RFC PATCH V3 29/43] rv64ilp32_abi: locking/atomic: Use BITS_PER_LONG for scripts In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-30-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" In RV64ILP32 ABI systems, BITS_PER_LONG equals 32 and determines code selection, not CONFIG_64BIT. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- include/linux/atomic/atomic-long.h | 174 ++++++++++++++--------------- scripts/atomic/gen-atomic-long.sh | 4 +- 2 files changed, 89 insertions(+), 89 deletions(-) diff --git a/include/linux/atomic/atomic-long.h b/include/linux/atomic/atomic-long.h index f86b29d90877..e31e0bdf9e26 100644 --- a/include/linux/atomic/atomic-long.h +++ b/include/linux/atomic/atomic-long.h @@ -9,7 +9,7 @@ #include #include -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 typedef atomic64_t atomic_long_t; #define ATOMIC_LONG_INIT(i) ATOMIC64_INIT(i) #define atomic_long_cond_read_acquire atomic64_cond_read_acquire @@ -34,7 +34,7 @@ typedef atomic_t atomic_long_t; static __always_inline long raw_atomic_long_read(const atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_read(v); #else return raw_atomic_read(v); @@ -54,7 +54,7 @@ raw_atomic_long_read(const atomic_long_t *v) static __always_inline long raw_atomic_long_read_acquire(const atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_read_acquire(v); #else return raw_atomic_read_acquire(v); @@ -75,7 +75,7 @@ raw_atomic_long_read_acquire(const atomic_long_t *v) static __always_inline void raw_atomic_long_set(atomic_long_t *v, long i) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 raw_atomic64_set(v, i); #else raw_atomic_set(v, i); @@ -96,7 +96,7 @@ raw_atomic_long_set(atomic_long_t *v, long i) static __always_inline void raw_atomic_long_set_release(atomic_long_t *v, long i) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 raw_atomic64_set_release(v, i); #else raw_atomic_set_release(v, i); @@ -117,7 +117,7 @@ raw_atomic_long_set_release(atomic_long_t *v, long i) static __always_inline void raw_atomic_long_add(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 raw_atomic64_add(i, v); #else raw_atomic_add(i, v); @@ -138,7 +138,7 @@ raw_atomic_long_add(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_add_return(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_add_return(i, v); #else return raw_atomic_add_return(i, v); @@ -159,7 +159,7 @@ raw_atomic_long_add_return(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_add_return_acquire(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_add_return_acquire(i, v); #else return raw_atomic_add_return_acquire(i, v); @@ -180,7 +180,7 @@ raw_atomic_long_add_return_acquire(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_add_return_release(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_add_return_release(i, v); #else return raw_atomic_add_return_release(i, v); @@ -201,7 +201,7 @@ raw_atomic_long_add_return_release(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_add_return_relaxed(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_add_return_relaxed(i, v); #else return raw_atomic_add_return_relaxed(i, v); @@ -222,7 +222,7 @@ raw_atomic_long_add_return_relaxed(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_add(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_add(i, v); #else return raw_atomic_fetch_add(i, v); @@ -243,7 +243,7 @@ raw_atomic_long_fetch_add(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_add_acquire(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_add_acquire(i, v); #else return raw_atomic_fetch_add_acquire(i, v); @@ -264,7 +264,7 @@ raw_atomic_long_fetch_add_acquire(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_add_release(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_add_release(i, v); #else return raw_atomic_fetch_add_release(i, v); @@ -285,7 +285,7 @@ raw_atomic_long_fetch_add_release(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_add_relaxed(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_add_relaxed(i, v); #else return raw_atomic_fetch_add_relaxed(i, v); @@ -306,7 +306,7 @@ raw_atomic_long_fetch_add_relaxed(long i, atomic_long_t *v) static __always_inline void raw_atomic_long_sub(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 raw_atomic64_sub(i, v); #else raw_atomic_sub(i, v); @@ -327,7 +327,7 @@ raw_atomic_long_sub(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_sub_return(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_sub_return(i, v); #else return raw_atomic_sub_return(i, v); @@ -348,7 +348,7 @@ raw_atomic_long_sub_return(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_sub_return_acquire(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_sub_return_acquire(i, v); #else return raw_atomic_sub_return_acquire(i, v); @@ -369,7 +369,7 @@ raw_atomic_long_sub_return_acquire(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_sub_return_release(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_sub_return_release(i, v); #else return raw_atomic_sub_return_release(i, v); @@ -390,7 +390,7 @@ raw_atomic_long_sub_return_release(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_sub_return_relaxed(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_sub_return_relaxed(i, v); #else return raw_atomic_sub_return_relaxed(i, v); @@ -411,7 +411,7 @@ raw_atomic_long_sub_return_relaxed(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_sub(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_sub(i, v); #else return raw_atomic_fetch_sub(i, v); @@ -432,7 +432,7 @@ raw_atomic_long_fetch_sub(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_sub_acquire(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_sub_acquire(i, v); #else return raw_atomic_fetch_sub_acquire(i, v); @@ -453,7 +453,7 @@ raw_atomic_long_fetch_sub_acquire(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_sub_release(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_sub_release(i, v); #else return raw_atomic_fetch_sub_release(i, v); @@ -474,7 +474,7 @@ raw_atomic_long_fetch_sub_release(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_sub_relaxed(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_sub_relaxed(i, v); #else return raw_atomic_fetch_sub_relaxed(i, v); @@ -494,7 +494,7 @@ raw_atomic_long_fetch_sub_relaxed(long i, atomic_long_t *v) static __always_inline void raw_atomic_long_inc(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 raw_atomic64_inc(v); #else raw_atomic_inc(v); @@ -514,7 +514,7 @@ raw_atomic_long_inc(atomic_long_t *v) static __always_inline long raw_atomic_long_inc_return(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_inc_return(v); #else return raw_atomic_inc_return(v); @@ -534,7 +534,7 @@ raw_atomic_long_inc_return(atomic_long_t *v) static __always_inline long raw_atomic_long_inc_return_acquire(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_inc_return_acquire(v); #else return raw_atomic_inc_return_acquire(v); @@ -554,7 +554,7 @@ raw_atomic_long_inc_return_acquire(atomic_long_t *v) static __always_inline long raw_atomic_long_inc_return_release(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_inc_return_release(v); #else return raw_atomic_inc_return_release(v); @@ -574,7 +574,7 @@ raw_atomic_long_inc_return_release(atomic_long_t *v) static __always_inline long raw_atomic_long_inc_return_relaxed(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_inc_return_relaxed(v); #else return raw_atomic_inc_return_relaxed(v); @@ -594,7 +594,7 @@ raw_atomic_long_inc_return_relaxed(atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_inc(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_inc(v); #else return raw_atomic_fetch_inc(v); @@ -614,7 +614,7 @@ raw_atomic_long_fetch_inc(atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_inc_acquire(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_inc_acquire(v); #else return raw_atomic_fetch_inc_acquire(v); @@ -634,7 +634,7 @@ raw_atomic_long_fetch_inc_acquire(atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_inc_release(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_inc_release(v); #else return raw_atomic_fetch_inc_release(v); @@ -654,7 +654,7 @@ raw_atomic_long_fetch_inc_release(atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_inc_relaxed(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_inc_relaxed(v); #else return raw_atomic_fetch_inc_relaxed(v); @@ -674,7 +674,7 @@ raw_atomic_long_fetch_inc_relaxed(atomic_long_t *v) static __always_inline void raw_atomic_long_dec(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 raw_atomic64_dec(v); #else raw_atomic_dec(v); @@ -694,7 +694,7 @@ raw_atomic_long_dec(atomic_long_t *v) static __always_inline long raw_atomic_long_dec_return(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_dec_return(v); #else return raw_atomic_dec_return(v); @@ -714,7 +714,7 @@ raw_atomic_long_dec_return(atomic_long_t *v) static __always_inline long raw_atomic_long_dec_return_acquire(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_dec_return_acquire(v); #else return raw_atomic_dec_return_acquire(v); @@ -734,7 +734,7 @@ raw_atomic_long_dec_return_acquire(atomic_long_t *v) static __always_inline long raw_atomic_long_dec_return_release(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_dec_return_release(v); #else return raw_atomic_dec_return_release(v); @@ -754,7 +754,7 @@ raw_atomic_long_dec_return_release(atomic_long_t *v) static __always_inline long raw_atomic_long_dec_return_relaxed(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_dec_return_relaxed(v); #else return raw_atomic_dec_return_relaxed(v); @@ -774,7 +774,7 @@ raw_atomic_long_dec_return_relaxed(atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_dec(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_dec(v); #else return raw_atomic_fetch_dec(v); @@ -794,7 +794,7 @@ raw_atomic_long_fetch_dec(atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_dec_acquire(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_dec_acquire(v); #else return raw_atomic_fetch_dec_acquire(v); @@ -814,7 +814,7 @@ raw_atomic_long_fetch_dec_acquire(atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_dec_release(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_dec_release(v); #else return raw_atomic_fetch_dec_release(v); @@ -834,7 +834,7 @@ raw_atomic_long_fetch_dec_release(atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_dec_relaxed(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_dec_relaxed(v); #else return raw_atomic_fetch_dec_relaxed(v); @@ -855,7 +855,7 @@ raw_atomic_long_fetch_dec_relaxed(atomic_long_t *v) static __always_inline void raw_atomic_long_and(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 raw_atomic64_and(i, v); #else raw_atomic_and(i, v); @@ -876,7 +876,7 @@ raw_atomic_long_and(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_and(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_and(i, v); #else return raw_atomic_fetch_and(i, v); @@ -897,7 +897,7 @@ raw_atomic_long_fetch_and(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_and_acquire(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_and_acquire(i, v); #else return raw_atomic_fetch_and_acquire(i, v); @@ -918,7 +918,7 @@ raw_atomic_long_fetch_and_acquire(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_and_release(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_and_release(i, v); #else return raw_atomic_fetch_and_release(i, v); @@ -939,7 +939,7 @@ raw_atomic_long_fetch_and_release(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_and_relaxed(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_and_relaxed(i, v); #else return raw_atomic_fetch_and_relaxed(i, v); @@ -960,7 +960,7 @@ raw_atomic_long_fetch_and_relaxed(long i, atomic_long_t *v) static __always_inline void raw_atomic_long_andnot(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 raw_atomic64_andnot(i, v); #else raw_atomic_andnot(i, v); @@ -981,7 +981,7 @@ raw_atomic_long_andnot(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_andnot(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_andnot(i, v); #else return raw_atomic_fetch_andnot(i, v); @@ -1002,7 +1002,7 @@ raw_atomic_long_fetch_andnot(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_andnot_acquire(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_andnot_acquire(i, v); #else return raw_atomic_fetch_andnot_acquire(i, v); @@ -1023,7 +1023,7 @@ raw_atomic_long_fetch_andnot_acquire(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_andnot_release(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_andnot_release(i, v); #else return raw_atomic_fetch_andnot_release(i, v); @@ -1044,7 +1044,7 @@ raw_atomic_long_fetch_andnot_release(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_andnot_relaxed(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_andnot_relaxed(i, v); #else return raw_atomic_fetch_andnot_relaxed(i, v); @@ -1065,7 +1065,7 @@ raw_atomic_long_fetch_andnot_relaxed(long i, atomic_long_t *v) static __always_inline void raw_atomic_long_or(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 raw_atomic64_or(i, v); #else raw_atomic_or(i, v); @@ -1086,7 +1086,7 @@ raw_atomic_long_or(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_or(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_or(i, v); #else return raw_atomic_fetch_or(i, v); @@ -1107,7 +1107,7 @@ raw_atomic_long_fetch_or(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_or_acquire(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_or_acquire(i, v); #else return raw_atomic_fetch_or_acquire(i, v); @@ -1128,7 +1128,7 @@ raw_atomic_long_fetch_or_acquire(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_or_release(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_or_release(i, v); #else return raw_atomic_fetch_or_release(i, v); @@ -1149,7 +1149,7 @@ raw_atomic_long_fetch_or_release(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_or_relaxed(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_or_relaxed(i, v); #else return raw_atomic_fetch_or_relaxed(i, v); @@ -1170,7 +1170,7 @@ raw_atomic_long_fetch_or_relaxed(long i, atomic_long_t *v) static __always_inline void raw_atomic_long_xor(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 raw_atomic64_xor(i, v); #else raw_atomic_xor(i, v); @@ -1191,7 +1191,7 @@ raw_atomic_long_xor(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_xor(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_xor(i, v); #else return raw_atomic_fetch_xor(i, v); @@ -1212,7 +1212,7 @@ raw_atomic_long_fetch_xor(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_xor_acquire(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_xor_acquire(i, v); #else return raw_atomic_fetch_xor_acquire(i, v); @@ -1233,7 +1233,7 @@ raw_atomic_long_fetch_xor_acquire(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_xor_release(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_xor_release(i, v); #else return raw_atomic_fetch_xor_release(i, v); @@ -1254,7 +1254,7 @@ raw_atomic_long_fetch_xor_release(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_xor_relaxed(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_xor_relaxed(i, v); #else return raw_atomic_fetch_xor_relaxed(i, v); @@ -1275,7 +1275,7 @@ raw_atomic_long_fetch_xor_relaxed(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_xchg(atomic_long_t *v, long new) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_xchg(v, new); #else return raw_atomic_xchg(v, new); @@ -1296,7 +1296,7 @@ raw_atomic_long_xchg(atomic_long_t *v, long new) static __always_inline long raw_atomic_long_xchg_acquire(atomic_long_t *v, long new) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_xchg_acquire(v, new); #else return raw_atomic_xchg_acquire(v, new); @@ -1317,7 +1317,7 @@ raw_atomic_long_xchg_acquire(atomic_long_t *v, long new) static __always_inline long raw_atomic_long_xchg_release(atomic_long_t *v, long new) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_xchg_release(v, new); #else return raw_atomic_xchg_release(v, new); @@ -1338,7 +1338,7 @@ raw_atomic_long_xchg_release(atomic_long_t *v, long new) static __always_inline long raw_atomic_long_xchg_relaxed(atomic_long_t *v, long new) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_xchg_relaxed(v, new); #else return raw_atomic_xchg_relaxed(v, new); @@ -1361,7 +1361,7 @@ raw_atomic_long_xchg_relaxed(atomic_long_t *v, long new) static __always_inline long raw_atomic_long_cmpxchg(atomic_long_t *v, long old, long new) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_cmpxchg(v, old, new); #else return raw_atomic_cmpxchg(v, old, new); @@ -1384,7 +1384,7 @@ raw_atomic_long_cmpxchg(atomic_long_t *v, long old, long new) static __always_inline long raw_atomic_long_cmpxchg_acquire(atomic_long_t *v, long old, long new) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_cmpxchg_acquire(v, old, new); #else return raw_atomic_cmpxchg_acquire(v, old, new); @@ -1407,7 +1407,7 @@ raw_atomic_long_cmpxchg_acquire(atomic_long_t *v, long old, long new) static __always_inline long raw_atomic_long_cmpxchg_release(atomic_long_t *v, long old, long new) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_cmpxchg_release(v, old, new); #else return raw_atomic_cmpxchg_release(v, old, new); @@ -1430,7 +1430,7 @@ raw_atomic_long_cmpxchg_release(atomic_long_t *v, long old, long new) static __always_inline long raw_atomic_long_cmpxchg_relaxed(atomic_long_t *v, long old, long new) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_cmpxchg_relaxed(v, old, new); #else return raw_atomic_cmpxchg_relaxed(v, old, new); @@ -1454,7 +1454,7 @@ raw_atomic_long_cmpxchg_relaxed(atomic_long_t *v, long old, long new) static __always_inline bool raw_atomic_long_try_cmpxchg(atomic_long_t *v, long *old, long new) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_try_cmpxchg(v, (s64 *)old, new); #else return raw_atomic_try_cmpxchg(v, (int *)old, new); @@ -1478,7 +1478,7 @@ raw_atomic_long_try_cmpxchg(atomic_long_t *v, long *old, long new) static __always_inline bool raw_atomic_long_try_cmpxchg_acquire(atomic_long_t *v, long *old, long new) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_try_cmpxchg_acquire(v, (s64 *)old, new); #else return raw_atomic_try_cmpxchg_acquire(v, (int *)old, new); @@ -1502,7 +1502,7 @@ raw_atomic_long_try_cmpxchg_acquire(atomic_long_t *v, long *old, long new) static __always_inline bool raw_atomic_long_try_cmpxchg_release(atomic_long_t *v, long *old, long new) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_try_cmpxchg_release(v, (s64 *)old, new); #else return raw_atomic_try_cmpxchg_release(v, (int *)old, new); @@ -1526,7 +1526,7 @@ raw_atomic_long_try_cmpxchg_release(atomic_long_t *v, long *old, long new) static __always_inline bool raw_atomic_long_try_cmpxchg_relaxed(atomic_long_t *v, long *old, long new) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_try_cmpxchg_relaxed(v, (s64 *)old, new); #else return raw_atomic_try_cmpxchg_relaxed(v, (int *)old, new); @@ -1547,7 +1547,7 @@ raw_atomic_long_try_cmpxchg_relaxed(atomic_long_t *v, long *old, long new) static __always_inline bool raw_atomic_long_sub_and_test(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_sub_and_test(i, v); #else return raw_atomic_sub_and_test(i, v); @@ -1567,7 +1567,7 @@ raw_atomic_long_sub_and_test(long i, atomic_long_t *v) static __always_inline bool raw_atomic_long_dec_and_test(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_dec_and_test(v); #else return raw_atomic_dec_and_test(v); @@ -1587,7 +1587,7 @@ raw_atomic_long_dec_and_test(atomic_long_t *v) static __always_inline bool raw_atomic_long_inc_and_test(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_inc_and_test(v); #else return raw_atomic_inc_and_test(v); @@ -1608,7 +1608,7 @@ raw_atomic_long_inc_and_test(atomic_long_t *v) static __always_inline bool raw_atomic_long_add_negative(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_add_negative(i, v); #else return raw_atomic_add_negative(i, v); @@ -1629,7 +1629,7 @@ raw_atomic_long_add_negative(long i, atomic_long_t *v) static __always_inline bool raw_atomic_long_add_negative_acquire(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_add_negative_acquire(i, v); #else return raw_atomic_add_negative_acquire(i, v); @@ -1650,7 +1650,7 @@ raw_atomic_long_add_negative_acquire(long i, atomic_long_t *v) static __always_inline bool raw_atomic_long_add_negative_release(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_add_negative_release(i, v); #else return raw_atomic_add_negative_release(i, v); @@ -1671,7 +1671,7 @@ raw_atomic_long_add_negative_release(long i, atomic_long_t *v) static __always_inline bool raw_atomic_long_add_negative_relaxed(long i, atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_add_negative_relaxed(i, v); #else return raw_atomic_add_negative_relaxed(i, v); @@ -1694,7 +1694,7 @@ raw_atomic_long_add_negative_relaxed(long i, atomic_long_t *v) static __always_inline long raw_atomic_long_fetch_add_unless(atomic_long_t *v, long a, long u) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_fetch_add_unless(v, a, u); #else return raw_atomic_fetch_add_unless(v, a, u); @@ -1717,7 +1717,7 @@ raw_atomic_long_fetch_add_unless(atomic_long_t *v, long a, long u) static __always_inline bool raw_atomic_long_add_unless(atomic_long_t *v, long a, long u) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_add_unless(v, a, u); #else return raw_atomic_add_unless(v, a, u); @@ -1738,7 +1738,7 @@ raw_atomic_long_add_unless(atomic_long_t *v, long a, long u) static __always_inline bool raw_atomic_long_inc_not_zero(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_inc_not_zero(v); #else return raw_atomic_inc_not_zero(v); @@ -1759,7 +1759,7 @@ raw_atomic_long_inc_not_zero(atomic_long_t *v) static __always_inline bool raw_atomic_long_inc_unless_negative(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_inc_unless_negative(v); #else return raw_atomic_inc_unless_negative(v); @@ -1780,7 +1780,7 @@ raw_atomic_long_inc_unless_negative(atomic_long_t *v) static __always_inline bool raw_atomic_long_dec_unless_positive(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_dec_unless_positive(v); #else return raw_atomic_dec_unless_positive(v); @@ -1801,7 +1801,7 @@ raw_atomic_long_dec_unless_positive(atomic_long_t *v) static __always_inline long raw_atomic_long_dec_if_positive(atomic_long_t *v) { -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return raw_atomic64_dec_if_positive(v); #else return raw_atomic_dec_if_positive(v); @@ -1809,4 +1809,4 @@ raw_atomic_long_dec_if_positive(atomic_long_t *v) } #endif /* _LINUX_ATOMIC_LONG_H */ -// eadf183c3600b8b92b91839dd3be6bcc560c752d +// 1b27315f1248fc8d43401372db7dd5895889c5be diff --git a/scripts/atomic/gen-atomic-long.sh b/scripts/atomic/gen-atomic-long.sh index 9826be3ba986..7667305381fc 100755 --- a/scripts/atomic/gen-atomic-long.sh +++ b/scripts/atomic/gen-atomic-long.sh @@ -55,7 +55,7 @@ cat < #include -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 typedef atomic64_t atomic_long_t; #define ATOMIC_LONG_INIT(i) ATOMIC64_INIT(i) #define atomic_long_cond_read_acquire atomic64_cond_read_acquire -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:11 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:11 -0400 Subject: [RFC PATCH V3 30/43] rv64ilp32_abi: kernel/smp: Disable CSD_LOCK_WAIT_DEBUG In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-31-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The rv64ilp32 abi is based on CONFIG_64BIT, but uses ILP32 for a smaller cache & memory footprint. So, disable CSD_LOCK_WAIT_DEBUG to get smaller csd struct. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- include/linux/smp_types.h | 2 +- lib/Kconfig.debug | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/include/linux/smp_types.h b/include/linux/smp_types.h index 2e8461af8df6..5912b694059f 100644 --- a/include/linux/smp_types.h +++ b/include/linux/smp_types.h @@ -61,7 +61,7 @@ struct __call_single_node { unsigned int u_flags; atomic_t a_flags; }; -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 u16 src, dst; #endif }; diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 1af972a92d06..f55f0ded826c 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1613,6 +1613,7 @@ config CSD_LOCK_WAIT_DEBUG depends on DEBUG_KERNEL depends on SMP depends on 64BIT + depends on !ABI_RV64ILP32 default n help This option enables debug prints when CPUs are slow to respond -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:12 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:12 -0400 Subject: [RFC PATCH V3 31/43] rv64ilp32_abi: maple_tree: Use BITS_PER_LONG instead of CONFIG_64BIT In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-32-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The Maple tree algorithm uses ulong type for each element. The number of slots is based on BITS_PER_LONG for RV64ILP32 ABI, so use BITS_PER_LONG instead of CONFIG_64BIT. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- include/linux/maple_tree.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h index cbbcd18d4186..ff6265b6468b 100644 --- a/include/linux/maple_tree.h +++ b/include/linux/maple_tree.h @@ -24,7 +24,7 @@ * * Nodes in the tree point to their parent unless bit 0 is set. */ -#if defined(CONFIG_64BIT) || defined(BUILD_VDSO32_64) +#if (BITS_PER_LONG == 64) || defined(BUILD_VDSO32_64) /* 64bit sizes */ #define MAPLE_NODE_SLOTS 31 /* 256 bytes including ->parent */ #define MAPLE_RANGE64_SLOTS 16 /* 256 bytes */ -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:13 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:13 -0400 Subject: [RFC PATCH V3 32/43] rv64ilp32_abi: mm: Remove _folio_nr_pages In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-33-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" BITS_PER_LONG defines the layout of the struct page, and _folio_nr_pages would conflict with page->_mapcount for RV64ILP32 ABI. Here is problem: BUG: Bad page state in process kworker/u4:0 pfn:61eed page: refcount:0 mapcount:5 mapping:00000000 index:0x0 pfn:0x61eed flags: 0x0(zone=0) raw: 00000000 00000000 ffffffff 00000000 00000000 00000000 00000004 00000000 raw: 00000000 00000000 page dumped because: nonzero mapcount Modules linked in: CPU: 0 UID: 0 PID: 11 Comm: kworker/u4:0 Not tainted 6.13.0-rc4 Hardware name: riscv-virtio,qemu (DT) Workqueue: async async_run_entry_fn Call Trace: [] dump_backtrace+0x1e/0x26 [] show_stack+0x2a/0x38 [] dump_stack_lvl+0x4a/0x68 [] dump_stack+0x16/0x1e [] bad_page+0x120/0x142 [] free_unref_page+0x510/0x5f8 [] __folio_put+0x6a/0xbc [] free_large_kmalloc+0x6a/0xb8 [] kfree+0x23c/0x300 [] unpack_to_rootfs+0x27c/0x2c0 [] do_populate_rootfs+0x24/0x12e [] async_run_entry_fn+0x26/0xcc [] process_one_work+0x136/0x224 [] worker_thread+0x234/0x30a [] kthread+0xca/0xe6 [] ret_from_fork+0xe/0x18 Disabling lock debugging due to kernel taint So, remove _folio_nr_pages just like CONFIG_32BIT and use "_flags_1 & 0xff" instead. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- include/linux/mm.h | 4 ++-- include/linux/mm_types.h | 2 +- mm/internal.h | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 7b1068ddcbb7..454fb8ca724c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2058,7 +2058,7 @@ static inline long folio_nr_pages(const struct folio *folio) { if (!folio_test_large(folio)) return 1; -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return folio->_folio_nr_pages; #else return 1L << (folio->_flags_1 & 0xff); @@ -2083,7 +2083,7 @@ static inline unsigned long compound_nr(struct page *page) if (!test_bit(PG_head, &folio->flags)) return 1; -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 return folio->_folio_nr_pages; #else return 1L << (folio->_flags_1 & 0xff); diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 6b27db7f9496..da3ba1a79ad5 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -370,7 +370,7 @@ struct folio { atomic_t _entire_mapcount; atomic_t _nr_pages_mapped; atomic_t _pincount; -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 unsigned int _folio_nr_pages; #endif /* private: the union with struct page is transitional */ diff --git a/mm/internal.h b/mm/internal.h index 109ef30fee11..c9372a8552ba 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -682,7 +682,7 @@ static inline void folio_set_order(struct folio *folio, unsigned int order) return; folio->_flags_1 = (folio->_flags_1 & ~0xffUL) | order; -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 folio->_folio_nr_pages = 1U << order; #endif } -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:14 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:14 -0400 Subject: [RFC PATCH V3 33/43] rv64ilp32_abi: mm/auxvec: Adapt mm->saved_auxv[] to Elf64 In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-34-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" Unable to handle kernel paging request at virtual address 60723de0 Oops [#1] Modules linked in: CPU: 0 UID: 0 PID: 1 Comm: init Not tainted 6.13.0-rc4-00031-g01dc3ca797b3-dirty #161 Hardware name: riscv-virtio,qemu (DT) epc : percpu_counter_add_batch+0x38/0xc4 ra : filemap_map_pages+0x3ec/0x54c epc : ffffffffbc4ea02e ra : ffffffffbc1722e4 sp : ffffffffc1c4fc60 gp : ffffffffbd6d3918 tp : ffffffffc1c50000 t0 : 0000000000000000 t1 : 000000003fffefff t2 : 0000000000000000 s0 : ffffffffc1c4fca0 s1 : 0000000000000022 a0 : ffffffffc25c8250 a1 : 0000000000000003 a2 : 0000000000000020 a3 : 000000003fffefff a4 : 000000000b1c2000 a5 : 0000000060723de0 a6 : ffffffffbffff000 a7 : 000000003fffffff s2 : ffffffffc25c8250 s3 : ffffffffc246e240 s4 : ffffffffc2138240 s5 : ffffffffbd70c4d0 s6 : 0000000000000003 s7 : 0000000000000000 s8 : ffffffff9a02d780 s9 : 0000000000000100 s10: ffffffffc1c4fda8 s11: 0000000000000003 t3 : 0000000000000000 t4 : 00000000000004f7 t5 : 0000000000000000 t6 : 0000000000000001 status: 0000000200000100 badaddr: 0000000060723de0 cause: 000000000000000d [] percpu_counter_add_batch+0x38/0xc4 [] filemap_map_pages+0x3ec/0x54c [] handle_mm_fault+0xb6c/0xe9c [] handle_page_fault+0xd0/0x418 [] do_page_fault+0x20/0x3a [] _new_vmalloc_restore_context_a0+0xb0/0xbc Code: 8a93 4baa 511c 171b 0027 873b 00ea 4318 2481 9fb9 (aa03) 0007 Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- include/linux/mm_types.h | 4 ++++ kernel/sys.c | 8 ++++++++ 2 files changed, 12 insertions(+) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index da3ba1a79ad5..0d436b0217fd 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -962,7 +962,11 @@ struct mm_struct { unsigned long start_brk, brk, start_stack; unsigned long arg_start, arg_end, env_start, env_end; +#ifdef CONFIG_64BIT + unsigned long long saved_auxv[AT_VECTOR_SIZE]; /* for /proc/PID/auxv */ +#else unsigned long saved_auxv[AT_VECTOR_SIZE]; /* for /proc/PID/auxv */ +#endif struct percpu_counter rss_stat[NR_MM_COUNTERS]; diff --git a/kernel/sys.c b/kernel/sys.c index cb366ff8703a..81c0d94ff50d 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -2008,7 +2008,11 @@ static int validate_prctl_map_addr(struct prctl_mm_map *prctl_map) static int prctl_set_mm_map(int opt, const void __user *addr, unsigned long data_size) { struct prctl_mm_map prctl_map = { .exe_fd = (u32)-1, }; +#ifdef CONFIG_64BIT + unsigned long long user_auxv[AT_VECTOR_SIZE]; +#else unsigned long user_auxv[AT_VECTOR_SIZE]; +#endif struct mm_struct *mm = current->mm; int error; @@ -2122,7 +2126,11 @@ static int prctl_set_auxv(struct mm_struct *mm, unsigned long addr, * up to the caller to provide sane values here, otherwise userspace * tools which use this vector might be unhappy. */ +#ifdef CONFIG_64BIT + unsigned long long user_auxv[AT_VECTOR_SIZE] = {}; +#else unsigned long user_auxv[AT_VECTOR_SIZE] = {}; +#endif if (len > sizeof(user_auxv)) return -EINVAL; -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:15 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:15 -0400 Subject: [RFC PATCH V3 34/43] rv64ilp32_abi: mm: Adapt vm_flags_t struct In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-35-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" RV64ILP32 ABI linux kernel is based on CONFIG_64BIT, so uses unsigned long long as vm_flags struct type. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- fs/proc/task_mmu.c | 9 +++++++-- include/linux/mm.h | 10 +++++++--- include/linux/mm_types.h | 4 ++++ mm/debug.c | 4 ++++ mm/memory.c | 4 ++++ 5 files changed, 26 insertions(+), 5 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index f02cd362309a..6c4eaba794da 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -905,6 +905,11 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, return 0; } +#ifdef CONFIG_64BIT +#define MNEMONICS_SZ 64 +#else +#define MNEMONICS_SZ 32 +#endif static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) { /* @@ -917,11 +922,11 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma) * -Werror=unterminated-string-initialization warning * with GCC 15 */ - static const char mnemonics[BITS_PER_LONG][3] = { + static const char mnemonics[MNEMONICS_SZ][3] = { /* * In case if we meet a flag we don't know about. */ - [0 ... (BITS_PER_LONG-1)] = "??", + [0 ... (MNEMONICS_SZ-1)] = "??", [ilog2(VM_READ)] = "rd", [ilog2(VM_WRITE)] = "wr", diff --git a/include/linux/mm.h b/include/linux/mm.h index 454fb8ca724c..d9735cd7efe9 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -412,7 +412,11 @@ extern unsigned int kobjsize(const void *objp); #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_MINOR # define VM_UFFD_MINOR_BIT 38 +#ifdef CONFIG_64BIT +# define VM_UFFD_MINOR BIT_ULL(VM_UFFD_MINOR_BIT) /* UFFD minor faults */ +#else # define VM_UFFD_MINOR BIT(VM_UFFD_MINOR_BIT) /* UFFD minor faults */ +#endif #else /* !CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */ # define VM_UFFD_MINOR VM_NONE #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */ @@ -426,14 +430,14 @@ extern unsigned int kobjsize(const void *objp); */ #ifdef CONFIG_64BIT #define VM_ALLOW_ANY_UNCACHED_BIT 39 -#define VM_ALLOW_ANY_UNCACHED BIT(VM_ALLOW_ANY_UNCACHED_BIT) +#define VM_ALLOW_ANY_UNCACHED BIT_ULL(VM_ALLOW_ANY_UNCACHED_BIT) #else #define VM_ALLOW_ANY_UNCACHED VM_NONE #endif #ifdef CONFIG_64BIT #define VM_DROPPABLE_BIT 40 -#define VM_DROPPABLE BIT(VM_DROPPABLE_BIT) +#define VM_DROPPABLE BIT_ULL(VM_DROPPABLE_BIT) #elif defined(CONFIG_PPC32) #define VM_DROPPABLE VM_ARCH_1 #else @@ -442,7 +446,7 @@ extern unsigned int kobjsize(const void *objp); #ifdef CONFIG_64BIT /* VM is sealed, in vm_flags */ -#define VM_SEALED _BITUL(63) +#define VM_SEALED _BITULL(63) #endif /* Bits set in the VMA until the stack is in its final location */ diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 0d436b0217fd..900665c5eca8 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -571,7 +571,11 @@ static inline void *folio_get_private(struct folio *folio) return folio->private; } +#ifdef CONFIG_64BIT +typedef unsigned long long vm_flags_t; +#else typedef unsigned long vm_flags_t; +#endif /* * A region containing a mapping of a non-memory backed file under NOMMU diff --git a/mm/debug.c b/mm/debug.c index 8d2acf432385..0fcb85e6efea 100644 --- a/mm/debug.c +++ b/mm/debug.c @@ -181,7 +181,11 @@ void dump_vma(const struct vm_area_struct *vma) pr_emerg("vma %px start %px end %px mm %px\n" "prot %lx anon_vma %px vm_ops %px\n" "pgoff %lx file %px private_data %px\n" +#ifdef CONFIG_64BIT + "flags: %#llx(%pGv)\n", +#else "flags: %#lx(%pGv)\n", +#endif vma, (void *)vma->vm_start, (void *)vma->vm_end, vma->vm_mm, (unsigned long)pgprot_val(vma->vm_page_prot), vma->anon_vma, vma->vm_ops, vma->vm_pgoff, diff --git a/mm/memory.c b/mm/memory.c index 539c0f7c6d54..3c4a9663c094 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -533,7 +533,11 @@ static void print_bad_pte(struct vm_area_struct *vma, unsigned long addr, (long long)pte_val(pte), (long long)pmd_val(*pmd)); if (page) dump_page(page, "bad pte"); +#ifdef CONFIG_64BIT + pr_alert("addr:%px vm_flags:%08llx anon_vma:%px mapping:%px index:%lx\n", +#else pr_alert("addr:%px vm_flags:%08lx anon_vma:%px mapping:%px index:%lx\n", +#endif (void *)addr, vma->vm_flags, vma->anon_vma, mapping, index); pr_alert("file:%pD fault:%ps mmap:%ps read_folio:%ps\n", vma->vm_file, -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:16 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:16 -0400 Subject: [RFC PATCH V3 35/43] rv64ilp32_abi: net: Use BITS_PER_LONG in struct dst_entry In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-36-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The rv64ilp32 ABI depends on CONFIG_64BIT for its ILP32 data type, which is smaller. To align with ILP32 requirements, CONFIG_64BIT was changed to BITS_PER_LONG in struct dts_entry. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- include/net/dst.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/net/dst.h b/include/net/dst.h index 78c78cdce0e9..af1c74c4836e 100644 --- a/include/net/dst.h +++ b/include/net/dst.h @@ -65,7 +65,7 @@ struct dst_entry { * __rcuref wants to be on a different cache line from * input/output/ops or performance tanks badly */ -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 rcuref_t __rcuref; /* 64-bit offset 64 */ #endif int __use; @@ -74,7 +74,7 @@ struct dst_entry { short error; short __pad; __u32 tclassid; -#ifndef CONFIG_64BIT +#if BITS_PER_LONG == 32 struct lwtunnel_state *lwtstate; rcuref_t __rcuref; /* 32-bit offset 64 */ #endif @@ -89,7 +89,7 @@ struct dst_entry { */ struct list_head rt_uncached; struct uncached_list *rt_uncached_list; -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 struct lwtunnel_state *lwtstate; #endif }; -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:17 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:17 -0400 Subject: [RFC PATCH V3 36/43] rv64ilp32_abi: printf: Use BITS_PER_LONG instead of CONFIG_64BIT In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-37-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" RV64ILP32 ABI systems have BITS_PER_LONG set to 32. Use BITS_PER_LONG instead of CONFIG_64BIT. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- lib/vsprintf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/vsprintf.c b/lib/vsprintf.c index 56fe96319292..2d719be86945 100644 --- a/lib/vsprintf.c +++ b/lib/vsprintf.c @@ -771,7 +771,7 @@ static inline int __ptr_to_hashval(const void *ptr, unsigned long *hashval_out) /* Pairs with smp_wmb() after writing ptr_key. */ smp_rmb(); -#ifdef CONFIG_64BIT +#if BITS_PER_LONG == 64 hashval = (unsigned long)siphash_1u64((u64)ptr, &ptr_key); /* * Mask off the first 32 bits, this makes explicit that we have -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:18 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:18 -0400 Subject: [RFC PATCH V3 37/43] rv64ilp32_abi: random: Adapt fast_pool struct In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-38-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" RV64ILP32 ABI systems have BITS_PER_LONG set to 32, matching sizeof(compat_ulong_t). Adjust code Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- drivers/char/random.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/char/random.c b/drivers/char/random.c index 2581186fa61b..0bfbe02ee255 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -1015,7 +1015,11 @@ EXPORT_SYMBOL_GPL(unregister_random_vmfork_notifier); #endif struct fast_pool { +#ifdef CONFIG_64BIT + u64 pool[4]; +#else unsigned long pool[4]; +#endif unsigned long last; unsigned int count; struct timer_list mix; @@ -1040,7 +1044,11 @@ static DEFINE_PER_CPU(struct fast_pool, irq_randomness) = { * and therefore this has no security on its own. s represents the * four-word SipHash state, while v represents a two-word input. */ +#ifdef CONFIG_64BIT +static void fast_mix(u64 s[4], u64 v1, u64 v2) +#else static void fast_mix(unsigned long s[4], unsigned long v1, unsigned long v2) +#endif { s[3] ^= v1; FASTMIX_PERM(s[0], s[1], s[2], s[3]); -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:19 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:19 -0400 Subject: [RFC PATCH V3 38/43] rv64ilp32_abi: syscall: Use CONFIG_64BIT instead of BITS_PER_LONG In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-39-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The RV64ILP32 ABI adopts the syscall rules from CONFIG_64BIT and directly uses 64BIT, replacing BITS_PER_LONG representation. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- arch/riscv/include/asm/syscall_table.h | 2 +- arch/riscv/include/asm/unistd.h | 4 ++-- scripts/checksyscalls.sh | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/riscv/include/asm/syscall_table.h b/arch/riscv/include/asm/syscall_table.h index 0c2d61782813..aab2bc0ddf4e 100644 --- a/arch/riscv/include/asm/syscall_table.h +++ b/arch/riscv/include/asm/syscall_table.h @@ -1,6 +1,6 @@ #include -#if __BITS_PER_LONG == 64 +#ifdef CONFIG_64BIT #include #else #include diff --git a/arch/riscv/include/asm/unistd.h b/arch/riscv/include/asm/unistd.h index e6d904fa67c5..86b9c1712f24 100644 --- a/arch/riscv/include/asm/unistd.h +++ b/arch/riscv/include/asm/unistd.h @@ -16,10 +16,10 @@ #define __ARCH_WANT_COMPAT_FADVISE64_64 #endif -#if defined(__LP64__) && !defined(__SYSCALL_COMPAT) +#if defined(CONFIG_64BIT) && !defined(__SYSCALL_COMPAT) #define __ARCH_WANT_NEW_STAT #define __ARCH_WANT_SET_GET_RLIMIT -#endif /* __LP64__ */ +#endif /* CONFIG_64BIT */ #define __ARCH_WANT_MEMFD_SECRET diff --git a/scripts/checksyscalls.sh b/scripts/checksyscalls.sh index 1e5d2eeb726d..9cc4f9086dfe 100755 --- a/scripts/checksyscalls.sh +++ b/scripts/checksyscalls.sh @@ -76,7 +76,7 @@ cat << EOF #endif /* System calls for 32-bit kernels only */ -#if BITS_PER_LONG == 64 +#ifdef CONFIG_64BIT #define __IGNORE_sendfile64 #define __IGNORE_ftruncate64 #define __IGNORE_truncate64 -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:20 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:20 -0400 Subject: [RFC PATCH V3 39/43] rv64ilp32_abi: sysinfo: Adapt sysinfo structure to lp64 uapi In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-40-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The RISC-V 64ilp32 ABI leverages LP64 uapi and accommodates LP64 ABI userspace directly, necessitating updates to the sysinfo struct's unsigned long and array types with u64. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- fs/proc/loadavg.c | 10 +++++++--- include/linux/sched/loadavg.h | 4 ++++ include/uapi/linux/sysinfo.h | 20 ++++++++++++++++++++ kernel/sched/loadavg.c | 4 ++++ 4 files changed, 35 insertions(+), 3 deletions(-) diff --git a/fs/proc/loadavg.c b/fs/proc/loadavg.c index 817981e57223..643e06de3446 100644 --- a/fs/proc/loadavg.c +++ b/fs/proc/loadavg.c @@ -13,14 +13,18 @@ static int loadavg_proc_show(struct seq_file *m, void *v) { +#if defined(CONFIG_64BIT) && (BITS_PER_LONG == 32) + unsigned long long avnrun[3]; +#else unsigned long avnrun[3]; +#endif get_avenrun(avnrun, FIXED_1/200, 0); seq_printf(m, "%lu.%02lu %lu.%02lu %lu.%02lu %u/%d %d\n", - LOAD_INT(avnrun[0]), LOAD_FRAC(avnrun[0]), - LOAD_INT(avnrun[1]), LOAD_FRAC(avnrun[1]), - LOAD_INT(avnrun[2]), LOAD_FRAC(avnrun[2]), + LOAD_INT((ulong)avnrun[0]), LOAD_FRAC((ulong)avnrun[0]), + LOAD_INT((ulong)avnrun[1]), LOAD_FRAC((ulong)avnrun[1]), + LOAD_INT((ulong)avnrun[2]), LOAD_FRAC((ulong)avnrun[2]), nr_running(), nr_threads, idr_get_cursor(&task_active_pid_ns(current)->idr) - 1); return 0; diff --git a/include/linux/sched/loadavg.h b/include/linux/sched/loadavg.h index 83ec54b65e79..8f2d6a827ee9 100644 --- a/include/linux/sched/loadavg.h +++ b/include/linux/sched/loadavg.h @@ -13,7 +13,11 @@ * 11 bit fractions. */ extern unsigned long avenrun[]; /* Load averages */ +#if defined(CONFIG_64BIT) && (BITS_PER_LONG == 32) +extern void get_avenrun(unsigned long long *loads, unsigned long offset, int shift); +#else extern void get_avenrun(unsigned long *loads, unsigned long offset, int shift); +#endif #define FSHIFT 11 /* nr of bits of precision */ #define FIXED_1 (1< #define SI_LOAD_SHIFT 16 + +#if (__riscv_xlen == 64) && (__BITS_PER_LONG == 32) +struct sysinfo { + __s64 uptime; /* Seconds since boot */ + __u64 loads[3]; /* 1, 5, and 15 minute load averages */ + __u64 totalram; /* Total usable main memory size */ + __u64 freeram; /* Available memory size */ + __u64 sharedram; /* Amount of shared memory */ + __u64 bufferram; /* Memory used by buffers */ + __u64 totalswap; /* Total swap space size */ + __u64 freeswap; /* swap space still available */ + __u16 procs; /* Number of current processes */ + __u16 pad; /* Explicit padding for m68k */ + __u64 totalhigh; /* Total high memory size */ + __u64 freehigh; /* Available high memory size */ + __u32 mem_unit; /* Memory unit size in bytes */ + char _f[20-2*sizeof(__u64)-sizeof(__u32)]; /* Padding: libc5 uses this.. */ +}; +#else struct sysinfo { __kernel_long_t uptime; /* Seconds since boot */ __kernel_ulong_t loads[3]; /* 1, 5, and 15 minute load averages */ @@ -21,5 +40,6 @@ struct sysinfo { __u32 mem_unit; /* Memory unit size in bytes */ char _f[20-2*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /* Padding: libc5 uses this.. */ }; +#endif #endif /* _LINUX_SYSINFO_H */ diff --git a/kernel/sched/loadavg.c b/kernel/sched/loadavg.c index c48900b856a2..f1f5abc64dea 100644 --- a/kernel/sched/loadavg.c +++ b/kernel/sched/loadavg.c @@ -68,7 +68,11 @@ EXPORT_SYMBOL(avenrun); /* should be removed */ * * These values are estimates at best, so no need for locking. */ +#if defined(CONFIG_64BIT) && (BITS_PER_LONG == 32) +void get_avenrun(unsigned long long *loads, unsigned long offset, int shift) +#else void get_avenrun(unsigned long *loads, unsigned long offset, int shift) +#endif { loads[0] = (avenrun[0] + offset) << shift; loads[1] = (avenrun[1] + offset) << shift; -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:21 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:21 -0400 Subject: [RFC PATCH V3 40/43] rv64ilp32_abi: tracepoint-defs: Using u64 for trace_print_flags.mask In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-41-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The rv64ilp32 ABI relies on CONFIG_64BIT, and mmflags.h defines VMA flags with BIT_ULL. Consequently, use "unsigned long long" for trace_print_flags.mask to align with VMAflag's type size. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- include/linux/tracepoint-defs.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/include/linux/tracepoint-defs.h b/include/linux/tracepoint-defs.h index aebf0571c736..3b51ede18e32 100644 --- a/include/linux/tracepoint-defs.h +++ b/include/linux/tracepoint-defs.h @@ -14,7 +14,11 @@ struct static_call_key; struct trace_print_flags { +#ifdef CONFIG_64BIT + unsigned long long mask; +#else unsigned long mask; +#endif const char *name; }; -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:22 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:22 -0400 Subject: [RFC PATCH V3 41/43] rv64ilp32_abi: tty: Adapt ptr_to_compat In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-42-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" The RV64ILP32 ABI is based on 64-bit ISA, but BITS_PER_LONG is 32. So, the size of unsigned long is the same as compat_ulong_t and no need "(unsigned long)v.iomem_base >> 32 ? 0xfffffff : ..." detection. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- drivers/tty/tty_io.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c index 449dbd216460..75e256e879d0 100644 --- a/drivers/tty/tty_io.c +++ b/drivers/tty/tty_io.c @@ -2873,8 +2873,12 @@ static int compat_tty_tiocgserial(struct tty_struct *tty, err = tty->ops->get_serial(tty, &v); if (!err) { memcpy(&v32, &v, offsetof(struct serial_struct32, iomem_base)); +#if BITS_PER_LONG == 64 v32.iomem_base = (unsigned long)v.iomem_base >> 32 ? 0xfffffff : ptr_to_compat(v.iomem_base); +#else + v32.iomem_base = ptr_to_compat(v.iomem_base); +#endif v32.iomem_reg_shift = v.iomem_reg_shift; v32.port_high = v.port_high; if (copy_to_user(ss, &v32, sizeof(v32))) -- 2.40.1 From guoren at kernel.org Tue Mar 25 05:16:23 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:23 -0400 Subject: [RFC PATCH V3 42/43] rv64ilp32_abi: memfd: Use vm_flag_t In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-43-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" RV64ILP32 ABI linux kernel is based on CONFIG_64BIT, and uses unsigned long long as vm_flags_t. Using unsigned long would break rv64ilp32 abi. The definition of vm_flag_t exists, hence its usage is preferred even if it's not essential. Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- include/linux/memfd.h | 4 ++-- mm/memfd.c | 8 ++++---- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/include/linux/memfd.h b/include/linux/memfd.h index 246daadbfde8..6f606d9573c3 100644 --- a/include/linux/memfd.h +++ b/include/linux/memfd.h @@ -14,7 +14,7 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx); * We also update VMA flags if appropriate by manipulating the VMA flags pointed * to by vm_flags_ptr. */ -int memfd_check_seals_mmap(struct file *file, unsigned long *vm_flags_ptr); +int memfd_check_seals_mmap(struct file *file, vm_flags_t *vm_flags_ptr); #else static inline long memfd_fcntl(struct file *f, unsigned int c, unsigned int a) { @@ -25,7 +25,7 @@ static inline struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx) return ERR_PTR(-EINVAL); } static inline int memfd_check_seals_mmap(struct file *file, - unsigned long *vm_flags_ptr) + vm_flags_t *vm_flags_ptr) { return 0; } diff --git a/mm/memfd.c b/mm/memfd.c index 37f7be57c2f5..50dad90ffedc 100644 --- a/mm/memfd.c +++ b/mm/memfd.c @@ -332,10 +332,10 @@ static inline bool is_write_sealed(unsigned int seals) return seals & (F_SEAL_WRITE | F_SEAL_FUTURE_WRITE); } -static int check_write_seal(unsigned long *vm_flags_ptr) +static int check_write_seal(vm_flags_t *vm_flags_ptr) { - unsigned long vm_flags = *vm_flags_ptr; - unsigned long mask = vm_flags & (VM_SHARED | VM_WRITE); + vm_flags_t vm_flags = *vm_flags_ptr; + vm_flags_t mask = vm_flags & (VM_SHARED | VM_WRITE); /* If a private matting then writability is irrelevant. */ if (!(mask & VM_SHARED)) @@ -357,7 +357,7 @@ static int check_write_seal(unsigned long *vm_flags_ptr) return 0; } -int memfd_check_seals_mmap(struct file *file, unsigned long *vm_flags_ptr) +int memfd_check_seals_mmap(struct file *file, vm_flags_t *vm_flags_ptr) { int err = 0; unsigned int *seals_ptr = memfd_file_seals_ptr(file); -- 2.40.1 From peterz at infradead.org Tue Mar 25 05:26:40 2025 From: peterz at infradead.org (Peter Zijlstra) Date: Tue, 25 Mar 2025 13:26:40 +0100 Subject: [RFC PATCH V3 00/43] rv64ilp32_abi: Build CONFIG_64BIT kernel-self with ILP32 ABI In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325122640.GK36322@noisy.programming.kicks-ass.net> On Tue, Mar 25, 2025 at 08:15:41AM -0400, guoren at kernel.org wrote: > From: "Guo Ren (Alibaba DAMO Academy)" > > Since 2001, the CONFIG_64BIT kernel has been built with the LP64 ABI, > but this patchset allows the CONFIG_64BIT kernel to use an ILP32 ABI I'm thinking you're going to be finding a metric ton of assumptions about 'unsigned long' being 64bit when 64BIT=y throughout the kernel. I know of a couple of places where 64BIT will result in different math such that a 32bit 'unsigned long' will trivially overflow. Please, don't do this. This adds a significant maintenance burden on all of us. From guoren at kernel.org Tue Mar 25 05:16:24 2025 From: guoren at kernel.org (guoren at kernel.org) Date: Tue, 25 Mar 2025 08:16:24 -0400 Subject: [RFC PATCH V3 43/43] riscv: Fixup address space overlay of print_mlk In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250325121624.523258-44-guoren@kernel.org> From: "Guo Ren (Alibaba DAMO Academy)" If phyical memory is 1GiB for ilp32 linux, then print_mlk would be: lowmem : 0xc0000000 - 0x00000000 ( 1024 MB) After fixup: lowmem : 0xc0000000 - 0xffffffff ( 1024 MB) Signed-off-by: Guo Ren (Alibaba DAMO Academy) --- arch/riscv/mm/init.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 3cdbb033860e..e09286d4916a 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -105,26 +105,26 @@ static void __init zone_sizes_init(void) static inline void print_mlk(char *name, unsigned long b, unsigned long t) { - pr_notice("%12s : 0x%08lx - 0x%08lx (%4ld kB)\n", name, b, t, + pr_notice("%12s : 0x%08lx - 0x%08lx (%4ld kB)\n", name, b, t - 1, (((t) - (b)) >> LOG2_SZ_1K)); } static inline void print_mlm(char *name, unsigned long b, unsigned long t) { - pr_notice("%12s : 0x%08lx - 0x%08lx (%4ld MB)\n", name, b, t, + pr_notice("%12s : 0x%08lx - 0x%08lx (%4ld MB)\n", name, b, t - 1, (((t) - (b)) >> LOG2_SZ_1M)); } static inline void print_mlg(char *name, unsigned long b, unsigned long t) { - pr_notice("%12s : 0x%08lx - 0x%08lx (%4ld GB)\n", name, b, t, + pr_notice("%12s : 0x%08lx - 0x%08lx (%4ld GB)\n", name, b, t - 1, (((t) - (b)) >> LOG2_SZ_1G)); } #if BITS_PER_LONG == 64 static inline void print_mlt(char *name, unsigned long b, unsigned long t) { - pr_notice("%12s : 0x%08lx - 0x%08lx (%4ld TB)\n", name, b, t, + pr_notice("%12s : 0x%08lx - 0x%08lx (%4ld TB)\n", name, b, t - 1, (((t) - (b)) >> LOG2_SZ_1T)); } #else -- 2.40.1 From guoren at kernel.org Tue Mar 25 06:13:57 2025 From: guoren at kernel.org (Guo Ren) Date: Tue, 25 Mar 2025 21:13:57 +0800 Subject: [RFC PATCH V3 00/43] rv64ilp32_abi: Build CONFIG_64BIT kernel-self with ILP32 ABI In-Reply-To: <20250325122640.GK36322@noisy.programming.kicks-ass.net> References: <20250325121624.523258-1-guoren@kernel.org> <20250325122640.GK36322@noisy.programming.kicks-ass.net> Message-ID: On Tue, Mar 25, 2025 at 8:27?PM Peter Zijlstra wrote: > > On Tue, Mar 25, 2025 at 08:15:41AM -0400, guoren at kernel.org wrote: > > From: "Guo Ren (Alibaba DAMO Academy)" > > > > Since 2001, the CONFIG_64BIT kernel has been built with the LP64 ABI, > > but this patchset allows the CONFIG_64BIT kernel to use an ILP32 ABI > > I'm thinking you're going to be finding a metric ton of assumptions > about 'unsigned long' being 64bit when 64BIT=y throughout the kernel. Less than you imagined. Most code is compatible with ILP32 ABI due to the CONFIG_32BIT. In my practice, it's deemed acceptable. > > I know of a couple of places where 64BIT will result in different math > such that a 32bit 'unsigned long' will trivially overflow. I would be grateful if you could share some with me. > > Please, don't do this. This adds a significant maintenance burden on all > of us. The 64ILP32 ABI would bear the maintenance burden, not traditional 64-bit or 32-bit ABIs. The patch set won't impact other CONFIG_64BIT or CONFIG_32BIT. Numerous RV64 chips require the RV64ILP32 ABI to reduce the memory and cache footprint; we will bear the burden. The core code maintainers would receive patches that would make them use BITS_PER_LONG and CONFIG_64BIT more accurately. -- Best Regards Guo Ren From arnd at arndb.de Tue Mar 25 06:17:44 2025 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 25 Mar 2025 14:17:44 +0100 Subject: [RFC PATCH V3 00/43] rv64ilp32_abi: Build CONFIG_64BIT kernel-self with ILP32 ABI In-Reply-To: <20250325122640.GK36322@noisy.programming.kicks-ass.net> References: <20250325121624.523258-1-guoren@kernel.org> <20250325122640.GK36322@noisy.programming.kicks-ass.net> Message-ID: On Tue, Mar 25, 2025, at 13:26, Peter Zijlstra wrote: > On Tue, Mar 25, 2025 at 08:15:41AM -0400, guoren at kernel.org wrote: >> From: "Guo Ren (Alibaba DAMO Academy)" >> >> Since 2001, the CONFIG_64BIT kernel has been built with the LP64 ABI, >> but this patchset allows the CONFIG_64BIT kernel to use an ILP32 ABI > > Please, don't do this. This adds a significant maintenance burden on all > of us. It would be easier to this with CONFIG_64BIT disabled and continue treating CONFIG_64BIT to be the same as BITS_PER_LONG=64, but I still think it's fundamentally a bad idea to support this in mainline kernels in any variation, other than supporting regular 32-bit compat mode tasks on a regular 64-bit kernel. >> The patchset targets RISC-V and is built on the RV64ILP32 ABI, which >> was introduced into RISC-V's psABI in January 2025 [1]. This patchset >> equips an rv64ilp32-abi kernel with all the functionalities of a >> traditional lp64-abi kernel, yet restricts the address space to 2GiB. >> Hence, the rv64ilp32-abi kernel simultaneously supports lp64-abi >> userspace and ilp32-abi (compat) userspace, the same as the >> traditional lp64-abi kernel. You declare the syscall ABI to be the native 64-bit ABI, but this is fundamentally not true because a many uapi structures are defined in terms of 'long' or pointer values, in particular in the ioctl call. This might work for an rv64ilp32 userspace that uses the same headers and the same types, but you explicitly say that the goal is to run native rv64 or compat rv32 tasks, not rv64ilp32 (thanks!). As far as I can tell, there is no way to rectify this design flaw other than to drop support for 64-bit userspace and only support regular rv32 userspace. I'm also skeptical that supporting rv64 userspace helps in practice other than for testing, since generally most memory overhead is in userspace rather than the kernel, and there is much more to gain from shrinking the larger userspace by running rv32 compat mode binaries on a 64-bit kernel than the other way round. If you remove the CONFIG_64BIT changes that Peter mentioned and the support for ilp64 userland from your series, you end up with a kernel that is very similar to a native rv32 kernel but executes as rv64ilp32 and runs rv32 userspace. I don't have any objections to that approach, and the same thing has come up on arm64 as a possible idea as well, but I don't know if that actually brings any notable advantage over an rv32 kernel. Are there CPUs that can run rv64 kernels and rv32 userspace but not rv32 kernels, similar to what we have on Arm Cortex-A76 and Cortex-A510? Arnd From jean-philippe at linaro.org Tue Mar 25 09:00:28 2025 From: jean-philippe at linaro.org (Jean-Philippe Brucker) Date: Tue, 25 Mar 2025 16:00:28 +0000 Subject: [kvm-unit-tests PATCH v3 0/5] arm64: Change the default QEMU CPU type to "max" Message-ID: <20250325160031.2390504-3-jean-philippe@linaro.org> This is v3 of the series that cleans up the configure flags and sets the default CPU type to "max" on arm64, in order to test the latest Arm features. Since v2 [1] I moved the CPU selection to ./configure, and improved the help text. Unfortunately I couldn't keep most of the Review tags since there were small changes all over. [1] https://lore.kernel.org/all/20250314154904.3946484-2-jean-philippe at linaro.org/ Alexandru Elisei (3): configure: arm64: Don't display 'aarch64' as the default architecture configure: arm/arm64: Display the correct default processor arm64: Implement the ./configure --processor option Jean-Philippe Brucker (2): configure: Add --qemu-cpu option arm64: Use -cpu max as the default for TCG scripts/mkstandalone.sh | 3 ++- arm/run | 15 ++++++----- riscv/run | 8 +++--- configure | 55 +++++++++++++++++++++++++++++++++++------ arm/Makefile.arm | 1 - arm/Makefile.common | 1 + 6 files changed, 63 insertions(+), 20 deletions(-) -- 2.49.0 From jean-philippe at linaro.org Tue Mar 25 09:00:30 2025 From: jean-philippe at linaro.org (Jean-Philippe Brucker) Date: Tue, 25 Mar 2025 16:00:30 +0000 Subject: [kvm-unit-tests PATCH v3 2/5] configure: arm/arm64: Display the correct default processor In-Reply-To: <20250325160031.2390504-3-jean-philippe@linaro.org> References: <20250325160031.2390504-3-jean-philippe@linaro.org> Message-ID: <20250325160031.2390504-5-jean-philippe@linaro.org> From: Alexandru Elisei The help text for the --processor option displays the architecture name as the default processor type. But the default for arm is cortex-a15, and for arm64 is cortex-a57. Teach configure to display the correct default processor type for these two architectures. Signed-off-by: Alexandru Elisei Signed-off-by: Jean-Philippe Brucker --- configure | 30 ++++++++++++++++++++++-------- 1 file changed, 22 insertions(+), 8 deletions(-) diff --git a/configure b/configure index 010c68ff..b4875ef3 100755 --- a/configure +++ b/configure @@ -5,6 +5,24 @@ if [ -z "${BASH_VERSINFO[0]}" ] || [ "${BASH_VERSINFO[0]}" -lt 4 ] ; then exit 1 fi +# Return the default CPU type to compile for +function get_default_processor() +{ + local arch="$1" + + case "$arch" in + "arm") + echo "cortex-a15" + ;; + "arm64") + echo "cortex-a57" + ;; + *) + echo "$arch" + ;; + esac +} + srcdir=$(cd "$(dirname "$0")"; pwd) prefix=/usr/local cc=gcc @@ -44,13 +62,14 @@ fi usage() { [ "$arch" = "aarch64" ] && arch="arm64" + [ -z "$processor" ] && processor=$(get_default_processor $arch) cat <<-EOF Usage: $0 [options] Options include: --arch=ARCH architecture to compile for ($arch). ARCH can be one of: arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 - --processor=PROCESSOR processor to compile for ($arch) + --processor=PROCESSOR processor to compile for ($processor) --target=TARGET target platform that the tests will be running on (qemu or kvmtool, default is qemu) (arm/arm64 only) --cross-prefix=PREFIX cross compiler prefix @@ -326,13 +345,8 @@ if [ "$earlycon" ]; then fi fi -[ -z "$processor" ] && processor="$arch" - -if [ "$processor" = "arm64" ]; then - processor="cortex-a57" -elif [ "$processor" = "arm" ]; then - processor="cortex-a15" -fi +# $arch will have changed when cross-compiling. +[ -z "$processor" ] && processor=$(get_default_processor $arch) if [ "$arch" = "i386" ] || [ "$arch" = "x86_64" ]; then testdir=x86 -- 2.49.0 From jean-philippe at linaro.org Tue Mar 25 09:00:31 2025 From: jean-philippe at linaro.org (Jean-Philippe Brucker) Date: Tue, 25 Mar 2025 16:00:31 +0000 Subject: [kvm-unit-tests PATCH v3 3/5] arm64: Implement the ./configure --processor option In-Reply-To: <20250325160031.2390504-3-jean-philippe@linaro.org> References: <20250325160031.2390504-3-jean-philippe@linaro.org> Message-ID: <20250325160031.2390504-6-jean-philippe@linaro.org> From: Alexandru Elisei The help text for the ./configure --processor option says: --processor=PROCESSOR processor to compile for (cortex-a57) but, unlike arm, the build system does not pass a -mcpu argument to the compiler. Fix it, and bring arm64 at parity with arm. Note that this introduces a regression, which is also present on arm: if the --processor argument is something that the compiler doesn't understand, but qemu does (like 'max'), then compilation fails. This will be fixed in a following patch; another fix is to specify a CPU model that gcc implements by using --cflags=-mcpu=. Reviewed-by: Andrew Jones Reviewed-by: Eric Auger Signed-off-by: Alexandru Elisei Signed-off-by: Jean-Philippe Brucker --- arm/Makefile.arm | 1 - arm/Makefile.common | 1 + 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/arm/Makefile.arm b/arm/Makefile.arm index 7fd39f3a..d6250b7f 100644 --- a/arm/Makefile.arm +++ b/arm/Makefile.arm @@ -12,7 +12,6 @@ $(error Cannot build arm32 tests as EFI apps) endif CFLAGS += $(machine) -CFLAGS += -mcpu=$(PROCESSOR) CFLAGS += -mno-unaligned-access ifeq ($(TARGET),qemu) diff --git a/arm/Makefile.common b/arm/Makefile.common index f828dbe0..a5d97bcf 100644 --- a/arm/Makefile.common +++ b/arm/Makefile.common @@ -25,6 +25,7 @@ AUXFLAGS ?= 0x0 # stack.o relies on frame pointers. KEEP_FRAME_POINTER := y +CFLAGS += -mcpu=$(PROCESSOR) CFLAGS += -std=gnu99 CFLAGS += -ffreestanding CFLAGS += -O2 -- 2.49.0 From jean-philippe at linaro.org Tue Mar 25 09:00:29 2025 From: jean-philippe at linaro.org (Jean-Philippe Brucker) Date: Tue, 25 Mar 2025 16:00:29 +0000 Subject: [kvm-unit-tests PATCH v3 1/5] configure: arm64: Don't display 'aarch64' as the default architecture In-Reply-To: <20250325160031.2390504-3-jean-philippe@linaro.org> References: <20250325160031.2390504-3-jean-philippe@linaro.org> Message-ID: <20250325160031.2390504-4-jean-philippe@linaro.org> From: Alexandru Elisei --arch=aarch64, intentional or not, has been supported since the initial arm64 support, commit 39ac3f8494be ("arm64: initial drop"). However, "aarch64" does not show up in the list of supported architectures, but it's displayed as the default architecture if doing ./configure --help on an arm64 machine. Keep everything consistent and make sure that the default value for $arch is "arm64", but still allow --arch=aarch64, in case they are users that use this configuration for kvm-unit-tests. The help text for --arch changes from: --arch=ARCH architecture to compile for (aarch64). ARCH can be one of: arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 to: --arch=ARCH architecture to compile for (arm64). ARCH can be one of: arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 Signed-off-by: Alexandru Elisei Signed-off-by: Jean-Philippe Brucker --- configure | 1 + 1 file changed, 1 insertion(+) diff --git a/configure b/configure index 52904d3a..010c68ff 100755 --- a/configure +++ b/configure @@ -43,6 +43,7 @@ else fi usage() { + [ "$arch" = "aarch64" ] && arch="arm64" cat <<-EOF Usage: $0 [options] -- 2.49.0 From jean-philippe at linaro.org Tue Mar 25 09:00:33 2025 From: jean-philippe at linaro.org (Jean-Philippe Brucker) Date: Tue, 25 Mar 2025 16:00:33 +0000 Subject: [kvm-unit-tests PATCH v3 5/5] arm64: Use -cpu max as the default for TCG In-Reply-To: <20250325160031.2390504-3-jean-philippe@linaro.org> References: <20250325160031.2390504-3-jean-philippe@linaro.org> Message-ID: <20250325160031.2390504-8-jean-philippe@linaro.org> In order to test all the latest features, default to "max" as the QEMU CPU type on arm64. Leave the default 32-bit CPU as cortex-a15, because that allows running the 32-bit tests with both qemu-system-arm, and with qemu-system-aarch64 whose default "max" CPU doesn't boot in 32-bit mode. Signed-off-by: Jean-Philippe Brucker --- configure | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/configure b/configure index b79145a5..b86ccc0c 100755 --- a/configure +++ b/configure @@ -33,7 +33,7 @@ function get_default_qemu_cpu() echo "cortex-a15" ;; "arm64") - echo "cortex-a57" + echo "max" ;; esac } -- 2.49.0 From jean-philippe at linaro.org Tue Mar 25 09:00:32 2025 From: jean-philippe at linaro.org (Jean-Philippe Brucker) Date: Tue, 25 Mar 2025 16:00:32 +0000 Subject: [kvm-unit-tests PATCH v3 4/5] configure: Add --qemu-cpu option In-Reply-To: <20250325160031.2390504-3-jean-philippe@linaro.org> References: <20250325160031.2390504-3-jean-philippe@linaro.org> Message-ID: <20250325160031.2390504-7-jean-philippe@linaro.org> Add the --qemu-cpu option to let users set the CPU type to run on. At the moment --processor allows to set both GCC -mcpu flag and QEMU -cpu. On Arm we'd like to pass `-cpu max` to QEMU in order to enable all the TCG features by default, and it could also be nice to let users modify the CPU capabilities by setting extra -cpu options. Since GCC -mcpu doesn't accept "max" or "host", separate the compiler and QEMU arguments. `--processor` is now exclusively for compiler options, as indicated by its documentation ("processor to compile for"). So use $QEMU_CPU on RISC-V as well. Suggested-by: Andrew Jones Signed-off-by: Jean-Philippe Brucker --- scripts/mkstandalone.sh | 3 ++- arm/run | 15 +++++++++------ riscv/run | 8 ++++---- configure | 24 ++++++++++++++++++++++++ 4 files changed, 39 insertions(+), 11 deletions(-) diff --git a/scripts/mkstandalone.sh b/scripts/mkstandalone.sh index 2318a85f..9b4f983d 100755 --- a/scripts/mkstandalone.sh +++ b/scripts/mkstandalone.sh @@ -42,7 +42,8 @@ generate_test () config_export ARCH config_export ARCH_NAME - config_export PROCESSOR + config_export QEMU_CPU + config_export DEFAULT_QEMU_CPU echo "echo BUILD_HEAD=$(cat build-head)" diff --git a/arm/run b/arm/run index efdd44ce..4675398f 100755 --- a/arm/run +++ b/arm/run @@ -8,7 +8,7 @@ if [ -z "$KUT_STANDALONE" ]; then source config.mak source scripts/arch-run.bash fi -processor="$PROCESSOR" +qemu_cpu="$QEMU_CPU" if [ "$QEMU" ] && [ -z "$ACCEL" ] && [ "$HOST" = "aarch64" ] && [ "$ARCH" = "arm" ] && @@ -37,12 +37,15 @@ if [ "$ACCEL" = "kvm" ]; then fi fi -if [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ]; then - if [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ]; then - processor="host" +if [ -z "$qemu_cpu" ]; then + if ( [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ] ) && + ( [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ] ); then + qemu_cpu="host" if [ "$ARCH" = "arm" ] && [ "$HOST" = "aarch64" ]; then - processor+=",aarch64=off" + qemu_cpu+=",aarch64=off" fi + else + qemu_cpu="$DEFAULT_QEMU_CPU" fi fi @@ -71,7 +74,7 @@ if $qemu $M -device '?' | grep -q pci-testdev; then fi A="-accel $ACCEL$ACCEL_PROPS" -command="$qemu -nodefaults $M $A -cpu $processor $chr_testdev $pci_testdev" +command="$qemu -nodefaults $M $A -cpu $qemu_cpu $chr_testdev $pci_testdev" command+=" -display none -serial stdio" command="$(migration_cmd) $(timeout_cmd) $command" diff --git a/riscv/run b/riscv/run index e2f5a922..02fcf0c0 100755 --- a/riscv/run +++ b/riscv/run @@ -11,12 +11,12 @@ fi # Allow user overrides of some config.mak variables mach=$MACHINE_OVERRIDE -processor=$PROCESSOR_OVERRIDE +qemu_cpu=$QEMU_CPU_OVERRIDE firmware=$FIRMWARE_OVERRIDE -[ "$PROCESSOR" = "$ARCH" ] && PROCESSOR="max" +[ -z "$QEMU_CPU" ] && QEMU_CPU="max" : "${mach:=virt}" -: "${processor:=$PROCESSOR}" +: "${qemu_cpu:=$QEMU_CPU}" : "${firmware:=$FIRMWARE}" [ "$firmware" ] && firmware="-bios $firmware" @@ -32,7 +32,7 @@ fi mach="-machine $mach" command="$qemu -nodefaults -nographic -serial mon:stdio" -command+=" $mach $acc $firmware -cpu $processor " +command+=" $mach $acc $firmware -cpu $qemu_cpu " command="$(migration_cmd) $(timeout_cmd) $command" if [ "$UEFI_SHELL_RUN" = "y" ]; then diff --git a/configure b/configure index b4875ef3..b79145a5 100755 --- a/configure +++ b/configure @@ -23,6 +23,21 @@ function get_default_processor() esac } +# Return the default CPU type to run on +function get_default_qemu_cpu() +{ + local arch="$1" + + case "$arch" in + "arm") + echo "cortex-a15" + ;; + "arm64") + echo "cortex-a57" + ;; + esac +} + srcdir=$(cd "$(dirname "$0")"; pwd) prefix=/usr/local cc=gcc @@ -52,6 +67,7 @@ earlycon= console= efi= efi_direct= +qemu_cpu= # Enable -Werror by default for git repositories only (i.e. developer builds) if [ -e "$srcdir"/.git ]; then @@ -70,6 +86,9 @@ usage() { --arch=ARCH architecture to compile for ($arch). ARCH can be one of: arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 --processor=PROCESSOR processor to compile for ($processor) + --qemu-cpu=CPU the CPU model to run on. If left unset, the run script + selects the best value based on the host system and the + test configuration. --target=TARGET target platform that the tests will be running on (qemu or kvmtool, default is qemu) (arm/arm64 only) --cross-prefix=PREFIX cross compiler prefix @@ -146,6 +165,9 @@ while [[ $optno -le $argc ]]; do --processor) processor="$arg" ;; + --qemu-cpu) + qemu_cpu="$arg" + ;; --target) target="$arg" ;; @@ -471,6 +493,8 @@ ARCH=$arch ARCH_NAME=$arch_name ARCH_LIBDIR=$arch_libdir PROCESSOR=$processor +QEMU_CPU=$qemu_cpu +DEFAULT_QEMU_CPU=$(get_default_qemu_cpu $arch) CC=$cc CFLAGS=$cflags LD=$cross_prefix$ld -- 2.49.0 From s.shtylyov at omp.ru Tue Mar 25 10:19:39 2025 From: s.shtylyov at omp.ru (Sergey Shtylyov) Date: Tue, 25 Mar 2025 20:19:39 +0300 Subject: [RFC PATCH V3 25/43] rv64ilp32_abi: exec: Adapt 64lp64 env and argv In-Reply-To: <20250325121624.523258-26-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> <20250325121624.523258-26-guoren@kernel.org> Message-ID: <05fec753-cdaa-45a5-a029-b6435c30eb07@omp.ru> On 3/25/25 3:16 PM, guoren at kernel.org wrote: > From: "Guo Ren (Alibaba DAMO Academy)" > > The rv64ilp32 abi reuses the env and argv memory layout of the > lp64 abi, so leave the space to fit the lp64 struct layout. > > Signed-off-by: Guo Ren (Alibaba DAMO Academy) > --- > fs/exec.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/fs/exec.c b/fs/exec.c > index 506cd411f4ac..548d18b7ae92 100644 > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -424,6 +424,10 @@ static const char __user *get_user_arg_ptr(struct user_arg_ptr argv, int nr) > } > #endif > > +#if defined(CONFIG_64BIT) && (BITS_PER_LONG == 32) Parens don't seem necessary... > + nr = nr * 2; Why not nr *= 2? [...] MBR, Sergey From david at redhat.com Tue Mar 25 11:51:47 2025 From: david at redhat.com (David Hildenbrand) Date: Tue, 25 Mar 2025 19:51:47 +0100 Subject: [RFC PATCH V3 00/43] rv64ilp32_abi: Build CONFIG_64BIT kernel-self with ILP32 ABI In-Reply-To: <20250325122640.GK36322@noisy.programming.kicks-ass.net> References: <20250325121624.523258-1-guoren@kernel.org> <20250325122640.GK36322@noisy.programming.kicks-ass.net> Message-ID: <0e1d8823-620f-420c-86a5-35495ccbd10f@redhat.com> On 25.03.25 13:26, Peter Zijlstra wrote: > On Tue, Mar 25, 2025 at 08:15:41AM -0400, guoren at kernel.org wrote: >> From: "Guo Ren (Alibaba DAMO Academy)" >> >> Since 2001, the CONFIG_64BIT kernel has been built with the LP64 ABI, >> but this patchset allows the CONFIG_64BIT kernel to use an ILP32 ABI > > I'm thinking you're going to be finding a metric ton of assumptions > about 'unsigned long' being 64bit when 64BIT=y throughout the kernel. > > I know of a couple of places where 64BIT will result in different math > such that a 32bit 'unsigned long' will trivially overflow. > > Please, don't do this. This adds a significant maintenance burden on all > of us. > Fully agreed. -- Cheers, David / dhildenb From Liam.Howlett at oracle.com Tue Mar 25 12:09:37 2025 From: Liam.Howlett at oracle.com (Liam R. Howlett) Date: Tue, 25 Mar 2025 15:09:37 -0400 Subject: [RFC PATCH V3 31/43] rv64ilp32_abi: maple_tree: Use BITS_PER_LONG instead of CONFIG_64BIT In-Reply-To: <20250325121624.523258-32-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> <20250325121624.523258-32-guoren@kernel.org> Message-ID: <4gph4xikdbg6loy2id72uyxgldsldecc7gquhymusl3vreiw3a@ephk5ahhrdw7> * guoren at kernel.org [250325 08:24]: > From: "Guo Ren (Alibaba DAMO Academy)" > > The Maple tree algorithm uses ulong type for each element. The > number of slots is based on BITS_PER_LONG for RV64ILP32 ABI, so > use BITS_PER_LONG instead of CONFIG_64BIT. > > Signed-off-by: Guo Ren (Alibaba DAMO Academy) > --- > include/linux/maple_tree.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h > index cbbcd18d4186..ff6265b6468b 100644 > --- a/include/linux/maple_tree.h > +++ b/include/linux/maple_tree.h > @@ -24,7 +24,7 @@ > * > * Nodes in the tree point to their parent unless bit 0 is set. > */ > -#if defined(CONFIG_64BIT) || defined(BUILD_VDSO32_64) > +#if (BITS_PER_LONG == 64) || defined(BUILD_VDSO32_64) This will break my userspace testing, if you do not update the testing as well. This can be found in tools/testing/radix-tree. Please also look at the Makefile as well since it will generate a build flag for the userspace. This raises other concerns as the code is found with a grep command, so I'm not sure why it was missed and if anything else is missed? If you consider this email to be the (unasked) question about what to do here, then please CC me, the maintainer of the files including the one you are updating here. Thank you, Liam From Liam.Howlett at oracle.com Tue Mar 25 12:23:39 2025 From: Liam.Howlett at oracle.com (Liam R. Howlett) Date: Tue, 25 Mar 2025 15:23:39 -0400 Subject: [RFC PATCH V3 00/43] rv64ilp32_abi: Build CONFIG_64BIT kernel-self with ILP32 ABI In-Reply-To: <0e1d8823-620f-420c-86a5-35495ccbd10f@redhat.com> References: <20250325121624.523258-1-guoren@kernel.org> <20250325122640.GK36322@noisy.programming.kicks-ass.net> <0e1d8823-620f-420c-86a5-35495ccbd10f@redhat.com> Message-ID: * David Hildenbrand [250325 14:52]: > On 25.03.25 13:26, Peter Zijlstra wrote: > > On Tue, Mar 25, 2025 at 08:15:41AM -0400, guoren at kernel.org wrote: > > > From: "Guo Ren (Alibaba DAMO Academy)" > > > > > > Since 2001, the CONFIG_64BIT kernel has been built with the LP64 ABI, > > > but this patchset allows the CONFIG_64BIT kernel to use an ILP32 ABI > > > > I'm thinking you're going to be finding a metric ton of assumptions > > about 'unsigned long' being 64bit when 64BIT=y throughout the kernel. > > > > I know of a couple of places where 64BIT will result in different math > > such that a 32bit 'unsigned long' will trivially overflow. > > > > Please, don't do this. This adds a significant maintenance burden on all > > of us. > > > > Fully agreed. I would go further and say I do not want this to go in. The open ended maintenance burden is not worth extending hardware life of a board with 16mb of ram (If I understand your 2023 LPC slides correctly). Thank you, Liam From ej at inai.de Tue Mar 25 13:30:53 2025 From: ej at inai.de (Jan Engelhardt) Date: Tue, 25 Mar 2025 21:30:53 +0100 (CET) Subject: [RFC PATCH V3 01/43] rv64ilp32_abi: uapi: Reuse lp64 ABI interface In-Reply-To: <20250325121624.523258-2-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> <20250325121624.523258-2-guoren@kernel.org> Message-ID: <0024788o-35r0-73q1-1s54-q564p457q33s@vanv.qr> On Tuesday 2025-03-25 13:15, guoren at kernel.org wrote: >diff --git a/include/uapi/linux/netfilter/x_tables.h b/include/uapi/linux/netfilter/x_tables.h >index 796af83a963a..7e02e34c6fad 100644 >--- a/include/uapi/linux/netfilter/x_tables.h >+++ b/include/uapi/linux/netfilter/x_tables.h >@@ -18,7 +18,11 @@ struct xt_entry_match { > __u8 revision; > } user; > struct { >+#if __riscv_xlen == 64 >+ __u64 match_size; >+#else > __u16 match_size; >+#endif > > /* Used inside the kernel */ > struct xt_match *match; The __u16 is the common prefix of the union which is exposed to userspace. If anything, you need to use __attribute__((aligned(8))) to move `match` to a fixed location. However, that sub-struct is only used inside the kernel and never exposed, so the alignment of `match` should not play a role. Moreover, change from u16 to u64 would break RISC-V Big-Endian. Even if there currently is no big-endian variant, let's not introduce such breakage. >--- a/include/uapi/linux/netfilter_ipv4/ip_tables.h >+++ b/include/uapi/linux/netfilter_ipv4/ip_tables.h >@@ -200,7 +200,14 @@ struct ipt_replace { > /* Number of counters (must be equal to current number of entries). */ > unsigned int num_counters; > /* The old entries' counters. */ >+#if __riscv_xlen == 64 >+ union { >+ struct xt_counters __user *counters; >+ __u64 __counters; >+ }; >+#else > struct xt_counters __user *counters; >+#endif > > /* The entries (hang off end: not really an array). */ > struct ipt_entry entries[]; This seems ok, but perhaps there is a better name for __riscv_xlen (ifdef CONFIG_????ilp32), so it is not strictly tied to riscv, in case other platform wants to try ilp32-self mode. >+#if __riscv_xlen == 64 >+ union { >+ int __user *auth_flavours; /* 1 */ >+ __u64 __auth_flavours; >+ }; >+#else > int __user *auth_flavours; /* 1 */ >+#endif > }; > > /* bits in the flags field */ >diff --git a/include/uapi/linux/ppp-ioctl.h b/include/uapi/linux/ppp-ioctl.h >index 1cc5ce0ae062..8d48eab430c1 100644 >--- a/include/uapi/linux/ppp-ioctl.h >+++ b/include/uapi/linux/ppp-ioctl.h >@@ -59,7 +59,14 @@ struct npioctl { > > /* Structure describing a CCP configuration option, for PPPIOCSCOMPRESS */ > struct ppp_option_data { >+#if __riscv_xlen == 64 >+ union { >+ __u8 __user *ptr; >+ __u64 __ptr; >+ }; >+#else > __u8 __user *ptr; >+#endif > __u32 length; > int transmit; > }; >diff --git a/include/uapi/linux/sctp.h b/include/uapi/linux/sctp.h >index b7d91d4cf0db..46a06fddcd2f 100644 >--- a/include/uapi/linux/sctp.h >+++ b/include/uapi/linux/sctp.h >@@ -1024,6 +1024,9 @@ struct sctp_getaddrs_old { > #else > struct sockaddr *addrs; > #endif >+#if (__riscv_xlen == 64) && (__SIZEOF_LONG__ == 4) >+ __u32 unused; >+#endif > }; > > struct sctp_getaddrs { >diff --git a/include/uapi/linux/sem.h b/include/uapi/linux/sem.h >index 75aa3b273cd9..de9f441913cd 100644 >--- a/include/uapi/linux/sem.h >+++ b/include/uapi/linux/sem.h >@@ -26,10 +26,29 @@ struct semid_ds { > struct ipc_perm sem_perm; /* permissions .. see ipc.h */ > __kernel_old_time_t sem_otime; /* last semop time */ > __kernel_old_time_t sem_ctime; /* create/last semctl() time */ >+#if __riscv_xlen == 64 >+ union { >+ struct sem *sem_base; /* ptr to first semaphore in array */ >+ __u64 __sem_base; >+ }; >+ union { >+ struct sem_queue *sem_pending; /* pending operations to be processed */ >+ __u64 __sem_pending; >+ }; >+ union { >+ struct sem_queue **sem_pending_last; /* last pending operation */ >+ __u64 __sem_pending_last; >+ }; >+ union { >+ struct sem_undo *undo; /* undo requests on this array */ >+ __u64 __undo; >+ }; >+#else > struct sem *sem_base; /* ptr to first semaphore in array */ > struct sem_queue *sem_pending; /* pending operations to be processed */ > struct sem_queue **sem_pending_last; /* last pending operation */ > struct sem_undo *undo; /* undo requests on this array */ >+#endif > unsigned short sem_nsems; /* no. of semaphores in array */ > }; > >@@ -46,10 +65,29 @@ struct sembuf { > /* arg for semctl system calls. */ > union semun { > int val; /* value for SETVAL */ >+#if __riscv_xlen == 64 >+ union { >+ struct semid_ds __user *buf; /* buffer for IPC_STAT & IPC_SET */ >+ __u64 ___buf; >+ }; >+ union { >+ unsigned short __user *array; /* array for GETALL & SETALL */ >+ __u64 __array; >+ }; >+ union { >+ struct seminfo __user *__buf; /* buffer for IPC_INFO */ >+ __u64 ____buf; >+ }; >+ union { >+ void __user *__pad; >+ __u64 ____pad; >+ }; >+#else > struct semid_ds __user *buf; /* buffer for IPC_STAT & IPC_SET */ > unsigned short __user *array; /* array for GETALL & SETALL */ > struct seminfo __user *__buf; /* buffer for IPC_INFO */ > void __user *__pad; >+#endif > }; > > struct seminfo { >diff --git a/include/uapi/linux/socket.h b/include/uapi/linux/socket.h >index d3fcd3b5ec53..5f7a83649395 100644 >--- a/include/uapi/linux/socket.h >+++ b/include/uapi/linux/socket.h >@@ -22,7 +22,14 @@ struct __kernel_sockaddr_storage { > /* space to achieve desired size, */ > /* _SS_MAXSIZE value minus size of ss_family */ > }; >+#if __riscv_xlen == 64 >+ union { >+ void *__align; /* implementation specific desired alignment */ >+ u64 ___align; >+ }; >+#else > void *__align; /* implementation specific desired alignment */ >+#endif > }; > }; > >diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h >index 8981f00204db..8ed7b29897f9 100644 >--- a/include/uapi/linux/sysctl.h >+++ b/include/uapi/linux/sysctl.h >@@ -33,13 +33,45 @@ > member of a struct __sysctl_args to have? */ > > struct __sysctl_args { >+#if __riscv_xlen == 64 >+ union { >+ int __user *name; >+ __u64 __name; >+ }; >+#else > int __user *name; >+#endif > int nlen; >+#if __riscv_xlen == 64 >+ union { >+ void __user *oldval; >+ __u64 __oldval; >+ }; >+#else > void __user *oldval; >+#endif >+#if __riscv_xlen == 64 >+ union { >+ size_t __user *oldlenp; >+ __u64 __oldlenp; >+ }; >+#else > size_t __user *oldlenp; >+#endif >+#if __riscv_xlen == 64 >+ union { >+ void __user *newval; >+ __u64 __newval; >+ }; >+#else > void __user *newval; >+#endif > size_t newlen; >+#if __riscv_xlen == 64 >+ unsigned long long __unused[4]; >+#else > unsigned long __unused[4]; >+#endif > }; > > /* Define sysctl names first */ >diff --git a/include/uapi/linux/uhid.h b/include/uapi/linux/uhid.h >index cef7534d2d19..4a774dbd3de8 100644 >--- a/include/uapi/linux/uhid.h >+++ b/include/uapi/linux/uhid.h >@@ -130,7 +130,14 @@ struct uhid_create_req { > __u8 name[128]; > __u8 phys[64]; > __u8 uniq[64]; >+#if __riscv_xlen == 64 >+ union { >+ __u8 __user *rd_data; >+ __u64 __rd_data; >+ }; >+#else > __u8 __user *rd_data; >+#endif > __u16 rd_size; > > __u16 bus; >diff --git a/include/uapi/linux/uio.h b/include/uapi/linux/uio.h >index 649739e0c404..27dfd6032dc6 100644 >--- a/include/uapi/linux/uio.h >+++ b/include/uapi/linux/uio.h >@@ -16,8 +16,19 @@ > > struct iovec > { >+#if __riscv_xlen == 64 >+ union { >+ void __user *iov_base; /* BSD uses caddr_t (1003.1g requires void *) */ >+ __u64 __iov_base; >+ }; >+ union { >+ __kernel_size_t iov_len; /* Must be size_t (1003.1g) */ >+ __u64 __iov_len; >+ }; >+#else > void __user *iov_base; /* BSD uses caddr_t (1003.1g requires void *) */ > __kernel_size_t iov_len; /* Must be size_t (1003.1g) */ >+#endif > }; > > struct dmabuf_cmsg { >diff --git a/include/uapi/linux/usb/tmc.h b/include/uapi/linux/usb/tmc.h >index d791cc58a7f0..443ec5356caf 100644 >--- a/include/uapi/linux/usb/tmc.h >+++ b/include/uapi/linux/usb/tmc.h >@@ -51,7 +51,14 @@ struct usbtmc_request { > > struct usbtmc_ctrlrequest { > struct usbtmc_request req; >+#if __riscv_xlen == 64 >+ union { >+ void __user *data; /* pointer to user space */ >+ __u64 __data; /* pointer to user space */ >+ }; >+#else > void __user *data; /* pointer to user space */ >+#endif > } __attribute__ ((packed)); > > struct usbtmc_termchar { >@@ -70,7 +77,14 @@ struct usbtmc_message { > __u32 transfer_size; /* size of bytes to transfer */ > __u32 transferred; /* size of received/written bytes */ > __u32 flags; /* bit 0: 0 = synchronous; 1 = asynchronous */ >+#if __riscv_xlen == 64 >+ union { >+ void __user *message; /* pointer to header and data in user space */ >+ __u64 __message; >+ }; >+#else > void __user *message; /* pointer to header and data in user space */ >+#endif > } __attribute__ ((packed)); > > /* Request values for USBTMC driver's ioctl entry point */ >diff --git a/include/uapi/linux/usbdevice_fs.h b/include/uapi/linux/usbdevice_fs.h >index 74a84e02422a..8c8efef74c3c 100644 >--- a/include/uapi/linux/usbdevice_fs.h >+++ b/include/uapi/linux/usbdevice_fs.h >@@ -44,14 +44,28 @@ struct usbdevfs_ctrltransfer { > __u16 wIndex; > __u16 wLength; > __u32 timeout; /* in milliseconds */ >+#if __riscv_xlen == 64 >+ union { >+ void __user *data; >+ __u64 __data; >+ }; >+#else > void __user *data; >+#endif > }; > > struct usbdevfs_bulktransfer { > unsigned int ep; > unsigned int len; > unsigned int timeout; /* in milliseconds */ >+#if __riscv_xlen == 64 >+ union { >+ void __user *data; >+ __u64 __data; >+ }; >+#else > void __user *data; >+#endif > }; > > struct usbdevfs_setinterface { >@@ -61,7 +75,14 @@ struct usbdevfs_setinterface { > > struct usbdevfs_disconnectsignal { > unsigned int signr; >+#if __riscv_xlen == 64 >+ union { >+ void __user *context; >+ __u64 __context; >+ }; >+#else > void __user *context; >+#endif > }; > > #define USBDEVFS_MAXDRIVERNAME 255 >@@ -119,7 +140,14 @@ struct usbdevfs_urb { > unsigned char endpoint; > int status; > unsigned int flags; >+#if __riscv_xlen == 64 >+ union { >+ void __user *buffer; >+ __u64 __buffer; >+ }; >+#else > void __user *buffer; >+#endif > int buffer_length; > int actual_length; > int start_frame; >@@ -130,7 +158,14 @@ struct usbdevfs_urb { > int error_count; > unsigned int signr; /* signal to be sent on completion, > or 0 if none should be sent. */ >+#if __riscv_xlen == 64 >+ union { >+ void __user *usercontext; >+ __u64 __usercontext; >+ }; >+#else > void __user *usercontext; >+#endif > struct usbdevfs_iso_packet_desc iso_frame_desc[]; > }; > >@@ -139,7 +174,14 @@ struct usbdevfs_ioctl { > int ifno; /* interface 0..N ; negative numbers reserved */ > int ioctl_code; /* MUST encode size + direction of data so the > * macros in give correct values */ >+#if __riscv_xlen == 64 >+ union { >+ void __user *data; /* param buffer (in, or out) */ >+ __u64 __pad; >+ }; >+#else > void __user *data; /* param buffer (in, or out) */ >+#endif > }; > > /* You can do most things with hubs just through control messages, >@@ -195,9 +237,17 @@ struct usbdevfs_streams { > #define USBDEVFS_SUBMITURB _IOR('U', 10, struct usbdevfs_urb) > #define USBDEVFS_SUBMITURB32 _IOR('U', 10, struct usbdevfs_urb32) > #define USBDEVFS_DISCARDURB _IO('U', 11) >+#if __riscv_xlen == 64 >+#define USBDEVFS_REAPURB _IOW('U', 12, __u64) >+#else > #define USBDEVFS_REAPURB _IOW('U', 12, void *) >+#endif > #define USBDEVFS_REAPURB32 _IOW('U', 12, __u32) >+#if __riscv_xlen == 64 >+#define USBDEVFS_REAPURBNDELAY _IOW('U', 13, __u64) >+#else > #define USBDEVFS_REAPURBNDELAY _IOW('U', 13, void *) >+#endif > #define USBDEVFS_REAPURBNDELAY32 _IOW('U', 13, __u32) > #define USBDEVFS_DISCSIGNAL _IOR('U', 14, struct usbdevfs_disconnectsignal) > #define USBDEVFS_DISCSIGNAL32 _IOR('U', 14, struct usbdevfs_disconnectsignal32) >diff --git a/include/uapi/linux/uvcvideo.h b/include/uapi/linux/uvcvideo.h >index f86185456dc5..3ccb99039a43 100644 >--- a/include/uapi/linux/uvcvideo.h >+++ b/include/uapi/linux/uvcvideo.h >@@ -54,7 +54,14 @@ struct uvc_xu_control_mapping { > __u32 v4l2_type; > __u32 data_type; > >+#if __riscv_xlen == 64 >+ union { >+ struct uvc_menu_info __user *menu_info; >+ __u64 __menu_info; >+ }; >+#else > struct uvc_menu_info __user *menu_info; >+#endif > __u32 menu_count; > > __u32 reserved[4]; >@@ -66,7 +73,14 @@ struct uvc_xu_control_query { > __u8 query; /* Video Class-Specific Request Code, */ > /* defined in linux/usb/video.h A.8. */ > __u16 size; >+#if __riscv_xlen == 64 >+ union { >+ __u8 __user *data; >+ __u64 __data; >+ }; >+#else > __u8 __user *data; >+#endif > }; > > #define UVCIOC_CTRL_MAP _IOWR('u', 0x20, struct uvc_xu_control_mapping) >diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h >index c8dbf8219c4f..0a1dc2a780fb 100644 >--- a/include/uapi/linux/vfio.h >+++ b/include/uapi/linux/vfio.h >@@ -1570,7 +1570,14 @@ struct vfio_iommu_type1_dma_map { > struct vfio_bitmap { > __u64 pgsize; /* page size for bitmap in bytes */ > __u64 size; /* in bytes */ >+ #if __riscv_xlen == 64 >+ union { >+ __u64 __user *data; /* one bit per page */ >+ __u64 __data; >+ }; >+ #else > __u64 __user *data; /* one bit per page */ >+ #endif > }; > > /** >diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h >index e7c4dce39007..8e5391f07626 100644 >--- a/include/uapi/linux/videodev2.h >+++ b/include/uapi/linux/videodev2.h >@@ -1898,7 +1898,14 @@ struct v4l2_ext_controls { > __u32 error_idx; > __s32 request_fd; > __u32 reserved[1]; >+#if __riscv_xlen == 64 >+ union { >+ struct v4l2_ext_control *controls; >+ __u64 __controls; >+ }; >+#else > struct v4l2_ext_control *controls; >+#endif > }; > > #define V4L2_CTRL_ID_MASK (0x0fffffff) >-- >2.40.1 > > From torvalds at linux-foundation.org Tue Mar 25 13:41:30 2025 From: torvalds at linux-foundation.org (Linus Torvalds) Date: Tue, 25 Mar 2025 13:41:30 -0700 Subject: [RFC PATCH V3 01/43] rv64ilp32_abi: uapi: Reuse lp64 ABI interface In-Reply-To: <20250325121624.523258-2-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> <20250325121624.523258-2-guoren@kernel.org> Message-ID: On Tue, 25 Mar 2025 at 05:17, wrote: > > The rv64ilp32 abi kernel accommodates the lp64 abi userspace and > leverages the lp64 abi Linux interface. Hence, unify the > BITS_PER_LONG = 32 memory layout to match BITS_PER_LONG = 64. No. This isn't happening. You can't do crazy things in the RISC-V code and then expect the rest of the kernel to just go "ok, we'll do crazy things". We're not doing crazy __riscv_xlen hackery with random structures containing 64-bit values that the kernel then only looks at the low 32 bits. That's wrong on *so* many levels. I'm willing to say "big-endian is dead", but I'm not willing to accept this kind of crazy hackery. Not today, not ever. If you want to run a ilp32 kernel on 64-bit hardware (and support 64-bit ABI just in a 32-bit virtual memory size), I would suggest you (a) treat the kernel as natively 32-bit (obviously you can then tell the compiler to use the rv64 instructions, which I presume you're already doing - I didn't look) (b) look at making the compat stuff do the conversion the "wrong way". And btw, that (b) implies *not* just ignoring the high bits. If user-space gives 64-bit pointer, you don't just treat it as a 32-bit one by dropping the high bits. You add some logic to convert it to an invalid pointer so that user space gets -EFAULT. Linus From guoren at kernel.org Tue Mar 25 20:35:43 2025 From: guoren at kernel.org (Guo Ren) Date: Wed, 26 Mar 2025 11:35:43 +0800 Subject: [RFC PATCH V3 01/43] rv64ilp32_abi: uapi: Reuse lp64 ABI interface In-Reply-To: <0024788o-35r0-73q1-1s54-q564p457q33s@vanv.qr> References: <20250325121624.523258-1-guoren@kernel.org> <20250325121624.523258-2-guoren@kernel.org> <0024788o-35r0-73q1-1s54-q564p457q33s@vanv.qr> Message-ID: On Wed, Mar 26, 2025 at 4:31?AM Jan Engelhardt wrote: > > > On Tuesday 2025-03-25 13:15, guoren at kernel.org wrote: > > >diff --git a/include/uapi/linux/netfilter/x_tables.h b/include/uapi/linux/netfilter/x_tables.h > >index 796af83a963a..7e02e34c6fad 100644 > >--- a/include/uapi/linux/netfilter/x_tables.h > >+++ b/include/uapi/linux/netfilter/x_tables.h > >@@ -18,7 +18,11 @@ struct xt_entry_match { > > __u8 revision; > > } user; > > struct { > >+#if __riscv_xlen == 64 > >+ __u64 match_size; > >+#else > > __u16 match_size; > >+#endif > > > > /* Used inside the kernel */ > > struct xt_match *match; > > The __u16 is the common prefix of the union which is exposed to userspace. > If anything, you need to use __attribute__((aligned(8))) to move > `match` to a fixed location. > > However, that sub-struct is only used inside the kernel and never exposed, > so the alignment of `match` should not play a role. > > Moreover, change from u16 to u64 would break RISC-V Big-Endian. Even if there > currently is no big-endian variant, let's not introduce such breakage. You're correct. The __u64 modification is too raw from the proof of concept. It's not correct, so I would accept your advice. > > > >--- a/include/uapi/linux/netfilter_ipv4/ip_tables.h > >+++ b/include/uapi/linux/netfilter_ipv4/ip_tables.h > >@@ -200,7 +200,14 @@ struct ipt_replace { > > /* Number of counters (must be equal to current number of entries). */ > > unsigned int num_counters; > > /* The old entries' counters. */ > >+#if __riscv_xlen == 64 > >+ union { > >+ struct xt_counters __user *counters; > >+ __u64 __counters; > >+ }; > >+#else > > struct xt_counters __user *counters; > >+#endif > > > > /* The entries (hang off end: not really an array). */ > > struct ipt_entry entries[]; > > This seems ok, but perhaps there is a better name for __riscv_xlen (ifdef > CONFIG_????ilp32), so it is not strictly tied to riscv, > in case other platform wants to try ilp32-self mode. Yes, I want that macro, but Linus has suggested "compat stuff". I would have to try. Thx for the reviewing! > > >+#if __riscv_xlen == 64 > >+ union { > >+ int __user *auth_flavours; /* 1 */ > >+ __u64 __auth_flavours; > >+ }; > >+#else > > int __user *auth_flavours; /* 1 */ > >+#endif > > }; > > > > /* bits in the flags field */ > >diff --git a/include/uapi/linux/ppp-ioctl.h b/include/uapi/linux/ppp-ioctl.h > >index 1cc5ce0ae062..8d48eab430c1 100644 > >--- a/include/uapi/linux/ppp-ioctl.h > >+++ b/include/uapi/linux/ppp-ioctl.h > >@@ -59,7 +59,14 @@ struct npioctl { > > > > /* Structure describing a CCP configuration option, for PPPIOCSCOMPRESS */ > > struct ppp_option_data { > >+#if __riscv_xlen == 64 > >+ union { > >+ __u8 __user *ptr; > >+ __u64 __ptr; > >+ }; > >+#else > > __u8 __user *ptr; > >+#endif > > __u32 length; > > int transmit; > > }; > >diff --git a/include/uapi/linux/sctp.h b/include/uapi/linux/sctp.h > >index b7d91d4cf0db..46a06fddcd2f 100644 > >--- a/include/uapi/linux/sctp.h > >+++ b/include/uapi/linux/sctp.h > >@@ -1024,6 +1024,9 @@ struct sctp_getaddrs_old { > > #else > > struct sockaddr *addrs; > > #endif > >+#if (__riscv_xlen == 64) && (__SIZEOF_LONG__ == 4) > >+ __u32 unused; > >+#endif > > }; > > > > > > struct sctp_getaddrs { > >diff --git a/include/uapi/linux/sem.h b/include/uapi/linux/sem.h > >index 75aa3b273cd9..de9f441913cd 100644 > >--- a/include/uapi/linux/sem.h > >+++ b/include/uapi/linux/sem.h > >@@ -26,10 +26,29 @@ struct semid_ds { > > struct ipc_perm sem_perm; /* permissions .. see ipc.h */ > > __kernel_old_time_t sem_otime; /* last semop time */ > > __kernel_old_time_t sem_ctime; /* create/last semctl() time */ > >+#if __riscv_xlen == 64 > >+ union { > >+ struct sem *sem_base; /* ptr to first semaphore in array */ > >+ __u64 __sem_base; > >+ }; > >+ union { > >+ struct sem_queue *sem_pending; /* pending operations to be processed */ > >+ __u64 __sem_pending; > >+ }; > >+ union { > >+ struct sem_queue **sem_pending_last; /* last pending operation */ > >+ __u64 __sem_pending_last; > >+ }; > >+ union { > >+ struct sem_undo *undo; /* undo requests on this array */ > >+ __u64 __undo; > >+ }; > >+#else > > struct sem *sem_base; /* ptr to first semaphore in array */ > > struct sem_queue *sem_pending; /* pending operations to be processed */ > > struct sem_queue **sem_pending_last; /* last pending operation */ > > struct sem_undo *undo; /* undo requests on this array */ > >+#endif > > unsigned short sem_nsems; /* no. of semaphores in array */ > > }; > > > >@@ -46,10 +65,29 @@ struct sembuf { > > /* arg for semctl system calls. */ > > union semun { > > int val; /* value for SETVAL */ > >+#if __riscv_xlen == 64 > >+ union { > >+ struct semid_ds __user *buf; /* buffer for IPC_STAT & IPC_SET */ > >+ __u64 ___buf; > >+ }; > >+ union { > >+ unsigned short __user *array; /* array for GETALL & SETALL */ > >+ __u64 __array; > >+ }; > >+ union { > >+ struct seminfo __user *__buf; /* buffer for IPC_INFO */ > >+ __u64 ____buf; > >+ }; > >+ union { > >+ void __user *__pad; > >+ __u64 ____pad; > >+ }; > >+#else > > struct semid_ds __user *buf; /* buffer for IPC_STAT & IPC_SET */ > > unsigned short __user *array; /* array for GETALL & SETALL */ > > struct seminfo __user *__buf; /* buffer for IPC_INFO */ > > void __user *__pad; > >+#endif > > }; > > > > struct seminfo { > >diff --git a/include/uapi/linux/socket.h b/include/uapi/linux/socket.h > >index d3fcd3b5ec53..5f7a83649395 100644 > >--- a/include/uapi/linux/socket.h > >+++ b/include/uapi/linux/socket.h > >@@ -22,7 +22,14 @@ struct __kernel_sockaddr_storage { > > /* space to achieve desired size, */ > > /* _SS_MAXSIZE value minus size of ss_family */ > > }; > >+#if __riscv_xlen == 64 > >+ union { > >+ void *__align; /* implementation specific desired alignment */ > >+ u64 ___align; > >+ }; > >+#else > > void *__align; /* implementation specific desired alignment */ > >+#endif > > }; > > }; > > > >diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h > >index 8981f00204db..8ed7b29897f9 100644 > >--- a/include/uapi/linux/sysctl.h > >+++ b/include/uapi/linux/sysctl.h > >@@ -33,13 +33,45 @@ > > member of a struct __sysctl_args to have? */ > > > > struct __sysctl_args { > >+#if __riscv_xlen == 64 > >+ union { > >+ int __user *name; > >+ __u64 __name; > >+ }; > >+#else > > int __user *name; > >+#endif > > int nlen; > >+#if __riscv_xlen == 64 > >+ union { > >+ void __user *oldval; > >+ __u64 __oldval; > >+ }; > >+#else > > void __user *oldval; > >+#endif > >+#if __riscv_xlen == 64 > >+ union { > >+ size_t __user *oldlenp; > >+ __u64 __oldlenp; > >+ }; > >+#else > > size_t __user *oldlenp; > >+#endif > >+#if __riscv_xlen == 64 > >+ union { > >+ void __user *newval; > >+ __u64 __newval; > >+ }; > >+#else > > void __user *newval; > >+#endif > > size_t newlen; > >+#if __riscv_xlen == 64 > >+ unsigned long long __unused[4]; > >+#else > > unsigned long __unused[4]; > >+#endif > > }; > > > > /* Define sysctl names first */ > >diff --git a/include/uapi/linux/uhid.h b/include/uapi/linux/uhid.h > >index cef7534d2d19..4a774dbd3de8 100644 > >--- a/include/uapi/linux/uhid.h > >+++ b/include/uapi/linux/uhid.h > >@@ -130,7 +130,14 @@ struct uhid_create_req { > > __u8 name[128]; > > __u8 phys[64]; > > __u8 uniq[64]; > >+#if __riscv_xlen == 64 > >+ union { > >+ __u8 __user *rd_data; > >+ __u64 __rd_data; > >+ }; > >+#else > > __u8 __user *rd_data; > >+#endif > > __u16 rd_size; > > > > __u16 bus; > >diff --git a/include/uapi/linux/uio.h b/include/uapi/linux/uio.h > >index 649739e0c404..27dfd6032dc6 100644 > >--- a/include/uapi/linux/uio.h > >+++ b/include/uapi/linux/uio.h > >@@ -16,8 +16,19 @@ > > > > struct iovec > > { > >+#if __riscv_xlen == 64 > >+ union { > >+ void __user *iov_base; /* BSD uses caddr_t (1003.1g requires void *) */ > >+ __u64 __iov_base; > >+ }; > >+ union { > >+ __kernel_size_t iov_len; /* Must be size_t (1003.1g) */ > >+ __u64 __iov_len; > >+ }; > >+#else > > void __user *iov_base; /* BSD uses caddr_t (1003.1g requires void *) */ > > __kernel_size_t iov_len; /* Must be size_t (1003.1g) */ > >+#endif > > }; > > > > struct dmabuf_cmsg { > >diff --git a/include/uapi/linux/usb/tmc.h b/include/uapi/linux/usb/tmc.h > >index d791cc58a7f0..443ec5356caf 100644 > >--- a/include/uapi/linux/usb/tmc.h > >+++ b/include/uapi/linux/usb/tmc.h > >@@ -51,7 +51,14 @@ struct usbtmc_request { > > > > struct usbtmc_ctrlrequest { > > struct usbtmc_request req; > >+#if __riscv_xlen == 64 > >+ union { > >+ void __user *data; /* pointer to user space */ > >+ __u64 __data; /* pointer to user space */ > >+ }; > >+#else > > void __user *data; /* pointer to user space */ > >+#endif > > } __attribute__ ((packed)); > > > > struct usbtmc_termchar { > >@@ -70,7 +77,14 @@ struct usbtmc_message { > > __u32 transfer_size; /* size of bytes to transfer */ > > __u32 transferred; /* size of received/written bytes */ > > __u32 flags; /* bit 0: 0 = synchronous; 1 = asynchronous */ > >+#if __riscv_xlen == 64 > >+ union { > >+ void __user *message; /* pointer to header and data in user space */ > >+ __u64 __message; > >+ }; > >+#else > > void __user *message; /* pointer to header and data in user space */ > >+#endif > > } __attribute__ ((packed)); > > > > /* Request values for USBTMC driver's ioctl entry point */ > >diff --git a/include/uapi/linux/usbdevice_fs.h b/include/uapi/linux/usbdevice_fs.h > >index 74a84e02422a..8c8efef74c3c 100644 > >--- a/include/uapi/linux/usbdevice_fs.h > >+++ b/include/uapi/linux/usbdevice_fs.h > >@@ -44,14 +44,28 @@ struct usbdevfs_ctrltransfer { > > __u16 wIndex; > > __u16 wLength; > > __u32 timeout; /* in milliseconds */ > >+#if __riscv_xlen == 64 > >+ union { > >+ void __user *data; > >+ __u64 __data; > >+ }; > >+#else > > void __user *data; > >+#endif > > }; > > > > struct usbdevfs_bulktransfer { > > unsigned int ep; > > unsigned int len; > > unsigned int timeout; /* in milliseconds */ > >+#if __riscv_xlen == 64 > >+ union { > >+ void __user *data; > >+ __u64 __data; > >+ }; > >+#else > > void __user *data; > >+#endif > > }; > > > > struct usbdevfs_setinterface { > >@@ -61,7 +75,14 @@ struct usbdevfs_setinterface { > > > > struct usbdevfs_disconnectsignal { > > unsigned int signr; > >+#if __riscv_xlen == 64 > >+ union { > >+ void __user *context; > >+ __u64 __context; > >+ }; > >+#else > > void __user *context; > >+#endif > > }; > > > > #define USBDEVFS_MAXDRIVERNAME 255 > >@@ -119,7 +140,14 @@ struct usbdevfs_urb { > > unsigned char endpoint; > > int status; > > unsigned int flags; > >+#if __riscv_xlen == 64 > >+ union { > >+ void __user *buffer; > >+ __u64 __buffer; > >+ }; > >+#else > > void __user *buffer; > >+#endif > > int buffer_length; > > int actual_length; > > int start_frame; > >@@ -130,7 +158,14 @@ struct usbdevfs_urb { > > int error_count; > > unsigned int signr; /* signal to be sent on completion, > > or 0 if none should be sent. */ > >+#if __riscv_xlen == 64 > >+ union { > >+ void __user *usercontext; > >+ __u64 __usercontext; > >+ }; > >+#else > > void __user *usercontext; > >+#endif > > struct usbdevfs_iso_packet_desc iso_frame_desc[]; > > }; > > > >@@ -139,7 +174,14 @@ struct usbdevfs_ioctl { > > int ifno; /* interface 0..N ; negative numbers reserved */ > > int ioctl_code; /* MUST encode size + direction of data so the > > * macros in give correct values */ > >+#if __riscv_xlen == 64 > >+ union { > >+ void __user *data; /* param buffer (in, or out) */ > >+ __u64 __pad; > >+ }; > >+#else > > void __user *data; /* param buffer (in, or out) */ > >+#endif > > }; > > > > /* You can do most things with hubs just through control messages, > >@@ -195,9 +237,17 @@ struct usbdevfs_streams { > > #define USBDEVFS_SUBMITURB _IOR('U', 10, struct usbdevfs_urb) > > #define USBDEVFS_SUBMITURB32 _IOR('U', 10, struct usbdevfs_urb32) > > #define USBDEVFS_DISCARDURB _IO('U', 11) > >+#if __riscv_xlen == 64 > >+#define USBDEVFS_REAPURB _IOW('U', 12, __u64) > >+#else > > #define USBDEVFS_REAPURB _IOW('U', 12, void *) > >+#endif > > #define USBDEVFS_REAPURB32 _IOW('U', 12, __u32) > >+#if __riscv_xlen == 64 > >+#define USBDEVFS_REAPURBNDELAY _IOW('U', 13, __u64) > >+#else > > #define USBDEVFS_REAPURBNDELAY _IOW('U', 13, void *) > >+#endif > > #define USBDEVFS_REAPURBNDELAY32 _IOW('U', 13, __u32) > > #define USBDEVFS_DISCSIGNAL _IOR('U', 14, struct usbdevfs_disconnectsignal) > > #define USBDEVFS_DISCSIGNAL32 _IOR('U', 14, struct usbdevfs_disconnectsignal32) > >diff --git a/include/uapi/linux/uvcvideo.h b/include/uapi/linux/uvcvideo.h > >index f86185456dc5..3ccb99039a43 100644 > >--- a/include/uapi/linux/uvcvideo.h > >+++ b/include/uapi/linux/uvcvideo.h > >@@ -54,7 +54,14 @@ struct uvc_xu_control_mapping { > > __u32 v4l2_type; > > __u32 data_type; > > > >+#if __riscv_xlen == 64 > >+ union { > >+ struct uvc_menu_info __user *menu_info; > >+ __u64 __menu_info; > >+ }; > >+#else > > struct uvc_menu_info __user *menu_info; > >+#endif > > __u32 menu_count; > > > > __u32 reserved[4]; > >@@ -66,7 +73,14 @@ struct uvc_xu_control_query { > > __u8 query; /* Video Class-Specific Request Code, */ > > /* defined in linux/usb/video.h A.8. */ > > __u16 size; > >+#if __riscv_xlen == 64 > >+ union { > >+ __u8 __user *data; > >+ __u64 __data; > >+ }; > >+#else > > __u8 __user *data; > >+#endif > > }; > > > > #define UVCIOC_CTRL_MAP _IOWR('u', 0x20, struct uvc_xu_control_mapping) > >diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h > >index c8dbf8219c4f..0a1dc2a780fb 100644 > >--- a/include/uapi/linux/vfio.h > >+++ b/include/uapi/linux/vfio.h > >@@ -1570,7 +1570,14 @@ struct vfio_iommu_type1_dma_map { > > struct vfio_bitmap { > > __u64 pgsize; /* page size for bitmap in bytes */ > > __u64 size; /* in bytes */ > >+ #if __riscv_xlen == 64 > >+ union { > >+ __u64 __user *data; /* one bit per page */ > >+ __u64 __data; > >+ }; > >+ #else > > __u64 __user *data; /* one bit per page */ > >+ #endif > > }; > > > > /** > >diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h > >index e7c4dce39007..8e5391f07626 100644 > >--- a/include/uapi/linux/videodev2.h > >+++ b/include/uapi/linux/videodev2.h > >@@ -1898,7 +1898,14 @@ struct v4l2_ext_controls { > > __u32 error_idx; > > __s32 request_fd; > > __u32 reserved[1]; > >+#if __riscv_xlen == 64 > >+ union { > >+ struct v4l2_ext_control *controls; > >+ __u64 __controls; > >+ }; > >+#else > > struct v4l2_ext_control *controls; > >+#endif > > }; > > > > #define V4L2_CTRL_ID_MASK (0x0fffffff) > >-- > >2.40.1 > > > > -- Best Regards Guo Ren From guoren at kernel.org Tue Mar 25 23:07:47 2025 From: guoren at kernel.org (Guo Ren) Date: Wed, 26 Mar 2025 14:07:47 +0800 Subject: [RFC PATCH V3 00/43] rv64ilp32_abi: Build CONFIG_64BIT kernel-self with ILP32 ABI In-Reply-To: References: <20250325121624.523258-1-guoren@kernel.org> <20250325122640.GK36322@noisy.programming.kicks-ass.net> Message-ID: On Tue, Mar 25, 2025 at 9:18?PM Arnd Bergmann wrote: > > On Tue, Mar 25, 2025, at 13:26, Peter Zijlstra wrote: > > On Tue, Mar 25, 2025 at 08:15:41AM -0400, guoren at kernel.org wrote: > >> From: "Guo Ren (Alibaba DAMO Academy)" > >> > >> Since 2001, the CONFIG_64BIT kernel has been built with the LP64 ABI, > >> but this patchset allows the CONFIG_64BIT kernel to use an ILP32 ABI > > > > Please, don't do this. This adds a significant maintenance burden on all > > of us. > > It would be easier to this with CONFIG_64BIT disabled and continue > treating CONFIG_64BIT to be the same as BITS_PER_LONG=64, but I still > think it's fundamentally a bad idea to support this in mainline > kernels in any variation, other than supporting regular 32-bit > compat mode tasks on a regular 64-bit kernel. > > >> The patchset targets RISC-V and is built on the RV64ILP32 ABI, which > >> was introduced into RISC-V's psABI in January 2025 [1]. This patchset > >> equips an rv64ilp32-abi kernel with all the functionalities of a > >> traditional lp64-abi kernel, yet restricts the address space to 2GiB. > >> Hence, the rv64ilp32-abi kernel simultaneously supports lp64-abi > >> userspace and ilp32-abi (compat) userspace, the same as the > >> traditional lp64-abi kernel. > > You declare the syscall ABI to be the native 64-bit ABI, but this > is fundamentally not true because a many uapi structures are > defined in terms of 'long' or pointer values, in particular in > the ioctl call. I modified uapi with void __user *msg_name; -> union {void __user *msg_name; u64 __msg_name;}; to make native 64-bit ABI. I would look at compat stuff instead of using __riscv_xlen macro. > This might work for an rv64ilp32 userspace that > uses the same headers and the same types, but you explicitly > say that the goal is to run native rv64 or compat rv32 tasks, > not rv64ilp32 (thanks!). It's not for rv64ilp32-abi userspace, no rv64ilp32-abi userspace introduced in the patch set. It's for native lp64-abi. Let's discuss this in the first patch thread: uapi: Reuse lp64 ABI interface > > As far as I can tell, there is no way to rectify this design flaw > other than to drop support for 64-bit userspace and only support > regular rv32 userspace. I'm also skeptical that supporting rv64 > userspace helps in practice other than for testing, since > generally most memory overhead is in userspace rather than the > kernel, and there is much more to gain from shrinking the larger > userspace by running rv32 compat mode binaries on a 64-bit kernel > than the other way round. The lp64-abi userspace rootfs works fine in this patch set, which proves the technique is valid. But the modification on uapi is raw, and I'm looking at compat stuff. Supporting lp64-abi userspace is essential because riscv lp64-abi and ilp32-abi userspace are hybrid deployments when the target is ilp32-abi userspace. The lp64-abi provides a good supplement to ilp32-abi which eases the development. > > If you remove the CONFIG_64BIT changes that Peter mentioned and > the support for ilp64 userland from your series, you end up > with a kernel that is very similar to a native rv32 kernel > but executes as rv64ilp32 and runs rv32 userspace. I don't have > any objections to that approach, and the same thing has come > up on arm64 as a possible idea as well, but I don't know if > that actually brings any notable advantage over an rv32 kernel. > > Are there CPUs that can run rv64 kernels and rv32 userspace > but not rv32 kernels, similar to what we have on Arm Cortex-A76 > and Cortex-A510? Yes, there is, and it only supports rv32 userspace, not rv32 kernel. https://www.xrvm.com/product/xuantie/C908 Here are the products: https://developer.canaan-creative.com/k230_canmv/en/dev/userguide/boards/canmv_k230d.html http://riscv.org/ecosystem-news/2024/07/unpacking-the-canmv-k230-risc-v-board/ -- Best Regards Guo Ren From guoren at kernel.org Tue Mar 25 23:34:32 2025 From: guoren at kernel.org (Guo Ren) Date: Wed, 26 Mar 2025 14:34:32 +0800 Subject: [RFC PATCH V3 01/43] rv64ilp32_abi: uapi: Reuse lp64 ABI interface In-Reply-To: References: <20250325121624.523258-1-guoren@kernel.org> <20250325121624.523258-2-guoren@kernel.org> Message-ID: On Wed, Mar 26, 2025 at 4:41?AM Linus Torvalds wrote: > > On Tue, 25 Mar 2025 at 05:17, wrote: > > > > The rv64ilp32 abi kernel accommodates the lp64 abi userspace and > > leverages the lp64 abi Linux interface. Hence, unify the > > BITS_PER_LONG = 32 memory layout to match BITS_PER_LONG = 64. > > No. > > This isn't happening. > > You can't do crazy things in the RISC-V code and then expect the rest > of the kernel to just go "ok, we'll do crazy things". > > We're not doing crazy __riscv_xlen hackery with random structures > containing 64-bit values that the kernel then only looks at the low 32 > bits. That's wrong on *so* many levels. > > I'm willing to say "big-endian is dead", but I'm not willing to accept > this kind of crazy hackery. > > Not today, not ever. > > If you want to run a ilp32 kernel on 64-bit hardware (and support > 64-bit ABI just in a 32-bit virtual memory size), I would suggest you > > (a) treat the kernel as natively 32-bit (obviously you can then tell > the compiler to use the rv64 instructions, which I presume you're > already doing - I didn't look) I used CONFIG_32BIT in v1 and v2, but I've abandoned them because, based on CONFIG_64BIT, I gain more functionality by inheriting the lp64-abi kernel. I want the full functionality of the CONFIG_64BIT Linux kernel, which can be equivalent, used interchangeably, and seamlessly. > > (b) look at making the compat stuff do the conversion the "wrong way". > > And btw, that (b) implies *not* just ignoring the high bits. If > user-space gives 64-bit pointer, you don't just treat it as a 32-bit > one by dropping the high bits. You add some logic to convert it to an > invalid pointer so that user space gets -EFAULT. Thanks for the advice. I'm looking at how to make the compat stuff. -- Best Regards Guo Ren From arnd at arndb.de Tue Mar 25 23:55:17 2025 From: arnd at arndb.de (Arnd Bergmann) Date: Wed, 26 Mar 2025 07:55:17 +0100 Subject: [RFC PATCH V3 00/43] rv64ilp32_abi: Build CONFIG_64BIT kernel-self with ILP32 ABI In-Reply-To: References: <20250325121624.523258-1-guoren@kernel.org> <20250325122640.GK36322@noisy.programming.kicks-ass.net> Message-ID: On Wed, Mar 26, 2025, at 07:07, Guo Ren wrote: > On Tue, Mar 25, 2025 at 9:18?PM Arnd Bergmann wrote: >> On Tue, Mar 25, 2025, at 13:26, Peter Zijlstra wrote: >> > On Tue, Mar 25, 2025 at 08:15:41AM -0400, guoren at kernel.org wrote: >> >> You declare the syscall ABI to be the native 64-bit ABI, but this >> is fundamentally not true because a many uapi structures are >> defined in terms of 'long' or pointer values, in particular in >> the ioctl call. > > I modified uapi with > void __user *msg_name; > -> > union {void __user *msg_name; u64 __msg_name;}; > to make native 64-bit ABI. > > I would look at compat stuff instead of using __riscv_xlen macro. The problem I see here is that there are many more drivers that you did not modify than drivers that you did change this way. The union is particularly ugly, but even if you find a nicer method of doing this, you now also put the burden on future driver writers to do this right for your platform. >> As far as I can tell, there is no way to rectify this design flaw >> other than to drop support for 64-bit userspace and only support >> regular rv32 userspace. I'm also skeptical that supporting rv64 >> userspace helps in practice other than for testing, since >> generally most memory overhead is in userspace rather than the >> kernel, and there is much more to gain from shrinking the larger >> userspace by running rv32 compat mode binaries on a 64-bit kernel >> than the other way round. > > The lp64-abi userspace rootfs works fine in this patch set, which > proves the technique is valid. But the modification on uapi is raw, > and I'm looking at compat stuff. There is a big difference between making it work for a particular set of userspace binaries and making it correct for the entire kernel ABI. I agree that limiting the hacks to the compat side while keeping the native ABI as ilp32 as in your previous versions is better, but I also don't think this can be easily done without major changes to how compat mode works in general, and that still seems like a show-stopper for two reasons: - it still puts the burden on driver writers to get it right for your platform. The scope is a bit smaller than in the current version because that would be limited to the compat handlers and not change the native codepath, but that's still a lot of drivers. - the way that I would imagine this to be implemented in practice would require changing the compat code in a way that allows multiple compat ABIs, so drivers can separate the normal 32-on-64 handling from the 64-on-32 version you need. We have discussed something like this in the past, but Linus has already made it very clear that he doesn't want it done that way. Whichever way you do it, this is unlikely to find consensus. > Supporting lp64-abi userspace is essential because riscv lp64-abi and > ilp32-abi userspace are hybrid deployments when the target is > ilp32-abi userspace. The lp64-abi provides a good supplement to > ilp32-abi which eases the development. I'm not following here, please clarify. I do understand that having a mixed 32/64 userspace can help for development, but that can already be done on a 64-bit kernel and it doesn't seem to be useful for deployment because having two sets of support libraries makes this counterproductive for the goal of saving RAM. >> If you remove the CONFIG_64BIT changes that Peter mentioned and >> the support for ilp64 userland from your series, you end up >> with a kernel that is very similar to a native rv32 kernel >> but executes as rv64ilp32 and runs rv32 userspace. I don't have >> any objections to that approach, and the same thing has come >> up on arm64 as a possible idea as well, but I don't know if >> that actually brings any notable advantage over an rv32 kernel. >> >> Are there CPUs that can run rv64 kernels and rv32 userspace >> but not rv32 kernels, similar to what we have on Arm Cortex-A76 >> and Cortex-A510? > > Yes, there is, and it only supports rv32 userspace, not rv32 kernel. > https://www.xrvm.com/product/xuantie/C908 Ok, thanks for the link. Arnd From apatel at ventanamicro.com Tue Mar 25 23:56:34 2025 From: apatel at ventanamicro.com (Anup Patel) Date: Wed, 26 Mar 2025 12:26:34 +0530 Subject: [kvmtool PATCH 00/10] Add SBI system suspend and cpu-type option Message-ID: <20250326065644.73765-1-apatel@ventanamicro.com> This series does the following improvements: 1) Add Svvptc, Zabha, and Ziccrse extension support (PATCH2 to PATCH3) 2) Add SBI system suspend support (PATCH5 to PATCH6) 3) Add "--cpu-type" command-line option supporting "min" and "max" CPU types where "max" is the default (PATCH8 to PATCH10) These patches can also be found in the riscv_more_exts_round6_v1 branch at: https://github.com/avpatel/kvmtool.git Andrew Jones (3): riscv: Add SBI system suspend support riscv: Make system suspend time configurable riscv: Fix no params with nodefault segfault Anup Patel (7): Sync-up headers with Linux-6.14 kernel riscv: Add Svvptc extension support riscv: Add Zabha extension support riscv: Add Ziccrse extension support riscv: Include single-letter extensions in isa_info_arr[] riscv: Add cpu-type command-line option riscv: Allow including extensions in the min CPU type using command-line arm/aarch64/include/asm/kvm.h | 3 - include/linux/kvm.h | 8 +- include/linux/virtio_pci.h | 14 ++ riscv/aia.c | 2 +- riscv/fdt.c | 247 ++++++++++++++++++++-------- riscv/include/asm/kvm.h | 7 +- riscv/include/kvm/kvm-arch.h | 2 + riscv/include/kvm/kvm-config-arch.h | 21 +++ riscv/include/kvm/sbi.h | 9 + riscv/kvm-cpu.c | 36 ++++ riscv/kvm.c | 2 + x86/include/asm/kvm.h | 1 + 12 files changed, 268 insertions(+), 84 deletions(-) -- 2.43.0 From apatel at ventanamicro.com Tue Mar 25 23:56:35 2025 From: apatel at ventanamicro.com (Anup Patel) Date: Wed, 26 Mar 2025 12:26:35 +0530 Subject: [kvmtool PATCH 01/10] Sync-up headers with Linux-6.14 kernel In-Reply-To: <20250326065644.73765-1-apatel@ventanamicro.com> References: <20250326065644.73765-1-apatel@ventanamicro.com> Message-ID: <20250326065644.73765-2-apatel@ventanamicro.com> We sync-up Linux headers to get latest KVM RISC-V headers having newly added ISA extensions in ONE_REG interface. Signed-off-by: Anup Patel --- arm/aarch64/include/asm/kvm.h | 3 --- include/linux/kvm.h | 8 ++++---- include/linux/virtio_pci.h | 14 ++++++++++++++ riscv/include/asm/kvm.h | 7 ++++--- x86/include/asm/kvm.h | 1 + 5 files changed, 23 insertions(+), 10 deletions(-) diff --git a/arm/aarch64/include/asm/kvm.h b/arm/aarch64/include/asm/kvm.h index 66736ff..568bf85 100644 --- a/arm/aarch64/include/asm/kvm.h +++ b/arm/aarch64/include/asm/kvm.h @@ -43,9 +43,6 @@ #define KVM_COALESCED_MMIO_PAGE_OFFSET 1 #define KVM_DIRTY_LOG_PAGE_OFFSET 64 -#define KVM_REG_SIZE(id) \ - (1U << (((id) & KVM_REG_SIZE_MASK) >> KVM_REG_SIZE_SHIFT)) - struct kvm_regs { struct user_pt_regs regs; /* sp = sp_el0 */ diff --git a/include/linux/kvm.h b/include/linux/kvm.h index 502ea63..45e6d8f 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -617,10 +617,6 @@ struct kvm_ioeventfd { #define KVM_X86_DISABLE_EXITS_HLT (1 << 1) #define KVM_X86_DISABLE_EXITS_PAUSE (1 << 2) #define KVM_X86_DISABLE_EXITS_CSTATE (1 << 3) -#define KVM_X86_DISABLE_VALID_EXITS (KVM_X86_DISABLE_EXITS_MWAIT | \ - KVM_X86_DISABLE_EXITS_HLT | \ - KVM_X86_DISABLE_EXITS_PAUSE | \ - KVM_X86_DISABLE_EXITS_CSTATE) /* for KVM_ENABLE_CAP */ struct kvm_enable_cap { @@ -1070,6 +1066,10 @@ struct kvm_dirty_tlb { #define KVM_REG_SIZE_SHIFT 52 #define KVM_REG_SIZE_MASK 0x00f0000000000000ULL + +#define KVM_REG_SIZE(id) \ + (1U << (((id) & KVM_REG_SIZE_MASK) >> KVM_REG_SIZE_SHIFT)) + #define KVM_REG_SIZE_U8 0x0000000000000000ULL #define KVM_REG_SIZE_U16 0x0010000000000000ULL #define KVM_REG_SIZE_U32 0x0020000000000000ULL diff --git a/include/linux/virtio_pci.h b/include/linux/virtio_pci.h index 1beb317..8549d45 100644 --- a/include/linux/virtio_pci.h +++ b/include/linux/virtio_pci.h @@ -116,6 +116,8 @@ #define VIRTIO_PCI_CAP_PCI_CFG 5 /* Additional shared memory capability */ #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8 +/* PCI vendor data configuration */ +#define VIRTIO_PCI_CAP_VENDOR_CFG 9 /* This is the PCI capability header: */ struct virtio_pci_cap { @@ -130,6 +132,18 @@ struct virtio_pci_cap { __le32 length; /* Length of the structure, in bytes. */ }; +/* This is the PCI vendor data capability header: */ +struct virtio_pci_vndr_data { + __u8 cap_vndr; /* Generic PCI field: PCI_CAP_ID_VNDR */ + __u8 cap_next; /* Generic PCI field: next ptr. */ + __u8 cap_len; /* Generic PCI field: capability length */ + __u8 cfg_type; /* Identifies the structure. */ + __u16 vendor_id; /* Identifies the vendor-specific format. */ + /* For Vendor Definition */ + /* Pads structure to a multiple of 4 bytes */ + /* Reads must not have side effects */ +}; + struct virtio_pci_cap64 { struct virtio_pci_cap cap; __le32 offset_hi; /* Most sig 32 bits of offset */ diff --git a/riscv/include/asm/kvm.h b/riscv/include/asm/kvm.h index 3482c9a..f06bc5e 100644 --- a/riscv/include/asm/kvm.h +++ b/riscv/include/asm/kvm.h @@ -179,6 +179,9 @@ enum KVM_RISCV_ISA_EXT_ID { KVM_RISCV_ISA_EXT_SSNPM, KVM_RISCV_ISA_EXT_SVADE, KVM_RISCV_ISA_EXT_SVADU, + KVM_RISCV_ISA_EXT_SVVPTC, + KVM_RISCV_ISA_EXT_ZABHA, + KVM_RISCV_ISA_EXT_ZICCRSE, KVM_RISCV_ISA_EXT_MAX, }; @@ -198,6 +201,7 @@ enum KVM_RISCV_SBI_EXT_ID { KVM_RISCV_SBI_EXT_VENDOR, KVM_RISCV_SBI_EXT_DBCN, KVM_RISCV_SBI_EXT_STA, + KVM_RISCV_SBI_EXT_SUSP, KVM_RISCV_SBI_EXT_MAX, }; @@ -211,9 +215,6 @@ struct kvm_riscv_sbi_sta { #define KVM_RISCV_TIMER_STATE_OFF 0 #define KVM_RISCV_TIMER_STATE_ON 1 -#define KVM_REG_SIZE(id) \ - (1U << (((id) & KVM_REG_SIZE_MASK) >> KVM_REG_SIZE_SHIFT)) - /* If you need to interpret the index values, here is the key: */ #define KVM_REG_RISCV_TYPE_MASK 0x00000000FF000000 #define KVM_REG_RISCV_TYPE_SHIFT 24 diff --git a/x86/include/asm/kvm.h b/x86/include/asm/kvm.h index 88585c1..9e75da9 100644 --- a/x86/include/asm/kvm.h +++ b/x86/include/asm/kvm.h @@ -925,5 +925,6 @@ struct kvm_hyperv_eventfd { #define KVM_X86_SEV_VM 2 #define KVM_X86_SEV_ES_VM 3 #define KVM_X86_SNP_VM 4 +#define KVM_X86_TDX_VM 5 #endif /* _ASM_X86_KVM_H */ -- 2.43.0 From apatel at ventanamicro.com Tue Mar 25 23:56:36 2025 From: apatel at ventanamicro.com (Anup Patel) Date: Wed, 26 Mar 2025 12:26:36 +0530 Subject: [kvmtool PATCH 02/10] riscv: Add Svvptc extension support In-Reply-To: <20250326065644.73765-1-apatel@ventanamicro.com> References: <20250326065644.73765-1-apatel@ventanamicro.com> Message-ID: <20250326065644.73765-3-apatel@ventanamicro.com> When the Svvptc extension is available expose it to the guest via device tree so that guest can use it. Signed-off-by: Anup Patel --- riscv/fdt.c | 1 + riscv/include/kvm/kvm-config-arch.h | 3 +++ 2 files changed, 4 insertions(+) diff --git a/riscv/fdt.c b/riscv/fdt.c index 03113cc..c1e688d 100644 --- a/riscv/fdt.c +++ b/riscv/fdt.c @@ -27,6 +27,7 @@ struct isa_ext_info isa_info_arr[] = { {"svinval", KVM_RISCV_ISA_EXT_SVINVAL}, {"svnapot", KVM_RISCV_ISA_EXT_SVNAPOT}, {"svpbmt", KVM_RISCV_ISA_EXT_SVPBMT}, + {"svvptc", KVM_RISCV_ISA_EXT_SVVPTC}, {"zacas", KVM_RISCV_ISA_EXT_ZACAS}, {"zawrs", KVM_RISCV_ISA_EXT_ZAWRS}, {"zba", KVM_RISCV_ISA_EXT_ZBA}, diff --git a/riscv/include/kvm/kvm-config-arch.h b/riscv/include/kvm/kvm-config-arch.h index e56610b..ae01e14 100644 --- a/riscv/include/kvm/kvm-config-arch.h +++ b/riscv/include/kvm/kvm-config-arch.h @@ -58,6 +58,9 @@ struct kvm_config_arch { OPT_BOOLEAN('\0', "disable-svpbmt", \ &(cfg)->ext_disabled[KVM_RISCV_ISA_EXT_SVPBMT], \ "Disable Svpbmt Extension"), \ + OPT_BOOLEAN('\0', "disable-svvptc", \ + &(cfg)->ext_disabled[KVM_RISCV_ISA_EXT_SVVPTC], \ + "Disable Svvptc Extension"), \ OPT_BOOLEAN('\0', "disable-zacas", \ &(cfg)->ext_disabled[KVM_RISCV_ISA_EXT_ZACAS], \ "Disable Zacas Extension"), \ -- 2.43.0 From apatel at ventanamicro.com Tue Mar 25 23:56:37 2025 From: apatel at ventanamicro.com (Anup Patel) Date: Wed, 26 Mar 2025 12:26:37 +0530 Subject: [kvmtool PATCH 03/10] riscv: Add Zabha extension support In-Reply-To: <20250326065644.73765-1-apatel@ventanamicro.com> References: <20250326065644.73765-1-apatel@ventanamicro.com> Message-ID: <20250326065644.73765-4-apatel@ventanamicro.com> When the Zabha extension is available expose it to the guest via device tree so that guest can use it. Signed-off-by: Anup Patel --- riscv/fdt.c | 1 + riscv/include/kvm/kvm-config-arch.h | 3 +++ 2 files changed, 4 insertions(+) diff --git a/riscv/fdt.c b/riscv/fdt.c index c1e688d..ddd0b28 100644 --- a/riscv/fdt.c +++ b/riscv/fdt.c @@ -28,6 +28,7 @@ struct isa_ext_info isa_info_arr[] = { {"svnapot", KVM_RISCV_ISA_EXT_SVNAPOT}, {"svpbmt", KVM_RISCV_ISA_EXT_SVPBMT}, {"svvptc", KVM_RISCV_ISA_EXT_SVVPTC}, + {"zabha", KVM_RISCV_ISA_EXT_ZABHA}, {"zacas", KVM_RISCV_ISA_EXT_ZACAS}, {"zawrs", KVM_RISCV_ISA_EXT_ZAWRS}, {"zba", KVM_RISCV_ISA_EXT_ZBA}, diff --git a/riscv/include/kvm/kvm-config-arch.h b/riscv/include/kvm/kvm-config-arch.h index ae01e14..d86158d 100644 --- a/riscv/include/kvm/kvm-config-arch.h +++ b/riscv/include/kvm/kvm-config-arch.h @@ -61,6 +61,9 @@ struct kvm_config_arch { OPT_BOOLEAN('\0', "disable-svvptc", \ &(cfg)->ext_disabled[KVM_RISCV_ISA_EXT_SVVPTC], \ "Disable Svvptc Extension"), \ + OPT_BOOLEAN('\0', "disable-zabha", \ + &(cfg)->ext_disabled[KVM_RISCV_ISA_EXT_ZABHA], \ + "Disable Zabha Extension"), \ OPT_BOOLEAN('\0', "disable-zacas", \ &(cfg)->ext_disabled[KVM_RISCV_ISA_EXT_ZACAS], \ "Disable Zacas Extension"), \ -- 2.43.0 From apatel at ventanamicro.com Tue Mar 25 23:56:38 2025 From: apatel at ventanamicro.com (Anup Patel) Date: Wed, 26 Mar 2025 12:26:38 +0530 Subject: [kvmtool PATCH 04/10] riscv: Add Ziccrse extension support In-Reply-To: <20250326065644.73765-1-apatel@ventanamicro.com> References: <20250326065644.73765-1-apatel@ventanamicro.com> Message-ID: <20250326065644.73765-5-apatel@ventanamicro.com> When the Ziccrse extension is available expose it to the guest via device tree so that guest can use it. Signed-off-by: Anup Patel --- riscv/fdt.c | 1 + riscv/include/kvm/kvm-config-arch.h | 3 +++ 2 files changed, 4 insertions(+) diff --git a/riscv/fdt.c b/riscv/fdt.c index ddd0b28..3ee20a9 100644 --- a/riscv/fdt.c +++ b/riscv/fdt.c @@ -48,6 +48,7 @@ struct isa_ext_info isa_info_arr[] = { {"zfhmin", KVM_RISCV_ISA_EXT_ZFHMIN}, {"zicbom", KVM_RISCV_ISA_EXT_ZICBOM}, {"zicboz", KVM_RISCV_ISA_EXT_ZICBOZ}, + {"ziccrse", KVM_RISCV_ISA_EXT_ZICCRSE}, {"zicntr", KVM_RISCV_ISA_EXT_ZICNTR}, {"zicond", KVM_RISCV_ISA_EXT_ZICOND}, {"zicsr", KVM_RISCV_ISA_EXT_ZICSR}, diff --git a/riscv/include/kvm/kvm-config-arch.h b/riscv/include/kvm/kvm-config-arch.h index d86158d..5badb74 100644 --- a/riscv/include/kvm/kvm-config-arch.h +++ b/riscv/include/kvm/kvm-config-arch.h @@ -121,6 +121,9 @@ struct kvm_config_arch { OPT_BOOLEAN('\0', "disable-zicboz", \ &(cfg)->ext_disabled[KVM_RISCV_ISA_EXT_ZICBOZ], \ "Disable Zicboz Extension"), \ + OPT_BOOLEAN('\0', "disable-ziccrse", \ + &(cfg)->ext_disabled[KVM_RISCV_ISA_EXT_ZICCRSE], \ + "Disable Ziccrse Extension"), \ OPT_BOOLEAN('\0', "disable-zicntr", \ &(cfg)->ext_disabled[KVM_RISCV_ISA_EXT_ZICNTR], \ "Disable Zicntr Extension"), \ -- 2.43.0 From apatel at ventanamicro.com Tue Mar 25 23:56:39 2025 From: apatel at ventanamicro.com (Anup Patel) Date: Wed, 26 Mar 2025 12:26:39 +0530 Subject: [kvmtool PATCH 05/10] riscv: Add SBI system suspend support In-Reply-To: <20250326065644.73765-1-apatel@ventanamicro.com> References: <20250326065644.73765-1-apatel@ventanamicro.com> Message-ID: <20250326065644.73765-6-apatel@ventanamicro.com> From: Andrew Jones Provide a handler for SBI system suspend requests forwarded to the VMM by KVM and enable system suspend by default. This is just for testing purposes, so the handler only sleeps 5 seconds before resuming the guest. Resuming a system suspend only requires running the system suspend invoking hart again, everything else is handled by KVM. Note, resuming after a suspend doesn't work with 9p used for the guest's rootfs. Testing with a disk works though, e.g. lkvm-static run --nodefaults -m 256 -c 2 -d rootfs.ext2 -k Image \ -p 'console=ttyS0 rootwait root=/dev/vda ro no_console_suspend' Signed-off-by: Andrew Jones Signed-off-by: Anup Patel --- riscv/include/kvm/kvm-config-arch.h | 3 +++ riscv/include/kvm/sbi.h | 9 ++++++++ riscv/kvm-cpu.c | 36 +++++++++++++++++++++++++++++ 3 files changed, 48 insertions(+) diff --git a/riscv/include/kvm/kvm-config-arch.h b/riscv/include/kvm/kvm-config-arch.h index 5badb74..0553004 100644 --- a/riscv/include/kvm/kvm-config-arch.h +++ b/riscv/include/kvm/kvm-config-arch.h @@ -238,6 +238,9 @@ struct kvm_config_arch { OPT_BOOLEAN('\0', "disable-sbi-dbcn", \ &(cfg)->sbi_ext_disabled[KVM_RISCV_SBI_EXT_DBCN], \ "Disable SBI DBCN Extension"), \ + OPT_BOOLEAN('\0', "disable-sbi-susp", \ + &(cfg)->sbi_ext_disabled[KVM_RISCV_SBI_EXT_SUSP], \ + "Disable SBI SUSP Extension"), \ OPT_BOOLEAN('\0', "disable-sbi-sta", \ &(cfg)->sbi_ext_disabled[KVM_RISCV_SBI_EXT_STA], \ "Disable SBI STA Extension"), diff --git a/riscv/include/kvm/sbi.h b/riscv/include/kvm/sbi.h index a0f2c70..10bbe1b 100644 --- a/riscv/include/kvm/sbi.h +++ b/riscv/include/kvm/sbi.h @@ -21,6 +21,7 @@ enum sbi_ext_id { SBI_EXT_0_1_SHUTDOWN = 0x8, SBI_EXT_BASE = 0x10, SBI_EXT_DBCN = 0x4442434E, + SBI_EXT_SUSP = 0x53555350, }; enum sbi_ext_base_fid { @@ -39,6 +40,14 @@ enum sbi_ext_dbcn_fid { SBI_EXT_DBCN_CONSOLE_WRITE_BYTE = 2, }; +enum sbi_ext_susp_fid { + SBI_EXT_SUSP_SYSTEM_SUSPEND = 0, +}; + +enum sbi_ext_susp_sleep_type { + SBI_SUSP_SLEEP_TYPE_SUSPEND_TO_RAM = 0, +}; + #define SBI_SPEC_VERSION_DEFAULT 0x1 #define SBI_SPEC_VERSION_MAJOR_OFFSET 24 #define SBI_SPEC_VERSION_MAJOR_MASK 0x7f diff --git a/riscv/kvm-cpu.c b/riscv/kvm-cpu.c index 0c171da..ad68b58 100644 --- a/riscv/kvm-cpu.c +++ b/riscv/kvm-cpu.c @@ -1,3 +1,5 @@ +#include + #include "kvm/csr.h" #include "kvm/kvm-cpu.h" #include "kvm/kvm.h" @@ -117,6 +119,17 @@ struct kvm_cpu *kvm_cpu__arch_init(struct kvm *kvm, unsigned long cpu_id) KVM_RISCV_SBI_EXT_DBCN); } + /* Force enable SBI system suspend if not disabled from command line */ + if (!kvm->cfg.arch.sbi_ext_disabled[KVM_RISCV_SBI_EXT_SUSP]) { + id = 1; + reg.id = RISCV_SBI_EXT_REG(KVM_REG_RISCV_SBI_SINGLE, + KVM_RISCV_SBI_EXT_SUSP); + reg.addr = (unsigned long)&id; + if (ioctl(vcpu->vcpu_fd, KVM_SET_ONE_REG, ®) < 0) + pr_warning("KVM_SET_ONE_REG failed (sbi_ext %d)", + KVM_RISCV_SBI_EXT_SUSP); + } + /* Populate the vcpu structure. */ vcpu->kvm = kvm; vcpu->cpu_id = cpu_id; @@ -203,6 +216,29 @@ static bool kvm_cpu_riscv_sbi(struct kvm_cpu *vcpu) break; } break; + case SBI_EXT_SUSP: + { + unsigned long susp_type, ret = SBI_SUCCESS; + + switch (vcpu->kvm_run->riscv_sbi.function_id) { + case SBI_EXT_SUSP_SYSTEM_SUSPEND: + susp_type = vcpu->kvm_run->riscv_sbi.args[0]; + if (susp_type != SBI_SUSP_SLEEP_TYPE_SUSPEND_TO_RAM) { + ret = SBI_ERR_INVALID_PARAM; + break; + } + + sleep(5); + + break; + default: + ret = SBI_ERR_NOT_SUPPORTED; + break; + } + + vcpu->kvm_run->riscv_sbi.ret[0] = ret; + break; + } default: dprintf(dfd, "Unhandled SBI call\n"); dprintf(dfd, "extension_id=0x%lx function_id=0x%lx\n", -- 2.43.0 From apatel at ventanamicro.com Tue Mar 25 23:56:40 2025 From: apatel at ventanamicro.com (Anup Patel) Date: Wed, 26 Mar 2025 12:26:40 +0530 Subject: [kvmtool PATCH 06/10] riscv: Make system suspend time configurable In-Reply-To: <20250326065644.73765-1-apatel@ventanamicro.com> References: <20250326065644.73765-1-apatel@ventanamicro.com> Message-ID: <20250326065644.73765-7-apatel@ventanamicro.com> From: Andrew Jones If the default of 5 seconds for a system suspend test is too long or too short, then feel free to change it. Signed-off-by: Andrew Jones Signed-off-by: Anup Patel --- riscv/include/kvm/kvm-config-arch.h | 4 ++++ riscv/kvm-cpu.c | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/riscv/include/kvm/kvm-config-arch.h b/riscv/include/kvm/kvm-config-arch.h index 0553004..7e54d8a 100644 --- a/riscv/include/kvm/kvm-config-arch.h +++ b/riscv/include/kvm/kvm-config-arch.h @@ -5,6 +5,7 @@ struct kvm_config_arch { const char *dump_dtb_filename; + u64 suspend_seconds; u64 custom_mvendorid; u64 custom_marchid; u64 custom_mimpid; @@ -16,6 +17,9 @@ struct kvm_config_arch { pfx, \ OPT_STRING('\0', "dump-dtb", &(cfg)->dump_dtb_filename, \ ".dtb file", "Dump generated .dtb to specified file"),\ + OPT_U64('\0', "suspend-seconds", \ + &(cfg)->suspend_seconds, \ + "Number of seconds to suspend for system suspend (default is 5)"), \ OPT_U64('\0', "custom-mvendorid", \ &(cfg)->custom_mvendorid, \ "Show custom mvendorid to Guest VCPU"), \ diff --git a/riscv/kvm-cpu.c b/riscv/kvm-cpu.c index ad68b58..7a86d71 100644 --- a/riscv/kvm-cpu.c +++ b/riscv/kvm-cpu.c @@ -228,7 +228,7 @@ static bool kvm_cpu_riscv_sbi(struct kvm_cpu *vcpu) break; } - sleep(5); + sleep(vcpu->kvm->cfg.arch.suspend_seconds ? : 5); break; default: -- 2.43.0 From apatel at ventanamicro.com Tue Mar 25 23:56:41 2025 From: apatel at ventanamicro.com (Anup Patel) Date: Wed, 26 Mar 2025 12:26:41 +0530 Subject: [kvmtool PATCH 07/10] riscv: Fix no params with nodefault segfault In-Reply-To: <20250326065644.73765-1-apatel@ventanamicro.com> References: <20250326065644.73765-1-apatel@ventanamicro.com> Message-ID: <20250326065644.73765-8-apatel@ventanamicro.com> From: Andrew Jones Fix segfault received when using --nodefault without --params. Fixes: 7c9aac003925 ("riscv: Generate FDT at runtime for Guest/VM") Suggested-by: Alexandru Elisei Signed-off-by: Andrew Jones Reviewed-by: Alexandru Elisei Link: https://lore.kernel.org/r/20250123151339.185908-2-ajones at ventanamicro.com Signed-off-by: Anup Patel --- riscv/fdt.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/riscv/fdt.c b/riscv/fdt.c index 3ee20a9..251821e 100644 --- a/riscv/fdt.c +++ b/riscv/fdt.c @@ -263,9 +263,10 @@ static int setup_fdt(struct kvm *kvm) if (kvm->cfg.kernel_cmdline) _FDT(fdt_property_string(fdt, "bootargs", kvm->cfg.kernel_cmdline)); - } else + } else if (kvm->cfg.real_cmdline) { _FDT(fdt_property_string(fdt, "bootargs", kvm->cfg.real_cmdline)); + } _FDT(fdt_property_string(fdt, "stdout-path", "serial0")); -- 2.43.0 From apatel at ventanamicro.com Tue Mar 25 23:56:42 2025 From: apatel at ventanamicro.com (Anup Patel) Date: Wed, 26 Mar 2025 12:26:42 +0530 Subject: [kvmtool PATCH 08/10] riscv: Include single-letter extensions in isa_info_arr[] In-Reply-To: <20250326065644.73765-1-apatel@ventanamicro.com> References: <20250326065644.73765-1-apatel@ventanamicro.com> Message-ID: <20250326065644.73765-9-apatel@ventanamicro.com> Currently, the isa_info_arr[] only include multi-letter extensions but the KVM ONE_REG interface covers both single-letter and multi-letter extensions so extend isa_info_arr[] to include single-letter extensions. Signed-off-by: Anup Patel --- riscv/fdt.c | 138 +++++++++++++++++++++++++++++----------------------- 1 file changed, 76 insertions(+), 62 deletions(-) diff --git a/riscv/fdt.c b/riscv/fdt.c index 251821e..46efb47 100644 --- a/riscv/fdt.c +++ b/riscv/fdt.c @@ -12,71 +12,81 @@ struct isa_ext_info { const char *name; unsigned long ext_id; + bool multi_letter; }; struct isa_ext_info isa_info_arr[] = { - /* sorted alphabetically */ - {"smnpm", KVM_RISCV_ISA_EXT_SMNPM}, - {"smstateen", KVM_RISCV_ISA_EXT_SMSTATEEN}, - {"ssaia", KVM_RISCV_ISA_EXT_SSAIA}, - {"sscofpmf", KVM_RISCV_ISA_EXT_SSCOFPMF}, - {"ssnpm", KVM_RISCV_ISA_EXT_SSNPM}, - {"sstc", KVM_RISCV_ISA_EXT_SSTC}, - {"svade", KVM_RISCV_ISA_EXT_SVADE}, - {"svadu", KVM_RISCV_ISA_EXT_SVADU}, - {"svinval", KVM_RISCV_ISA_EXT_SVINVAL}, - {"svnapot", KVM_RISCV_ISA_EXT_SVNAPOT}, - {"svpbmt", KVM_RISCV_ISA_EXT_SVPBMT}, - {"svvptc", KVM_RISCV_ISA_EXT_SVVPTC}, - {"zabha", KVM_RISCV_ISA_EXT_ZABHA}, - {"zacas", KVM_RISCV_ISA_EXT_ZACAS}, - {"zawrs", KVM_RISCV_ISA_EXT_ZAWRS}, - {"zba", KVM_RISCV_ISA_EXT_ZBA}, - {"zbb", KVM_RISCV_ISA_EXT_ZBB}, - {"zbc", KVM_RISCV_ISA_EXT_ZBC}, - {"zbkb", KVM_RISCV_ISA_EXT_ZBKB}, - {"zbkc", KVM_RISCV_ISA_EXT_ZBKC}, - {"zbkx", KVM_RISCV_ISA_EXT_ZBKX}, - {"zbs", KVM_RISCV_ISA_EXT_ZBS}, - {"zca", KVM_RISCV_ISA_EXT_ZCA}, - {"zcb", KVM_RISCV_ISA_EXT_ZCB}, - {"zcd", KVM_RISCV_ISA_EXT_ZCD}, - {"zcf", KVM_RISCV_ISA_EXT_ZCF}, - {"zcmop", KVM_RISCV_ISA_EXT_ZCMOP}, - {"zfa", KVM_RISCV_ISA_EXT_ZFA}, - {"zfh", KVM_RISCV_ISA_EXT_ZFH}, - {"zfhmin", KVM_RISCV_ISA_EXT_ZFHMIN}, - {"zicbom", KVM_RISCV_ISA_EXT_ZICBOM}, - {"zicboz", KVM_RISCV_ISA_EXT_ZICBOZ}, - {"ziccrse", KVM_RISCV_ISA_EXT_ZICCRSE}, - {"zicntr", KVM_RISCV_ISA_EXT_ZICNTR}, - {"zicond", KVM_RISCV_ISA_EXT_ZICOND}, - {"zicsr", KVM_RISCV_ISA_EXT_ZICSR}, - {"zifencei", KVM_RISCV_ISA_EXT_ZIFENCEI}, - {"zihintntl", KVM_RISCV_ISA_EXT_ZIHINTNTL}, - {"zihintpause", KVM_RISCV_ISA_EXT_ZIHINTPAUSE}, - {"zihpm", KVM_RISCV_ISA_EXT_ZIHPM}, - {"zimop", KVM_RISCV_ISA_EXT_ZIMOP}, - {"zknd", KVM_RISCV_ISA_EXT_ZKND}, - {"zkne", KVM_RISCV_ISA_EXT_ZKNE}, - {"zknh", KVM_RISCV_ISA_EXT_ZKNH}, - {"zkr", KVM_RISCV_ISA_EXT_ZKR}, - {"zksed", KVM_RISCV_ISA_EXT_ZKSED}, - {"zksh", KVM_RISCV_ISA_EXT_ZKSH}, - {"zkt", KVM_RISCV_ISA_EXT_ZKT}, - {"ztso", KVM_RISCV_ISA_EXT_ZTSO}, - {"zvbb", KVM_RISCV_ISA_EXT_ZVBB}, - {"zvbc", KVM_RISCV_ISA_EXT_ZVBC}, - {"zvfh", KVM_RISCV_ISA_EXT_ZVFH}, - {"zvfhmin", KVM_RISCV_ISA_EXT_ZVFHMIN}, - {"zvkb", KVM_RISCV_ISA_EXT_ZVKB}, - {"zvkg", KVM_RISCV_ISA_EXT_ZVKG}, - {"zvkned", KVM_RISCV_ISA_EXT_ZVKNED}, - {"zvknha", KVM_RISCV_ISA_EXT_ZVKNHA}, - {"zvknhb", KVM_RISCV_ISA_EXT_ZVKNHB}, - {"zvksed", KVM_RISCV_ISA_EXT_ZVKSED}, - {"zvksh", KVM_RISCV_ISA_EXT_ZVKSH}, - {"zvkt", KVM_RISCV_ISA_EXT_ZVKT}, + /* single-letter */ + {"a", KVM_RISCV_ISA_EXT_A, false}, + {"c", KVM_RISCV_ISA_EXT_C, false}, + {"d", KVM_RISCV_ISA_EXT_D, false}, + {"f", KVM_RISCV_ISA_EXT_F, false}, + {"h", KVM_RISCV_ISA_EXT_H, false}, + {"i", KVM_RISCV_ISA_EXT_I, false}, + {"m", KVM_RISCV_ISA_EXT_M, false}, + {"v", KVM_RISCV_ISA_EXT_V, false}, + /* multi-letter sorted alphabetically */ + {"smnpm", KVM_RISCV_ISA_EXT_SMNPM, true}, + {"smstateen", KVM_RISCV_ISA_EXT_SMSTATEEN, true}, + {"ssaia", KVM_RISCV_ISA_EXT_SSAIA, true}, + {"sscofpmf", KVM_RISCV_ISA_EXT_SSCOFPMF, true}, + {"ssnpm", KVM_RISCV_ISA_EXT_SSNPM, true}, + {"sstc", KVM_RISCV_ISA_EXT_SSTC, true}, + {"svade", KVM_RISCV_ISA_EXT_SVADE, true}, + {"svadu", KVM_RISCV_ISA_EXT_SVADU, true}, + {"svinval", KVM_RISCV_ISA_EXT_SVINVAL, true}, + {"svnapot", KVM_RISCV_ISA_EXT_SVNAPOT, true}, + {"svpbmt", KVM_RISCV_ISA_EXT_SVPBMT, true}, + {"svvptc", KVM_RISCV_ISA_EXT_SVVPTC, true}, + {"zabha", KVM_RISCV_ISA_EXT_ZABHA, true}, + {"zacas", KVM_RISCV_ISA_EXT_ZACAS, true}, + {"zawrs", KVM_RISCV_ISA_EXT_ZAWRS, true}, + {"zba", KVM_RISCV_ISA_EXT_ZBA, true}, + {"zbb", KVM_RISCV_ISA_EXT_ZBB, true}, + {"zbc", KVM_RISCV_ISA_EXT_ZBC, true}, + {"zbkb", KVM_RISCV_ISA_EXT_ZBKB, true}, + {"zbkc", KVM_RISCV_ISA_EXT_ZBKC, true}, + {"zbkx", KVM_RISCV_ISA_EXT_ZBKX, true}, + {"zbs", KVM_RISCV_ISA_EXT_ZBS, true}, + {"zca", KVM_RISCV_ISA_EXT_ZCA, true}, + {"zcb", KVM_RISCV_ISA_EXT_ZCB, true}, + {"zcd", KVM_RISCV_ISA_EXT_ZCD, true}, + {"zcf", KVM_RISCV_ISA_EXT_ZCF, true}, + {"zcmop", KVM_RISCV_ISA_EXT_ZCMOP, true}, + {"zfa", KVM_RISCV_ISA_EXT_ZFA, true}, + {"zfh", KVM_RISCV_ISA_EXT_ZFH, true}, + {"zfhmin", KVM_RISCV_ISA_EXT_ZFHMIN, true}, + {"zicbom", KVM_RISCV_ISA_EXT_ZICBOM, true}, + {"zicboz", KVM_RISCV_ISA_EXT_ZICBOZ, true}, + {"ziccrse", KVM_RISCV_ISA_EXT_ZICCRSE, true}, + {"zicntr", KVM_RISCV_ISA_EXT_ZICNTR, true}, + {"zicond", KVM_RISCV_ISA_EXT_ZICOND, true}, + {"zicsr", KVM_RISCV_ISA_EXT_ZICSR, true}, + {"zifencei", KVM_RISCV_ISA_EXT_ZIFENCEI, true}, + {"zihintntl", KVM_RISCV_ISA_EXT_ZIHINTNTL, true}, + {"zihintpause", KVM_RISCV_ISA_EXT_ZIHINTPAUSE, true}, + {"zihpm", KVM_RISCV_ISA_EXT_ZIHPM, true}, + {"zimop", KVM_RISCV_ISA_EXT_ZIMOP, true}, + {"zknd", KVM_RISCV_ISA_EXT_ZKND, true}, + {"zkne", KVM_RISCV_ISA_EXT_ZKNE, true}, + {"zknh", KVM_RISCV_ISA_EXT_ZKNH, true}, + {"zkr", KVM_RISCV_ISA_EXT_ZKR, true}, + {"zksed", KVM_RISCV_ISA_EXT_ZKSED, true}, + {"zksh", KVM_RISCV_ISA_EXT_ZKSH, true}, + {"zkt", KVM_RISCV_ISA_EXT_ZKT, true}, + {"ztso", KVM_RISCV_ISA_EXT_ZTSO, true}, + {"zvbb", KVM_RISCV_ISA_EXT_ZVBB, true}, + {"zvbc", KVM_RISCV_ISA_EXT_ZVBC, true}, + {"zvfh", KVM_RISCV_ISA_EXT_ZVFH, true}, + {"zvfhmin", KVM_RISCV_ISA_EXT_ZVFHMIN, true}, + {"zvkb", KVM_RISCV_ISA_EXT_ZVKB, true}, + {"zvkg", KVM_RISCV_ISA_EXT_ZVKG, true}, + {"zvkned", KVM_RISCV_ISA_EXT_ZVKNED, true}, + {"zvknha", KVM_RISCV_ISA_EXT_ZVKNHA, true}, + {"zvknhb", KVM_RISCV_ISA_EXT_ZVKNHB, true}, + {"zvksed", KVM_RISCV_ISA_EXT_ZVKSED, true}, + {"zvksh", KVM_RISCV_ISA_EXT_ZVKSH, true}, + {"zvkt", KVM_RISCV_ISA_EXT_ZVKT, true}, }; static void dump_fdt(const char *dtb_file, void *fdt) @@ -129,6 +139,10 @@ static void generate_cpu_nodes(void *fdt, struct kvm *kvm) } for (i = 0; i < arr_sz; i++) { + /* Skip single-letter extensions since these are taken care */ + if (!isa_info_arr[i].multi_letter) + continue; + reg.id = RISCV_ISA_EXT_REG(isa_info_arr[i].ext_id); reg.addr = (unsigned long)&isa_ext_out; if (ioctl(vcpu->vcpu_fd, KVM_GET_ONE_REG, ®) < 0) -- 2.43.0 From apatel at ventanamicro.com Tue Mar 25 23:56:43 2025 From: apatel at ventanamicro.com (Anup Patel) Date: Wed, 26 Mar 2025 12:26:43 +0530 Subject: [kvmtool PATCH 09/10] riscv: Add cpu-type command-line option In-Reply-To: <20250326065644.73765-1-apatel@ventanamicro.com> References: <20250326065644.73765-1-apatel@ventanamicro.com> Message-ID: <20250326065644.73765-10-apatel@ventanamicro.com> Currently, the KVMTOOL always creates a VM with all available ISA extensions virtualized by the in-kernel KVM module. For better flexibility, add cpu-type command-line option using which users can select one of the available CPU types for VM. There are two CPU types supported at the moment namely "min" and "max". The "min" CPU type implies VCPU with rv[64|32]imafdc ISA whereas the "max" CPU type implies VCPU with all available ISA extensions. Signed-off-by: Anup Patel --- riscv/aia.c | 2 +- riscv/fdt.c | 220 +++++++++++++++++----------- riscv/include/kvm/kvm-arch.h | 2 + riscv/include/kvm/kvm-config-arch.h | 5 + riscv/kvm.c | 2 + 5 files changed, 143 insertions(+), 88 deletions(-) diff --git a/riscv/aia.c b/riscv/aia.c index 21d9704..cad53d4 100644 --- a/riscv/aia.c +++ b/riscv/aia.c @@ -209,7 +209,7 @@ void aia__create(struct kvm *kvm) .flags = 0, }; - if (kvm->cfg.arch.ext_disabled[KVM_RISCV_ISA_EXT_SSAIA]) + if (riscv__isa_extension_disabled(kvm, KVM_RISCV_ISA_EXT_SSAIA)) return; err = ioctl(kvm->vm_fd, KVM_CREATE_DEVICE, &aia_device); diff --git a/riscv/fdt.c b/riscv/fdt.c index 46efb47..4c018c8 100644 --- a/riscv/fdt.c +++ b/riscv/fdt.c @@ -13,82 +13,134 @@ struct isa_ext_info { const char *name; unsigned long ext_id; bool multi_letter; + bool min_cpu_included; }; struct isa_ext_info isa_info_arr[] = { - /* single-letter */ - {"a", KVM_RISCV_ISA_EXT_A, false}, - {"c", KVM_RISCV_ISA_EXT_C, false}, - {"d", KVM_RISCV_ISA_EXT_D, false}, - {"f", KVM_RISCV_ISA_EXT_F, false}, - {"h", KVM_RISCV_ISA_EXT_H, false}, - {"i", KVM_RISCV_ISA_EXT_I, false}, - {"m", KVM_RISCV_ISA_EXT_M, false}, - {"v", KVM_RISCV_ISA_EXT_V, false}, + /* single-letter ordered canonically as "IEMAFDQCLBJTPVNSUHKORWXYZG" */ + {"i", KVM_RISCV_ISA_EXT_I, false, true}, + {"m", KVM_RISCV_ISA_EXT_M, false, true}, + {"a", KVM_RISCV_ISA_EXT_A, false, true}, + {"f", KVM_RISCV_ISA_EXT_F, false, true}, + {"d", KVM_RISCV_ISA_EXT_D, false, true}, + {"c", KVM_RISCV_ISA_EXT_C, false, true}, + {"v", KVM_RISCV_ISA_EXT_V, false, false}, + {"h", KVM_RISCV_ISA_EXT_H, false, false}, /* multi-letter sorted alphabetically */ - {"smnpm", KVM_RISCV_ISA_EXT_SMNPM, true}, - {"smstateen", KVM_RISCV_ISA_EXT_SMSTATEEN, true}, - {"ssaia", KVM_RISCV_ISA_EXT_SSAIA, true}, - {"sscofpmf", KVM_RISCV_ISA_EXT_SSCOFPMF, true}, - {"ssnpm", KVM_RISCV_ISA_EXT_SSNPM, true}, - {"sstc", KVM_RISCV_ISA_EXT_SSTC, true}, - {"svade", KVM_RISCV_ISA_EXT_SVADE, true}, - {"svadu", KVM_RISCV_ISA_EXT_SVADU, true}, - {"svinval", KVM_RISCV_ISA_EXT_SVINVAL, true}, - {"svnapot", KVM_RISCV_ISA_EXT_SVNAPOT, true}, - {"svpbmt", KVM_RISCV_ISA_EXT_SVPBMT, true}, - {"svvptc", KVM_RISCV_ISA_EXT_SVVPTC, true}, - {"zabha", KVM_RISCV_ISA_EXT_ZABHA, true}, - {"zacas", KVM_RISCV_ISA_EXT_ZACAS, true}, - {"zawrs", KVM_RISCV_ISA_EXT_ZAWRS, true}, - {"zba", KVM_RISCV_ISA_EXT_ZBA, true}, - {"zbb", KVM_RISCV_ISA_EXT_ZBB, true}, - {"zbc", KVM_RISCV_ISA_EXT_ZBC, true}, - {"zbkb", KVM_RISCV_ISA_EXT_ZBKB, true}, - {"zbkc", KVM_RISCV_ISA_EXT_ZBKC, true}, - {"zbkx", KVM_RISCV_ISA_EXT_ZBKX, true}, - {"zbs", KVM_RISCV_ISA_EXT_ZBS, true}, - {"zca", KVM_RISCV_ISA_EXT_ZCA, true}, - {"zcb", KVM_RISCV_ISA_EXT_ZCB, true}, - {"zcd", KVM_RISCV_ISA_EXT_ZCD, true}, - {"zcf", KVM_RISCV_ISA_EXT_ZCF, true}, - {"zcmop", KVM_RISCV_ISA_EXT_ZCMOP, true}, - {"zfa", KVM_RISCV_ISA_EXT_ZFA, true}, - {"zfh", KVM_RISCV_ISA_EXT_ZFH, true}, - {"zfhmin", KVM_RISCV_ISA_EXT_ZFHMIN, true}, - {"zicbom", KVM_RISCV_ISA_EXT_ZICBOM, true}, - {"zicboz", KVM_RISCV_ISA_EXT_ZICBOZ, true}, - {"ziccrse", KVM_RISCV_ISA_EXT_ZICCRSE, true}, - {"zicntr", KVM_RISCV_ISA_EXT_ZICNTR, true}, - {"zicond", KVM_RISCV_ISA_EXT_ZICOND, true}, - {"zicsr", KVM_RISCV_ISA_EXT_ZICSR, true}, - {"zifencei", KVM_RISCV_ISA_EXT_ZIFENCEI, true}, - {"zihintntl", KVM_RISCV_ISA_EXT_ZIHINTNTL, true}, - {"zihintpause", KVM_RISCV_ISA_EXT_ZIHINTPAUSE, true}, - {"zihpm", KVM_RISCV_ISA_EXT_ZIHPM, true}, - {"zimop", KVM_RISCV_ISA_EXT_ZIMOP, true}, - {"zknd", KVM_RISCV_ISA_EXT_ZKND, true}, - {"zkne", KVM_RISCV_ISA_EXT_ZKNE, true}, - {"zknh", KVM_RISCV_ISA_EXT_ZKNH, true}, - {"zkr", KVM_RISCV_ISA_EXT_ZKR, true}, - {"zksed", KVM_RISCV_ISA_EXT_ZKSED, true}, - {"zksh", KVM_RISCV_ISA_EXT_ZKSH, true}, - {"zkt", KVM_RISCV_ISA_EXT_ZKT, true}, - {"ztso", KVM_RISCV_ISA_EXT_ZTSO, true}, - {"zvbb", KVM_RISCV_ISA_EXT_ZVBB, true}, - {"zvbc", KVM_RISCV_ISA_EXT_ZVBC, true}, - {"zvfh", KVM_RISCV_ISA_EXT_ZVFH, true}, - {"zvfhmin", KVM_RISCV_ISA_EXT_ZVFHMIN, true}, - {"zvkb", KVM_RISCV_ISA_EXT_ZVKB, true}, - {"zvkg", KVM_RISCV_ISA_EXT_ZVKG, true}, - {"zvkned", KVM_RISCV_ISA_EXT_ZVKNED, true}, - {"zvknha", KVM_RISCV_ISA_EXT_ZVKNHA, true}, - {"zvknhb", KVM_RISCV_ISA_EXT_ZVKNHB, true}, - {"zvksed", KVM_RISCV_ISA_EXT_ZVKSED, true}, - {"zvksh", KVM_RISCV_ISA_EXT_ZVKSH, true}, - {"zvkt", KVM_RISCV_ISA_EXT_ZVKT, true}, + {"smnpm", KVM_RISCV_ISA_EXT_SMNPM, true, false}, + {"smstateen", KVM_RISCV_ISA_EXT_SMSTATEEN, true, false}, + {"ssaia", KVM_RISCV_ISA_EXT_SSAIA, true, false}, + {"sscofpmf", KVM_RISCV_ISA_EXT_SSCOFPMF, true, false}, + {"ssnpm", KVM_RISCV_ISA_EXT_SSNPM, true, false}, + {"sstc", KVM_RISCV_ISA_EXT_SSTC, true, false}, + {"svade", KVM_RISCV_ISA_EXT_SVADE, true, false}, + {"svadu", KVM_RISCV_ISA_EXT_SVADU, true, false}, + {"svinval", KVM_RISCV_ISA_EXT_SVINVAL, true, false}, + {"svnapot", KVM_RISCV_ISA_EXT_SVNAPOT, true, false}, + {"svpbmt", KVM_RISCV_ISA_EXT_SVPBMT, true, false}, + {"svvptc", KVM_RISCV_ISA_EXT_SVVPTC, true, false}, + {"zabha", KVM_RISCV_ISA_EXT_ZABHA, true, false}, + {"zacas", KVM_RISCV_ISA_EXT_ZACAS, true, false}, + {"zawrs", KVM_RISCV_ISA_EXT_ZAWRS, true, false}, + {"zba", KVM_RISCV_ISA_EXT_ZBA, true, false}, + {"zbb", KVM_RISCV_ISA_EXT_ZBB, true, false}, + {"zbc", KVM_RISCV_ISA_EXT_ZBC, true, false}, + {"zbkb", KVM_RISCV_ISA_EXT_ZBKB, true, false}, + {"zbkc", KVM_RISCV_ISA_EXT_ZBKC, true, false}, + {"zbkx", KVM_RISCV_ISA_EXT_ZBKX, true, false}, + {"zbs", KVM_RISCV_ISA_EXT_ZBS, true, false}, + {"zca", KVM_RISCV_ISA_EXT_ZCA, true, false}, + {"zcb", KVM_RISCV_ISA_EXT_ZCB, true, false}, + {"zcd", KVM_RISCV_ISA_EXT_ZCD, true, false}, + {"zcf", KVM_RISCV_ISA_EXT_ZCF, true, false}, + {"zcmop", KVM_RISCV_ISA_EXT_ZCMOP, true, false}, + {"zfa", KVM_RISCV_ISA_EXT_ZFA, true, false}, + {"zfh", KVM_RISCV_ISA_EXT_ZFH, true, false}, + {"zfhmin", KVM_RISCV_ISA_EXT_ZFHMIN, true, false}, + {"zicbom", KVM_RISCV_ISA_EXT_ZICBOM, true, false}, + {"zicboz", KVM_RISCV_ISA_EXT_ZICBOZ, true, false}, + {"ziccrse", KVM_RISCV_ISA_EXT_ZICCRSE, true, false}, + {"zicntr", KVM_RISCV_ISA_EXT_ZICNTR, true, false}, + {"zicond", KVM_RISCV_ISA_EXT_ZICOND, true, false}, + {"zicsr", KVM_RISCV_ISA_EXT_ZICSR, true, false}, + {"zifencei", KVM_RISCV_ISA_EXT_ZIFENCEI, true, false}, + {"zihintntl", KVM_RISCV_ISA_EXT_ZIHINTNTL, true, false}, + {"zihintpause", KVM_RISCV_ISA_EXT_ZIHINTPAUSE, true, false}, + {"zihpm", KVM_RISCV_ISA_EXT_ZIHPM, true, false}, + {"zimop", KVM_RISCV_ISA_EXT_ZIMOP, true, false}, + {"zknd", KVM_RISCV_ISA_EXT_ZKND, true, false}, + {"zkne", KVM_RISCV_ISA_EXT_ZKNE, true, false}, + {"zknh", KVM_RISCV_ISA_EXT_ZKNH, true, false}, + {"zkr", KVM_RISCV_ISA_EXT_ZKR, true, false}, + {"zksed", KVM_RISCV_ISA_EXT_ZKSED, true, false}, + {"zksh", KVM_RISCV_ISA_EXT_ZKSH, true, false}, + {"zkt", KVM_RISCV_ISA_EXT_ZKT, true, false}, + {"ztso", KVM_RISCV_ISA_EXT_ZTSO, true, false}, + {"zvbb", KVM_RISCV_ISA_EXT_ZVBB, true, false}, + {"zvbc", KVM_RISCV_ISA_EXT_ZVBC, true, false}, + {"zvfh", KVM_RISCV_ISA_EXT_ZVFH, true, false}, + {"zvfhmin", KVM_RISCV_ISA_EXT_ZVFHMIN, true, false}, + {"zvkb", KVM_RISCV_ISA_EXT_ZVKB, true, false}, + {"zvkg", KVM_RISCV_ISA_EXT_ZVKG, true, false}, + {"zvkned", KVM_RISCV_ISA_EXT_ZVKNED, true, false}, + {"zvknha", KVM_RISCV_ISA_EXT_ZVKNHA, true, false}, + {"zvknhb", KVM_RISCV_ISA_EXT_ZVKNHB, true, false}, + {"zvksed", KVM_RISCV_ISA_EXT_ZVKSED, true, false}, + {"zvksh", KVM_RISCV_ISA_EXT_ZVKSH, true, false}, + {"zvkt", KVM_RISCV_ISA_EXT_ZVKT, true, false}, }; +static bool __isa_ext_disabled(struct kvm *kvm, struct isa_ext_info *info) +{ + if (!strncmp(kvm->cfg.arch.cpu_type, "min", 3) && + !info->min_cpu_included) + return true; + + return kvm->cfg.arch.ext_disabled[info->ext_id]; +} + +static bool __isa_ext_warn_disable_failure(struct kvm *kvm, struct isa_ext_info *info) +{ + if (!strncmp(kvm->cfg.arch.cpu_type, "min", 3) && + !info->min_cpu_included) + return false; + + return true; +} + +bool riscv__isa_extension_disabled(struct kvm *kvm, unsigned long isa_ext_id) +{ + struct isa_ext_info *info = NULL; + unsigned long i; + + for (i = 0; i < ARRAY_SIZE(isa_info_arr); i++) { + if (isa_info_arr[i].ext_id == isa_ext_id) { + info = &isa_info_arr[i]; + break; + } + } + if (!info) + return true; + + return __isa_ext_disabled(kvm, info); +} + +int riscv__cpu_type_parser(const struct option *opt, const char *arg, int unset) +{ + struct kvm *kvm = opt->ptr; + + if ((strncmp(arg, "min", 3) && strncmp(arg, "max", 3)) || strlen(arg) != 3) + die("Invalid CPU type %s\n", arg); + + if (!strncmp(arg, "max", 3)) + kvm->cfg.arch.cpu_type = "max"; + + if (!strncmp(arg, "min", 3)) + kvm->cfg.arch.cpu_type = "min"; + + return 0; +} + static void dump_fdt(const char *dtb_file, void *fdt) { int count, fd; @@ -108,10 +160,8 @@ static void dump_fdt(const char *dtb_file, void *fdt) #define CPU_NAME_MAX_LEN 15 static void generate_cpu_nodes(void *fdt, struct kvm *kvm) { - int cpu, pos, i, index, valid_isa_len; - const char *valid_isa_order = "IEMAFDQCLBJTPVNSUHKORWXYZG"; - int arr_sz = ARRAY_SIZE(isa_info_arr); unsigned long cbom_blksz = 0, cboz_blksz = 0, satp_mode = 0; + int i, cpu, pos, arr_sz = ARRAY_SIZE(isa_info_arr); _FDT(fdt_begin_node(fdt, "cpus")); _FDT(fdt_property_cell(fdt, "#address-cells", 0x1)); @@ -131,18 +181,8 @@ static void generate_cpu_nodes(void *fdt, struct kvm *kvm) snprintf(cpu_isa, CPU_ISA_MAX_LEN, "rv%ld", vcpu->riscv_xlen); pos = strlen(cpu_isa); - valid_isa_len = strlen(valid_isa_order); - for (i = 0; i < valid_isa_len; i++) { - index = valid_isa_order[i] - 'A'; - if (vcpu->riscv_isa & (1 << (index))) - cpu_isa[pos++] = 'a' + index; - } for (i = 0; i < arr_sz; i++) { - /* Skip single-letter extensions since these are taken care */ - if (!isa_info_arr[i].multi_letter) - continue; - reg.id = RISCV_ISA_EXT_REG(isa_info_arr[i].ext_id); reg.addr = (unsigned long)&isa_ext_out; if (ioctl(vcpu->vcpu_fd, KVM_GET_ONE_REG, ®) < 0) @@ -151,9 +191,10 @@ static void generate_cpu_nodes(void *fdt, struct kvm *kvm) /* This extension is not available in hardware */ continue; - if (kvm->cfg.arch.ext_disabled[isa_info_arr[i].ext_id]) { + if (__isa_ext_disabled(kvm, &isa_info_arr[i])) { isa_ext_out = 0; - if (ioctl(vcpu->vcpu_fd, KVM_SET_ONE_REG, ®) < 0) + if ((ioctl(vcpu->vcpu_fd, KVM_SET_ONE_REG, ®) < 0) && + __isa_ext_warn_disable_failure(kvm, &isa_info_arr[i])) pr_warning("Failed to disable %s ISA exension\n", isa_info_arr[i].name); continue; @@ -178,8 +219,13 @@ static void generate_cpu_nodes(void *fdt, struct kvm *kvm) isa_info_arr[i].name); break; } - pos += snprintf(cpu_isa + pos, CPU_ISA_MAX_LEN - pos, "_%s", - isa_info_arr[i].name); + + if (isa_info_arr[i].multi_letter) + pos += snprintf(cpu_isa + pos, CPU_ISA_MAX_LEN - pos, "_%s", + isa_info_arr[i].name); + else + pos += snprintf(cpu_isa + pos, CPU_ISA_MAX_LEN - pos, "%s", + isa_info_arr[i].name); } cpu_isa[pos] = '\0'; diff --git a/riscv/include/kvm/kvm-arch.h b/riscv/include/kvm/kvm-arch.h index f0f469f..1bb2d32 100644 --- a/riscv/include/kvm/kvm-arch.h +++ b/riscv/include/kvm/kvm-arch.h @@ -90,6 +90,8 @@ enum irqchip_type { IRQCHIP_AIA }; +bool riscv__isa_extension_disabled(struct kvm *kvm, unsigned long ext_id); + extern enum irqchip_type riscv_irqchip; extern bool riscv_irqchip_inkernel; extern void (*riscv_irqchip_trigger)(struct kvm *kvm, int irq, diff --git a/riscv/include/kvm/kvm-config-arch.h b/riscv/include/kvm/kvm-config-arch.h index 7e54d8a..26b1b50 100644 --- a/riscv/include/kvm/kvm-config-arch.h +++ b/riscv/include/kvm/kvm-config-arch.h @@ -4,6 +4,7 @@ #include "kvm/parse-options.h" struct kvm_config_arch { + const char *cpu_type; const char *dump_dtb_filename; u64 suspend_seconds; u64 custom_mvendorid; @@ -13,8 +14,12 @@ struct kvm_config_arch { bool sbi_ext_disabled[KVM_RISCV_SBI_EXT_MAX]; }; +int riscv__cpu_type_parser(const struct option *opt, const char *arg, int unset); + #define OPT_ARCH_RUN(pfx, cfg) \ pfx, \ + OPT_CALLBACK('\0', "cpu-type", kvm, "min or max", \ + "Choose the cpu type (default is max).", riscv__cpu_type_parser, kvm),\ OPT_STRING('\0', "dump-dtb", &(cfg)->dump_dtb_filename, \ ".dtb file", "Dump generated .dtb to specified file"),\ OPT_U64('\0', "suspend-seconds", \ diff --git a/riscv/kvm.c b/riscv/kvm.c index 1d49479..6a1b154 100644 --- a/riscv/kvm.c +++ b/riscv/kvm.c @@ -20,6 +20,8 @@ u64 kvm__arch_default_ram_address(void) void kvm__arch_validate_cfg(struct kvm *kvm) { + if (!kvm->cfg.arch.cpu_type) + kvm->cfg.arch.cpu_type = "max"; } bool kvm__arch_cpu_supports_vm(void) -- 2.43.0 From apatel at ventanamicro.com Tue Mar 25 23:56:44 2025 From: apatel at ventanamicro.com (Anup Patel) Date: Wed, 26 Mar 2025 12:26:44 +0530 Subject: [kvmtool PATCH 10/10] riscv: Allow including extensions in the min CPU type using command-line In-Reply-To: <20250326065644.73765-1-apatel@ventanamicro.com> References: <20250326065644.73765-1-apatel@ventanamicro.com> Message-ID: <20250326065644.73765-11-apatel@ventanamicro.com> It is useful to allow including extensions in the min CPU type on need basis via command-line. To achieve this, parse extension names as comma separated values appended to the "min" CPU type using command-line. For example, to include Sstc and Ssaia in the min CPU type use "--cpu-type min,sstc,ssaia" command-line option. Signed-off-by: Anup Patel --- riscv/fdt.c | 41 +++++++++++++++++++++++++++++++++++++++-- 1 file changed, 39 insertions(+), 2 deletions(-) diff --git a/riscv/fdt.c b/riscv/fdt.c index 4c018c8..9cefd2f 100644 --- a/riscv/fdt.c +++ b/riscv/fdt.c @@ -108,6 +108,20 @@ static bool __isa_ext_warn_disable_failure(struct kvm *kvm, struct isa_ext_info return true; } +static void __min_cpu_include(const char *ext, size_t ext_len) +{ + struct isa_ext_info *info; + unsigned long i; + + for (i = 0; i < ARRAY_SIZE(isa_info_arr); i++) { + info = &isa_info_arr[i]; + if (strlen(info->name) != ext_len) + continue; + if (!strncmp(ext, info->name, ext_len)) + info->min_cpu_included = true; + } +} + bool riscv__isa_extension_disabled(struct kvm *kvm, unsigned long isa_ext_id) { struct isa_ext_info *info = NULL; @@ -128,16 +142,39 @@ bool riscv__isa_extension_disabled(struct kvm *kvm, unsigned long isa_ext_id) int riscv__cpu_type_parser(const struct option *opt, const char *arg, int unset) { struct kvm *kvm = opt->ptr; + const char *str, *nstr; + int len; - if ((strncmp(arg, "min", 3) && strncmp(arg, "max", 3)) || strlen(arg) != 3) + if ((strncmp(arg, "min", 3) || strlen(arg) < 3) && + (strncmp(arg, "max", 3) || strlen(arg) != 3)) die("Invalid CPU type %s\n", arg); if (!strncmp(arg, "max", 3)) kvm->cfg.arch.cpu_type = "max"; - if (!strncmp(arg, "min", 3)) + if (!strncmp(arg, "min", 3)) { kvm->cfg.arch.cpu_type = "min"; + str = arg; + str += 3; + while (*str) { + if (*str == ',') { + str++; + continue; + } + + nstr = strchr(str, ','); + if (!nstr) + nstr = str + strlen(str); + + len = nstr - str; + if (len) { + __min_cpu_include(str, len); + str += len; + } + } + } + return 0; } -- 2.43.0 From guoren at kernel.org Wed Mar 26 02:22:41 2025 From: guoren at kernel.org (Guo Ren) Date: Wed, 26 Mar 2025 17:22:41 +0800 Subject: [RFC PATCH V3 25/43] rv64ilp32_abi: exec: Adapt 64lp64 env and argv In-Reply-To: <05fec753-cdaa-45a5-a029-b6435c30eb07@omp.ru> References: <20250325121624.523258-1-guoren@kernel.org> <20250325121624.523258-26-guoren@kernel.org> <05fec753-cdaa-45a5-a029-b6435c30eb07@omp.ru> Message-ID: On Wed, Mar 26, 2025 at 1:19?AM Sergey Shtylyov wrote: > > On 3/25/25 3:16 PM, guoren at kernel.org wrote: > > > From: "Guo Ren (Alibaba DAMO Academy)" > > > > The rv64ilp32 abi reuses the env and argv memory layout of the > > lp64 abi, so leave the space to fit the lp64 struct layout. > > > > Signed-off-by: Guo Ren (Alibaba DAMO Academy) > > --- > > fs/exec.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/fs/exec.c b/fs/exec.c > > index 506cd411f4ac..548d18b7ae92 100644 > > --- a/fs/exec.c > > +++ b/fs/exec.c > > @@ -424,6 +424,10 @@ static const char __user *get_user_arg_ptr(struct user_arg_ptr argv, int nr) > > } > > #endif > > > > +#if defined(CONFIG_64BIT) && (BITS_PER_LONG == 32) okay, #if defined(CONFIG_64BIT) && BITS_PER_LONG == 32 > > Parens don't seem necessary... > > > + nr = nr * 2; > > Why not nr *= 2? okay, nr *= 2; -- Best Regards Guo Ren From andrew.jones at linux.dev Wed Mar 26 11:51:35 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Wed, 26 Mar 2025 19:51:35 +0100 Subject: [kvm-unit-tests PATCH v3 0/5] arm64: Change the default QEMU CPU type to "max" In-Reply-To: <20250325160031.2390504-3-jean-philippe@linaro.org> References: <20250325160031.2390504-3-jean-philippe@linaro.org> Message-ID: <20250326-adedac8bee1bcff68cb0b849@orel> On Tue, Mar 25, 2025 at 04:00:28PM +0000, Jean-Philippe Brucker wrote: > This is v3 of the series that cleans up the configure flags and sets the > default CPU type to "max" on arm64, in order to test the latest Arm > features. > > Since v2 [1] I moved the CPU selection to ./configure, and improved the > help text. Unfortunately I couldn't keep most of the Review tags since > there were small changes all over. > > [1] https://lore.kernel.org/all/20250314154904.3946484-2-jean-philippe at linaro.org/ > > Alexandru Elisei (3): > configure: arm64: Don't display 'aarch64' as the default architecture > configure: arm/arm64: Display the correct default processor > arm64: Implement the ./configure --processor option > > Jean-Philippe Brucker (2): > configure: Add --qemu-cpu option > arm64: Use -cpu max as the default for TCG > > scripts/mkstandalone.sh | 3 ++- > arm/run | 15 ++++++----- > riscv/run | 8 +++--- > configure | 55 +++++++++++++++++++++++++++++++++++------ > arm/Makefile.arm | 1 - > arm/Makefile.common | 1 + > 6 files changed, 63 insertions(+), 20 deletions(-) > > -- > 2.49.0 Thanks Jean-Philippe. I'll let Alex and Eric give it another look, but LGTM. drew From seanjc at google.com Wed Mar 26 18:55:55 2025 From: seanjc at google.com (Sean Christopherson) Date: Wed, 26 Mar 2025 18:55:55 -0700 Subject: [PATCH] RISC-V: KVM: Teardown riscv specific bits after kvm_exit In-Reply-To: References: <20250317-kvm_exit_fix-v1-1-aa5240c5dbd2@rivosinc.com> Message-ID: On Thu, Mar 20, 2025, Atish Kumar Patra wrote: > On Mon, Mar 17, 2025 at 9:08?AM Sean Christopherson wrote: > > On Mon, Mar 17, 2025, Atish Patra wrote: > > > arch/riscv/kvm/main.c | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c > > > index 1fa8be5ee509..4b24705dc63a 100644 > > > --- a/arch/riscv/kvm/main.c > > > +++ b/arch/riscv/kvm/main.c > > > @@ -172,8 +172,8 @@ module_init(riscv_kvm_init); > > > > > > static void __exit riscv_kvm_exit(void) > > > { > > > - kvm_riscv_teardown(); > > > - > > > kvm_exit(); > > > + > > > + kvm_riscv_teardown(); > > > > I wonder if there's a way we can guard against kvm_init()/kvm_exit() being called > > too early/late. x86 had similar bugs for a very long time, e.g. see commit > > e32b120071ea ("KVM: VMX: Do _all_ initialization before exposing /dev/kvm to userspace"). > > > > E.g. maybe we do something like create+destroy a VM at the end of kvm_init() and > > the beginning of kvm_exit()? Not sure if that would work for kvm_exit(), but it > > should definitely be fine for kvm_init(). > > > Yes. That would be super useful. I am not sure about the exact > mechanism to achieve that though. Me either :-) > Do you just test code guarded within a new config that just > creates/destroys a dummy VM ? That's only idea I could come up with too, but I don't particulary like it. > May be kunit test for KVM fits here in some way ? I don't think a kunit test would be a good fit, there are likely too many dependencies, and I'm pretty sure we'd still need to hack KVM. From patchwork-bot+linux-riscv at kernel.org Wed Mar 26 20:24:46 2025 From: patchwork-bot+linux-riscv at kernel.org (patchwork-bot+linux-riscv at kernel.org) Date: Thu, 27 Mar 2025 03:24:46 +0000 Subject: [GIT PULL] KVM/riscv fixes for 6.14 take #1 In-Reply-To: References: Message-ID: <174304588624.1549280.15557437967851536574.git-patchwork-notify@kernel.org> Hello: This pull request was applied to riscv/linux.git (for-next) by Paolo Bonzini : On Fri, 21 Feb 2025 22:37:42 +0530 you wrote: > Hi Paolo, > > We have a bunch of SBI related fixes and one fix to remove > a redundant vcpu kick for the 6.14 kernel. > > Please pull. > > [...] Here is the summary with links: - [GIT,PULL] KVM/riscv fixes for 6.14 take #1 https://git.kernel.org/riscv/c/e93d78e05abb You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html From patchwork-bot+linux-riscv at kernel.org Wed Mar 26 20:24:50 2025 From: patchwork-bot+linux-riscv at kernel.org (patchwork-bot+linux-riscv at kernel.org) Date: Thu, 27 Mar 2025 03:24:50 +0000 Subject: [PATCH v2] riscv: KVM: Remove unnecessary vcpu kick In-Reply-To: <20250221104538.2147-1-xiangwencheng@lanxincomputing.com> References: <20250221104538.2147-1-xiangwencheng@lanxincomputing.com> Message-ID: <174304589076.1549280.903710261634215317.git-patchwork-notify@kernel.org> Hello: This patch was applied to riscv/linux.git (for-next) by Anup Patel : On Fri, 21 Feb 2025 18:45:38 +0800 you wrote: > Remove the unnecessary kick to the vCPU after writing to the vs_file > of IMSIC in kvm_riscv_vcpu_aia_imsic_inject. > > For vCPUs that are running, writing to the vs_file directly forwards > the interrupt as an MSI to them and does not need an extra kick. > > For vCPUs that are descheduled after emulating WFI, KVM will enable > the guest external interrupt for that vCPU in > kvm_riscv_aia_wakeon_hgei. This means that writing to the vs_file > will cause a guest external interrupt, which will cause KVM to wake > up the vCPU in hgei_interrupt to handle the interrupt properly. > > [...] Here is the summary with links: - [v2] riscv: KVM: Remove unnecessary vcpu kick https://git.kernel.org/riscv/c/d252435aca44 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html From patchwork-bot+linux-riscv at kernel.org Wed Mar 26 20:24:52 2025 From: patchwork-bot+linux-riscv at kernel.org (patchwork-bot+linux-riscv at kernel.org) Date: Thu, 27 Mar 2025 03:24:52 +0000 Subject: [PATCH 0/7] KVM: x86: nVMX IRQ fix and VM teardown cleanups In-Reply-To: <20250224235542.2562848-1-seanjc@google.com> References: <20250224235542.2562848-1-seanjc@google.com> Message-ID: <174304589224.1549280.1623157194395422949.git-patchwork-notify@kernel.org> Hello: This series was applied to riscv/linux.git (for-next) by Paolo Bonzini : On Mon, 24 Feb 2025 15:55:35 -0800 you wrote: > This was _supposed_ to be a tiny one-off patch to fix a nVMX bug where KVM > fails to detect that, after nested VM-Exit, L1 has a pending IRQ (or NMI). > But because x86's nested teardown flows are garbage (KVM simply forces a > nested VM-Exit to put the vCPU back into L1), that simple fix snowballed. > > The immediate issue is that checking for a pending interrupt accesses the > legacy PIC, and x86's kvm_arch_destroy_vm() currently frees the PIC before > destroying vCPUs, i.e. checking for IRQs during the forced nested VM-Exit > results in a NULL pointer deref (or use-after-free if KVM didn't nullify > the PIC pointer). That's patch 1. > > [...] Here is the summary with links: - [1/7] KVM: x86: Free vCPUs before freeing VM state https://git.kernel.org/riscv/c/17bcd7144263 - [2/7] KVM: nVMX: Process events on nested VM-Exit if injectable IRQ or NMI is pending https://git.kernel.org/riscv/c/982caaa11504 - [3/7] KVM: Assert that a destroyed/freed vCPU is no longer visible (no matching commit) - [4/7] KVM: x86: Don't load/put vCPU when unloading its MMU during teardown (no matching commit) - [5/7] KVM: x86: Unload MMUs during vCPU destruction, not before (no matching commit) - [6/7] KVM: x86: Fold guts of kvm_arch_sync_events() into kvm_arch_pre_destroy_vm() (no matching commit) - [7/7] KVM: Drop kvm_arch_sync_events() now that all implementations are nops (no matching commit) You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html From patchwork-bot+linux-riscv at kernel.org Wed Mar 26 20:25:05 2025 From: patchwork-bot+linux-riscv at kernel.org (patchwork-bot+linux-riscv at kernel.org) Date: Thu, 27 Mar 2025 03:25:05 +0000 Subject: [PATCH] riscv: Remove duplicate CLINT_TIMER selections In-Reply-To: <448dba3309fe341f4095b227b75ae5fc6b05f51a.1733242214.git.geert+renesas@glider.be> References: <448dba3309fe341f4095b227b75ae5fc6b05f51a.1733242214.git.geert+renesas@glider.be> Message-ID: <174304590525.1549280.12521028074935026513.git-patchwork-notify@kernel.org> Hello: This patch was applied to riscv/linux.git (for-next) by Alexandre Ghiti : On Tue, 3 Dec 2024 17:15:53 +0100 you wrote: > Since commit f862bbf4cdca696e ("riscv: Allow NOMMU kernels to run in > S-mode") in v6.10, CLINT_TIMER is selected by the main RISCV symbol when > RISCV_M_MODE is enabled. > > Signed-off-by: Geert Uytterhoeven > --- > arch/riscv/Kconfig.socs | 2 -- > 1 file changed, 2 deletions(-) Here is the summary with links: - riscv: Remove duplicate CLINT_TIMER selections https://git.kernel.org/riscv/c/72770690e02c You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html From andrew.jones at linux.dev Thu Mar 27 01:25:42 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Thu, 27 Mar 2025 09:25:42 +0100 Subject: [kvm-unit-tests PATCH v3 4/5] configure: Add --qemu-cpu option In-Reply-To: <20250325160031.2390504-7-jean-philippe@linaro.org> References: <20250325160031.2390504-3-jean-philippe@linaro.org> <20250325160031.2390504-7-jean-philippe@linaro.org> Message-ID: <20250327-96bca62a41b4bb0aa44303d4@orel> On Tue, Mar 25, 2025 at 04:00:32PM +0000, Jean-Philippe Brucker wrote: > Add the --qemu-cpu option to let users set the CPU type to run on. > At the moment --processor allows to set both GCC -mcpu flag and QEMU > -cpu. On Arm we'd like to pass `-cpu max` to QEMU in order to enable all > the TCG features by default, and it could also be nice to let users > modify the CPU capabilities by setting extra -cpu options. Since GCC > -mcpu doesn't accept "max" or "host", separate the compiler and QEMU > arguments. > > `--processor` is now exclusively for compiler options, as indicated by > its documentation ("processor to compile for"). So use $QEMU_CPU on > RISC-V as well. > > Suggested-by: Andrew Jones > Signed-off-by: Jean-Philippe Brucker > --- > scripts/mkstandalone.sh | 3 ++- > arm/run | 15 +++++++++------ > riscv/run | 8 ++++---- > configure | 24 ++++++++++++++++++++++++ > 4 files changed, 39 insertions(+), 11 deletions(-) > > diff --git a/scripts/mkstandalone.sh b/scripts/mkstandalone.sh > index 2318a85f..9b4f983d 100755 > --- a/scripts/mkstandalone.sh > +++ b/scripts/mkstandalone.sh > @@ -42,7 +42,8 @@ generate_test () > > config_export ARCH > config_export ARCH_NAME > - config_export PROCESSOR > + config_export QEMU_CPU > + config_export DEFAULT_QEMU_CPU > > echo "echo BUILD_HEAD=$(cat build-head)" > > diff --git a/arm/run b/arm/run > index efdd44ce..4675398f 100755 > --- a/arm/run > +++ b/arm/run > @@ -8,7 +8,7 @@ if [ -z "$KUT_STANDALONE" ]; then > source config.mak > source scripts/arch-run.bash > fi > -processor="$PROCESSOR" > +qemu_cpu="$QEMU_CPU" > > if [ "$QEMU" ] && [ -z "$ACCEL" ] && > [ "$HOST" = "aarch64" ] && [ "$ARCH" = "arm" ] && > @@ -37,12 +37,15 @@ if [ "$ACCEL" = "kvm" ]; then > fi > fi > > -if [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ]; then > - if [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ]; then > - processor="host" > +if [ -z "$qemu_cpu" ]; then > + if ( [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ] ) && > + ( [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ] ); then > + qemu_cpu="host" > if [ "$ARCH" = "arm" ] && [ "$HOST" = "aarch64" ]; then > - processor+=",aarch64=off" > + qemu_cpu+=",aarch64=off" > fi > + else > + qemu_cpu="$DEFAULT_QEMU_CPU" > fi > fi > > @@ -71,7 +74,7 @@ if $qemu $M -device '?' | grep -q pci-testdev; then > fi > > A="-accel $ACCEL$ACCEL_PROPS" > -command="$qemu -nodefaults $M $A -cpu $processor $chr_testdev $pci_testdev" > +command="$qemu -nodefaults $M $A -cpu $qemu_cpu $chr_testdev $pci_testdev" > command+=" -display none -serial stdio" > command="$(migration_cmd) $(timeout_cmd) $command" > > diff --git a/riscv/run b/riscv/run > index e2f5a922..02fcf0c0 100755 > --- a/riscv/run > +++ b/riscv/run > @@ -11,12 +11,12 @@ fi > > # Allow user overrides of some config.mak variables > mach=$MACHINE_OVERRIDE > -processor=$PROCESSOR_OVERRIDE > +qemu_cpu=$QEMU_CPU_OVERRIDE > firmware=$FIRMWARE_OVERRIDE > > -[ "$PROCESSOR" = "$ARCH" ] && PROCESSOR="max" > +[ -z "$QEMU_CPU" ] && QEMU_CPU="max" > : "${mach:=virt}" > -: "${processor:=$PROCESSOR}" > +: "${qemu_cpu:=$QEMU_CPU}" > : "${firmware:=$FIRMWARE}" > [ "$firmware" ] && firmware="-bios $firmware" > > @@ -32,7 +32,7 @@ fi > mach="-machine $mach" > > command="$qemu -nodefaults -nographic -serial mon:stdio" > -command+=" $mach $acc $firmware -cpu $processor " > +command+=" $mach $acc $firmware -cpu $qemu_cpu " > command="$(migration_cmd) $(timeout_cmd) $command" > > if [ "$UEFI_SHELL_RUN" = "y" ]; then > diff --git a/configure b/configure > index b4875ef3..b79145a5 100755 > --- a/configure > +++ b/configure > @@ -23,6 +23,21 @@ function get_default_processor() > esac > } > > +# Return the default CPU type to run on > +function get_default_qemu_cpu() > +{ > + local arch="$1" > + > + case "$arch" in > + "arm") > + echo "cortex-a15" > + ;; > + "arm64") > + echo "cortex-a57" > + ;; > + esac > +} > + > srcdir=$(cd "$(dirname "$0")"; pwd) > prefix=/usr/local > cc=gcc > @@ -52,6 +67,7 @@ earlycon= > console= > efi= > efi_direct= > +qemu_cpu= > > # Enable -Werror by default for git repositories only (i.e. developer builds) > if [ -e "$srcdir"/.git ]; then > @@ -70,6 +86,9 @@ usage() { > --arch=ARCH architecture to compile for ($arch). ARCH can be one of: > arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 > --processor=PROCESSOR processor to compile for ($processor) > + --qemu-cpu=CPU the CPU model to run on. If left unset, the run script > + selects the best value based on the host system and the > + test configuration. I'm starting to think we should name this '--target-cpu'. We already have '--target' which allows us to change the target to kvmtool and we may support other emulators/vmms someday. riscv kvmtool is already gaining support for different cpu types[1] and other vmms we add may have that support as well. There's no reason to add a separate configure command line option for each. So let's rename QEMU_CPU to TARGET_CPU, but keep DEFAULT_QEMU_CPU as it is, since that's qemu specific. run scripts will need to check TARGET is "qemu" to decide if it should use DEFAULT_QEMU_CPU when TARGET_CPU isn't set. Since TARGET is currently arm-only we can drop the riscv changes from this patch. I can do them with a follow-on series. [1] https://lore.kernel.org/all/20250326065644.73765-1-apatel at ventanamicro.com/ Thanks, drew > --target=TARGET target platform that the tests will be running on (qemu or > kvmtool, default is qemu) (arm/arm64 only) > --cross-prefix=PREFIX cross compiler prefix > @@ -146,6 +165,9 @@ while [[ $optno -le $argc ]]; do > --processor) > processor="$arg" > ;; > + --qemu-cpu) > + qemu_cpu="$arg" > + ;; > --target) > target="$arg" > ;; > @@ -471,6 +493,8 @@ ARCH=$arch > ARCH_NAME=$arch_name > ARCH_LIBDIR=$arch_libdir > PROCESSOR=$processor > +QEMU_CPU=$qemu_cpu > +DEFAULT_QEMU_CPU=$(get_default_qemu_cpu $arch) > CC=$cc > CFLAGS=$cflags > LD=$cross_prefix$ld > -- > 2.49.0 > > > -- > kvm-riscv mailing list > kvm-riscv at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kvm-riscv From guoren at kernel.org Thu Mar 27 05:47:15 2025 From: guoren at kernel.org (Guo Ren) Date: Thu, 27 Mar 2025 20:47:15 +0800 Subject: [RFC PATCH V3 31/43] rv64ilp32_abi: maple_tree: Use BITS_PER_LONG instead of CONFIG_64BIT In-Reply-To: <4gph4xikdbg6loy2id72uyxgldsldecc7gquhymusl3vreiw3a@ephk5ahhrdw7> References: <20250325121624.523258-1-guoren@kernel.org> <20250325121624.523258-32-guoren@kernel.org> <4gph4xikdbg6loy2id72uyxgldsldecc7gquhymusl3vreiw3a@ephk5ahhrdw7> Message-ID: On Wed, Mar 26, 2025 at 3:10?AM Liam R. Howlett wrote: > > * guoren at kernel.org [250325 08:24]: > > From: "Guo Ren (Alibaba DAMO Academy)" > > > > The Maple tree algorithm uses ulong type for each element. The > > number of slots is based on BITS_PER_LONG for RV64ILP32 ABI, so > > use BITS_PER_LONG instead of CONFIG_64BIT. > > > > Signed-off-by: Guo Ren (Alibaba DAMO Academy) > > --- > > include/linux/maple_tree.h | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h > > index cbbcd18d4186..ff6265b6468b 100644 > > --- a/include/linux/maple_tree.h > > +++ b/include/linux/maple_tree.h > > @@ -24,7 +24,7 @@ > > * > > * Nodes in the tree point to their parent unless bit 0 is set. > > */ > > -#if defined(CONFIG_64BIT) || defined(BUILD_VDSO32_64) > > +#if (BITS_PER_LONG == 64) || defined(BUILD_VDSO32_64) > > This will break my userspace testing, if you do not update the testing as > well. This can be found in tools/testing/radix-tree. Please also look > at the Makefile as well since it will generate a build flag for the > userspace. I think you are talking about the following: ============================================================ ../shared/shared.mk: ifndef LONG_BIT LONG_BIT := $(shell getconf LONG_BIT) endif generated/bit-length.h: FORCE @mkdir -p generated @if ! grep -qws CONFIG_$(LONG_BIT)BIT generated/bit-length.h; then \ echo "Generating $@"; \ echo "#define CONFIG_$(LONG_BIT)BIT 1" > $@; \ echo "#define CONFIG_PHYS_ADDR_T_$(LONG_BIT)BIT 1" >> $@; \ fi $ grep CONFIG_64BIT * -r -A 2 generated/bit-length.h:#define CONFIG_64BIT 1 generated/bit-length.h-#define CONFIG_PHYS_ADDR_T_64BIT 1 -- maple.c:#if defined(CONFIG_64BIT) maple.c-static noinline void __init check_erase2_testset(struct maple_tree *mt, maple.c- const unsigned long *set, unsigned long size) -- maple.c:#if CONFIG_64BIT maple.c- MT_BUG_ON(mt, data_end != mas_data_end(&mas)); maple.c-#endif -- maple.c:#if CONFIG_64BIT maple.c- MT_BUG_ON(mt, data_end - 2 != mas_data_end(&mas)); maple.c-#endif -- maple.c:#if CONFIG_64BIT maple.c- MT_BUG_ON(mt, data_end - 4 != mas_data_end(&mas)); maple.c-#endif -- maple.c:#if defined(CONFIG_64BIT) maple.c- /* Captures from VMs that found previous errors */ maple.c- mt_init_flags(&tree, 0); ============================================================ First, we don't introduce rv64ilp32-abi user space, which means these testing codes can't run on rv64ilp32-abi userspace currently. So, the problem you mentioned doesn't exist. Second, CONFIG_32BIT is determined by LONG_BIT, so there's no issue in maple.c with future rv64ilp32-abi userspace. That means rv64ilp32-abi userspace would use CONFIG_32BIT to test radix-tree. It's okay. > > This raises other concerns as the code is found with a grep command, so > I'm not sure why it was missed and if anything else is missed? > > If you consider this email to be the (unasked) question about what to do > here, then please CC me, the maintainer of the files including the one > you are updating here. > > Thank you, > Liam > -- Best Regards Guo Ren From guoren at kernel.org Thu Mar 27 06:13:43 2025 From: guoren at kernel.org (Guo Ren) Date: Thu, 27 Mar 2025 21:13:43 +0800 Subject: [RFC PATCH V3 00/43] rv64ilp32_abi: Build CONFIG_64BIT kernel-self with ILP32 ABI In-Reply-To: References: <20250325121624.523258-1-guoren@kernel.org> <20250325122640.GK36322@noisy.programming.kicks-ass.net> Message-ID: On Wed, Mar 26, 2025 at 2:56?PM Arnd Bergmann wrote: > > On Wed, Mar 26, 2025, at 07:07, Guo Ren wrote: > > On Tue, Mar 25, 2025 at 9:18?PM Arnd Bergmann wrote: > >> On Tue, Mar 25, 2025, at 13:26, Peter Zijlstra wrote: > >> > On Tue, Mar 25, 2025 at 08:15:41AM -0400, guoren at kernel.org wrote: > >> > >> You declare the syscall ABI to be the native 64-bit ABI, but this > >> is fundamentally not true because a many uapi structures are > >> defined in terms of 'long' or pointer values, in particular in > >> the ioctl call. > > > > I modified uapi with > > void __user *msg_name; > > -> > > union {void __user *msg_name; u64 __msg_name;}; > > to make native 64-bit ABI. > > > > I would look at compat stuff instead of using __riscv_xlen macro. > > The problem I see here is that there are many more drivers > that you did not modify than drivers that you did change this > way. The union is particularly ugly, but even if you find > a nicer method of doing this, you now also put the burden > on future driver writers to do this right for your platform. Got it. > > >> As far as I can tell, there is no way to rectify this design flaw > >> other than to drop support for 64-bit userspace and only support > >> regular rv32 userspace. I'm also skeptical that supporting rv64 > >> userspace helps in practice other than for testing, since > >> generally most memory overhead is in userspace rather than the > >> kernel, and there is much more to gain from shrinking the larger > >> userspace by running rv32 compat mode binaries on a 64-bit kernel > >> than the other way round. > > > > The lp64-abi userspace rootfs works fine in this patch set, which > > proves the technique is valid. But the modification on uapi is raw, > > and I'm looking at compat stuff. > > There is a big difference between making it work for a particular > set of userspace binaries and making it correct for the entire > kernel ABI. > > I agree that limiting the hacks to the compat side while keeping > the native ABI as ilp32 as in your previous versions is better, > but I also don't think this can be easily done without major > changes to how compat mode works in general, and that still > seems like a show-stopper for two reasons: > > - it still puts the burden on driver writers to get it right > for your platform. The scope is a bit smaller than in the > current version because that would be limited to the compat > handlers and not change the native codepath, but that's > still a lot of drivers. > > - the way that I would imagine this to be implemented in > practice would require changing the compat code in a way that > allows multiple compat ABIs, so drivers can separate the > normal 32-on-64 handling from the 64-on-32 version you need. > We have discussed something like this in the past, but Linus > has already made it very clear that he doesn't want it done > that way. Whichever way you do it, this is unlikely to > find consensus. Got it, thanks for analysing. > > > Supporting lp64-abi userspace is essential because riscv lp64-abi and > > ilp32-abi userspace are hybrid deployments when the target is > > ilp32-abi userspace. The lp64-abi provides a good supplement to > > ilp32-abi which eases the development. > > I'm not following here, please clarify. I do understand that > having a mixed 32/64 userspace can help for development, but > that can already be done on a 64-bit kernel and it doesn't > seem to be useful for deployment because having two sets of > support libraries makes this counterproductive for the goal > of saving RAM. In my case, most binaries and libraries are based on 32-bit, but a small part would remain on 64-bit, which may be statically linked. For RISC-V, the rv64 ecosystem is more complete than the rv32's. So, rv64-abi is always necessary, and rv32-abi is a supplement. > > >> If you remove the CONFIG_64BIT changes that Peter mentioned and > >> the support for ilp64 userland from your series, you end up > >> with a kernel that is very similar to a native rv32 kernel > >> but executes as rv64ilp32 and runs rv32 userspace. I don't have > >> any objections to that approach, and the same thing has come > >> up on arm64 as a possible idea as well, but I don't know if > >> that actually brings any notable advantage over an rv32 kernel. > >> > >> Are there CPUs that can run rv64 kernels and rv32 userspace > >> but not rv32 kernels, similar to what we have on Arm Cortex-A76 > >> and Cortex-A510? > > > > Yes, there is, and it only supports rv32 userspace, not rv32 kernel. > > https://www.xrvm.com/product/xuantie/C908 > > Ok, thanks for the link. > > Arnd > -- Best Regards Guo Ren From palmer at dabbelt.com Thu Mar 27 09:20:09 2025 From: palmer at dabbelt.com (Palmer Dabbelt) Date: Thu, 27 Mar 2025 09:20:09 -0700 (PDT) Subject: [RFC PATCH V3 00/43] rv64ilp32_abi: Build CONFIG_64BIT kernel-self with ILP32 ABI In-Reply-To: Message-ID: On Tue, 25 Mar 2025 12:23:39 PDT (-0700), Liam.Howlett at oracle.com wrote: > * David Hildenbrand [250325 14:52]: >> On 25.03.25 13:26, Peter Zijlstra wrote: >> > On Tue, Mar 25, 2025 at 08:15:41AM -0400, guoren at kernel.org wrote: >> > > From: "Guo Ren (Alibaba DAMO Academy)" >> > > >> > > Since 2001, the CONFIG_64BIT kernel has been built with the LP64 ABI, >> > > but this patchset allows the CONFIG_64BIT kernel to use an ILP32 ABI >> > >> > I'm thinking you're going to be finding a metric ton of assumptions >> > about 'unsigned long' being 64bit when 64BIT=y throughout the kernel. >> > >> > I know of a couple of places where 64BIT will result in different math >> > such that a 32bit 'unsigned long' will trivially overflow. Ya, I write code that assumes "unsigned long" is the size of a register pretty regularly. >> > >> > Please, don't do this. This adds a significant maintenance burden on all >> > of us. >> > >> >> Fully agreed. > > I would go further and say I do not want this to go in. Seems reasonable to me, and I think it's also been the general sentiment when this stuff comes up. This specific implementation seems particularly clunky, but I agree that it's going to be painful to do this sort of thing. > The open ended maintenance burden is not worth extending hardware life > of a board with 16mb of ram (If I understand your 2023 LPC slides > correctly). We can already run full 32-bit kernels on 64-bit hardware. The hardware needs to support configurable XLEN, but there's systems out there that do already. It's not like any of the existing RISC-V stuff ships in meaningful volumes. So I think it's fine to just say that vendors who want tiny memories should build hardware that plays nice with those constraints, and vendors who build hardware that doesn't make any sense get to pick up the pieces. I get RISC-V is where people go to have crazy ideas, but there's got to be a line somewhere... > > Thank you, > Liam From palmer at dabbelt.com Thu Mar 27 09:20:11 2025 From: palmer at dabbelt.com (Palmer Dabbelt) Date: Thu, 27 Mar 2025 09:20:11 -0700 (PDT) Subject: [RFC PATCH V3 01/43] rv64ilp32_abi: uapi: Reuse lp64 ABI interface In-Reply-To: Message-ID: On Tue, 25 Mar 2025 13:41:30 PDT (-0700), Linus Torvalds wrote: > On Tue, 25 Mar 2025 at 05:17, wrote: >> >> The rv64ilp32 abi kernel accommodates the lp64 abi userspace and >> leverages the lp64 abi Linux interface. Hence, unify the >> BITS_PER_LONG = 32 memory layout to match BITS_PER_LONG = 64. > > No. > > This isn't happening. > > You can't do crazy things in the RISC-V code and then expect the rest > of the kernel to just go "ok, we'll do crazy things". > > We're not doing crazy __riscv_xlen hackery with random structures > containing 64-bit values that the kernel then only looks at the low 32 > bits. That's wrong on *so* many levels. FWIW: this has come up a few times and we've generally said "nobody wants this", but that doesn't seem to stick... > I'm willing to say "big-endian is dead", but I'm not willing to accept > this kind of crazy hackery. > > Not today, not ever. OK, maybe that will stick ;) > If you want to run a ilp32 kernel on 64-bit hardware (and support > 64-bit ABI just in a 32-bit virtual memory size), I would suggest you > > (a) treat the kernel as natively 32-bit (obviously you can then tell > the compiler to use the rv64 instructions, which I presume you're > already doing - I didn't look) > > (b) look at making the compat stuff do the conversion the "wrong way". > > And btw, that (b) implies *not* just ignoring the high bits. If > user-space gives 64-bit pointer, you don't just treat it as a 32-bit > one by dropping the high bits. You add some logic to convert it to an > invalid pointer so that user space gets -EFAULT. > > Linus From alexandru.elisei at arm.com Thu Mar 27 10:11:23 2025 From: alexandru.elisei at arm.com (Alexandru Elisei) Date: Thu, 27 Mar 2025 17:11:23 +0000 Subject: [kvm-unit-tests PATCH v3 1/5] configure: arm64: Don't display 'aarch64' as the default architecture In-Reply-To: <20250325160031.2390504-4-jean-philippe@linaro.org> References: <20250325160031.2390504-3-jean-philippe@linaro.org> <20250325160031.2390504-4-jean-philippe@linaro.org> Message-ID: Hi Jean-Philippe On Tue, Mar 25, 2025 at 04:00:29PM +0000, Jean-Philippe Brucker wrote: > From: Alexandru Elisei > > --arch=aarch64, intentional or not, has been supported since the initial > arm64 support, commit 39ac3f8494be ("arm64: initial drop"). However, > "aarch64" does not show up in the list of supported architectures, but > it's displayed as the default architecture if doing ./configure --help > on an arm64 machine. > > Keep everything consistent and make sure that the default value for > $arch is "arm64", but still allow --arch=aarch64, in case they are users > that use this configuration for kvm-unit-tests. You can drop this paragraph, since the change to the default value for $arch was dropped. With this change: Reviewed-by: Alexandru Elisei Thanks, Alex > > The help text for --arch changes from: > > --arch=ARCH architecture to compile for (aarch64). ARCH can be one of: > arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 > > to: > > --arch=ARCH architecture to compile for (arm64). ARCH can be one of: > arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 > > Signed-off-by: Alexandru Elisei > Signed-off-by: Jean-Philippe Brucker > --- > configure | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/configure b/configure > index 52904d3a..010c68ff 100755 > --- a/configure > +++ b/configure > @@ -43,6 +43,7 @@ else > fi > > usage() { > + [ "$arch" = "aarch64" ] && arch="arm64" > cat <<-EOF > Usage: $0 [options] > > -- > 2.49.0 > From alexandru.elisei at arm.com Thu Mar 27 10:11:38 2025 From: alexandru.elisei at arm.com (Alexandru Elisei) Date: Thu, 27 Mar 2025 17:11:38 +0000 Subject: [kvm-unit-tests PATCH v3 2/5] configure: arm/arm64: Display the correct default processor In-Reply-To: <20250325160031.2390504-5-jean-philippe@linaro.org> References: <20250325160031.2390504-3-jean-philippe@linaro.org> <20250325160031.2390504-5-jean-philippe@linaro.org> Message-ID: Hi Jean-Philippe, On Tue, Mar 25, 2025 at 04:00:30PM +0000, Jean-Philippe Brucker wrote: > From: Alexandru Elisei > > The help text for the --processor option displays the architecture name as > the default processor type. But the default for arm is cortex-a15, and for > arm64 is cortex-a57. Teach configure to display the correct default > processor type for these two architectures. Looks good to me: Reviewed-by: Alexandru Elisei Thanks, Alex > > Signed-off-by: Alexandru Elisei > Signed-off-by: Jean-Philippe Brucker > --- > configure | 30 ++++++++++++++++++++++-------- > 1 file changed, 22 insertions(+), 8 deletions(-) > > diff --git a/configure b/configure > index 010c68ff..b4875ef3 100755 > --- a/configure > +++ b/configure > @@ -5,6 +5,24 @@ if [ -z "${BASH_VERSINFO[0]}" ] || [ "${BASH_VERSINFO[0]}" -lt 4 ] ; then > exit 1 > fi > > +# Return the default CPU type to compile for > +function get_default_processor() > +{ > + local arch="$1" > + > + case "$arch" in > + "arm") > + echo "cortex-a15" > + ;; > + "arm64") > + echo "cortex-a57" > + ;; > + *) > + echo "$arch" > + ;; > + esac > +} > + > srcdir=$(cd "$(dirname "$0")"; pwd) > prefix=/usr/local > cc=gcc > @@ -44,13 +62,14 @@ fi > > usage() { > [ "$arch" = "aarch64" ] && arch="arm64" > + [ -z "$processor" ] && processor=$(get_default_processor $arch) > cat <<-EOF > Usage: $0 [options] > > Options include: > --arch=ARCH architecture to compile for ($arch). ARCH can be one of: > arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 > - --processor=PROCESSOR processor to compile for ($arch) > + --processor=PROCESSOR processor to compile for ($processor) > --target=TARGET target platform that the tests will be running on (qemu or > kvmtool, default is qemu) (arm/arm64 only) > --cross-prefix=PREFIX cross compiler prefix > @@ -326,13 +345,8 @@ if [ "$earlycon" ]; then > fi > fi > > -[ -z "$processor" ] && processor="$arch" > - > -if [ "$processor" = "arm64" ]; then > - processor="cortex-a57" > -elif [ "$processor" = "arm" ]; then > - processor="cortex-a15" > -fi > +# $arch will have changed when cross-compiling. > +[ -z "$processor" ] && processor=$(get_default_processor $arch) > > if [ "$arch" = "i386" ] || [ "$arch" = "x86_64" ]; then > testdir=x86 > -- > 2.49.0 > From alexandru.elisei at arm.com Thu Mar 27 10:14:39 2025 From: alexandru.elisei at arm.com (Alexandru Elisei) Date: Thu, 27 Mar 2025 17:14:39 +0000 Subject: [kvm-unit-tests PATCH v3 4/5] configure: Add --qemu-cpu option In-Reply-To: <20250325160031.2390504-7-jean-philippe@linaro.org> References: <20250325160031.2390504-3-jean-philippe@linaro.org> <20250325160031.2390504-7-jean-philippe@linaro.org> Message-ID: Hi Jean-Philippe, On Tue, Mar 25, 2025 at 04:00:32PM +0000, Jean-Philippe Brucker wrote: > Add the --qemu-cpu option to let users set the CPU type to run on. > At the moment --processor allows to set both GCC -mcpu flag and QEMU > -cpu. On Arm we'd like to pass `-cpu max` to QEMU in order to enable all > the TCG features by default, and it could also be nice to let users > modify the CPU capabilities by setting extra -cpu options. Since GCC > -mcpu doesn't accept "max" or "host", separate the compiler and QEMU > arguments. > > `--processor` is now exclusively for compiler options, as indicated by > its documentation ("processor to compile for"). So use $QEMU_CPU on > RISC-V as well. > > Suggested-by: Andrew Jones > Signed-off-by: Jean-Philippe Brucker > --- > scripts/mkstandalone.sh | 3 ++- > arm/run | 15 +++++++++------ > riscv/run | 8 ++++---- > configure | 24 ++++++++++++++++++++++++ > 4 files changed, 39 insertions(+), 11 deletions(-) > > diff --git a/scripts/mkstandalone.sh b/scripts/mkstandalone.sh > index 2318a85f..9b4f983d 100755 > --- a/scripts/mkstandalone.sh > +++ b/scripts/mkstandalone.sh > @@ -42,7 +42,8 @@ generate_test () > > config_export ARCH > config_export ARCH_NAME > - config_export PROCESSOR > + config_export QEMU_CPU > + config_export DEFAULT_QEMU_CPU > > echo "echo BUILD_HEAD=$(cat build-head)" > > diff --git a/arm/run b/arm/run > index efdd44ce..4675398f 100755 > --- a/arm/run > +++ b/arm/run > @@ -8,7 +8,7 @@ if [ -z "$KUT_STANDALONE" ]; then > source config.mak > source scripts/arch-run.bash > fi > -processor="$PROCESSOR" > +qemu_cpu="$QEMU_CPU" > > if [ "$QEMU" ] && [ -z "$ACCEL" ] && > [ "$HOST" = "aarch64" ] && [ "$ARCH" = "arm" ] && > @@ -37,12 +37,15 @@ if [ "$ACCEL" = "kvm" ]; then > fi > fi > > -if [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ]; then > - if [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ]; then > - processor="host" > +if [ -z "$qemu_cpu" ]; then > + if ( [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ] ) && > + ( [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ] ); then > + qemu_cpu="host" > if [ "$ARCH" = "arm" ] && [ "$HOST" = "aarch64" ]; then > - processor+=",aarch64=off" > + qemu_cpu+=",aarch64=off" > fi > + else > + qemu_cpu="$DEFAULT_QEMU_CPU" > fi > fi > > @@ -71,7 +74,7 @@ if $qemu $M -device '?' | grep -q pci-testdev; then > fi > > A="-accel $ACCEL$ACCEL_PROPS" > -command="$qemu -nodefaults $M $A -cpu $processor $chr_testdev $pci_testdev" > +command="$qemu -nodefaults $M $A -cpu $qemu_cpu $chr_testdev $pci_testdev" > command+=" -display none -serial stdio" > command="$(migration_cmd) $(timeout_cmd) $command" > > diff --git a/riscv/run b/riscv/run > index e2f5a922..02fcf0c0 100755 > --- a/riscv/run > +++ b/riscv/run > @@ -11,12 +11,12 @@ fi > > # Allow user overrides of some config.mak variables > mach=$MACHINE_OVERRIDE > -processor=$PROCESSOR_OVERRIDE > +qemu_cpu=$QEMU_CPU_OVERRIDE > firmware=$FIRMWARE_OVERRIDE > > -[ "$PROCESSOR" = "$ARCH" ] && PROCESSOR="max" > +[ -z "$QEMU_CPU" ] && QEMU_CPU="max" If you make DEFAULT_QEMU_CPU=max for riscv, you can use it instead of "max", just like arm/arm64 does. Not sure if it's worth another respin. Otherwise the patch looks good to me. Thanks, Alex > : "${mach:=virt}" > -: "${processor:=$PROCESSOR}" > +: "${qemu_cpu:=$QEMU_CPU}" > : "${firmware:=$FIRMWARE}" > [ "$firmware" ] && firmware="-bios $firmware" > > @@ -32,7 +32,7 @@ fi > mach="-machine $mach" > > command="$qemu -nodefaults -nographic -serial mon:stdio" > -command+=" $mach $acc $firmware -cpu $processor " > +command+=" $mach $acc $firmware -cpu $qemu_cpu " > command="$(migration_cmd) $(timeout_cmd) $command" > > if [ "$UEFI_SHELL_RUN" = "y" ]; then > diff --git a/configure b/configure > index b4875ef3..b79145a5 100755 > --- a/configure > +++ b/configure > @@ -23,6 +23,21 @@ function get_default_processor() > esac > } > > +# Return the default CPU type to run on > +function get_default_qemu_cpu() > +{ > + local arch="$1" > + > + case "$arch" in > + "arm") > + echo "cortex-a15" > + ;; > + "arm64") > + echo "cortex-a57" > + ;; > + esac > +} > + > srcdir=$(cd "$(dirname "$0")"; pwd) > prefix=/usr/local > cc=gcc > @@ -52,6 +67,7 @@ earlycon= > console= > efi= > efi_direct= > +qemu_cpu= > > # Enable -Werror by default for git repositories only (i.e. developer builds) > if [ -e "$srcdir"/.git ]; then > @@ -70,6 +86,9 @@ usage() { > --arch=ARCH architecture to compile for ($arch). ARCH can be one of: > arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 > --processor=PROCESSOR processor to compile for ($processor) > + --qemu-cpu=CPU the CPU model to run on. If left unset, the run script > + selects the best value based on the host system and the > + test configuration. > --target=TARGET target platform that the tests will be running on (qemu or > kvmtool, default is qemu) (arm/arm64 only) > --cross-prefix=PREFIX cross compiler prefix > @@ -146,6 +165,9 @@ while [[ $optno -le $argc ]]; do > --processor) > processor="$arg" > ;; > + --qemu-cpu) > + qemu_cpu="$arg" > + ;; > --target) > target="$arg" > ;; > @@ -471,6 +493,8 @@ ARCH=$arch > ARCH_NAME=$arch_name > ARCH_LIBDIR=$arch_libdir > PROCESSOR=$processor > +QEMU_CPU=$qemu_cpu > +DEFAULT_QEMU_CPU=$(get_default_qemu_cpu $arch) > CC=$cc > CFLAGS=$cflags > LD=$cross_prefix$ld > -- > 2.49.0 > From alexandru.elisei at arm.com Thu Mar 27 10:17:42 2025 From: alexandru.elisei at arm.com (Alexandru Elisei) Date: Thu, 27 Mar 2025 17:17:42 +0000 Subject: [kvm-unit-tests PATCH v3 0/5] arm64: Change the default QEMU CPU type to "max" In-Reply-To: <20250326-adedac8bee1bcff68cb0b849@orel> References: <20250325160031.2390504-3-jean-philippe@linaro.org> <20250326-adedac8bee1bcff68cb0b849@orel> Message-ID: Hi Drew, On Wed, Mar 26, 2025 at 07:51:35PM +0100, Andrew Jones wrote: > On Tue, Mar 25, 2025 at 04:00:28PM +0000, Jean-Philippe Brucker wrote: > > This is v3 of the series that cleans up the configure flags and sets the > > default CPU type to "max" on arm64, in order to test the latest Arm > > features. > > > > Since v2 [1] I moved the CPU selection to ./configure, and improved the > > help text. Unfortunately I couldn't keep most of the Review tags since > > there were small changes all over. > > > > [1] https://lore.kernel.org/all/20250314154904.3946484-2-jean-philippe at linaro.org/ > > > > Alexandru Elisei (3): > > configure: arm64: Don't display 'aarch64' as the default architecture > > configure: arm/arm64: Display the correct default processor > > arm64: Implement the ./configure --processor option > > > > Jean-Philippe Brucker (2): > > configure: Add --qemu-cpu option > > arm64: Use -cpu max as the default for TCG > > > > scripts/mkstandalone.sh | 3 ++- > > arm/run | 15 ++++++----- > > riscv/run | 8 +++--- > > configure | 55 +++++++++++++++++++++++++++++++++++------ > > arm/Makefile.arm | 1 - > > arm/Makefile.common | 1 + > > 6 files changed, 63 insertions(+), 20 deletions(-) > > > > -- > > 2.49.0 > > Thanks Jean-Philippe. I'll let Alex and Eric give it another look, but > LGTM. Looks good to me too, just two nitpicks, not sure if it's worth respining just for them, or if you want to make the changes when you apply the patches (or just ignore them, that's fine too!). Thanks, Alex From atishp at rivosinc.com Thu Mar 27 12:35:41 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:35:41 -0700 Subject: [PATCH v5 00/21] Add Counter delegation ISA extension support Message-ID: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> This series adds the counter delegation extension support. It is based on very early PoC work done by Kevin Xue and mostly rewritten after that. The counter delegation ISA extension(Smcdeleg/Ssccfg) actually depends on multiple ISA extensions. 1. S[m|s]csrind : The indirect CSR extension[1] which defines additional 5 ([M|S|VS]IREG2-[M|S|VS]IREG6) register to address size limitation of RISC-V CSR address space. 2. Smstateen: The stateen bit[60] controls the access to the registers indirectly via the above indirect registers. 3. Smcdeleg/Ssccfg: The counter delegation extensions[2] The counter delegation extension allows Supervisor mode to program the hpmevent and hpmcounters directly without needing the assistance from the M-mode via SBI calls. This results in a faster perf profiling and very few traps. This extension also introduces a scountinhibit CSR which allows to stop/start any counter directly from the S-mode. As the counter delegation extension potentially can have more than 100 CSRs, the specification leverages the indirect CSR extension to save the precious CSR address range. Due to the dependency of these extensions, the following extensions must be enabled in qemu to use the counter delegation feature in S-mode. "smstateen=true,sscofpmf=true,ssccfg=true,smcdeleg=true,smcsrind=true,sscsrind=true" or Virt machine users can just "max" cpu instead. When we access the counters directly in S-mode, we also need to solve the following problems. 1. Event to counter mapping 2. Event encoding discovery The RISC-V ISA doesn't define any standard either for event encoding or the event to counter mapping rules. Until now, the SBI PMU implementation relies on device tree binding[3] to discover the event to counter mapping in RISC-V platform in the firmware. The SBI PMU specification[4] defines event encoding for standard perf events as well. Thus, the kernel can query the appropriate counter for an given event from the firmware. However, the kernel doesn't need any firmware interaction for hardware counters if counter delegation is available in the hardware. Thus, the driver needs to discover the above mappings/encodings by itself without any assistance from firmware. Solution to Problem #1: This patch series solves the above problem #1 by extending the perf tool in a way so that event json file can specify the counter constraints of each event and that can be passed to the driver to choose the best counter for a given event. The perf stat metric series[5] from Weilin already extend the perf tool to parse "Counter" property to specify the hardware counter restriction. As that series was not revised in a while, I have rebased it and included in this series. I can only include the necessary parts from that patch required for this series if required. This series extends that support by converting comma separated string to a bitmap. The counter constraint bitmap is passed to the perf driver via newly introduced "counterid_mask" property set in "config2". However, it results in the following event string which has repeated information about the counters both in list and bitmask format. I am not sure how I can pass the list information to the driver directly. That's why I added a counterid_mask property. Additionaly, the PATCH5 in [5] parses the bitmask information from the string and puts it into the metric group structure. We can just convert it in python easily and pass it to the metric group instead. The PATCH19 does exactly that and sets the counterid_mask property. @Weilin @Ian : Please let me know if there is a better way to solve the problem I described. Due to the new counterid_mask property, the layout in empty-pmu-events.c got changed which is patched in PATCH 21 based on existing script. Possible solutions to Problem #2: 1. Extend the PMU DT parsing support to kernel as well. However, that requires additional support in ACPI based system. It also needs more infrastructure in the virtualization as well. 2. Rename perf legacy events to riscv specific names. This will require users to use perf differently than other ISAs which is not ideal. 3. Define a architecture specific override function for legacy events. Earlier RFC version did that but it is not preferred as arch specific behavior in perf tool has other ramifications on the tests. 4. Ian graciously helped and sent a generic fix[6] for #3 that prefers json over legacy encoding. Unfortunately, it had some regressions and the discussions are ongoing if it is a viable solution. 5. Specify the encodings in the driver. There were earlier concerns of managing these in the driver as these encodings are vendor specific in absence of an ISA guidelines. However, we also need to support counter virtualization and legacy event users (without perf tool) as described in [7]. That's why, this series adapts this solution similar to other ISAs. The vendors can define their pmu event encoding and event to counter mapping in the driver. Note: This solution is still compatible with solution #4 by Ian. It gives vendors flexibility to define legacy event encoding in either the driver or json file if Ian's series [6] is merged. If we can get rid of the legacy events in the future, we can just rely on the json encodings. I have not added a json file for qemu as I have not included Ian's patches in this series. But I have verified them with a virt machine specific json file. The Qemu patches are available in upstream now. The Linux kernel patches can be found here: https://github.com/atishp04/linux/tree/b4/counter_delegation_v4 [1] https://github.com/riscv/riscv-indirect-csr-access [2] https://github.com/riscv/riscv-smcdeleg-ssccfg [3] https://www.kernel.org/doc/Documentation/devicetree/bindings/perf/riscv%2Cpmu.yaml [4] https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/ext-pmu.adoc [5] https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang at intel.com/ [6] https://lore.kernel.org/lkml/20250109222109.567031-1-irogers at google.com/ [7] https://lore.kernel.org/lkml/20241026121758.143259-1-irogers at google.com/T/#m653a6b98919a365a361a698032502bd26af9f6ba Signed-off-by: Atish Patra --- Changes in v5: - Fixed dt_binding_check errors. - Added the ISA extension dependancy for counter delegation extensions. - Replaced the boolean variables with static key conditional check required at boot time. - Miscellaneous minor code restructuring. - Link to v4: https://lore.kernel.org/r/20250205-counter_delegation-v4-0-835cfa88e3b1 at rivosinc.com Changes in v4: - Added ISA dependencies as per dt schema instead of description. - Fixed few compilation issues due to patch reordering in v3. - Link to v3: https://lore.kernel.org/r/20250127-counter_delegation-v3-0-64894d7e16d5 at rivosinc.com Changes in v3: - Fixed the dtb binding check failures. - Inlcuded the fix reported by Rajnesh Kanwal for guest counter overflow. - Rearranged the overflow handling more efficiently for better modularity. - Link to v2: https://lore.kernel.org/r/20250114-counter_delegation-v2-0-8ba74cdb851b at rivosinc.com Changes in v2: - Dropped architecture specific overrides for event encoding. - Dropped hwprobe bits. - Added a vendor specific event encoding table to support vendor specific event encoding and counter mapping. - Fixed few bugs and cleanup. - Link to v1: https://lore.kernel.org/r/20240217005738.3744121-1-atishp at rivosinc.com --- Atish Patra (17): RISC-V: Add Sxcsrind ISA extension definition and parsing dt-bindings: riscv: add Sxcsrind ISA extension description RISC-V: Define indirect CSR access helpers RISC-V: Add Smcntrpmf extension parsing dt-bindings: riscv: add Smcntrpmf ISA extension description RISC-V: Add Ssccfg/Smcdeleg ISA extension definition and parsing dt-bindings: riscv: add Counter delegation ISA extensions description RISC-V: perf: Restructure the SBI PMU code RISC-V: perf: Modify the counter discovery mechanism RISC-V: perf: Add a mechanism to defined legacy event encoding RISC-V: perf: Implement supervisor counter delegation support RISC-V: perf: Use config2/vendor table for event to counter mapping RISC-V: perf: Add legacy event encodings via sysfs RISC-V: perf: Add Qemu virt machine events tools/perf: Support event code for arch standard events tools/perf: Pass the Counter constraint values in the pmu events Sync empty-pmu-events.c with autogenerated one Charlie Jenkins (1): RISC-V: perf: Skip PMU SBI extension when not implemented Kaiwen Xue (2): RISC-V: Add Sxcsrind ISA extension CSR definitions RISC-V: Add Sscfg extension CSR definition Weilin Wang (1): perf pmu-events: Add functions in jevent.py to parse counter and event info for hardware aware grouping .../devicetree/bindings/riscv/extensions.yaml | 67 ++ MAINTAINERS | 4 +- arch/riscv/include/asm/csr.h | 57 ++ arch/riscv/include/asm/csr_ind.h | 42 + arch/riscv/include/asm/hwcap.h | 8 + arch/riscv/include/asm/kvm_vcpu_pmu.h | 4 +- arch/riscv/include/asm/kvm_vcpu_sbi.h | 2 +- arch/riscv/include/asm/vendorid_list.h | 4 + arch/riscv/kernel/cpufeature.c | 27 + arch/riscv/kvm/Makefile | 4 +- arch/riscv/kvm/vcpu_sbi.c | 2 +- drivers/perf/Kconfig | 16 +- drivers/perf/Makefile | 4 +- drivers/perf/{riscv_pmu.c => riscv_pmu_common.c} | 0 drivers/perf/{riscv_pmu_sbi.c => riscv_pmu_dev.c} | 982 +++++++++++++++++---- include/linux/perf/riscv_pmu.h | 26 +- .../perf/pmu-events/arch/riscv/arch-standard.json | 10 + tools/perf/pmu-events/empty-pmu-events.c | 299 ++++--- tools/perf/pmu-events/jevents.py | 218 ++++- tools/perf/pmu-events/pmu-events.h | 32 +- 20 files changed, 1490 insertions(+), 318 deletions(-) --- base-commit: 74e48c9286409128a3dccd0efd4b83b3f9ef95fd change-id: 20240715-counter_delegation-628a32f8c9cc -- Regards, Atish patra From atishp at rivosinc.com Thu Mar 27 12:35:43 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:35:43 -0700 Subject: [PATCH v5 02/21] RISC-V: Add Sxcsrind ISA extension CSR definitions In-Reply-To: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> Message-ID: <20250327-counter_delegation-v5-2-1ee538468d1b@rivosinc.com> From: Kaiwen Xue This adds definitions of new CSRs and bits defined in Sxcsrind ISA extension. These CSR enables indirect accesses mechanism to access any CSRs in M-, S-, and VS-mode. The range of the select values and ireg will be define by the ISA extension using Sxcsrind extension. Signed-off-by: Kaiwen Xue Reviewed-by: Cl?ment L?ger Signed-off-by: Atish Patra --- arch/riscv/include/asm/csr.h | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h index 6fed42e37705..bce56a83c384 100644 --- a/arch/riscv/include/asm/csr.h +++ b/arch/riscv/include/asm/csr.h @@ -333,6 +333,12 @@ /* Supervisor-Level Window to Indirectly Accessed Registers (AIA) */ #define CSR_SISELECT 0x150 #define CSR_SIREG 0x151 +/* Supervisor-Level Window to Indirectly Accessed Registers (Sxcsrind) */ +#define CSR_SIREG2 0x152 +#define CSR_SIREG3 0x153 +#define CSR_SIREG4 0x155 +#define CSR_SIREG5 0x156 +#define CSR_SIREG6 0x157 /* Supervisor-Level Interrupts (AIA) */ #define CSR_STOPEI 0x15c @@ -380,6 +386,14 @@ /* VS-Level Window to Indirectly Accessed Registers (H-extension with AIA) */ #define CSR_VSISELECT 0x250 #define CSR_VSIREG 0x251 +/* + * VS-Level Window to Indirectly Accessed Registers (H-extension with Sxcsrind) + */ +#define CSR_VSIREG2 0x252 +#define CSR_VSIREG3 0x253 +#define CSR_VSIREG4 0x255 +#define CSR_VSIREG5 0x256 +#define CSR_VISREG6 0x257 /* VS-Level Interrupts (H-extension with AIA) */ #define CSR_VSTOPEI 0x25c @@ -422,6 +436,12 @@ /* Machine-Level Window to Indirectly Accessed Registers (AIA) */ #define CSR_MISELECT 0x350 #define CSR_MIREG 0x351 +/* Machine-Level Window to Indirectly Accessed Registers (Sxcsrind) */ +#define CSR_MIREG2 0x352 +#define CSR_MIREG3 0x353 +#define CSR_MIREG4 0x355 +#define CSR_MIREG5 0x356 +#define CSR_MIREG6 0x357 /* Machine-Level Interrupts (AIA) */ #define CSR_MTOPEI 0x35c @@ -467,6 +487,11 @@ # define CSR_IEH CSR_MIEH # define CSR_ISELECT CSR_MISELECT # define CSR_IREG CSR_MIREG +# define CSR_IREG2 CSR_MIREG2 +# define CSR_IREG3 CSR_MIREG3 +# define CSR_IREG4 CSR_MIREG4 +# define CSR_IREG5 CSR_MIREG5 +# define CSR_IREG6 CSR_MIREG6 # define CSR_IPH CSR_MIPH # define CSR_TOPEI CSR_MTOPEI # define CSR_TOPI CSR_MTOPI @@ -492,6 +517,11 @@ # define CSR_IEH CSR_SIEH # define CSR_ISELECT CSR_SISELECT # define CSR_IREG CSR_SIREG +# define CSR_IREG2 CSR_SIREG2 +# define CSR_IREG3 CSR_SIREG3 +# define CSR_IREG4 CSR_SIREG4 +# define CSR_IREG5 CSR_SIREG5 +# define CSR_IREG6 CSR_SIREG6 # define CSR_IPH CSR_SIPH # define CSR_TOPEI CSR_STOPEI # define CSR_TOPI CSR_STOPI -- 2.43.0 From atishp at rivosinc.com Thu Mar 27 12:35:42 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:35:42 -0700 Subject: [PATCH v5 01/21] perf pmu-events: Add functions in jevent.py to parse counter and event info for hardware aware grouping In-Reply-To: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> Message-ID: <20250327-counter_delegation-v5-1-1ee538468d1b@rivosinc.com> From: Weilin Wang These functions are added to parse event counter restrictions and counter availability info from json files so that the metric grouping method could do grouping based on the counter restriction of events and the counters that are available on the system. Signed-off-by: Weilin Wang --- tools/perf/pmu-events/empty-pmu-events.c | 299 ++++++++++++++++++++----------- tools/perf/pmu-events/jevents.py | 205 ++++++++++++++++++++- tools/perf/pmu-events/pmu-events.h | 32 +++- 3 files changed, 419 insertions(+), 117 deletions(-) diff --git a/tools/perf/pmu-events/empty-pmu-events.c b/tools/perf/pmu-events/empty-pmu-events.c index 1c7a2cfa321f..3a7ec31576f5 100644 --- a/tools/perf/pmu-events/empty-pmu-events.c +++ b/tools/perf/pmu-events/empty-pmu-events.c @@ -20,73 +20,73 @@ struct pmu_table_entry { static const char *const big_c_string = /* offset=0 */ "tool\000" -/* offset=5 */ "duration_time\000tool\000Wall clock interval time in nanoseconds\000config=1\000\00000\000\000" -/* offset=78 */ "user_time\000tool\000User (non-kernel) time in nanoseconds\000config=2\000\00000\000\000" -/* offset=145 */ "system_time\000tool\000System/kernel time in nanoseconds\000config=3\000\00000\000\000" -/* offset=210 */ "has_pmem\000tool\0001 if persistent memory installed otherwise 0\000config=4\000\00000\000\000" -/* offset=283 */ "num_cores\000tool\000Number of cores. A core consists of 1 or more thread, with each thread being associated with a logical Linux CPU\000config=5\000\00000\000\000" -/* offset=425 */ "num_cpus\000tool\000Number of logical Linux CPUs. There may be multiple such CPUs on a core\000config=6\000\00000\000\000" -/* offset=525 */ "num_cpus_online\000tool\000Number of online logical Linux CPUs. There may be multiple such CPUs on a core\000config=7\000\00000\000\000" -/* offset=639 */ "num_dies\000tool\000Number of dies. Each die has 1 or more cores\000config=8\000\00000\000\000" -/* offset=712 */ "num_packages\000tool\000Number of packages. Each package has 1 or more die\000config=9\000\00000\000\000" -/* offset=795 */ "slots\000tool\000Number of functional units that in parallel can execute parts of an instruction\000config=0xa\000\00000\000\000" -/* offset=902 */ "smt_on\000tool\0001 if simultaneous multithreading (aka hyperthreading) is enable otherwise 0\000config=0xb\000\00000\000\000" -/* offset=1006 */ "system_tsc_freq\000tool\000The amount a Time Stamp Counter (TSC) increases per second\000config=0xc\000\00000\000\000" -/* offset=1102 */ "default_core\000" -/* offset=1115 */ "bp_l1_btb_correct\000branch\000L1 BTB Correction\000event=0x8a\000\00000\000\000" -/* offset=1174 */ "bp_l2_btb_correct\000branch\000L2 BTB Correction\000event=0x8b\000\00000\000\000" -/* offset=1233 */ "l3_cache_rd\000cache\000L3 cache access, read\000event=0x40\000\00000\000Attributable Level 3 cache access, read\000" -/* offset=1328 */ "segment_reg_loads.any\000other\000Number of segment register loads\000event=6,period=200000,umask=0x80\000\00000\000\000" -/* offset=1427 */ "dispatch_blocked.any\000other\000Memory cluster signals to block micro-op dispatch for any reason\000event=9,period=200000,umask=0x20\000\00000\000\000" -/* offset=1557 */ "eist_trans\000other\000Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions\000event=0x3a,period=200000\000\00000\000\000" -/* offset=1672 */ "hisi_sccl,ddrc\000" -/* offset=1687 */ "uncore_hisi_ddrc.flux_wcmd\000uncore\000DDRC write commands\000event=2\000\00000\000DDRC write commands\000" -/* offset=1773 */ "uncore_cbox\000" -/* offset=1785 */ "unc_cbo_xsnp_response.miss_eviction\000uncore\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000event=0x22,umask=0x81\000\00000\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000" -/* offset=2016 */ "event-hyphen\000uncore\000UNC_CBO_HYPHEN\000event=0xe0\000\00000\000UNC_CBO_HYPHEN\000" -/* offset=2081 */ "event-two-hyph\000uncore\000UNC_CBO_TWO_HYPH\000event=0xc0\000\00000\000UNC_CBO_TWO_HYPH\000" -/* offset=2152 */ "hisi_sccl,l3c\000" -/* offset=2166 */ "uncore_hisi_l3c.rd_hit_cpipe\000uncore\000Total read hits\000event=7\000\00000\000Total read hits\000" -/* offset=2246 */ "uncore_imc_free_running\000" -/* offset=2270 */ "uncore_imc_free_running.cache_miss\000uncore\000Total cache misses\000event=0x12\000\00000\000Total cache misses\000" -/* offset=2365 */ "uncore_imc\000" -/* offset=2376 */ "uncore_imc.cache_hits\000uncore\000Total cache hits\000event=0x34\000\00000\000Total cache hits\000" -/* offset=2454 */ "uncore_sys_ddr_pmu\000" -/* offset=2473 */ "sys_ddr_pmu.write_cycles\000uncore\000ddr write-cycles event\000event=0x2b\000v8\00000\000\000" -/* offset=2546 */ "uncore_sys_ccn_pmu\000" -/* offset=2565 */ "sys_ccn_pmu.read_cycles\000uncore\000ccn read-cycles event\000config=0x2c\0000x01\00000\000\000" -/* offset=2639 */ "uncore_sys_cmn_pmu\000" -/* offset=2658 */ "sys_cmn_pmu.hnf_cache_miss\000uncore\000Counts total cache misses in first lookup result (high priority)\000eventid=1,type=5\000(434|436|43c|43a).*\00000\000\000" -/* offset=2798 */ "CPI\000\0001 / IPC\000\000\000\000\000\000\000\00000" -/* offset=2820 */ "IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\00000" -/* offset=2883 */ "Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\00000" -/* offset=3049 */ "dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000" -/* offset=3113 */ "icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000" -/* offset=3180 */ "cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\00000" -/* offset=3251 */ "DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\00000" -/* offset=3345 */ "DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\00000" -/* offset=3479 */ "DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\00000" -/* offset=3543 */ "DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\00000" -/* offset=3611 */ "DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\00000" -/* offset=3681 */ "M1\000\000ipc + M2\000\000\000\000\000\000\000\00000" -/* offset=3703 */ "M2\000\000ipc + M1\000\000\000\000\000\000\000\00000" -/* offset=3725 */ "M3\000\0001 / M3\000\000\000\000\000\000\000\00000" -/* offset=3745 */ "L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\00000" +/* offset=5 */ "duration_time\000tool\000Wall clock interval time in nanoseconds\000config=1\000\00000\000\000\000" +/* offset=79 */ "user_time\000tool\000User (non-kernel) time in nanoseconds\000config=2\000\00000\000\000\000" +/* offset=147 */ "system_time\000tool\000System/kernel time in nanoseconds\000config=3\000\00000\000\000\000" +/* offset=213 */ "has_pmem\000tool\0001 if persistent memory installed otherwise 0\000config=4\000\00000\000\000\000" +/* offset=287 */ "num_cores\000tool\000Number of cores. A core consists of 1 or more thread, with each thread being associated with a logical Linux CPU\000config=5\000\00000\000\000\000" +/* offset=430 */ "num_cpus\000tool\000Number of logical Linux CPUs. There may be multiple such CPUs on a core\000config=6\000\00000\000\000\000" +/* offset=531 */ "num_cpus_online\000tool\000Number of online logical Linux CPUs. There may be multiple such CPUs on a core\000config=7\000\00000\000\000\000" +/* offset=646 */ "num_dies\000tool\000Number of dies. Each die has 1 or more cores\000config=8\000\00000\000\000\000" +/* offset=720 */ "num_packages\000tool\000Number of packages. Each package has 1 or more die\000config=9\000\00000\000\000\000" +/* offset=804 */ "slots\000tool\000Number of functional units that in parallel can execute parts of an instruction\000config=0xa\000\00000\000\000\000" +/* offset=912 */ "smt_on\000tool\0001 if simultaneous multithreading (aka hyperthreading) is enable otherwise 0\000config=0xb\000\00000\000\000\000" +/* offset=1017 */ "system_tsc_freq\000tool\000The amount a Time Stamp Counter (TSC) increases per second\000config=0xc\000\00000\000\000\000" +/* offset=1114 */ "default_core\000" +/* offset=1127 */ "bp_l1_btb_correct\000branch\000L1 BTB Correction\000event=0x8a\000\00000\000\000\000" +/* offset=1187 */ "bp_l2_btb_correct\000branch\000L2 BTB Correction\000event=0x8b\000\00000\000\000\000" +/* offset=1247 */ "l3_cache_rd\000cache\000L3 cache access, read\000event=0x40\000\00000\000Attributable Level 3 cache access, read\000\000" +/* offset=1343 */ "segment_reg_loads.any\000other\000Number of segment register loads\000event=6,period=200000,umask=0x80\000\00000\000\0000,1\000" +/* offset=1446 */ "dispatch_blocked.any\000other\000Memory cluster signals to block micro-op dispatch for any reason\000event=9,period=200000,umask=0x20\000\00000\000\0000,1\000" +/* offset=1580 */ "eist_trans\000other\000Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions\000event=0x3a,period=200000\000\00000\000\0000,1\000" +/* offset=1699 */ "hisi_sccl,ddrc\000" +/* offset=1714 */ "uncore_hisi_ddrc.flux_wcmd\000uncore\000DDRC write commands\000event=2\000\00000\000DDRC write commands\000\000" +/* offset=1801 */ "uncore_cbox\000" +/* offset=1813 */ "unc_cbo_xsnp_response.miss_eviction\000uncore\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000event=0x22,umask=0x81\000\00000\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\0000,1\000" +/* offset=2048 */ "event-hyphen\000uncore\000UNC_CBO_HYPHEN\000event=0xe0\000\00000\000UNC_CBO_HYPHEN\000\000" +/* offset=2114 */ "event-two-hyph\000uncore\000UNC_CBO_TWO_HYPH\000event=0xc0\000\00000\000UNC_CBO_TWO_HYPH\000\000" +/* offset=2186 */ "hisi_sccl,l3c\000" +/* offset=2200 */ "uncore_hisi_l3c.rd_hit_cpipe\000uncore\000Total read hits\000event=7\000\00000\000Total read hits\000\000" +/* offset=2281 */ "uncore_imc_free_running\000" +/* offset=2305 */ "uncore_imc_free_running.cache_miss\000uncore\000Total cache misses\000event=0x12\000\00000\000Total cache misses\000\000" +/* offset=2401 */ "uncore_imc\000" +/* offset=2412 */ "uncore_imc.cache_hits\000uncore\000Total cache hits\000event=0x34\000\00000\000Total cache hits\000\000" +/* offset=2491 */ "uncore_sys_ddr_pmu\000" +/* offset=2510 */ "sys_ddr_pmu.write_cycles\000uncore\000ddr write-cycles event\000event=0x2b\000v8\00000\000\000\000" +/* offset=2584 */ "uncore_sys_ccn_pmu\000" +/* offset=2603 */ "sys_ccn_pmu.read_cycles\000uncore\000ccn read-cycles event\000config=0x2c\0000x01\00000\000\000\000" +/* offset=2678 */ "uncore_sys_cmn_pmu\000" +/* offset=2697 */ "sys_cmn_pmu.hnf_cache_miss\000uncore\000Counts total cache misses in first lookup result (high priority)\000eventid=1,type=5\000(434|436|43c|43a).*\00000\000\000\000" +/* offset=2838 */ "CPI\000\0001 / IPC\000\000\000\000\000\000\000\00000" +/* offset=2860 */ "IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\00000" +/* offset=2923 */ "Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\00000" +/* offset=3089 */ "dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000" +/* offset=3153 */ "icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000" +/* offset=3220 */ "cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\00000" +/* offset=3291 */ "DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\00000" +/* offset=3385 */ "DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\00000" +/* offset=3519 */ "DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\00000" +/* offset=3583 */ "DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\00000" +/* offset=3651 */ "DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\00000" +/* offset=3721 */ "M1\000\000ipc + M2\000\000\000\000\000\000\000\00000" +/* offset=3743 */ "M2\000\000ipc + M1\000\000\000\000\000\000\000\00000" +/* offset=3765 */ "M3\000\0001 / M3\000\000\000\000\000\000\000\00000" +/* offset=3785 */ "L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\00000" ; static const struct compact_pmu_event pmu_events__common_tool[] = { -{ 5 }, /* duration_time\000tool\000Wall clock interval time in nanoseconds\000config=1\000\00000\000\000 */ -{ 210 }, /* has_pmem\000tool\0001 if persistent memory installed otherwise 0\000config=4\000\00000\000\000 */ -{ 283 }, /* num_cores\000tool\000Number of cores. A core consists of 1 or more thread, with each thread being associated with a logical Linux CPU\000config=5\000\00000\000\000 */ -{ 425 }, /* num_cpus\000tool\000Number of logical Linux CPUs. There may be multiple such CPUs on a core\000config=6\000\00000\000\000 */ -{ 525 }, /* num_cpus_online\000tool\000Number of online logical Linux CPUs. There may be multiple such CPUs on a core\000config=7\000\00000\000\000 */ -{ 639 }, /* num_dies\000tool\000Number of dies. Each die has 1 or more cores\000config=8\000\00000\000\000 */ -{ 712 }, /* num_packages\000tool\000Number of packages. Each package has 1 or more die\000config=9\000\00000\000\000 */ -{ 795 }, /* slots\000tool\000Number of functional units that in parallel can execute parts of an instruction\000config=0xa\000\00000\000\000 */ -{ 902 }, /* smt_on\000tool\0001 if simultaneous multithreading (aka hyperthreading) is enable otherwise 0\000config=0xb\000\00000\000\000 */ -{ 145 }, /* system_time\000tool\000System/kernel time in nanoseconds\000config=3\000\00000\000\000 */ -{ 1006 }, /* system_tsc_freq\000tool\000The amount a Time Stamp Counter (TSC) increases per second\000config=0xc\000\00000\000\000 */ -{ 78 }, /* user_time\000tool\000User (non-kernel) time in nanoseconds\000config=2\000\00000\000\000 */ +{ 5 }, /* duration_time\000tool\000Wall clock interval time in nanoseconds\000config=1\000\00000\000\000\000 */ +{ 213 }, /* has_pmem\000tool\0001 if persistent memory installed otherwise 0\000config=4\000\00000\000\000\000 */ +{ 287 }, /* num_cores\000tool\000Number of cores. A core consists of 1 or more thread, with each thread being associated with a logical Linux CPU\000config=5\000\00000\000\000\000 */ +{ 430 }, /* num_cpus\000tool\000Number of logical Linux CPUs. There may be multiple such CPUs on a core\000config=6\000\00000\000\000\000 */ +{ 531 }, /* num_cpus_online\000tool\000Number of online logical Linux CPUs. There may be multiple such CPUs on a core\000config=7\000\00000\000\000\000 */ +{ 646 }, /* num_dies\000tool\000Number of dies. Each die has 1 or more cores\000config=8\000\00000\000\000\000 */ +{ 720 }, /* num_packages\000tool\000Number of packages. Each package has 1 or more die\000config=9\000\00000\000\000\000 */ +{ 804 }, /* slots\000tool\000Number of functional units that in parallel can execute parts of an instruction\000config=0xa\000\00000\000\000\000 */ +{ 912 }, /* smt_on\000tool\0001 if simultaneous multithreading (aka hyperthreading) is enable otherwise 0\000config=0xb\000\00000\000\000\000 */ +{ 147 }, /* system_time\000tool\000System/kernel time in nanoseconds\000config=3\000\00000\000\000\000 */ +{ 1017 }, /* system_tsc_freq\000tool\000The amount a Time Stamp Counter (TSC) increases per second\000config=0xc\000\00000\000\000\000 */ +{ 79 }, /* user_time\000tool\000User (non-kernel) time in nanoseconds\000config=2\000\00000\000\000\000 */ }; @@ -99,29 +99,29 @@ const struct pmu_table_entry pmu_events__common[] = { }; static const struct compact_pmu_event pmu_events__test_soc_cpu_default_core[] = { -{ 1115 }, /* bp_l1_btb_correct\000branch\000L1 BTB Correction\000event=0x8a\000\00000\000\000 */ -{ 1174 }, /* bp_l2_btb_correct\000branch\000L2 BTB Correction\000event=0x8b\000\00000\000\000 */ -{ 1427 }, /* dispatch_blocked.any\000other\000Memory cluster signals to block micro-op dispatch for any reason\000event=9,period=200000,umask=0x20\000\00000\000\000 */ -{ 1557 }, /* eist_trans\000other\000Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions\000event=0x3a,period=200000\000\00000\000\000 */ -{ 1233 }, /* l3_cache_rd\000cache\000L3 cache access, read\000event=0x40\000\00000\000Attributable Level 3 cache access, read\000 */ -{ 1328 }, /* segment_reg_loads.any\000other\000Number of segment register loads\000event=6,period=200000,umask=0x80\000\00000\000\000 */ +{ 1127 }, /* bp_l1_btb_correct\000branch\000L1 BTB Correction\000event=0x8a\000\00000\000\000\000 */ +{ 1187 }, /* bp_l2_btb_correct\000branch\000L2 BTB Correction\000event=0x8b\000\00000\000\000\000 */ +{ 1446 }, /* dispatch_blocked.any\000other\000Memory cluster signals to block micro-op dispatch for any reason\000event=9,period=200000,umask=0x20\000\00000\000\0000,1\000 */ +{ 1580 }, /* eist_trans\000other\000Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions\000event=0x3a,period=200000\000\00000\000\0000,1\000 */ +{ 1247 }, /* l3_cache_rd\000cache\000L3 cache access, read\000event=0x40\000\00000\000Attributable Level 3 cache access, read\000\000 */ +{ 1343 }, /* segment_reg_loads.any\000other\000Number of segment register loads\000event=6,period=200000,umask=0x80\000\00000\000\0000,1\000 */ }; static const struct compact_pmu_event pmu_events__test_soc_cpu_hisi_sccl_ddrc[] = { -{ 1687 }, /* uncore_hisi_ddrc.flux_wcmd\000uncore\000DDRC write commands\000event=2\000\00000\000DDRC write commands\000 */ +{ 1714 }, /* uncore_hisi_ddrc.flux_wcmd\000uncore\000DDRC write commands\000event=2\000\00000\000DDRC write commands\000\000 */ }; static const struct compact_pmu_event pmu_events__test_soc_cpu_hisi_sccl_l3c[] = { -{ 2166 }, /* uncore_hisi_l3c.rd_hit_cpipe\000uncore\000Total read hits\000event=7\000\00000\000Total read hits\000 */ +{ 2200 }, /* uncore_hisi_l3c.rd_hit_cpipe\000uncore\000Total read hits\000event=7\000\00000\000Total read hits\000\000 */ }; static const struct compact_pmu_event pmu_events__test_soc_cpu_uncore_cbox[] = { -{ 2016 }, /* event-hyphen\000uncore\000UNC_CBO_HYPHEN\000event=0xe0\000\00000\000UNC_CBO_HYPHEN\000 */ -{ 2081 }, /* event-two-hyph\000uncore\000UNC_CBO_TWO_HYPH\000event=0xc0\000\00000\000UNC_CBO_TWO_HYPH\000 */ -{ 1785 }, /* unc_cbo_xsnp_response.miss_eviction\000uncore\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000event=0x22,umask=0x81\000\00000\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000 */ +{ 2048 }, /* event-hyphen\000uncore\000UNC_CBO_HYPHEN\000event=0xe0\000\00000\000UNC_CBO_HYPHEN\000\000 */ +{ 2114 }, /* event-two-hyph\000uncore\000UNC_CBO_TWO_HYPH\000event=0xc0\000\00000\000UNC_CBO_TWO_HYPH\000\000 */ +{ 1813 }, /* unc_cbo_xsnp_response.miss_eviction\000uncore\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000event=0x22,umask=0x81\000\00000\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\0000,1\000 */ }; static const struct compact_pmu_event pmu_events__test_soc_cpu_uncore_imc[] = { -{ 2376 }, /* uncore_imc.cache_hits\000uncore\000Total cache hits\000event=0x34\000\00000\000Total cache hits\000 */ +{ 2412 }, /* uncore_imc.cache_hits\000uncore\000Total cache hits\000event=0x34\000\00000\000Total cache hits\000\000 */ }; static const struct compact_pmu_event pmu_events__test_soc_cpu_uncore_imc_free_running[] = { -{ 2270 }, /* uncore_imc_free_running.cache_miss\000uncore\000Total cache misses\000event=0x12\000\00000\000Total cache misses\000 */ +{ 2305 }, /* uncore_imc_free_running.cache_miss\000uncore\000Total cache misses\000event=0x12\000\00000\000Total cache misses\000\000 */ }; @@ -129,51 +129,51 @@ const struct pmu_table_entry pmu_events__test_soc_cpu[] = { { .entries = pmu_events__test_soc_cpu_default_core, .num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_default_core), - .pmu_name = { 1102 /* default_core\000 */ }, + .pmu_name = { 1114 /* default_core\000 */ }, }, { .entries = pmu_events__test_soc_cpu_hisi_sccl_ddrc, .num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_hisi_sccl_ddrc), - .pmu_name = { 1672 /* hisi_sccl,ddrc\000 */ }, + .pmu_name = { 1699 /* hisi_sccl,ddrc\000 */ }, }, { .entries = pmu_events__test_soc_cpu_hisi_sccl_l3c, .num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_hisi_sccl_l3c), - .pmu_name = { 2152 /* hisi_sccl,l3c\000 */ }, + .pmu_name = { 2186 /* hisi_sccl,l3c\000 */ }, }, { .entries = pmu_events__test_soc_cpu_uncore_cbox, .num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_uncore_cbox), - .pmu_name = { 1773 /* uncore_cbox\000 */ }, + .pmu_name = { 1801 /* uncore_cbox\000 */ }, }, { .entries = pmu_events__test_soc_cpu_uncore_imc, .num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_uncore_imc), - .pmu_name = { 2365 /* uncore_imc\000 */ }, + .pmu_name = { 2401 /* uncore_imc\000 */ }, }, { .entries = pmu_events__test_soc_cpu_uncore_imc_free_running, .num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_uncore_imc_free_running), - .pmu_name = { 2246 /* uncore_imc_free_running\000 */ }, + .pmu_name = { 2281 /* uncore_imc_free_running\000 */ }, }, }; static const struct compact_pmu_event pmu_metrics__test_soc_cpu_default_core[] = { -{ 2798 }, /* CPI\000\0001 / IPC\000\000\000\000\000\000\000\00000 */ -{ 3479 }, /* DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\00000 */ -{ 3251 }, /* DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\00000 */ -{ 3345 }, /* DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\00000 */ -{ 3543 }, /* DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\00000 */ -{ 3611 }, /* DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\00000 */ -{ 2883 }, /* Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\00000 */ -{ 2820 }, /* IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\00000 */ -{ 3745 }, /* L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\00000 */ -{ 3681 }, /* M1\000\000ipc + M2\000\000\000\000\000\000\000\00000 */ -{ 3703 }, /* M2\000\000ipc + M1\000\000\000\000\000\000\000\00000 */ -{ 3725 }, /* M3\000\0001 / M3\000\000\000\000\000\000\000\00000 */ -{ 3180 }, /* cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\00000 */ -{ 3049 }, /* dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000 */ -{ 3113 }, /* icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000 */ +{ 2838 }, /* CPI\000\0001 / IPC\000\000\000\000\000\000\000\00000 */ +{ 3519 }, /* DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\00000 */ +{ 3291 }, /* DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\00000 */ +{ 3385 }, /* DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\00000 */ +{ 3583 }, /* DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\00000 */ +{ 3651 }, /* DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\00000 */ +{ 2923 }, /* Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\00000 */ +{ 2860 }, /* IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\00000 */ +{ 3785 }, /* L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\00000 */ +{ 3721 }, /* M1\000\000ipc + M2\000\000\000\000\000\000\000\00000 */ +{ 3743 }, /* M2\000\000ipc + M1\000\000\000\000\000\000\000\00000 */ +{ 3765 }, /* M3\000\0001 / M3\000\000\000\000\000\000\000\00000 */ +{ 3220 }, /* cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\00000 */ +{ 3089 }, /* dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000 */ +{ 3153 }, /* icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000 */ }; @@ -181,18 +181,18 @@ const struct pmu_table_entry pmu_metrics__test_soc_cpu[] = { { .entries = pmu_metrics__test_soc_cpu_default_core, .num_entries = ARRAY_SIZE(pmu_metrics__test_soc_cpu_default_core), - .pmu_name = { 1102 /* default_core\000 */ }, + .pmu_name = { 1114 /* default_core\000 */ }, }, }; static const struct compact_pmu_event pmu_events__test_soc_sys_uncore_sys_ccn_pmu[] = { -{ 2565 }, /* sys_ccn_pmu.read_cycles\000uncore\000ccn read-cycles event\000config=0x2c\0000x01\00000\000\000 */ +{ 2603 }, /* sys_ccn_pmu.read_cycles\000uncore\000ccn read-cycles event\000config=0x2c\0000x01\00000\000\000\000 */ }; static const struct compact_pmu_event pmu_events__test_soc_sys_uncore_sys_cmn_pmu[] = { -{ 2658 }, /* sys_cmn_pmu.hnf_cache_miss\000uncore\000Counts total cache misses in first lookup result (high priority)\000eventid=1,type=5\000(434|436|43c|43a).*\00000\000\000 */ +{ 2697 }, /* sys_cmn_pmu.hnf_cache_miss\000uncore\000Counts total cache misses in first lookup result (high priority)\000eventid=1,type=5\000(434|436|43c|43a).*\00000\000\000\000 */ }; static const struct compact_pmu_event pmu_events__test_soc_sys_uncore_sys_ddr_pmu[] = { -{ 2473 }, /* sys_ddr_pmu.write_cycles\000uncore\000ddr write-cycles event\000event=0x2b\000v8\00000\000\000 */ +{ 2510 }, /* sys_ddr_pmu.write_cycles\000uncore\000ddr write-cycles event\000event=0x2b\000v8\00000\000\000\000 */ }; @@ -200,17 +200,17 @@ const struct pmu_table_entry pmu_events__test_soc_sys[] = { { .entries = pmu_events__test_soc_sys_uncore_sys_ccn_pmu, .num_entries = ARRAY_SIZE(pmu_events__test_soc_sys_uncore_sys_ccn_pmu), - .pmu_name = { 2546 /* uncore_sys_ccn_pmu\000 */ }, + .pmu_name = { 2584 /* uncore_sys_ccn_pmu\000 */ }, }, { .entries = pmu_events__test_soc_sys_uncore_sys_cmn_pmu, .num_entries = ARRAY_SIZE(pmu_events__test_soc_sys_uncore_sys_cmn_pmu), - .pmu_name = { 2639 /* uncore_sys_cmn_pmu\000 */ }, + .pmu_name = { 2678 /* uncore_sys_cmn_pmu\000 */ }, }, { .entries = pmu_events__test_soc_sys_uncore_sys_ddr_pmu, .num_entries = ARRAY_SIZE(pmu_events__test_soc_sys_uncore_sys_ddr_pmu), - .pmu_name = { 2454 /* uncore_sys_ddr_pmu\000 */ }, + .pmu_name = { 2491 /* uncore_sys_ddr_pmu\000 */ }, }, }; @@ -227,6 +227,12 @@ struct pmu_metrics_table { uint32_t num_pmus; }; +/* Struct used to make the PMU counter layout table implementation opaque to callers. */ +struct pmu_layouts_table { + const struct compact_pmu_event *entries; + size_t length; +}; + /* * Map a CPU to its table of PMU events. The CPU is identified by the * cpuid field, which is an arch-specific identifier for the CPU. @@ -240,6 +246,7 @@ struct pmu_events_map { const char *cpuid; struct pmu_events_table event_table; struct pmu_metrics_table metric_table; + struct pmu_layouts_table layout_table; }; /* @@ -273,6 +280,7 @@ const struct pmu_events_map pmu_events_map[] = { .cpuid = 0, .event_table = { 0, 0 }, .metric_table = { 0, 0 }, + .layout_table = { 0, 0 }, } }; @@ -317,6 +325,8 @@ static void decompress_event(int offset, struct pmu_event *pe) pe->unit = (*p == '\0' ? NULL : p); while (*p++); pe->long_desc = (*p == '\0' ? NULL : p); + while (*p++); + pe->counters_list = (*p == '\0' ? NULL : p); } static void decompress_metric(int offset, struct pmu_metric *pm) @@ -348,6 +358,19 @@ static void decompress_metric(int offset, struct pmu_metric *pm) pm->event_grouping = *p - '0'; } +static void decompress_layout(int offset, struct pmu_layout *pm) +{ + const char *p = &big_c_string[offset]; + + pm->pmu = (*p == '\0' ? NULL : p); + while (*p++); + pm->desc = (*p == '\0' ? NULL : p); + p++; + pm->counters_num_gp = *p - '0'; + p++; + pm->counters_num_fixed = *p - '0'; +} + static int pmu_events_table__for_each_event_pmu(const struct pmu_events_table *table, const struct pmu_table_entry *pmu, pmu_event_iter_fn fn, @@ -503,6 +526,21 @@ int pmu_metrics_table__for_each_metric(const struct pmu_metrics_table *table, return 0; } +int pmu_layouts_table__for_each_layout(const struct pmu_layouts_table *table, + pmu_layout_iter_fn fn, + void *data) { + for (size_t i = 0; i < table->length; i++) { + struct pmu_layout pm; + int ret; + + decompress_layout(table->entries[i].offset, &pm); + ret = fn(&pm, data); + if (ret) + return ret; + } + return 0; +} + static const struct pmu_events_map *map_for_cpu(struct perf_cpu cpu) { static struct { @@ -595,6 +633,34 @@ const struct pmu_metrics_table *pmu_metrics_table__find(void) return map ? &map->metric_table : NULL; } +const struct pmu_layouts_table *perf_pmu__find_layouts_table(void) +{ + const struct pmu_layouts_table *table = NULL; + struct perf_cpu cpu = {-1}; + char *cpuid = get_cpuid_allow_env_override(cpu); + int i; + + /* on some platforms which uses cpus map, cpuid can be NULL for + * PMUs other than CORE PMUs. + */ + if (!cpuid) + return NULL; + + i = 0; + for (;;) { + const struct pmu_events_map *map = &pmu_events_map[i++]; + if (!map->arch) + break; + + if (!strcmp_cpuid_str(map->cpuid, cpuid)) { + table = &map->layout_table; + break; + } + } + free(cpuid); + return table; +} + const struct pmu_events_table *find_core_events_table(const char *arch, const char *cpuid) { for (const struct pmu_events_map *tables = &pmu_events_map[0]; @@ -616,6 +682,16 @@ const struct pmu_metrics_table *find_core_metrics_table(const char *arch, const } return NULL; } +const struct pmu_layouts_table *find_core_layouts_table(const char *arch, const char *cpuid) +{ + for (const struct pmu_events_map *tables = &pmu_events_map[0]; + tables->arch; + tables++) { + if (!strcmp(tables->arch, arch) && !strcmp_cpuid_str(tables->cpuid, cpuid)) + return &tables->layout_table; + } + return NULL; +} int pmu_for_each_core_event(pmu_event_iter_fn fn, void *data) { @@ -644,6 +720,19 @@ int pmu_for_each_core_metric(pmu_metric_iter_fn fn, void *data) return 0; } +int pmu_for_each_core_layout(pmu_layout_iter_fn fn, void *data) +{ + for (const struct pmu_events_map *tables = &pmu_events_map[0]; + tables->arch; + tables++) { + int ret = pmu_layouts_table__for_each_layout(&tables->layout_table, fn, data); + + if (ret) + return ret; + } + return 0; +} + const struct pmu_events_table *find_sys_events_table(const char *name) { for (const struct pmu_sys_events *tables = &pmu_sys_event_tables[0]; diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py index 3e204700b59a..fa7c466a5ef3 100755 --- a/tools/perf/pmu-events/jevents.py +++ b/tools/perf/pmu-events/jevents.py @@ -23,6 +23,8 @@ _metric_tables = [] _sys_metric_tables = [] # Mapping between sys event table names and sys metric table names. _sys_event_table_to_metric_table_mapping = {} +# List of regular PMU counter layout tables. +_pmu_layouts_tables = [] # Map from an event name to an architecture standard # JsonEvent. Architecture standard events are in json files in the top # f'{_args.starting_dir}/{_args.arch}' directory. @@ -31,6 +33,10 @@ _arch_std_events = {} _pending_events = [] # Name of events table to be written out _pending_events_tblname = None +# PMU counter layout to write out when the layout table is closed +_pending_pmu_counts = [] +# Name of PMU counter layout table to be written out +_pending_pmu_counts_tblname = None # Metrics to write out when the table is closed _pending_metrics = [] # Name of metrics table to be written out @@ -51,6 +57,11 @@ _json_event_attributes = [ 'long_desc' ] +# Attributes that are in pmu_unit_layout. +_json_layout_attributes = [ + 'pmu', 'desc' +] + # Attributes that are in pmu_metric rather than pmu_event. _json_metric_attributes = [ 'metric_name', 'metric_group', 'metric_expr', 'metric_threshold', @@ -265,7 +276,7 @@ class JsonEvent: def unit_to_pmu(unit: str) -> Optional[str]: """Convert a JSON Unit to Linux PMU name.""" - if not unit: + if not unit or unit == "core": return 'default_core' # Comment brought over from jevents.c: # it's not realistic to keep adding these, we need something more scalable ... @@ -336,6 +347,19 @@ class JsonEvent: if 'Errata' in jd: extra_desc += ' Spec update: ' + jd['Errata'] self.pmu = unit_to_pmu(jd.get('Unit')) + # The list of counter(s) the event could be collected with + class Counter: + gp = str() + fixed = str() + self.counters = {'list': str(), 'num': Counter()} + self.counters['list'] = jd.get('Counter') + # Number of generic counter + self.counters['num'].gp = jd.get('CountersNumGeneric') + # Number of fixed counter + self.counters['num'].fixed = jd.get('CountersNumFixed') + # If the event uses an MSR, other event uses the same MSR could not be + # schedule to collect at the same time. + self.msr = jd.get('MSRIndex') filter = jd.get('Filter') self.unit = jd.get('ScaleUnit') self.perpkg = jd.get('PerPkg') @@ -411,8 +435,20 @@ class JsonEvent: s += f'\t{attr} = {value},\n' return s + '}' - def build_c_string(self, metric: bool) -> str: + def build_c_string(self, metric: bool, layout: bool) -> str: s = '' + if layout: + for attr in _json_layout_attributes: + x = getattr(self, attr) + if attr in _json_enum_attributes: + s += x if x else '0' + else: + s += f'{x}\\000' if x else '\\000' + x = self.counters['num'].gp + s += x if x else '0' + x = self.counters['num'].fixed + s += x if x else '0' + return s for attr in _json_metric_attributes if metric else _json_event_attributes: x = getattr(self, attr) if metric and x and attr == 'metric_expr': @@ -425,15 +461,18 @@ class JsonEvent: s += x if x else '0' else: s += f'{x}\\000' if x else '\\000' + if not metric: + x = self.counters['list'] + s += f'{x}\\000' if x else '\\000' return s - def to_c_string(self, metric: bool) -> str: + def to_c_string(self, metric: bool, layout: bool) -> str: """Representation of the event as a C struct initializer.""" def fix_comment(s: str) -> str: return s.replace('*/', r'\*\/') - s = self.build_c_string(metric) + s = self.build_c_string(metric, layout) return f'{{ { _bcs.offsets[s] } }}, /* {fix_comment(s)} */\n' @@ -472,6 +511,8 @@ def preprocess_arch_std_files(archpath: str) -> None: _arch_std_events[event.name.lower()] = event if event.metric_name: _arch_std_events[event.metric_name.lower()] = event + if event.counters['num'].gp: + _arch_std_events[event.pmu.lower()] = event except Exception as e: raise RuntimeError(f'Failure processing \'{item.name}\' in \'{archpath}\'') from e @@ -483,6 +524,8 @@ def add_events_table_entries(item: os.DirEntry, topic: str) -> None: _pending_events.append(e) if e.metric_name: _pending_metrics.append(e) + if e.counters['num'].gp: + _pending_pmu_counts.append(e) def print_pending_events() -> None: @@ -526,8 +569,8 @@ def print_pending_events() -> None: last_pmu = event.pmu pmus.add((event.pmu, pmu_name)) - _args.output_file.write(event.to_c_string(metric=False)) last_name = event.name + _args.output_file.write(event.to_c_string(metric=False, layout=False)) _pending_events = [] _args.output_file.write(f""" @@ -582,7 +625,7 @@ def print_pending_metrics() -> None: last_pmu = metric.pmu pmus.add((metric.pmu, pmu_name)) - _args.output_file.write(metric.to_c_string(metric=True)) + _args.output_file.write(metric.to_c_string(metric=True, layout=False)) _pending_metrics = [] _args.output_file.write(f""" @@ -600,6 +643,35 @@ const struct pmu_table_entry {_pending_metrics_tblname}[] = {{ """) _args.output_file.write('};\n\n') +def print_pending_pmu_counter_layout_table() -> None: + '''Print counter layout data from counter.json file to counter layout table in + c-string''' + + def pmu_counts_cmp_key(j: JsonEvent) -> Tuple[bool, str, str]: + def fix_none(s: Optional[str]) -> str: + if s is None: + return '' + return s + + return (j.desc is not None, fix_none(j.pmu)) + + global _pending_pmu_counts + if not _pending_pmu_counts: + return + + global _pending_pmu_counts_tblname + global pmu_layouts_tables + _pmu_layouts_tables.append(_pending_pmu_counts_tblname) + + _args.output_file.write( + f'static const struct compact_pmu_event {_pending_pmu_counts_tblname}[] = {{\n') + + for pmu_layout in sorted(_pending_pmu_counts, key=pmu_counts_cmp_key): + _args.output_file.write(pmu_layout.to_c_string(metric=False, layout=True)) + _pending_pmu_counts = [] + + _args.output_file.write('};\n\n') + def get_topic(topic: str) -> str: if topic.endswith('metrics.json'): return 'metrics' @@ -636,10 +708,12 @@ def preprocess_one_file(parents: Sequence[str], item: os.DirEntry) -> None: pmu_name = f"{event.pmu}\\000" if event.name: _bcs.add(pmu_name, metric=False) - _bcs.add(event.build_c_string(metric=False), metric=False) + _bcs.add(event.build_c_string(metric=False, layout=False), metric=False) if event.metric_name: _bcs.add(pmu_name, metric=True) - _bcs.add(event.build_c_string(metric=True), metric=True) + _bcs.add(event.build_c_string(metric=True, layout=False), metric=True) + if event.counters['num'].gp: + _bcs.add(event.build_c_string(metric=False, layout=True), metric=False) def process_one_file(parents: Sequence[str], item: os.DirEntry) -> None: """Process a JSON file during the main walk.""" @@ -656,11 +730,14 @@ def process_one_file(parents: Sequence[str], item: os.DirEntry) -> None: if item.is_dir() and is_leaf_dir_ignoring_sys(item.path): print_pending_events() print_pending_metrics() + print_pending_pmu_counter_layout_table() global _pending_events_tblname _pending_events_tblname = file_name_to_table_name('pmu_events_', parents, item.name) global _pending_metrics_tblname _pending_metrics_tblname = file_name_to_table_name('pmu_metrics_', parents, item.name) + global _pending_pmu_counts_tblname + _pending_pmu_counts_tblname = file_name_to_table_name('pmu_layouts_', parents, item.name) if item.name == 'sys': _sys_event_table_to_metric_table_mapping[_pending_events_tblname] = _pending_metrics_tblname @@ -694,6 +771,12 @@ struct pmu_metrics_table { uint32_t num_pmus; }; +/* Struct used to make the PMU counter layout table implementation opaque to callers. */ +struct pmu_layouts_table { + const struct compact_pmu_event *entries; + size_t length; +}; + /* * Map a CPU to its table of PMU events. The CPU is identified by the * cpuid field, which is an arch-specific identifier for the CPU. @@ -707,6 +790,7 @@ struct pmu_events_map { const char *cpuid; struct pmu_events_table event_table; struct pmu_metrics_table metric_table; + struct pmu_layouts_table layout_table; }; /* @@ -762,6 +846,12 @@ const struct pmu_events_map pmu_events_map[] = { metric_size = '0' if event_size == '0' and metric_size == '0': continue + layout_tblname = file_name_to_table_name('pmu_layouts_', [], row[2].replace('/', '_')) + if layout_tblname in _pmu_layouts_tables: + layout_size = f'ARRAY_SIZE({layout_tblname})' + else: + layout_tblname = 'NULL' + layout_size = '0' cpuid = row[0].replace('\\', '\\\\') _args.output_file.write(f"""{{ \t.arch = "{arch}", @@ -773,6 +863,10 @@ const struct pmu_events_map pmu_events_map[] = { \t.metric_table = {{ \t\t.pmus = {metric_tblname}, \t\t.num_pmus = {metric_size} +\t}}, +\t.layout_table = {{ +\t\t.entries = {layout_tblname}, +\t\t.length = {layout_size} \t}} }}, """) @@ -783,6 +877,7 @@ const struct pmu_events_map pmu_events_map[] = { \t.cpuid = 0, \t.event_table = { 0, 0 }, \t.metric_table = { 0, 0 }, +\t.layout_table = { 0, 0 }, } }; """) @@ -851,6 +946,9 @@ static void decompress_event(int offset, struct pmu_event *pe) _args.output_file.write('\tp++;') else: _args.output_file.write('\twhile (*p++);') + _args.output_file.write('\twhile (*p++);') + _args.output_file.write(f'\n\tpe->counters_list = ') + _args.output_file.write("(*p == '\\0' ? NULL : p);\n") _args.output_file.write("""} static void decompress_metric(int offset, struct pmu_metric *pm) @@ -871,6 +969,30 @@ static void decompress_metric(int offset, struct pmu_metric *pm) _args.output_file.write('\twhile (*p++);') _args.output_file.write("""} +static void decompress_layout(int offset, struct pmu_layout *pm) +{ +\tconst char *p = &big_c_string[offset]; +""") + for attr in _json_layout_attributes: + _args.output_file.write(f'\n\tpm->{attr} = ') + if attr in _json_enum_attributes: + _args.output_file.write("*p - '0';\n") + else: + _args.output_file.write("(*p == '\\0' ? NULL : p);\n") + if attr == _json_layout_attributes[-1]: + continue + if attr in _json_enum_attributes: + _args.output_file.write('\tp++;') + else: + _args.output_file.write('\twhile (*p++);') + _args.output_file.write('\tp++;') + _args.output_file.write(f'\n\tpm->counters_num_gp = ') + _args.output_file.write("*p - '0';\n") + _args.output_file.write('\tp++;') + _args.output_file.write(f'\n\tpm->counters_num_fixed = ') + _args.output_file.write("*p - '0';\n") + _args.output_file.write("""} + static int pmu_events_table__for_each_event_pmu(const struct pmu_events_table *table, const struct pmu_table_entry *pmu, pmu_event_iter_fn fn, @@ -1026,6 +1148,21 @@ int pmu_metrics_table__for_each_metric(const struct pmu_metrics_table *table, return 0; } +int pmu_layouts_table__for_each_layout(const struct pmu_layouts_table *table, + pmu_layout_iter_fn fn, + void *data) { + for (size_t i = 0; i < table->length; i++) { + struct pmu_layout pm; + int ret; + + decompress_layout(table->entries[i].offset, &pm); + ret = fn(&pm, data); + if (ret) + return ret; + } + return 0; +} + static const struct pmu_events_map *map_for_cpu(struct perf_cpu cpu) { static struct { @@ -1118,6 +1255,34 @@ const struct pmu_metrics_table *pmu_metrics_table__find(void) return map ? &map->metric_table : NULL; } +const struct pmu_layouts_table *perf_pmu__find_layouts_table(void) +{ + const struct pmu_layouts_table *table = NULL; + struct perf_cpu cpu = {-1}; + char *cpuid = get_cpuid_allow_env_override(cpu); + int i; + + /* on some platforms which uses cpus map, cpuid can be NULL for + * PMUs other than CORE PMUs. + */ + if (!cpuid) + return NULL; + + i = 0; + for (;;) { + const struct pmu_events_map *map = &pmu_events_map[i++]; + if (!map->arch) + break; + + if (!strcmp_cpuid_str(map->cpuid, cpuid)) { + table = &map->layout_table; + break; + } + } + free(cpuid); + return table; +} + const struct pmu_events_table *find_core_events_table(const char *arch, const char *cpuid) { for (const struct pmu_events_map *tables = &pmu_events_map[0]; @@ -1139,6 +1304,16 @@ const struct pmu_metrics_table *find_core_metrics_table(const char *arch, const } return NULL; } +const struct pmu_layouts_table *find_core_layouts_table(const char *arch, const char *cpuid) +{ + for (const struct pmu_events_map *tables = &pmu_events_map[0]; + tables->arch; + tables++) { + if (!strcmp(tables->arch, arch) && !strcmp_cpuid_str(tables->cpuid, cpuid)) + return &tables->layout_table; + } + return NULL; +} int pmu_for_each_core_event(pmu_event_iter_fn fn, void *data) { @@ -1167,6 +1342,19 @@ int pmu_for_each_core_metric(pmu_metric_iter_fn fn, void *data) return 0; } +int pmu_for_each_core_layout(pmu_layout_iter_fn fn, void *data) +{ + for (const struct pmu_events_map *tables = &pmu_events_map[0]; + tables->arch; + tables++) { + int ret = pmu_layouts_table__for_each_layout(&tables->layout_table, fn, data); + + if (ret) + return ret; + } + return 0; +} + const struct pmu_events_table *find_sys_events_table(const char *name) { for (const struct pmu_sys_events *tables = &pmu_sys_event_tables[0]; @@ -1330,6 +1518,7 @@ struct pmu_table_entry { ftw(arch_path, [], process_one_file) print_pending_events() print_pending_metrics() + print_pending_pmu_counter_layout_table() print_mapping_table(archs) print_system_mapping_table() diff --git a/tools/perf/pmu-events/pmu-events.h b/tools/perf/pmu-events/pmu-events.h index 675562e6f770..9a5cbec32513 100644 --- a/tools/perf/pmu-events/pmu-events.h +++ b/tools/perf/pmu-events/pmu-events.h @@ -45,6 +45,11 @@ struct pmu_event { const char *desc; const char *topic; const char *long_desc; + /** + * The list of counter(s) the event could be collected on. + * eg., "0,1,2,3,4,5,6,7". + */ + const char *counters_list; const char *pmu; const char *unit; bool perpkg; @@ -67,8 +72,18 @@ struct pmu_metric { enum metric_event_groups event_grouping; }; +struct pmu_layout { + const char *pmu; + const char *desc; + /** Total number of generic counters*/ + int counters_num_gp; + /** Total number of fixed counters. Set to zero if no fixed counter on the unit.*/ + int counters_num_fixed; +}; + struct pmu_events_table; struct pmu_metrics_table; +struct pmu_layouts_table; #define PMU_EVENTS__NOT_FOUND -1000 @@ -80,6 +95,9 @@ typedef int (*pmu_metric_iter_fn)(const struct pmu_metric *pm, const struct pmu_metrics_table *table, void *data); +typedef int (*pmu_layout_iter_fn)(const struct pmu_layout *pm, + void *data); + int pmu_events_table__for_each_event(const struct pmu_events_table *table, struct perf_pmu *pmu, pmu_event_iter_fn fn, @@ -92,10 +110,13 @@ int pmu_events_table__for_each_event(const struct pmu_events_table *table, * search of all tables. */ int pmu_events_table__find_event(const struct pmu_events_table *table, - struct perf_pmu *pmu, - const char *name, - pmu_event_iter_fn fn, - void *data); + struct perf_pmu *pmu, + const char *name, + pmu_event_iter_fn fn, + void *data); +int pmu_layouts_table__for_each_layout(const struct pmu_layouts_table *table, + pmu_layout_iter_fn fn, + void *data); size_t pmu_events_table__num_events(const struct pmu_events_table *table, struct perf_pmu *pmu); @@ -104,10 +125,13 @@ int pmu_metrics_table__for_each_metric(const struct pmu_metrics_table *table, pm const struct pmu_events_table *perf_pmu__find_events_table(struct perf_pmu *pmu); const struct pmu_metrics_table *pmu_metrics_table__find(void); +const struct pmu_layouts_table *perf_pmu__find_layouts_table(void); const struct pmu_events_table *find_core_events_table(const char *arch, const char *cpuid); const struct pmu_metrics_table *find_core_metrics_table(const char *arch, const char *cpuid); +const struct pmu_layouts_table *find_core_layouts_table(const char *arch, const char *cpuid); int pmu_for_each_core_event(pmu_event_iter_fn fn, void *data); int pmu_for_each_core_metric(pmu_metric_iter_fn fn, void *data); +int pmu_for_each_core_layout(pmu_layout_iter_fn fn, void *data); const struct pmu_events_table *find_sys_events_table(const char *name); const struct pmu_metrics_table *find_sys_metrics_table(const char *name); -- 2.43.0 From atishp at rivosinc.com Thu Mar 27 12:35:44 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:35:44 -0700 Subject: [PATCH v5 03/21] RISC-V: Add Sxcsrind ISA extension definition and parsing In-Reply-To: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> Message-ID: <20250327-counter_delegation-v5-3-1ee538468d1b@rivosinc.com> The S[m|s]csrind extension extends the indirect CSR access mechanism defined in Smaia/Ssaia extensions. This patch just enables the definition and parsing. Signed-off-by: Atish Patra --- arch/riscv/include/asm/hwcap.h | 5 +++++ arch/riscv/kernel/cpufeature.c | 2 ++ 2 files changed, 7 insertions(+) diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h index 869da082252a..3d6e706fc5b2 100644 --- a/arch/riscv/include/asm/hwcap.h +++ b/arch/riscv/include/asm/hwcap.h @@ -100,6 +100,8 @@ #define RISCV_ISA_EXT_ZICCRSE 91 #define RISCV_ISA_EXT_SVADE 92 #define RISCV_ISA_EXT_SVADU 93 +#define RISCV_ISA_EXT_SSCSRIND 94 +#define RISCV_ISA_EXT_SMCSRIND 95 #define RISCV_ISA_EXT_XLINUXENVCFG 127 @@ -109,9 +111,12 @@ #ifdef CONFIG_RISCV_M_MODE #define RISCV_ISA_EXT_SxAIA RISCV_ISA_EXT_SMAIA #define RISCV_ISA_EXT_SUPM RISCV_ISA_EXT_SMNPM +#define RISCV_ISA_EXT_SxCSRIND RISCV_ISA_EXT_SMCSRIND #else #define RISCV_ISA_EXT_SxAIA RISCV_ISA_EXT_SSAIA #define RISCV_ISA_EXT_SUPM RISCV_ISA_EXT_SSNPM +#define RISCV_ISA_EXT_SxAIA RISCV_ISA_EXT_SSAIA +#define RISCV_ISA_EXT_SxCSRIND RISCV_ISA_EXT_SSCSRIND #endif #endif /* _ASM_RISCV_HWCAP_H */ diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 40ac72e407b6..eddbab038301 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -394,11 +394,13 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = { __RISCV_ISA_EXT_BUNDLE(zvksg, riscv_zvksg_bundled_exts), __RISCV_ISA_EXT_DATA(zvkt, RISCV_ISA_EXT_ZVKT), __RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA), + __RISCV_ISA_EXT_DATA(smcsrind, RISCV_ISA_EXT_SMCSRIND), __RISCV_ISA_EXT_DATA(smmpm, RISCV_ISA_EXT_SMMPM), __RISCV_ISA_EXT_SUPERSET(smnpm, RISCV_ISA_EXT_SMNPM, riscv_xlinuxenvcfg_exts), __RISCV_ISA_EXT_DATA(smstateen, RISCV_ISA_EXT_SMSTATEEN), __RISCV_ISA_EXT_DATA(ssaia, RISCV_ISA_EXT_SSAIA), __RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF), + __RISCV_ISA_EXT_DATA(sscsrind, RISCV_ISA_EXT_SSCSRIND), __RISCV_ISA_EXT_SUPERSET(ssnpm, RISCV_ISA_EXT_SSNPM, riscv_xlinuxenvcfg_exts), __RISCV_ISA_EXT_DATA(sstc, RISCV_ISA_EXT_SSTC), __RISCV_ISA_EXT_DATA(svade, RISCV_ISA_EXT_SVADE), -- 2.43.0 From atishp at rivosinc.com Thu Mar 27 12:35:45 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:35:45 -0700 Subject: [PATCH v5 04/21] dt-bindings: riscv: add Sxcsrind ISA extension description In-Reply-To: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> Message-ID: <20250327-counter_delegation-v5-4-1ee538468d1b@rivosinc.com> Add the S[m|s]csrind ISA extension description. Acked-by: Rob Herring (Arm) Signed-off-by: Atish Patra --- Documentation/devicetree/bindings/riscv/extensions.yaml | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml index a63b994e0763..0520a9d8b1cd 100644 --- a/Documentation/devicetree/bindings/riscv/extensions.yaml +++ b/Documentation/devicetree/bindings/riscv/extensions.yaml @@ -128,6 +128,14 @@ properties: changes to interrupts as frozen at commit ccbddab ("Merge pull request #42 from riscv/jhauser-2023-RC4") of riscv-aia. + - const: smcsrind + description: | + The standard Smcsrind supervisor-level extension extends the + indirect CSR access mechanism defined by the Smaia extension. This + extension allows other ISA extension to use indirect CSR access + mechanism in M-mode as ratified in the 20240326 version of the + privileged ISA specification. + - const: smmpm description: | The standard Smmpm extension for M-mode pointer masking as @@ -146,6 +154,14 @@ properties: added by other RISC-V extensions in H/S/VS/U/VU modes and as ratified at commit a28bfae (Ratified (#7)) of riscv-state-enable. + - const: sscsrind + description: | + The standard Sscsrind supervisor-level extension extends the + indirect CSR access mechanism defined by the Ssaia extension. This + extension allows other ISA extension to use indirect CSR access + mechanism in S-mode as ratified in the 20240326 version of the + privileged ISA specification. + - const: ssaia description: | The standard Ssaia supervisor-level extension for the advanced -- 2.43.0 From atishp at rivosinc.com Thu Mar 27 12:35:46 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:35:46 -0700 Subject: [PATCH v5 05/21] RISC-V: Define indirect CSR access helpers In-Reply-To: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> Message-ID: <20250327-counter_delegation-v5-5-1ee538468d1b@rivosinc.com> The indriect CSR requires multiple instructions to read/write CSR. Add a few helper functions for ease of usage. Signed-off-by: Atish Patra --- arch/riscv/include/asm/csr_ind.h | 42 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/arch/riscv/include/asm/csr_ind.h b/arch/riscv/include/asm/csr_ind.h new file mode 100644 index 000000000000..d36e1e06ed2b --- /dev/null +++ b/arch/riscv/include/asm/csr_ind.h @@ -0,0 +1,42 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2024 Rivos Inc. + */ + +#ifndef _ASM_RISCV_CSR_IND_H +#define _ASM_RISCV_CSR_IND_H + +#include + +#define csr_ind_read(iregcsr, iselbase, iseloff) ({ \ + unsigned long value = 0; \ + unsigned long flags; \ + local_irq_save(flags); \ + csr_write(CSR_ISELECT, iselbase + iseloff); \ + value = csr_read(iregcsr); \ + local_irq_restore(flags); \ + value; \ +}) + +#define csr_ind_write(iregcsr, iselbase, iseloff, value) ({ \ + unsigned long flags; \ + local_irq_save(flags); \ + csr_write(CSR_ISELECT, iselbase + iseloff); \ + csr_write(iregcsr, value); \ + local_irq_restore(flags); \ +}) + +#define csr_ind_warl(iregcsr, iselbase, iseloff, warl_val) ({ \ + unsigned long old_val = 0, value = 0; \ + unsigned long flags; \ + local_irq_save(flags); \ + csr_write(CSR_ISELECT, iselbase + iseloff); \ + old_val = csr_read(iregcsr); \ + csr_write(iregcsr, warl_val); \ + value = csr_read(iregcsr); \ + csr_write(iregcsr, old_val); \ + local_irq_restore(flags); \ + value; \ +}) + +#endif -- 2.43.0 From atishp at rivosinc.com Thu Mar 27 12:35:47 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:35:47 -0700 Subject: [PATCH v5 06/21] RISC-V: Add Smcntrpmf extension parsing In-Reply-To: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> Message-ID: <20250327-counter_delegation-v5-6-1ee538468d1b@rivosinc.com> Smcntrpmf extension allows M-mode to enable privilege mode filtering for cycle/instret counters. However, the cyclecfg/instretcfg CSRs are only available only in Ssccfg only Smcntrpmf is present. That's why, kernel needs to detect presence of Smcntrpmf extension and enable privilege mode filtering for cycle/instret counters. Reviewed-by: Cl?ment L?ger Signed-off-by: Atish Patra --- arch/riscv/include/asm/hwcap.h | 1 + arch/riscv/kernel/cpufeature.c | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h index 3d6e706fc5b2..b4eddcb57842 100644 --- a/arch/riscv/include/asm/hwcap.h +++ b/arch/riscv/include/asm/hwcap.h @@ -102,6 +102,7 @@ #define RISCV_ISA_EXT_SVADU 93 #define RISCV_ISA_EXT_SSCSRIND 94 #define RISCV_ISA_EXT_SMCSRIND 95 +#define RISCV_ISA_EXT_SMCNTRPMF 96 #define RISCV_ISA_EXT_XLINUXENVCFG 127 diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index eddbab038301..e3e40cfe7967 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -394,6 +394,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = { __RISCV_ISA_EXT_BUNDLE(zvksg, riscv_zvksg_bundled_exts), __RISCV_ISA_EXT_DATA(zvkt, RISCV_ISA_EXT_ZVKT), __RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA), + __RISCV_ISA_EXT_DATA(smcntrpmf, RISCV_ISA_EXT_SMCNTRPMF), __RISCV_ISA_EXT_DATA(smcsrind, RISCV_ISA_EXT_SMCSRIND), __RISCV_ISA_EXT_DATA(smmpm, RISCV_ISA_EXT_SMMPM), __RISCV_ISA_EXT_SUPERSET(smnpm, RISCV_ISA_EXT_SMNPM, riscv_xlinuxenvcfg_exts), -- 2.43.0 From atishp at rivosinc.com Thu Mar 27 12:35:48 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:35:48 -0700 Subject: [PATCH v5 07/21] dt-bindings: riscv: add Smcntrpmf ISA extension description In-Reply-To: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> Message-ID: <20250327-counter_delegation-v5-7-1ee538468d1b@rivosinc.com> Add the description for Smcntrpmf ISA extension Acked-by: Rob Herring (Arm) Signed-off-by: Atish Patra --- Documentation/devicetree/bindings/riscv/extensions.yaml | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml index 0520a9d8b1cd..c2025949295f 100644 --- a/Documentation/devicetree/bindings/riscv/extensions.yaml +++ b/Documentation/devicetree/bindings/riscv/extensions.yaml @@ -136,6 +136,12 @@ properties: mechanism in M-mode as ratified in the 20240326 version of the privileged ISA specification. + - const: smcntrpmf + description: | + The standard Smcntrpmf supervisor-level extension for the machine mode + to enable privilege mode filtering for cycle and instret counters as + ratified in the 20240326 version of the privileged ISA specification. + - const: smmpm description: | The standard Smmpm extension for M-mode pointer masking as -- 2.43.0 From atishp at rivosinc.com Thu Mar 27 12:35:49 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:35:49 -0700 Subject: [PATCH v5 08/21] RISC-V: Add Sscfg extension CSR definition In-Reply-To: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> Message-ID: <20250327-counter_delegation-v5-8-1ee538468d1b@rivosinc.com> From: Kaiwen Xue This adds the scountinhibit CSR definition and S-mode accessible hpmevent bits defined by smcdeleg/ssccfg. scountinhibit allows S-mode to start/stop counters directly from S-mode without invoking SBI calls to M-mode. It is also used to figure out the counters delegated to S-mode by the M-mode as well. Signed-off-by: Kaiwen Xue Reviewed-by: Cl?ment L?ger --- arch/riscv/include/asm/csr.h | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h index bce56a83c384..3d2d4f886c77 100644 --- a/arch/riscv/include/asm/csr.h +++ b/arch/riscv/include/asm/csr.h @@ -230,6 +230,31 @@ #define SMSTATEEN0_HSENVCFG (_ULL(1) << SMSTATEEN0_HSENVCFG_SHIFT) #define SMSTATEEN0_SSTATEEN0_SHIFT 63 #define SMSTATEEN0_SSTATEEN0 (_ULL(1) << SMSTATEEN0_SSTATEEN0_SHIFT) +/* HPMEVENT bits. These are accessible in S-mode via Smcdeleg/Ssccfg */ +#ifdef CONFIG_64BIT +#define HPMEVENT_OF (BIT_ULL(63)) +#define HPMEVENT_MINH (BIT_ULL(62)) +#define HPMEVENT_SINH (BIT_ULL(61)) +#define HPMEVENT_UINH (BIT_ULL(60)) +#define HPMEVENT_VSINH (BIT_ULL(59)) +#define HPMEVENT_VUINH (BIT_ULL(58)) +#else +#define HPMEVENTH_OF (BIT_ULL(31)) +#define HPMEVENTH_MINH (BIT_ULL(30)) +#define HPMEVENTH_SINH (BIT_ULL(29)) +#define HPMEVENTH_UINH (BIT_ULL(28)) +#define HPMEVENTH_VSINH (BIT_ULL(27)) +#define HPMEVENTH_VUINH (BIT_ULL(26)) + +#define HPMEVENT_OF (HPMEVENTH_OF << 32) +#define HPMEVENT_MINH (HPMEVENTH_MINH << 32) +#define HPMEVENT_SINH (HPMEVENTH_SINH << 32) +#define HPMEVENT_UINH (HPMEVENTH_UINH << 32) +#define HPMEVENT_VSINH (HPMEVENTH_VSINH << 32) +#define HPMEVENT_VUINH (HPMEVENTH_VUINH << 32) +#endif + +#define SISELECT_SSCCFG_BASE 0x40 /* mseccfg bits */ #define MSECCFG_PMM ENVCFG_PMM @@ -311,6 +336,7 @@ #define CSR_SCOUNTEREN 0x106 #define CSR_SENVCFG 0x10a #define CSR_SSTATEEN0 0x10c +#define CSR_SCOUNTINHIBIT 0x120 #define CSR_SSCRATCH 0x140 #define CSR_SEPC 0x141 #define CSR_SCAUSE 0x142 -- 2.43.0 From atishp at rivosinc.com Thu Mar 27 12:35:50 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:35:50 -0700 Subject: [PATCH v5 09/21] RISC-V: Add Ssccfg/Smcdeleg ISA extension definition and parsing In-Reply-To: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> Message-ID: <20250327-counter_delegation-v5-9-1ee538468d1b@rivosinc.com> Smcdeleg extension allows the M-mode to delegate selected counters to S-mode so that it can access those counters and correpsonding hpmevent CSRs without M-mode. Ssccfg (?Ss? for Privileged architecture and Supervisor-level extension, ?ccfg? for Counter Configuration) provides access to delegated counters and new supervisor-level state. This patch just enables these definitions and enable parsing. Signed-off-by: Atish Patra --- arch/riscv/include/asm/hwcap.h | 2 ++ arch/riscv/kernel/cpufeature.c | 24 ++++++++++++++++++++++++ 2 files changed, 26 insertions(+) diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h index b4eddcb57842..fa5e01bcb990 100644 --- a/arch/riscv/include/asm/hwcap.h +++ b/arch/riscv/include/asm/hwcap.h @@ -103,6 +103,8 @@ #define RISCV_ISA_EXT_SSCSRIND 94 #define RISCV_ISA_EXT_SMCSRIND 95 #define RISCV_ISA_EXT_SMCNTRPMF 96 +#define RISCV_ISA_EXT_SSCCFG 97 +#define RISCV_ISA_EXT_SMCDELEG 98 #define RISCV_ISA_EXT_XLINUXENVCFG 127 diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index e3e40cfe7967..f72552adb257 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -150,6 +150,27 @@ static int riscv_ext_svadu_validate(const struct riscv_isa_ext_data *data, return 0; } +static int riscv_ext_smcdeleg_validate(const struct riscv_isa_ext_data *data, + const unsigned long *isa_bitmap) +{ + if (__riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_SSCSRIND) && + __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_ZIHPM) && + __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_ZICNTR)) + return 0; + + return -EPROBE_DEFER; +} + +static int riscv_ext_ssccfg_validate(const struct riscv_isa_ext_data *data, + const unsigned long *isa_bitmap) +{ + if (!riscv_ext_smcdeleg_validate(data, isa_bitmap) && + __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_SMCDELEG)) + return 0; + + return -EPROBE_DEFER; +} + static const unsigned int riscv_zk_bundled_exts[] = { RISCV_ISA_EXT_ZBKB, RISCV_ISA_EXT_ZBKC, @@ -394,12 +415,15 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = { __RISCV_ISA_EXT_BUNDLE(zvksg, riscv_zvksg_bundled_exts), __RISCV_ISA_EXT_DATA(zvkt, RISCV_ISA_EXT_ZVKT), __RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA), + __RISCV_ISA_EXT_DATA_VALIDATE(smcdeleg, RISCV_ISA_EXT_SMCDELEG, + riscv_ext_smcdeleg_validate), __RISCV_ISA_EXT_DATA(smcntrpmf, RISCV_ISA_EXT_SMCNTRPMF), __RISCV_ISA_EXT_DATA(smcsrind, RISCV_ISA_EXT_SMCSRIND), __RISCV_ISA_EXT_DATA(smmpm, RISCV_ISA_EXT_SMMPM), __RISCV_ISA_EXT_SUPERSET(smnpm, RISCV_ISA_EXT_SMNPM, riscv_xlinuxenvcfg_exts), __RISCV_ISA_EXT_DATA(smstateen, RISCV_ISA_EXT_SMSTATEEN), __RISCV_ISA_EXT_DATA(ssaia, RISCV_ISA_EXT_SSAIA), + __RISCV_ISA_EXT_DATA_VALIDATE(ssccfg, RISCV_ISA_EXT_SSCCFG, riscv_ext_ssccfg_validate), __RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF), __RISCV_ISA_EXT_DATA(sscsrind, RISCV_ISA_EXT_SSCSRIND), __RISCV_ISA_EXT_SUPERSET(ssnpm, RISCV_ISA_EXT_SSNPM, riscv_xlinuxenvcfg_exts), -- 2.43.0 From atishp at rivosinc.com Thu Mar 27 12:35:51 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:35:51 -0700 Subject: [PATCH v5 10/21] dt-bindings: riscv: add Counter delegation ISA extensions description In-Reply-To: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> Message-ID: <20250327-counter_delegation-v5-10-1ee538468d1b@rivosinc.com> Add description for the Smcdeleg/Ssccfg extension. Signed-off-by: Atish Patra --- .../devicetree/bindings/riscv/extensions.yaml | 45 ++++++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml index c2025949295f..f34bc66940c0 100644 --- a/Documentation/devicetree/bindings/riscv/extensions.yaml +++ b/Documentation/devicetree/bindings/riscv/extensions.yaml @@ -128,6 +128,13 @@ properties: changes to interrupts as frozen at commit ccbddab ("Merge pull request #42 from riscv/jhauser-2023-RC4") of riscv-aia. + - const: smcdeleg + description: | + The standard Smcdeleg supervisor-level extension for the machine mode + to delegate the hpmcounters to supvervisor mode so that they are + directlyi accessible in the supervisor mode as ratified in the + 20240213 version of the privileged ISA specification. + - const: smcsrind description: | The standard Smcsrind supervisor-level extension extends the @@ -175,6 +182,14 @@ properties: behavioural changes to interrupts as frozen at commit ccbddab ("Merge pull request #42 from riscv/jhauser-2023-RC4") of riscv-aia. + - const: ssccfg + description: | + The standard Ssccfg supervisor-level extension for configuring + the delegated hpmcounters to be accessible directly in supervisor + mode as ratified in the 20240213 version of the privileged ISA + specification. This extension depends on Sscsrind, Smcdeleg, Zihpm, + Zicntr extensions. + - const: sscofpmf description: | The standard Sscofpmf supervisor-level extension for count overflow @@ -695,6 +710,36 @@ properties: then: contains: const: zca + # Smcdeleg depends on Sscsrind, Zihpm, Zicntr + - if: + contains: + const: smcdeleg + then: + allOf: + - contains: + const: sscsrind + - contains: + const: zihpm + - contains: + const: zicntr + # Ssccfg depends on Smcdeleg, Sscsrind, Zihpm, Zicntr, Sscofpmf, Smcntrpmf + - if: + contains: + const: ssccfg + then: + allOf: + - contains: + const: smcdeleg + - contains: + const: sscsrind + - contains: + const: sscofpmf + - contains: + const: smcntrpmf + - contains: + const: zihpm + - contains: + const: zicntr allOf: # Zcf extension does not exist on rv64 -- 2.43.0 From atishp at rivosinc.com Thu Mar 27 12:35:52 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:35:52 -0700 Subject: [PATCH v5 11/21] RISC-V: perf: Restructure the SBI PMU code In-Reply-To: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> Message-ID: <20250327-counter_delegation-v5-11-1ee538468d1b@rivosinc.com> With Ssccfg/Smcdeleg, we no longer need SBI PMU extension to program/ access hpmcounter/events. However, we do need it for firmware counters. Rename the driver and its related code to represent generic name that will handle both sbi and ISA mechanism for hpmcounter related operations. Take this opportunity to update the Kconfig names to match the new driver name closely. No functional change intended. Reviewed-by: Cl?ment L?ger Signed-off-by: Atish Patra --- MAINTAINERS | 4 +- arch/riscv/include/asm/kvm_vcpu_pmu.h | 4 +- arch/riscv/include/asm/kvm_vcpu_sbi.h | 2 +- arch/riscv/kvm/Makefile | 4 +- arch/riscv/kvm/vcpu_sbi.c | 2 +- drivers/perf/Kconfig | 16 +- drivers/perf/Makefile | 4 +- drivers/perf/{riscv_pmu.c => riscv_pmu_common.c} | 0 drivers/perf/{riscv_pmu_sbi.c => riscv_pmu_dev.c} | 214 +++++++++++++--------- include/linux/perf/riscv_pmu.h | 8 +- 10 files changed, 151 insertions(+), 107 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index ed7aa6867674..b6d174f7735e 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -20406,9 +20406,9 @@ M: Atish Patra R: Anup Patel L: linux-riscv at lists.infradead.org S: Supported -F: drivers/perf/riscv_pmu.c +F: drivers/perf/riscv_pmu_common.c +F: drivers/perf/riscv_pmu_dev.c F: drivers/perf/riscv_pmu_legacy.c -F: drivers/perf/riscv_pmu_sbi.c RISC-V SPACEMIT SoC Support M: Yixun Lan diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h index 1d85b6617508..aa75f52e9092 100644 --- a/arch/riscv/include/asm/kvm_vcpu_pmu.h +++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h @@ -13,7 +13,7 @@ #include #include -#ifdef CONFIG_RISCV_PMU_SBI +#ifdef CONFIG_RISCV_PMU #define RISCV_KVM_MAX_FW_CTRS 32 #define RISCV_KVM_MAX_HW_CTRS 32 #define RISCV_KVM_MAX_COUNTERS (RISCV_KVM_MAX_HW_CTRS + RISCV_KVM_MAX_FW_CTRS) @@ -128,5 +128,5 @@ static inline int kvm_riscv_vcpu_pmu_incr_fw(struct kvm_vcpu *vcpu, unsigned lon static inline void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu) {} static inline void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu) {} -#endif /* CONFIG_RISCV_PMU_SBI */ +#endif /* CONFIG_RISCV_PMU */ #endif /* !__KVM_VCPU_RISCV_PMU_H */ diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h index 4ed6203cdd30..745690d9e8b4 100644 --- a/arch/riscv/include/asm/kvm_vcpu_sbi.h +++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h @@ -90,7 +90,7 @@ extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_sta; extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_experimental; extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_vendor; -#ifdef CONFIG_RISCV_PMU_SBI +#ifdef CONFIG_RISCV_PMU extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu; #endif #endif /* __RISCV_KVM_VCPU_SBI_H__ */ diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile index 4e0bba91d284..1ffe8c3400c0 100644 --- a/arch/riscv/kvm/Makefile +++ b/arch/riscv/kvm/Makefile @@ -23,11 +23,11 @@ kvm-y += vcpu_exit.o kvm-y += vcpu_fp.o kvm-y += vcpu_insn.o kvm-y += vcpu_onereg.o -kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o +kvm-$(CONFIG_RISCV_PMU) += vcpu_pmu.o kvm-y += vcpu_sbi.o kvm-y += vcpu_sbi_base.o kvm-y += vcpu_sbi_hsm.o -kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_sbi_pmu.o +kvm-$(CONFIG_RISCV_PMU) += vcpu_sbi_pmu.o kvm-y += vcpu_sbi_replace.o kvm-y += vcpu_sbi_sta.o kvm-y += vcpu_sbi_system.o diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c index d1c83a77735e..7bb4517921d9 100644 --- a/arch/riscv/kvm/vcpu_sbi.c +++ b/arch/riscv/kvm/vcpu_sbi.c @@ -20,7 +20,7 @@ static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_v01 = { }; #endif -#ifndef CONFIG_RISCV_PMU_SBI +#ifndef CONFIG_RISCV_PMU static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu = { .extid_start = -1UL, .extid_end = -1UL, diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig index 4e268de351c4..b3bdff2a99a4 100644 --- a/drivers/perf/Kconfig +++ b/drivers/perf/Kconfig @@ -75,7 +75,7 @@ config ARM_XSCALE_PMU depends on ARM_PMU && CPU_XSCALE def_bool y -config RISCV_PMU +config RISCV_PMU_COMMON depends on RISCV bool "RISC-V PMU framework" default y @@ -86,7 +86,7 @@ config RISCV_PMU can reuse it. config RISCV_PMU_LEGACY - depends on RISCV_PMU + depends on RISCV_PMU_COMMON bool "RISC-V legacy PMU implementation" default y help @@ -95,15 +95,15 @@ config RISCV_PMU_LEGACY of cycle/instruction counter and doesn't support counter overflow, or programmable counters. It will be removed in future. -config RISCV_PMU_SBI - depends on RISCV_PMU && RISCV_SBI - bool "RISC-V PMU based on SBI PMU extension" +config RISCV_PMU + depends on RISCV_PMU_COMMON && RISCV_SBI + bool "RISC-V PMU based on SBI PMU extension and/or Counter delegation extension" default y help Say y if you want to use the CPU performance monitor - using SBI PMU extension on RISC-V based systems. This option provides - full perf feature support i.e. counter overflow, privilege mode - filtering, counter configuration. + using SBI PMU extension or counter delegation ISA extension on RISC-V + based systems. This option provides full perf feature support i.e. + counter overflow, privilege mode filtering, counter configuration. config STARFIVE_STARLINK_PMU depends on ARCH_STARFIVE || COMPILE_TEST diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile index de71d2574857..0805d740c773 100644 --- a/drivers/perf/Makefile +++ b/drivers/perf/Makefile @@ -16,9 +16,9 @@ obj-$(CONFIG_FSL_IMX9_DDR_PMU) += fsl_imx9_ddr_perf.o obj-$(CONFIG_HISI_PMU) += hisilicon/ obj-$(CONFIG_QCOM_L2_PMU) += qcom_l2_pmu.o obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o -obj-$(CONFIG_RISCV_PMU) += riscv_pmu.o +obj-$(CONFIG_RISCV_PMU_COMMON) += riscv_pmu_common.o obj-$(CONFIG_RISCV_PMU_LEGACY) += riscv_pmu_legacy.o -obj-$(CONFIG_RISCV_PMU_SBI) += riscv_pmu_sbi.o +obj-$(CONFIG_RISCV_PMU) += riscv_pmu_dev.o obj-$(CONFIG_STARFIVE_STARLINK_PMU) += starfive_starlink_pmu.o obj-$(CONFIG_THUNDERX2_PMU) += thunderx2_pmu.o obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o diff --git a/drivers/perf/riscv_pmu.c b/drivers/perf/riscv_pmu_common.c similarity index 100% rename from drivers/perf/riscv_pmu.c rename to drivers/perf/riscv_pmu_common.c diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_dev.c similarity index 87% rename from drivers/perf/riscv_pmu_sbi.c rename to drivers/perf/riscv_pmu_dev.c index 698de8ddf895..6cebbc16bfe4 100644 --- a/drivers/perf/riscv_pmu_sbi.c +++ b/drivers/perf/riscv_pmu_dev.c @@ -8,7 +8,7 @@ * sparc64 and x86 code. */ -#define pr_fmt(fmt) "riscv-pmu-sbi: " fmt +#define pr_fmt(fmt) "riscv-pmu-dev: " fmt #include #include @@ -87,6 +87,8 @@ static const struct attribute_group *riscv_pmu_attr_groups[] = { static int sysctl_perf_user_access __read_mostly = SYSCTL_USER_ACCESS; /* + * This structure is SBI specific but counter delegation also require counter + * width, csr mapping. Reuse it for now. * RISC-V doesn't have heterogeneous harts yet. This need to be part of * per_cpu in case of harts with different pmu counters */ @@ -119,7 +121,7 @@ struct sbi_pmu_event_data { }; }; -static struct sbi_pmu_event_data pmu_hw_event_map[] = { +static struct sbi_pmu_event_data pmu_hw_event_sbi_map[] = { [PERF_COUNT_HW_CPU_CYCLES] = {.hw_gen_event = { SBI_PMU_HW_CPU_CYCLES, SBI_PMU_EVENT_TYPE_HW, 0}}, @@ -153,7 +155,7 @@ static struct sbi_pmu_event_data pmu_hw_event_map[] = { }; #define C(x) PERF_COUNT_HW_CACHE_##x -static struct sbi_pmu_event_data pmu_cache_event_map[PERF_COUNT_HW_CACHE_MAX] +static struct sbi_pmu_event_data pmu_cache_event_sbi_map[PERF_COUNT_HW_CACHE_MAX] [PERF_COUNT_HW_CACHE_OP_MAX] [PERF_COUNT_HW_CACHE_RESULT_MAX] = { [C(L1D)] = { @@ -298,7 +300,7 @@ static struct sbi_pmu_event_data pmu_cache_event_map[PERF_COUNT_HW_CACHE_MAX] }, }; -static void pmu_sbi_check_event(struct sbi_pmu_event_data *edata) +static void rvpmu_sbi_check_event(struct sbi_pmu_event_data *edata) { struct sbiret ret; @@ -313,25 +315,25 @@ static void pmu_sbi_check_event(struct sbi_pmu_event_data *edata) } } -static void pmu_sbi_check_std_events(struct work_struct *work) +static void rvpmu_sbi_check_std_events(struct work_struct *work) { - for (int i = 0; i < ARRAY_SIZE(pmu_hw_event_map); i++) - pmu_sbi_check_event(&pmu_hw_event_map[i]); + for (int i = 0; i < ARRAY_SIZE(pmu_hw_event_sbi_map); i++) + rvpmu_sbi_check_event(&pmu_hw_event_sbi_map[i]); - for (int i = 0; i < ARRAY_SIZE(pmu_cache_event_map); i++) - for (int j = 0; j < ARRAY_SIZE(pmu_cache_event_map[i]); j++) - for (int k = 0; k < ARRAY_SIZE(pmu_cache_event_map[i][j]); k++) - pmu_sbi_check_event(&pmu_cache_event_map[i][j][k]); + for (int i = 0; i < ARRAY_SIZE(pmu_cache_event_sbi_map); i++) + for (int j = 0; j < ARRAY_SIZE(pmu_cache_event_sbi_map[i]); j++) + for (int k = 0; k < ARRAY_SIZE(pmu_cache_event_sbi_map[i][j]); k++) + rvpmu_sbi_check_event(&pmu_cache_event_sbi_map[i][j][k]); } -static DECLARE_WORK(check_std_events_work, pmu_sbi_check_std_events); +static DECLARE_WORK(check_std_events_work, rvpmu_sbi_check_std_events); -static int pmu_sbi_ctr_get_width(int idx) +static int rvpmu_ctr_get_width(int idx) { return pmu_ctr_list[idx].width; } -static bool pmu_sbi_ctr_is_fw(int cidx) +static bool rvpmu_ctr_is_fw(int cidx) { union sbi_pmu_ctr_info *info; @@ -373,12 +375,12 @@ int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr) } EXPORT_SYMBOL_GPL(riscv_pmu_get_hpm_info); -static uint8_t pmu_sbi_csr_index(struct perf_event *event) +static uint8_t rvpmu_csr_index(struct perf_event *event) { return pmu_ctr_list[event->hw.idx].csr - CSR_CYCLE; } -static unsigned long pmu_sbi_get_filter_flags(struct perf_event *event) +static unsigned long rvpmu_sbi_get_filter_flags(struct perf_event *event) { unsigned long cflags = 0; bool guest_events = false; @@ -399,7 +401,7 @@ static unsigned long pmu_sbi_get_filter_flags(struct perf_event *event) return cflags; } -static int pmu_sbi_ctr_get_idx(struct perf_event *event) +static int rvpmu_sbi_ctr_get_idx(struct perf_event *event) { struct hw_perf_event *hwc = &event->hw; struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu); @@ -409,7 +411,7 @@ static int pmu_sbi_ctr_get_idx(struct perf_event *event) uint64_t cbase = 0, cmask = rvpmu->cmask; unsigned long cflags = 0; - cflags = pmu_sbi_get_filter_flags(event); + cflags = rvpmu_sbi_get_filter_flags(event); /* * In legacy mode, we have to force the fixed counters for those events @@ -446,7 +448,7 @@ static int pmu_sbi_ctr_get_idx(struct perf_event *event) return -ENOENT; /* Additional sanity check for the counter id */ - if (pmu_sbi_ctr_is_fw(idx)) { + if (rvpmu_ctr_is_fw(idx)) { if (!test_and_set_bit(idx, cpuc->used_fw_ctrs)) return idx; } else { @@ -457,7 +459,7 @@ static int pmu_sbi_ctr_get_idx(struct perf_event *event) return -ENOENT; } -static void pmu_sbi_ctr_clear_idx(struct perf_event *event) +static void rvpmu_ctr_clear_idx(struct perf_event *event) { struct hw_perf_event *hwc = &event->hw; @@ -465,13 +467,13 @@ static void pmu_sbi_ctr_clear_idx(struct perf_event *event) struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events); int idx = hwc->idx; - if (pmu_sbi_ctr_is_fw(idx)) + if (rvpmu_ctr_is_fw(idx)) clear_bit(idx, cpuc->used_fw_ctrs); else clear_bit(idx, cpuc->used_hw_ctrs); } -static int pmu_event_find_cache(u64 config) +static int sbi_pmu_event_find_cache(u64 config) { unsigned int cache_type, cache_op, cache_result, ret; @@ -487,7 +489,7 @@ static int pmu_event_find_cache(u64 config) if (cache_result >= PERF_COUNT_HW_CACHE_RESULT_MAX) return -EINVAL; - ret = pmu_cache_event_map[cache_type][cache_op][cache_result].event_idx; + ret = pmu_cache_event_sbi_map[cache_type][cache_op][cache_result].event_idx; return ret; } @@ -503,7 +505,7 @@ static bool pmu_sbi_is_fw_event(struct perf_event *event) return false; } -static int pmu_sbi_event_map(struct perf_event *event, u64 *econfig) +static int rvpmu_sbi_event_map(struct perf_event *event, u64 *econfig) { u32 type = event->attr.type; u64 config = event->attr.config; @@ -519,10 +521,10 @@ static int pmu_sbi_event_map(struct perf_event *event, u64 *econfig) case PERF_TYPE_HARDWARE: if (config >= PERF_COUNT_HW_MAX) return -EINVAL; - ret = pmu_hw_event_map[event->attr.config].event_idx; + ret = pmu_hw_event_sbi_map[event->attr.config].event_idx; break; case PERF_TYPE_HW_CACHE: - ret = pmu_event_find_cache(config); + ret = sbi_pmu_event_find_cache(config); break; case PERF_TYPE_RAW: /* @@ -648,7 +650,7 @@ static int pmu_sbi_snapshot_setup(struct riscv_pmu *pmu, int cpu) return 0; } -static u64 pmu_sbi_ctr_read(struct perf_event *event) +static u64 rvpmu_sbi_ctr_read(struct perf_event *event) { struct hw_perf_event *hwc = &event->hw; int idx = hwc->idx; @@ -690,25 +692,25 @@ static u64 pmu_sbi_ctr_read(struct perf_event *event) return val; } -static void pmu_sbi_set_scounteren(void *arg) +static void rvpmu_set_scounteren(void *arg) { struct perf_event *event = (struct perf_event *)arg; if (event->hw.idx != -1) csr_write(CSR_SCOUNTEREN, - csr_read(CSR_SCOUNTEREN) | BIT(pmu_sbi_csr_index(event))); + csr_read(CSR_SCOUNTEREN) | BIT(rvpmu_csr_index(event))); } -static void pmu_sbi_reset_scounteren(void *arg) +static void rvpmu_reset_scounteren(void *arg) { struct perf_event *event = (struct perf_event *)arg; if (event->hw.idx != -1) csr_write(CSR_SCOUNTEREN, - csr_read(CSR_SCOUNTEREN) & ~BIT(pmu_sbi_csr_index(event))); + csr_read(CSR_SCOUNTEREN) & ~BIT(rvpmu_csr_index(event))); } -static void pmu_sbi_ctr_start(struct perf_event *event, u64 ival) +static void rvpmu_sbi_ctr_start(struct perf_event *event, u64 ival) { struct sbiret ret; struct hw_perf_event *hwc = &event->hw; @@ -728,10 +730,10 @@ static void pmu_sbi_ctr_start(struct perf_event *event, u64 ival) if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) && (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT)) - pmu_sbi_set_scounteren((void *)event); + rvpmu_set_scounteren((void *)event); } -static void pmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag) +static void rvpmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag) { struct sbiret ret; struct hw_perf_event *hwc = &event->hw; @@ -741,7 +743,7 @@ static void pmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag) if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) && (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT)) - pmu_sbi_reset_scounteren((void *)event); + rvpmu_reset_scounteren((void *)event); if (sbi_pmu_snapshot_available()) flag |= SBI_PMU_STOP_FLAG_TAKE_SNAPSHOT; @@ -767,7 +769,7 @@ static void pmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag) } } -static int pmu_sbi_find_num_ctrs(void) +static int rvpmu_sbi_find_num_ctrs(void) { struct sbiret ret; @@ -778,7 +780,7 @@ static int pmu_sbi_find_num_ctrs(void) return sbi_err_map_linux_errno(ret.error); } -static int pmu_sbi_get_ctrinfo(int nctr, unsigned long *mask) +static int rvpmu_sbi_get_ctrinfo(int nctr, unsigned long *mask) { struct sbiret ret; int i, num_hw_ctr = 0, num_fw_ctr = 0; @@ -809,7 +811,7 @@ static int pmu_sbi_get_ctrinfo(int nctr, unsigned long *mask) return 0; } -static inline void pmu_sbi_stop_all(struct riscv_pmu *pmu) +static inline void rvpmu_sbi_stop_all(struct riscv_pmu *pmu) { /* * No need to check the error because we are disabling all the counters @@ -819,7 +821,7 @@ static inline void pmu_sbi_stop_all(struct riscv_pmu *pmu) 0, pmu->cmask, SBI_PMU_STOP_FLAG_RESET, 0, 0, 0); } -static inline void pmu_sbi_stop_hw_ctrs(struct riscv_pmu *pmu) +static inline void rvpmu_sbi_stop_hw_ctrs(struct riscv_pmu *pmu) { struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events); struct riscv_pmu_snapshot_data *sdata = cpu_hw_evt->snapshot_addr; @@ -863,8 +865,8 @@ static inline void pmu_sbi_stop_hw_ctrs(struct riscv_pmu *pmu) * while the overflowed counters need to be started with updated initialization * value. */ -static inline void pmu_sbi_start_ovf_ctrs_sbi(struct cpu_hw_events *cpu_hw_evt, - u64 ctr_ovf_mask) +static inline void rvpmu_sbi_start_ovf_ctrs_sbi(struct cpu_hw_events *cpu_hw_evt, + u64 ctr_ovf_mask) { int idx = 0, i; struct perf_event *event; @@ -902,8 +904,8 @@ static inline void pmu_sbi_start_ovf_ctrs_sbi(struct cpu_hw_events *cpu_hw_evt, } } -static inline void pmu_sbi_start_ovf_ctrs_snapshot(struct cpu_hw_events *cpu_hw_evt, - u64 ctr_ovf_mask) +static inline void rvpmu_sbi_start_ovf_ctrs_snapshot(struct cpu_hw_events *cpu_hw_evt, + u64 ctr_ovf_mask) { int i, idx = 0; struct perf_event *event; @@ -937,18 +939,18 @@ static inline void pmu_sbi_start_ovf_ctrs_snapshot(struct cpu_hw_events *cpu_hw_ } } -static void pmu_sbi_start_overflow_mask(struct riscv_pmu *pmu, - u64 ctr_ovf_mask) +static void rvpmu_sbi_start_overflow_mask(struct riscv_pmu *pmu, + u64 ctr_ovf_mask) { struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events); if (sbi_pmu_snapshot_available()) - pmu_sbi_start_ovf_ctrs_snapshot(cpu_hw_evt, ctr_ovf_mask); + rvpmu_sbi_start_ovf_ctrs_snapshot(cpu_hw_evt, ctr_ovf_mask); else - pmu_sbi_start_ovf_ctrs_sbi(cpu_hw_evt, ctr_ovf_mask); + rvpmu_sbi_start_ovf_ctrs_sbi(cpu_hw_evt, ctr_ovf_mask); } -static irqreturn_t pmu_sbi_ovf_handler(int irq, void *dev) +static irqreturn_t rvpmu_ovf_handler(int irq, void *dev) { struct perf_sample_data data; struct pt_regs *regs; @@ -980,7 +982,7 @@ static irqreturn_t pmu_sbi_ovf_handler(int irq, void *dev) } pmu = to_riscv_pmu(event->pmu); - pmu_sbi_stop_hw_ctrs(pmu); + rvpmu_sbi_stop_hw_ctrs(pmu); /* Overflow status register should only be read after counter are stopped */ if (sbi_pmu_snapshot_available()) @@ -1049,13 +1051,55 @@ static irqreturn_t pmu_sbi_ovf_handler(int irq, void *dev) hw_evt->state = 0; } - pmu_sbi_start_overflow_mask(pmu, overflowed_ctrs); + rvpmu_sbi_start_overflow_mask(pmu, overflowed_ctrs); perf_sample_event_took(sched_clock() - start_clock); return IRQ_HANDLED; } -static int pmu_sbi_starting_cpu(unsigned int cpu, struct hlist_node *node) +static void rvpmu_ctr_start(struct perf_event *event, u64 ival) +{ + rvpmu_sbi_ctr_start(event, ival); + /* TODO: Counter delegation implementation */ +} + +static void rvpmu_ctr_stop(struct perf_event *event, unsigned long flag) +{ + rvpmu_sbi_ctr_stop(event, flag); + /* TODO: Counter delegation implementation */ +} + +static int rvpmu_find_num_ctrs(void) +{ + return rvpmu_sbi_find_num_ctrs(); + /* TODO: Counter delegation implementation */ +} + +static int rvpmu_get_ctrinfo(int nctr, unsigned long *mask) +{ + return rvpmu_sbi_get_ctrinfo(nctr, mask); + /* TODO: Counter delegation implementation */ +} + +static int rvpmu_event_map(struct perf_event *event, u64 *econfig) +{ + return rvpmu_sbi_event_map(event, econfig); + /* TODO: Counter delegation implementation */ +} + +static int rvpmu_ctr_get_idx(struct perf_event *event) +{ + return rvpmu_sbi_ctr_get_idx(event); + /* TODO: Counter delegation implementation */ +} + +static u64 rvpmu_ctr_read(struct perf_event *event) +{ + return rvpmu_sbi_ctr_read(event); + /* TODO: Counter delegation implementation */ +} + +static int rvpmu_starting_cpu(unsigned int cpu, struct hlist_node *node) { struct riscv_pmu *pmu = hlist_entry_safe(node, struct riscv_pmu, node); struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events); @@ -1070,7 +1114,7 @@ static int pmu_sbi_starting_cpu(unsigned int cpu, struct hlist_node *node) csr_write(CSR_SCOUNTEREN, 0x2); /* Stop all the counters so that they can be enabled from perf */ - pmu_sbi_stop_all(pmu); + rvpmu_sbi_stop_all(pmu); if (riscv_pmu_use_irq) { cpu_hw_evt->irq = riscv_pmu_irq; @@ -1084,7 +1128,7 @@ static int pmu_sbi_starting_cpu(unsigned int cpu, struct hlist_node *node) return 0; } -static int pmu_sbi_dying_cpu(unsigned int cpu, struct hlist_node *node) +static int rvpmu_dying_cpu(unsigned int cpu, struct hlist_node *node) { if (riscv_pmu_use_irq) { disable_percpu_irq(riscv_pmu_irq); @@ -1099,7 +1143,7 @@ static int pmu_sbi_dying_cpu(unsigned int cpu, struct hlist_node *node) return 0; } -static int pmu_sbi_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pdev) +static int rvpmu_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pdev) { int ret; struct cpu_hw_events __percpu *hw_events = pmu->hw_events; @@ -1139,7 +1183,7 @@ static int pmu_sbi_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pde return -ENODEV; } - ret = request_percpu_irq(riscv_pmu_irq, pmu_sbi_ovf_handler, "riscv-pmu", hw_events); + ret = request_percpu_irq(riscv_pmu_irq, rvpmu_ovf_handler, "riscv-pmu", hw_events); if (ret) { pr_err("registering percpu irq failed [%d]\n", ret); return ret; @@ -1215,7 +1259,7 @@ static void riscv_pmu_destroy(struct riscv_pmu *pmu) cpuhp_state_remove_instance(CPUHP_AP_PERF_RISCV_STARTING, &pmu->node); } -static void pmu_sbi_event_init(struct perf_event *event) +static void rvpmu_event_init(struct perf_event *event) { /* * The permissions are set at event_init so that we do not depend @@ -1229,7 +1273,7 @@ static void pmu_sbi_event_init(struct perf_event *event) event->hw.flags |= PERF_EVENT_FLAG_LEGACY; } -static void pmu_sbi_event_mapped(struct perf_event *event, struct mm_struct *mm) +static void rvpmu_event_mapped(struct perf_event *event, struct mm_struct *mm) { if (event->hw.flags & PERF_EVENT_FLAG_NO_USER_ACCESS) return; @@ -1257,14 +1301,14 @@ static void pmu_sbi_event_mapped(struct perf_event *event, struct mm_struct *mm) * that it is possible to do so to avoid any race. * And we must notify all cpus here because threads that currently run * on other cpus will try to directly access the counter too without - * calling pmu_sbi_ctr_start. + * calling rvpmu_sbi_ctr_start. */ if (event->hw.flags & PERF_EVENT_FLAG_USER_ACCESS) on_each_cpu_mask(mm_cpumask(mm), - pmu_sbi_set_scounteren, (void *)event, 1); + rvpmu_set_scounteren, (void *)event, 1); } -static void pmu_sbi_event_unmapped(struct perf_event *event, struct mm_struct *mm) +static void rvpmu_event_unmapped(struct perf_event *event, struct mm_struct *mm) { if (event->hw.flags & PERF_EVENT_FLAG_NO_USER_ACCESS) return; @@ -1286,7 +1330,7 @@ static void pmu_sbi_event_unmapped(struct perf_event *event, struct mm_struct *m if (event->hw.flags & PERF_EVENT_FLAG_USER_ACCESS) on_each_cpu_mask(mm_cpumask(mm), - pmu_sbi_reset_scounteren, (void *)event, 1); + rvpmu_reset_scounteren, (void *)event, 1); } static void riscv_pmu_update_counter_access(void *info) @@ -1329,7 +1373,7 @@ static const struct ctl_table sbi_pmu_sysctl_table[] = { }, }; -static int pmu_sbi_device_probe(struct platform_device *pdev) +static int rvpmu_device_probe(struct platform_device *pdev) { struct riscv_pmu *pmu = NULL; int ret = -ENODEV; @@ -1340,7 +1384,7 @@ static int pmu_sbi_device_probe(struct platform_device *pdev) if (!pmu) return -ENOMEM; - num_counters = pmu_sbi_find_num_ctrs(); + num_counters = rvpmu_find_num_ctrs(); if (num_counters < 0) { pr_err("SBI PMU extension doesn't provide any counters\n"); goto out_free; @@ -1353,10 +1397,10 @@ static int pmu_sbi_device_probe(struct platform_device *pdev) } /* cache all the information about counters now */ - if (pmu_sbi_get_ctrinfo(num_counters, &cmask)) + if (rvpmu_get_ctrinfo(num_counters, &cmask)) goto out_free; - ret = pmu_sbi_setup_irqs(pmu, pdev); + ret = rvpmu_setup_irqs(pmu, pdev); if (ret < 0) { pr_info("Perf sampling/filtering is not supported as sscof extension is not available\n"); pmu->pmu.capabilities |= PERF_PMU_CAP_NO_INTERRUPT; @@ -1366,17 +1410,17 @@ static int pmu_sbi_device_probe(struct platform_device *pdev) pmu->pmu.attr_groups = riscv_pmu_attr_groups; pmu->pmu.parent = &pdev->dev; pmu->cmask = cmask; - pmu->ctr_start = pmu_sbi_ctr_start; - pmu->ctr_stop = pmu_sbi_ctr_stop; - pmu->event_map = pmu_sbi_event_map; - pmu->ctr_get_idx = pmu_sbi_ctr_get_idx; - pmu->ctr_get_width = pmu_sbi_ctr_get_width; - pmu->ctr_clear_idx = pmu_sbi_ctr_clear_idx; - pmu->ctr_read = pmu_sbi_ctr_read; - pmu->event_init = pmu_sbi_event_init; - pmu->event_mapped = pmu_sbi_event_mapped; - pmu->event_unmapped = pmu_sbi_event_unmapped; - pmu->csr_index = pmu_sbi_csr_index; + pmu->ctr_start = rvpmu_ctr_start; + pmu->ctr_stop = rvpmu_ctr_stop; + pmu->event_map = rvpmu_event_map; + pmu->ctr_get_idx = rvpmu_ctr_get_idx; + pmu->ctr_get_width = rvpmu_ctr_get_width; + pmu->ctr_clear_idx = rvpmu_ctr_clear_idx; + pmu->ctr_read = rvpmu_ctr_read; + pmu->event_init = rvpmu_event_init; + pmu->event_mapped = rvpmu_event_mapped; + pmu->event_unmapped = rvpmu_event_unmapped; + pmu->csr_index = rvpmu_csr_index; ret = riscv_pm_pmu_register(pmu); if (ret) @@ -1432,14 +1476,14 @@ static int pmu_sbi_device_probe(struct platform_device *pdev) return ret; } -static struct platform_driver pmu_sbi_driver = { - .probe = pmu_sbi_device_probe, +static struct platform_driver rvpmu_driver = { + .probe = rvpmu_device_probe, .driver = { - .name = RISCV_PMU_SBI_PDEV_NAME, + .name = RISCV_PMU_PDEV_NAME, }, }; -static int __init pmu_sbi_devinit(void) +static int __init rvpmu_devinit(void) { int ret; struct platform_device *pdev; @@ -1454,20 +1498,20 @@ static int __init pmu_sbi_devinit(void) ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_RISCV_STARTING, "perf/riscv/pmu:starting", - pmu_sbi_starting_cpu, pmu_sbi_dying_cpu); + rvpmu_starting_cpu, rvpmu_dying_cpu); if (ret) { pr_err("CPU hotplug notifier could not be registered: %d\n", ret); return ret; } - ret = platform_driver_register(&pmu_sbi_driver); + ret = platform_driver_register(&rvpmu_driver); if (ret) return ret; - pdev = platform_device_register_simple(RISCV_PMU_SBI_PDEV_NAME, -1, NULL, 0); + pdev = platform_device_register_simple(RISCV_PMU_PDEV_NAME, -1, NULL, 0); if (IS_ERR(pdev)) { - platform_driver_unregister(&pmu_sbi_driver); + platform_driver_unregister(&rvpmu_driver); return PTR_ERR(pdev); } @@ -1476,4 +1520,4 @@ static int __init pmu_sbi_devinit(void) return ret; } -device_initcall(pmu_sbi_devinit) +device_initcall(rvpmu_devinit) diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h index 701974639ff2..525acd6d96d0 100644 --- a/include/linux/perf/riscv_pmu.h +++ b/include/linux/perf/riscv_pmu.h @@ -13,7 +13,7 @@ #include #include -#ifdef CONFIG_RISCV_PMU +#ifdef CONFIG_RISCV_PMU_COMMON /* * The RISCV_MAX_COUNTERS parameter should be specified. @@ -21,7 +21,7 @@ #define RISCV_MAX_COUNTERS 64 #define RISCV_OP_UNSUPP (-EOPNOTSUPP) -#define RISCV_PMU_SBI_PDEV_NAME "riscv-pmu-sbi" +#define RISCV_PMU_PDEV_NAME "riscv-pmu" #define RISCV_PMU_LEGACY_PDEV_NAME "riscv-pmu-legacy" #define RISCV_PMU_STOP_FLAG_RESET 1 @@ -87,10 +87,10 @@ void riscv_pmu_legacy_skip_init(void); static inline void riscv_pmu_legacy_skip_init(void) {}; #endif struct riscv_pmu *riscv_pmu_alloc(void); -#ifdef CONFIG_RISCV_PMU_SBI +#ifdef CONFIG_RISCV_PMU int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr); #endif -#endif /* CONFIG_RISCV_PMU */ +#endif /* CONFIG_RISCV_PMU_COMMON */ #endif /* _RISCV_PMU_H */ -- 2.43.0 From atishp at rivosinc.com Thu Mar 27 12:35:53 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:35:53 -0700 Subject: [PATCH v5 12/21] RISC-V: perf: Modify the counter discovery mechanism In-Reply-To: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> Message-ID: <20250327-counter_delegation-v5-12-1ee538468d1b@rivosinc.com> If both counter delegation and SBI PMU is present, the counter delegation will be used for hardware pmu counters while the SBI PMU will be used for firmware counters. Thus, the driver has to probe the counters info via SBI PMU to distinguish the firmware counters. The hybrid scheme also requires improvements of the informational logging messages to indicate the user about underlying interface used for each use case. Signed-off-by: Atish Patra --- drivers/perf/riscv_pmu_dev.c | 130 ++++++++++++++++++++++++++++++++----------- 1 file changed, 96 insertions(+), 34 deletions(-) diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c index 6cebbc16bfe4..c0397bd68b91 100644 --- a/drivers/perf/riscv_pmu_dev.c +++ b/drivers/perf/riscv_pmu_dev.c @@ -66,6 +66,20 @@ static bool sbi_v2_available; static DEFINE_STATIC_KEY_FALSE(sbi_pmu_snapshot_available); #define sbi_pmu_snapshot_available() \ static_branch_unlikely(&sbi_pmu_snapshot_available) +static DEFINE_STATIC_KEY_FALSE(riscv_pmu_sbi_available); +static DEFINE_STATIC_KEY_FALSE(riscv_pmu_cdeleg_available); + +/* Avoid unnecessary code patching in the one time booting path*/ +#define riscv_pmu_cdeleg_available_boot() \ + static_key_enabled(&riscv_pmu_cdeleg_available) +#define riscv_pmu_sbi_available_boot() \ + static_key_enabled(&riscv_pmu_sbi_available) + +/* Perform a runtime code patching with static key */ +#define riscv_pmu_cdeleg_available() \ + static_branch_unlikely(&riscv_pmu_cdeleg_available) +#define riscv_pmu_sbi_available() \ + static_branch_likely(&riscv_pmu_sbi_available) static struct attribute *riscv_arch_formats_attr[] = { &format_attr_event.attr, @@ -88,7 +102,8 @@ static int sysctl_perf_user_access __read_mostly = SYSCTL_USER_ACCESS; /* * This structure is SBI specific but counter delegation also require counter - * width, csr mapping. Reuse it for now. + * width, csr mapping. Reuse it for now we can have firmware counters for + * platfroms with counter delegation support. * RISC-V doesn't have heterogeneous harts yet. This need to be part of * per_cpu in case of harts with different pmu counters */ @@ -100,6 +115,8 @@ static unsigned int riscv_pmu_irq; /* Cache the available counters in a bitmask */ static unsigned long cmask; +/* Cache the available firmware counters in another bitmask */ +static unsigned long firmware_cmask; struct sbi_pmu_event_data { union { @@ -780,34 +797,38 @@ static int rvpmu_sbi_find_num_ctrs(void) return sbi_err_map_linux_errno(ret.error); } -static int rvpmu_sbi_get_ctrinfo(int nctr, unsigned long *mask) +static u32 rvpmu_deleg_find_ctrs(void) +{ + /* TODO */ + return 0; +} + +static int rvpmu_sbi_get_ctrinfo(u32 nsbi_ctr, u32 *num_fw_ctr, u32 *num_hw_ctr) { struct sbiret ret; - int i, num_hw_ctr = 0, num_fw_ctr = 0; + int i; union sbi_pmu_ctr_info cinfo; - pmu_ctr_list = kcalloc(nctr, sizeof(*pmu_ctr_list), GFP_KERNEL); - if (!pmu_ctr_list) - return -ENOMEM; - - for (i = 0; i < nctr; i++) { + for (i = 0; i < nsbi_ctr; i++) { ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_GET_INFO, i, 0, 0, 0, 0, 0); if (ret.error) /* The logical counter ids are not expected to be contiguous */ continue; - *mask |= BIT(i); - cinfo.value = ret.value; - if (cinfo.type == SBI_PMU_CTR_TYPE_FW) - num_fw_ctr++; - else - num_hw_ctr++; - pmu_ctr_list[i].value = cinfo.value; + if (cinfo.type == SBI_PMU_CTR_TYPE_FW) { + /* Track firmware counters in a different mask */ + firmware_cmask |= BIT(i); + pmu_ctr_list[i].value = cinfo.value; + *num_fw_ctr = *num_fw_ctr + 1; + } else if (cinfo.type == SBI_PMU_CTR_TYPE_HW && + !riscv_pmu_cdeleg_available_boot()) { + *num_hw_ctr = *num_hw_ctr + 1; + cmask |= BIT(i); + pmu_ctr_list[i].value = cinfo.value; + } } - pr_info("%d firmware and %d hardware counters\n", num_fw_ctr, num_hw_ctr); - return 0; } @@ -1069,16 +1090,41 @@ static void rvpmu_ctr_stop(struct perf_event *event, unsigned long flag) /* TODO: Counter delegation implementation */ } -static int rvpmu_find_num_ctrs(void) +static int rvpmu_find_ctrs(void) { - return rvpmu_sbi_find_num_ctrs(); - /* TODO: Counter delegation implementation */ -} + u32 num_sbi_counters = 0, num_deleg_counters = 0; + u32 num_hw_ctr = 0, num_fw_ctr = 0, num_ctr = 0; + /* + * We don't know how many firmware counters are available. Just allocate + * for maximum counters the driver can support. The default is 64 anyways. + */ + pmu_ctr_list = kcalloc(RISCV_MAX_COUNTERS, sizeof(*pmu_ctr_list), + GFP_KERNEL); + if (!pmu_ctr_list) + return -ENOMEM; -static int rvpmu_get_ctrinfo(int nctr, unsigned long *mask) -{ - return rvpmu_sbi_get_ctrinfo(nctr, mask); - /* TODO: Counter delegation implementation */ + if (riscv_pmu_cdeleg_available_boot()) + num_deleg_counters = rvpmu_deleg_find_ctrs(); + + /* This is required for firmware counters even if the above is true */ + if (riscv_pmu_sbi_available_boot()) { + num_sbi_counters = rvpmu_sbi_find_num_ctrs(); + /* cache all the information about counters now */ + rvpmu_sbi_get_ctrinfo(num_sbi_counters, &num_hw_ctr, &num_fw_ctr); + } + + if (num_sbi_counters > RISCV_MAX_COUNTERS || num_deleg_counters > RISCV_MAX_COUNTERS) + return -ENOSPC; + + if (riscv_pmu_cdeleg_available_boot()) { + pr_info("%u firmware and %u hardware counters\n", num_fw_ctr, num_deleg_counters); + num_ctr = num_fw_ctr + num_deleg_counters; + } else { + pr_info("%u firmware and %u hardware counters\n", num_fw_ctr, num_hw_ctr); + num_ctr = num_sbi_counters; + } + + return num_ctr; } static int rvpmu_event_map(struct perf_event *event, u64 *econfig) @@ -1379,12 +1425,21 @@ static int rvpmu_device_probe(struct platform_device *pdev) int ret = -ENODEV; int num_counters; - pr_info("SBI PMU extension is available\n"); + if (riscv_pmu_cdeleg_available_boot()) { + pr_info("hpmcounters will use the counter delegation ISA extension\n"); + if (riscv_pmu_sbi_available_boot()) + pr_info("Firmware counters will use SBI PMU extension\n"); + else + pr_info("Firmware counters will not be available as SBI PMU extension is not present\n"); + } else if (riscv_pmu_sbi_available_boot()) { + pr_info("Both hpmcounters and firmware counters will use SBI PMU extension\n"); + } + pmu = riscv_pmu_alloc(); if (!pmu) return -ENOMEM; - num_counters = rvpmu_find_num_ctrs(); + num_counters = rvpmu_find_ctrs(); if (num_counters < 0) { pr_err("SBI PMU extension doesn't provide any counters\n"); goto out_free; @@ -1396,9 +1451,6 @@ static int rvpmu_device_probe(struct platform_device *pdev) pr_info("SBI returned more than maximum number of counters. Limiting the number of counters to %d\n", num_counters); } - /* cache all the information about counters now */ - if (rvpmu_get_ctrinfo(num_counters, &cmask)) - goto out_free; ret = rvpmu_setup_irqs(pmu, pdev); if (ret < 0) { @@ -1488,13 +1540,23 @@ static int __init rvpmu_devinit(void) int ret; struct platform_device *pdev; - if (sbi_spec_version < sbi_mk_version(0, 3) || - !sbi_probe_extension(SBI_EXT_PMU)) { - return 0; - } + if (sbi_spec_version >= sbi_mk_version(0, 3) && + sbi_probe_extension(SBI_EXT_PMU)) + static_branch_enable(&riscv_pmu_sbi_available); if (sbi_spec_version >= sbi_mk_version(2, 0)) sbi_v2_available = true; + /* + * We need all three extensions to be present to access the counters + * in S-mode via Supervisor Counter delegation. + */ + if (riscv_isa_extension_available(NULL, SSCCFG) && + riscv_isa_extension_available(NULL, SMCDELEG) && + riscv_isa_extension_available(NULL, SSCSRIND)) + static_branch_enable(&riscv_pmu_cdeleg_available); + + if (!(riscv_pmu_sbi_available_boot() || riscv_pmu_cdeleg_available_boot())) + return 0; ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_RISCV_STARTING, "perf/riscv/pmu:starting", -- 2.43.0 From atishp at rivosinc.com Thu Mar 27 12:35:54 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:35:54 -0700 Subject: [PATCH v5 13/21] RISC-V: perf: Add a mechanism to defined legacy event encoding In-Reply-To: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> Message-ID: <20250327-counter_delegation-v5-13-1ee538468d1b@rivosinc.com> RISC-V ISA doesn't define any standard event encodings or specify any event to counter mapping. Thus, event encoding information and corresponding counter mapping fot those events needs to be provided in the driver for each vendor. Add a framework to support that. The individual platform events will be added later. Signed-off-by: Atish Patra --- drivers/perf/riscv_pmu_dev.c | 54 +++++++++++++++++++++++++++++++++++++++++- include/linux/perf/riscv_pmu.h | 13 ++++++++++ 2 files changed, 66 insertions(+), 1 deletion(-) diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c index c0397bd68b91..6f64404a6e3d 100644 --- a/drivers/perf/riscv_pmu_dev.c +++ b/drivers/perf/riscv_pmu_dev.c @@ -317,6 +317,56 @@ static struct sbi_pmu_event_data pmu_cache_event_sbi_map[PERF_COUNT_HW_CACHE_MAX }, }; +/* + * Vendor specific PMU events. + */ +struct riscv_pmu_event { + u64 event_id; + u32 counter_mask; +}; + +struct riscv_vendor_pmu_events { + unsigned long vendorid; + unsigned long archid; + unsigned long implid; + const struct riscv_pmu_event *hw_event_map; + const struct riscv_pmu_event (*cache_event_map)[PERF_COUNT_HW_CACHE_OP_MAX] + [PERF_COUNT_HW_CACHE_RESULT_MAX]; +}; + +#define RISCV_VENDOR_PMU_EVENTS(_vendorid, _archid, _implid, _hw_event_map, _cache_event_map) \ + { .vendorid = _vendorid, .archid = _archid, .implid = _implid, \ + .hw_event_map = _hw_event_map, .cache_event_map = _cache_event_map }, + +static struct riscv_vendor_pmu_events pmu_vendor_events_table[] = { +}; + +const struct riscv_pmu_event *current_pmu_hw_event_map; +const struct riscv_pmu_event (*current_pmu_cache_event_map)[PERF_COUNT_HW_CACHE_OP_MAX] + [PERF_COUNT_HW_CACHE_RESULT_MAX]; + +static void rvpmu_vendor_register_events(void) +{ + int cpu = raw_smp_processor_id(); + unsigned long vendor_id = riscv_cached_mvendorid(cpu); + unsigned long impl_id = riscv_cached_mimpid(cpu); + unsigned long arch_id = riscv_cached_marchid(cpu); + + for (int i = 0; i < ARRAY_SIZE(pmu_vendor_events_table); i++) { + if (pmu_vendor_events_table[i].vendorid == vendor_id && + pmu_vendor_events_table[i].implid == impl_id && + pmu_vendor_events_table[i].archid == arch_id) { + current_pmu_hw_event_map = pmu_vendor_events_table[i].hw_event_map; + current_pmu_cache_event_map = pmu_vendor_events_table[i].cache_event_map; + break; + } + } + + if (!current_pmu_hw_event_map || !current_pmu_cache_event_map) { + pr_info("No default PMU events found\n"); + } +} + static void rvpmu_sbi_check_event(struct sbi_pmu_event_data *edata) { struct sbiret ret; @@ -1552,8 +1602,10 @@ static int __init rvpmu_devinit(void) */ if (riscv_isa_extension_available(NULL, SSCCFG) && riscv_isa_extension_available(NULL, SMCDELEG) && - riscv_isa_extension_available(NULL, SSCSRIND)) + riscv_isa_extension_available(NULL, SSCSRIND)) { static_branch_enable(&riscv_pmu_cdeleg_available); + rvpmu_vendor_register_events(); + } if (!(riscv_pmu_sbi_available_boot() || riscv_pmu_cdeleg_available_boot())) return 0; diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h index 525acd6d96d0..a3e1fdd5084a 100644 --- a/include/linux/perf/riscv_pmu.h +++ b/include/linux/perf/riscv_pmu.h @@ -28,6 +28,19 @@ #define RISCV_PMU_CONFIG1_GUEST_EVENTS 0x1 +#define HW_OP_UNSUPPORTED 0xFFFF +#define CACHE_OP_UNSUPPORTED 0xFFFF + +#define PERF_MAP_ALL_UNSUPPORTED \ + [0 ... PERF_COUNT_HW_MAX - 1] = {HW_OP_UNSUPPORTED, 0x0} + +#define PERF_CACHE_MAP_ALL_UNSUPPORTED \ +[0 ... C(MAX) - 1] = { \ + [0 ... C(OP_MAX) - 1] = { \ + [0 ... C(RESULT_MAX) - 1] = {CACHE_OP_UNSUPPORTED, 0x0} \ + }, \ +} + struct cpu_hw_events { /* currently enabled events */ int n_events; -- 2.43.0 From atishp at rivosinc.com Thu Mar 27 12:35:55 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:35:55 -0700 Subject: [PATCH v5 14/21] RISC-V: perf: Implement supervisor counter delegation support In-Reply-To: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> Message-ID: <20250327-counter_delegation-v5-14-1ee538468d1b@rivosinc.com> There are few new RISC-V ISA exensions (ssccfg, sscsrind, smcntrpmf) which allows the hpmcounter/hpmevents to be programmed directly from S-mode. The implementation detects the ISA extension at runtime and uses them if available instead of SBI PMU extension. SBI PMU extension will still be used for firmware counters if the user requests it. The current linux driver relies on event encoding defined by SBI PMU specification for standard perf events. However, there are no standard event encoding available in the ISA. In the future, we may want to decouple the counter delegation and SBI PMU completely. In that case, counter delegation supported platforms must rely on the event encoding defined in the perf json file or in the pmu driver. For firmware events, it will continue to use the SBI PMU encoding as one can not support firmware event without SBI PMU. Signed-off-by: Atish Patra --- arch/riscv/include/asm/csr.h | 1 + drivers/perf/riscv_pmu_dev.c | 561 +++++++++++++++++++++++++++++++++-------- include/linux/perf/riscv_pmu.h | 3 + 3 files changed, 462 insertions(+), 103 deletions(-) diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h index 3d2d4f886c77..8b2f5ae1d60e 100644 --- a/arch/riscv/include/asm/csr.h +++ b/arch/riscv/include/asm/csr.h @@ -255,6 +255,7 @@ #endif #define SISELECT_SSCCFG_BASE 0x40 +#define HPMEVENT_MASK GENMASK_ULL(63, 56) /* mseccfg bits */ #define MSECCFG_PMM ENVCFG_PMM diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c index 6f64404a6e3d..7c4a1ef15866 100644 --- a/drivers/perf/riscv_pmu_dev.c +++ b/drivers/perf/riscv_pmu_dev.c @@ -27,6 +27,8 @@ #include #include #include +#include +#include #define ALT_SBI_PMU_OVERFLOW(__ovl) \ asm volatile(ALTERNATIVE_2( \ @@ -59,14 +61,31 @@ asm volatile(ALTERNATIVE( \ #define PERF_EVENT_FLAG_USER_ACCESS BIT(SYSCTL_USER_ACCESS) #define PERF_EVENT_FLAG_LEGACY BIT(SYSCTL_LEGACY) -PMU_FORMAT_ATTR(event, "config:0-47"); +#define RVPMU_SBI_PMU_FORMAT_ATTR "config:0-47" +#define RVPMU_CDELEG_PMU_FORMAT_ATTR "config:0-55" + +static ssize_t __maybe_unused rvpmu_format_show(struct device *dev, struct device_attribute *attr, + char *buf); + +#define RVPMU_ATTR_ENTRY(_name, _func, _config) ( \ + &((struct dev_ext_attribute[]) { \ + { __ATTR(_name, 0444, _func, NULL), (void *)_config } \ + })[0].attr.attr) + +#define RVPMU_FORMAT_ATTR_ENTRY(_name, _config) \ + RVPMU_ATTR_ENTRY(_name, rvpmu_format_show, (char *)_config) + PMU_FORMAT_ATTR(firmware, "config:62-63"); static bool sbi_v2_available; static DEFINE_STATIC_KEY_FALSE(sbi_pmu_snapshot_available); #define sbi_pmu_snapshot_available() \ static_branch_unlikely(&sbi_pmu_snapshot_available) + static DEFINE_STATIC_KEY_FALSE(riscv_pmu_sbi_available); +#define riscv_pmu_sbi_available() \ + static_branch_likely(&riscv_pmu_sbi_available) + static DEFINE_STATIC_KEY_FALSE(riscv_pmu_cdeleg_available); /* Avoid unnecessary code patching in the one time booting path*/ @@ -81,19 +100,35 @@ static DEFINE_STATIC_KEY_FALSE(riscv_pmu_cdeleg_available); #define riscv_pmu_sbi_available() \ static_branch_likely(&riscv_pmu_sbi_available) -static struct attribute *riscv_arch_formats_attr[] = { - &format_attr_event.attr, +static struct attribute *riscv_sbi_pmu_formats_attr[] = { + RVPMU_FORMAT_ATTR_ENTRY(event, RVPMU_SBI_PMU_FORMAT_ATTR), + &format_attr_firmware.attr, + NULL, +}; + +static struct attribute_group riscv_sbi_pmu_format_group = { + .name = "format", + .attrs = riscv_sbi_pmu_formats_attr, +}; + +static const struct attribute_group *riscv_sbi_pmu_attr_groups[] = { + &riscv_sbi_pmu_format_group, + NULL, +}; + +static struct attribute *riscv_cdeleg_pmu_formats_attr[] = { + RVPMU_FORMAT_ATTR_ENTRY(event, RVPMU_CDELEG_PMU_FORMAT_ATTR), &format_attr_firmware.attr, NULL, }; -static struct attribute_group riscv_pmu_format_group = { +static struct attribute_group riscv_cdeleg_pmu_format_group = { .name = "format", - .attrs = riscv_arch_formats_attr, + .attrs = riscv_cdeleg_pmu_formats_attr, }; -static const struct attribute_group *riscv_pmu_attr_groups[] = { - &riscv_pmu_format_group, +static const struct attribute_group *riscv_cdeleg_pmu_attr_groups[] = { + &riscv_cdeleg_pmu_format_group, NULL, }; @@ -395,6 +430,14 @@ static void rvpmu_sbi_check_std_events(struct work_struct *work) static DECLARE_WORK(check_std_events_work, rvpmu_sbi_check_std_events); +static ssize_t rvpmu_format_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct dev_ext_attribute *eattr = container_of(attr, + struct dev_ext_attribute, attr); + return sysfs_emit(buf, "%s\n", (char *)eattr->var); +} + static int rvpmu_ctr_get_width(int idx) { return pmu_ctr_list[idx].width; @@ -447,6 +490,38 @@ static uint8_t rvpmu_csr_index(struct perf_event *event) return pmu_ctr_list[event->hw.idx].csr - CSR_CYCLE; } +static uint64_t get_deleg_priv_filter_bits(struct perf_event *event) +{ + u64 priv_filter_bits = 0; + bool guest_events = false; + + if (event->attr.config1 & RISCV_PMU_CONFIG1_GUEST_EVENTS) + guest_events = true; + if (event->attr.exclude_kernel) + priv_filter_bits |= guest_events ? HPMEVENT_VSINH : HPMEVENT_SINH; + if (event->attr.exclude_user) + priv_filter_bits |= guest_events ? HPMEVENT_VUINH : HPMEVENT_UINH; + if (guest_events && event->attr.exclude_hv) + priv_filter_bits |= HPMEVENT_SINH; + if (event->attr.exclude_host) + priv_filter_bits |= HPMEVENT_UINH | HPMEVENT_SINH; + if (event->attr.exclude_guest) + priv_filter_bits |= HPMEVENT_VSINH | HPMEVENT_VUINH; + + return priv_filter_bits; +} + +static bool pmu_sbi_is_fw_event(struct perf_event *event) +{ + u32 type = event->attr.type; + u64 config = event->attr.config; + + if (type == PERF_TYPE_RAW && ((config >> 63) == 1)) + return true; + else + return false; +} + static unsigned long rvpmu_sbi_get_filter_flags(struct perf_event *event) { unsigned long cflags = 0; @@ -475,7 +550,8 @@ static int rvpmu_sbi_ctr_get_idx(struct perf_event *event) struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events); struct sbiret ret; int idx; - uint64_t cbase = 0, cmask = rvpmu->cmask; + u64 cbase = 0; + unsigned long ctr_mask = rvpmu->cmask; unsigned long cflags = 0; cflags = rvpmu_sbi_get_filter_flags(event); @@ -488,21 +564,23 @@ static int rvpmu_sbi_ctr_get_idx(struct perf_event *event) if ((hwc->flags & PERF_EVENT_FLAG_LEGACY) && (event->attr.type == PERF_TYPE_HARDWARE)) { if (event->attr.config == PERF_COUNT_HW_CPU_CYCLES) { cflags |= SBI_PMU_CFG_FLAG_SKIP_MATCH; - cmask = 1; + ctr_mask = 1; } else if (event->attr.config == PERF_COUNT_HW_INSTRUCTIONS) { cflags |= SBI_PMU_CFG_FLAG_SKIP_MATCH; - cmask = BIT(CSR_INSTRET - CSR_CYCLE); + ctr_mask = BIT(CSR_INSTRET - CSR_CYCLE); } + } else if (pmu_sbi_is_fw_event(event)) { + ctr_mask = firmware_cmask; } /* retrieve the available counter index */ #if defined(CONFIG_32BIT) ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase, - cmask, cflags, hwc->event_base, hwc->config, + ctr_mask, cflags, hwc->event_base, hwc->config, hwc->config >> 32); #else ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase, - cmask, cflags, hwc->event_base, hwc->config, 0); + ctr_mask, cflags, hwc->event_base, hwc->config, 0); #endif if (ret.error) { pr_debug("Not able to find a counter for event %lx config %llx\n", @@ -511,7 +589,7 @@ static int rvpmu_sbi_ctr_get_idx(struct perf_event *event) } idx = ret.value; - if (!test_bit(idx, &rvpmu->cmask) || !pmu_ctr_list[idx].value) + if (!test_bit(idx, &ctr_mask) || !pmu_ctr_list[idx].value) return -ENOENT; /* Additional sanity check for the counter id */ @@ -561,17 +639,6 @@ static int sbi_pmu_event_find_cache(u64 config) return ret; } -static bool pmu_sbi_is_fw_event(struct perf_event *event) -{ - u32 type = event->attr.type; - u64 config = event->attr.config; - - if ((type == PERF_TYPE_RAW) && ((config >> 63) == 1)) - return true; - else - return false; -} - static int rvpmu_sbi_event_map(struct perf_event *event, u64 *econfig) { u32 type = event->attr.type; @@ -602,7 +669,6 @@ static int rvpmu_sbi_event_map(struct perf_event *event, u64 *econfig) * 10 - SBI firmware events * 11 - Risc-V platform specific firmware event */ - switch (config >> 62) { case 0: /* Return error any bits [48-63] is set as it is not allowed by the spec */ @@ -634,6 +700,84 @@ static int rvpmu_sbi_event_map(struct perf_event *event, u64 *econfig) return ret; } +static int cdeleg_pmu_event_find_cache(u64 config, u64 *eventid, uint32_t *counter_mask) +{ + unsigned int cache_type, cache_op, cache_result; + + if (!current_pmu_cache_event_map) + return -ENOENT; + + cache_type = (config >> 0) & 0xff; + if (cache_type >= PERF_COUNT_HW_CACHE_MAX) + return -EINVAL; + + cache_op = (config >> 8) & 0xff; + if (cache_op >= PERF_COUNT_HW_CACHE_OP_MAX) + return -EINVAL; + + cache_result = (config >> 16) & 0xff; + if (cache_result >= PERF_COUNT_HW_CACHE_RESULT_MAX) + return -EINVAL; + + if (eventid) + *eventid = current_pmu_cache_event_map[cache_type][cache_op] + [cache_result].event_id; + if (counter_mask) + *counter_mask = current_pmu_cache_event_map[cache_type][cache_op] + [cache_result].counter_mask; + + return 0; +} + +static int rvpmu_cdeleg_event_map(struct perf_event *event, u64 *econfig) +{ + u32 type = event->attr.type; + u64 config = event->attr.config; + int ret = 0; + + /* + * There are two ways standard perf events can be mapped to platform specific + * encoding. + * 1. The vendor may specify the encodings in the driver. + * 2. The Perf tool for RISC-V may remap the standard perf event to platform + * specific encoding. + * + * As RISC-V ISA doesn't define any standard event encoding. Thus, perf tool allows + * vendor to define it via json file. The encoding defined in the json will override + * the perf legacy encoding. However, some user may want to run performance + * monitoring without perf tool as well. That's why, vendors may specify the event + * encoding in the driver as well if they want to support that use case too. + * If an encoding is defined in the json, it will be encoded as a raw event. + */ + + switch (type) { + case PERF_TYPE_HARDWARE: + if (config >= PERF_COUNT_HW_MAX) + return -EINVAL; + if (!current_pmu_hw_event_map) + return -ENOENT; + + *econfig = current_pmu_hw_event_map[config].event_id; + if (*econfig == HW_OP_UNSUPPORTED) + ret = -ENOENT; + break; + case PERF_TYPE_HW_CACHE: + ret = cdeleg_pmu_event_find_cache(config, econfig, NULL); + if (*econfig == HW_OP_UNSUPPORTED) + ret = -ENOENT; + break; + case PERF_TYPE_RAW: + *econfig = config & RISCV_PMU_DELEG_RAW_EVENT_MASK; + break; + default: + ret = -ENOENT; + break; + } + + /* event_base is not used for counter delegation */ + return ret; +} + static void pmu_sbi_snapshot_free(struct riscv_pmu *pmu) { int cpu; @@ -717,7 +861,7 @@ static int pmu_sbi_snapshot_setup(struct riscv_pmu *pmu, int cpu) return 0; } -static u64 rvpmu_sbi_ctr_read(struct perf_event *event) +static u64 rvpmu_ctr_read(struct perf_event *event) { struct hw_perf_event *hwc = &event->hw; int idx = hwc->idx; @@ -794,10 +938,6 @@ static void rvpmu_sbi_ctr_start(struct perf_event *event, u64 ival) if (ret.error && (ret.error != SBI_ERR_ALREADY_STARTED)) pr_err("Starting counter idx %d failed with error %d\n", hwc->idx, sbi_err_map_linux_errno(ret.error)); - - if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) && - (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT)) - rvpmu_set_scounteren((void *)event); } static void rvpmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag) @@ -808,10 +948,6 @@ static void rvpmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag) struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events); struct riscv_pmu_snapshot_data *sdata = cpu_hw_evt->snapshot_addr; - if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) && - (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT)) - rvpmu_reset_scounteren((void *)event); - if (sbi_pmu_snapshot_available()) flag |= SBI_PMU_STOP_FLAG_TAKE_SNAPSHOT; @@ -847,12 +983,6 @@ static int rvpmu_sbi_find_num_ctrs(void) return sbi_err_map_linux_errno(ret.error); } -static u32 rvpmu_deleg_find_ctrs(void) -{ - /* TODO */ - return 0; -} - static int rvpmu_sbi_get_ctrinfo(u32 nsbi_ctr, u32 *num_fw_ctr, u32 *num_hw_ctr) { struct sbiret ret; @@ -930,53 +1060,75 @@ static inline void rvpmu_sbi_stop_hw_ctrs(struct riscv_pmu *pmu) } } -/* - * This function starts all the used counters in two step approach. - * Any counter that did not overflow can be start in a single step - * while the overflowed counters need to be started with updated initialization - * value. - */ -static inline void rvpmu_sbi_start_ovf_ctrs_sbi(struct cpu_hw_events *cpu_hw_evt, - u64 ctr_ovf_mask) +static void rvpmu_deleg_ctr_start_mask(unsigned long mask) { - int idx = 0, i; - struct perf_event *event; - unsigned long flag = SBI_PMU_START_FLAG_SET_INIT_VALUE; - unsigned long ctr_start_mask = 0; - uint64_t max_period; - struct hw_perf_event *hwc; - u64 init_val = 0; + unsigned long scountinhibit_val = 0; - for (i = 0; i < BITS_TO_LONGS(RISCV_MAX_COUNTERS); i++) { - ctr_start_mask = cpu_hw_evt->used_hw_ctrs[i] & ~ctr_ovf_mask; - /* Start all the counters that did not overflow in a single shot */ - sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, i * BITS_PER_LONG, ctr_start_mask, - 0, 0, 0, 0); - } + scountinhibit_val = csr_read(CSR_SCOUNTINHIBIT); + scountinhibit_val &= ~mask; + + csr_write(CSR_SCOUNTINHIBIT, scountinhibit_val); +} + +static void rvpmu_deleg_ctr_enable_irq(struct perf_event *event) +{ + unsigned long hpmevent_curr; + unsigned long of_mask; + struct hw_perf_event *hwc = &event->hw; + int counter_idx = hwc->idx; + unsigned long sip_val = csr_read(CSR_SIP); + + if (!is_sampling_event(event) || (sip_val & SIP_LCOFIP)) + return; - /* Reinitialize and start all the counter that overflowed */ - while (ctr_ovf_mask) { - if (ctr_ovf_mask & 0x01) { - event = cpu_hw_evt->events[idx]; - hwc = &event->hw; - max_period = riscv_pmu_ctr_get_width_mask(event); - init_val = local64_read(&hwc->prev_count) & max_period; #if defined(CONFIG_32BIT) - sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, idx, 1, - flag, init_val, init_val >> 32, 0); + hpmevent_curr = csr_ind_read(CSR_SIREG5, SISELECT_SSCCFG_BASE, counter_idx); + of_mask = (u32)~HPMEVENTH_OF; #else - sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, idx, 1, - flag, init_val, 0, 0); + hpmevent_curr = csr_ind_read(CSR_SIREG2, SISELECT_SSCCFG_BASE, counter_idx); + of_mask = ~HPMEVENT_OF; +#endif + + hpmevent_curr &= of_mask; +#if defined(CONFIG_32BIT) + csr_ind_write(CSR_SIREG4, SISELECT_SSCCFG_BASE, counter_idx, hpmevent_curr); +#else + csr_ind_write(CSR_SIREG2, SISELECT_SSCCFG_BASE, counter_idx, hpmevent_curr); #endif - perf_event_update_userpage(event); - } - ctr_ovf_mask = ctr_ovf_mask >> 1; - idx++; - } } -static inline void rvpmu_sbi_start_ovf_ctrs_snapshot(struct cpu_hw_events *cpu_hw_evt, - u64 ctr_ovf_mask) +static void rvpmu_deleg_ctr_start(struct perf_event *event, u64 ival) +{ + unsigned long scountinhibit_val = 0; + struct hw_perf_event *hwc = &event->hw; + +#if defined(CONFIG_32BIT) + csr_ind_write(CSR_SIREG, SISELECT_SSCCFG_BASE, hwc->idx, ival & 0xFFFFFFFF); + csr_ind_write(CSR_SIREG4, SISELECT_SSCCFG_BASE, hwc->idx, ival >> BITS_PER_LONG); +#else + csr_ind_write(CSR_SIREG, SISELECT_SSCCFG_BASE, hwc->idx, ival); +#endif + + rvpmu_deleg_ctr_enable_irq(event); + + scountinhibit_val = csr_read(CSR_SCOUNTINHIBIT); + scountinhibit_val &= ~(1 << hwc->idx); + + csr_write(CSR_SCOUNTINHIBIT, scountinhibit_val); +} + +static void rvpmu_deleg_ctr_stop_mask(unsigned long mask) +{ + unsigned long scountinhibit_val = 0; + + scountinhibit_val = csr_read(CSR_SCOUNTINHIBIT); + scountinhibit_val |= mask; + + csr_write(CSR_SCOUNTINHIBIT, scountinhibit_val); +} + +static void rvpmu_sbi_start_ovf_ctrs_snapshot(struct cpu_hw_events *cpu_hw_evt, + u64 ctr_ovf_mask) { int i, idx = 0; struct perf_event *event; @@ -1010,15 +1162,53 @@ static inline void rvpmu_sbi_start_ovf_ctrs_snapshot(struct cpu_hw_events *cpu_h } } -static void rvpmu_sbi_start_overflow_mask(struct riscv_pmu *pmu, - u64 ctr_ovf_mask) +/* + * This function starts all the used counters in two step approach. + * Any counter that did not overflow can be start in a single step + * while the overflowed counters need to be started with updated initialization + * value. + */ +static void rvpmu_start_overflow_mask(struct riscv_pmu *pmu, u64 ctr_ovf_mask) { + int idx = 0, i; + struct perf_event *event; + unsigned long ctr_start_mask = 0; + u64 max_period, init_val = 0; + struct hw_perf_event *hwc; struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events); if (sbi_pmu_snapshot_available()) - rvpmu_sbi_start_ovf_ctrs_snapshot(cpu_hw_evt, ctr_ovf_mask); - else - rvpmu_sbi_start_ovf_ctrs_sbi(cpu_hw_evt, ctr_ovf_mask); + return rvpmu_sbi_start_ovf_ctrs_snapshot(cpu_hw_evt, ctr_ovf_mask); + + /* Start all the counters that did not overflow */ + if (riscv_pmu_cdeleg_available()) { + ctr_start_mask = cpu_hw_evt->used_hw_ctrs[i] & ~ctr_ovf_mask; + rvpmu_deleg_ctr_start_mask(ctr_start_mask); + } else { + for (i = 0; i < BITS_TO_LONGS(RISCV_MAX_COUNTERS); i++) { + ctr_start_mask = cpu_hw_evt->used_hw_ctrs[i] & ~ctr_ovf_mask; + /* Start all the counters that did not overflow in a single shot */ + sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, i * BITS_PER_LONG, + ctr_start_mask, 0, 0, 0, 0); + } + } + + /* Reinitialize and start all the counter that overflowed */ + while (ctr_ovf_mask) { + if (ctr_ovf_mask & 0x01) { + event = cpu_hw_evt->events[idx]; + hwc = &event->hw; + max_period = riscv_pmu_ctr_get_width_mask(event); + init_val = local64_read(&hwc->prev_count) & max_period; + if (riscv_pmu_cdeleg_available()) + rvpmu_deleg_ctr_start(event, init_val); + else + rvpmu_sbi_ctr_start(event, init_val); + perf_event_update_userpage(event); + } + ctr_ovf_mask = ctr_ovf_mask >> 1; + idx++; + } } static irqreturn_t rvpmu_ovf_handler(int irq, void *dev) @@ -1053,7 +1243,10 @@ static irqreturn_t rvpmu_ovf_handler(int irq, void *dev) } pmu = to_riscv_pmu(event->pmu); - rvpmu_sbi_stop_hw_ctrs(pmu); + if (riscv_pmu_cdeleg_available()) + rvpmu_deleg_ctr_stop_mask(cpu_hw_evt->used_hw_ctrs[0]); + else + rvpmu_sbi_stop_hw_ctrs(pmu); /* Overflow status register should only be read after counter are stopped */ if (sbi_pmu_snapshot_available()) @@ -1122,25 +1315,177 @@ static irqreturn_t rvpmu_ovf_handler(int irq, void *dev) hw_evt->state = 0; } - rvpmu_sbi_start_overflow_mask(pmu, overflowed_ctrs); + rvpmu_start_overflow_mask(pmu, overflowed_ctrs); perf_sample_event_took(sched_clock() - start_clock); return IRQ_HANDLED; } +static int get_deleg_hw_ctr_width(int counter_offset) +{ + unsigned long hpm_warl; + int num_bits; + + if (counter_offset < 3 || counter_offset > 31) + return 0; + + hpm_warl = csr_ind_warl(CSR_SIREG, SISELECT_SSCCFG_BASE, counter_offset, -1); + num_bits = __fls(hpm_warl); + +#if defined(CONFIG_32BIT) + hpm_warl = csr_ind_warl(CSR_SIREG4, SISELECT_SSCCFG_BASE, counter_offset, -1); + num_bits += __fls(hpm_warl); +#endif + return num_bits; +} + +static int rvpmu_deleg_find_ctrs(void) +{ + int i, num_hw_ctr = 0; + union sbi_pmu_ctr_info cinfo; + unsigned long scountinhibit_old = 0; + + /* Do a WARL write/read to detect which hpmcounters have been delegated */ + scountinhibit_old = csr_read(CSR_SCOUNTINHIBIT); + csr_write(CSR_SCOUNTINHIBIT, -1); + cmask = csr_read(CSR_SCOUNTINHIBIT); + + csr_write(CSR_SCOUNTINHIBIT, scountinhibit_old); + + for_each_set_bit(i, &cmask, RISCV_MAX_HW_COUNTERS) { + if (unlikely(i == 1)) + continue; /* This should never happen as TM is read only */ + cinfo.value = 0; + cinfo.type = SBI_PMU_CTR_TYPE_HW; + /* + * If counter delegation is enabled, the csr stored to the cinfo will + * be a virtual counter that the delegation attempts to read. + */ + cinfo.csr = CSR_CYCLE + i; + if (i == 0 || i == 2) + cinfo.width = 63; + else + cinfo.width = get_deleg_hw_ctr_width(i); + + num_hw_ctr++; + pmu_ctr_list[i].value = cinfo.value; + } + + return num_hw_ctr; +} + +static int get_deleg_fixed_hw_idx(struct cpu_hw_events *cpuc, struct perf_event *event) +{ + return -EINVAL; +} + +static int get_deleg_next_hpm_hw_idx(struct cpu_hw_events *cpuc, struct perf_event *event) +{ + unsigned long hw_ctr_mask = 0; + + /* + * TODO: Treat every hpmcounter can monitor every event for now. + * The event to counter mapping should come from the json file. + * The mapping should also tell if sampling is supported or not. + */ + + /* Select only hpmcounters */ + hw_ctr_mask = cmask & (~0x7); + hw_ctr_mask &= ~(cpuc->used_hw_ctrs[0]); + return __ffs(hw_ctr_mask); +} + +static void update_deleg_hpmevent(int counter_idx, uint64_t event_value, uint64_t filter_bits) +{ + u64 hpmevent_value = 0; + + /* OF bit should be enable during the start if sampling is requested */ + hpmevent_value = (event_value & ~HPMEVENT_MASK) | filter_bits | HPMEVENT_OF; +#if defined(CONFIG_32BIT) + csr_ind_write(CSR_SIREG2, SISELECT_SSCCFG_BASE, counter_idx, hpmevent_value & 0xFFFFFFFF); + if (riscv_isa_extension_available(NULL, SSCOFPMF)) + csr_ind_write(CSR_SIREG5, SISELECT_SSCCFG_BASE, counter_idx, + hpmevent_value >> BITS_PER_LONG); +#else + csr_ind_write(CSR_SIREG2, SISELECT_SSCCFG_BASE, counter_idx, hpmevent_value); +#endif +} + +static int rvpmu_deleg_ctr_get_idx(struct perf_event *event) +{ + struct hw_perf_event *hwc = &event->hw; + struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu); + struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events); + unsigned long hw_ctr_max_id; + u64 priv_filter; + int idx; + + /* + * TODO: We should not rely on SBI Perf encoding to check if the event + * is a fixed one or not. + */ + if (!is_sampling_event(event)) { + idx = get_deleg_fixed_hw_idx(cpuc, event); + if (idx == 0 || idx == 2) { + /* Priv mode filter bits are only available if smcntrpmf is present */ + if (riscv_isa_extension_available(NULL, SMCNTRPMF)) + goto found_idx; + else + goto skip_update; + } + } + + hw_ctr_max_id = __fls(cmask); + idx = get_deleg_next_hpm_hw_idx(cpuc, event); + if (idx < 3 || idx > hw_ctr_max_id) + goto out_err; +found_idx: + priv_filter = get_deleg_priv_filter_bits(event); + update_deleg_hpmevent(idx, hwc->config, priv_filter); +skip_update: + if (!test_and_set_bit(idx, cpuc->used_hw_ctrs)) + return idx; +out_err: + return -ENOENT; +} + static void rvpmu_ctr_start(struct perf_event *event, u64 ival) { - rvpmu_sbi_ctr_start(event, ival); - /* TODO: Counter delegation implementation */ + struct hw_perf_event *hwc = &event->hw; + + if (riscv_pmu_cdeleg_available() && !pmu_sbi_is_fw_event(event)) + rvpmu_deleg_ctr_start(event, ival); + else + rvpmu_sbi_ctr_start(event, ival); + + if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) && + (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT)) + rvpmu_set_scounteren((void *)event); } static void rvpmu_ctr_stop(struct perf_event *event, unsigned long flag) { - rvpmu_sbi_ctr_stop(event, flag); - /* TODO: Counter delegation implementation */ + struct hw_perf_event *hwc = &event->hw; + + if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) && + (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT)) + rvpmu_reset_scounteren((void *)event); + + if (riscv_pmu_cdeleg_available() && !pmu_sbi_is_fw_event(event)) { + /* + * The counter is already stopped. No need to stop again. Counter + * mapping will be reset in clear_idx function. + */ + if (flag != RISCV_PMU_STOP_FLAG_RESET) + rvpmu_deleg_ctr_stop_mask((1 << hwc->idx)); + else + update_deleg_hpmevent(hwc->idx, 0, 0); + } else { + rvpmu_sbi_ctr_stop(event, flag); + } } -static int rvpmu_find_ctrs(void) +static u32 rvpmu_find_ctrs(void) { u32 num_sbi_counters = 0, num_deleg_counters = 0; u32 num_hw_ctr = 0, num_fw_ctr = 0, num_ctr = 0; @@ -1179,20 +1524,18 @@ static int rvpmu_find_ctrs(void) static int rvpmu_event_map(struct perf_event *event, u64 *econfig) { - return rvpmu_sbi_event_map(event, econfig); - /* TODO: Counter delegation implementation */ + if (riscv_pmu_cdeleg_available() && !pmu_sbi_is_fw_event(event)) + return rvpmu_cdeleg_event_map(event, econfig); + else + return rvpmu_sbi_event_map(event, econfig); } static int rvpmu_ctr_get_idx(struct perf_event *event) { - return rvpmu_sbi_ctr_get_idx(event); - /* TODO: Counter delegation implementation */ -} - -static u64 rvpmu_ctr_read(struct perf_event *event) -{ - return rvpmu_sbi_ctr_read(event); - /* TODO: Counter delegation implementation */ + if (riscv_pmu_cdeleg_available() && !pmu_sbi_is_fw_event(event)) + return rvpmu_deleg_ctr_get_idx(event); + else + return rvpmu_sbi_ctr_get_idx(event); } static int rvpmu_starting_cpu(unsigned int cpu, struct hlist_node *node) @@ -1210,7 +1553,16 @@ static int rvpmu_starting_cpu(unsigned int cpu, struct hlist_node *node) csr_write(CSR_SCOUNTEREN, 0x2); /* Stop all the counters so that they can be enabled from perf */ - rvpmu_sbi_stop_all(pmu); + if (riscv_pmu_cdeleg_available()) { + rvpmu_deleg_ctr_stop_mask(cmask); + if (riscv_pmu_sbi_available()) { + /* Stop the firmware counters as well */ + sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_STOP, 0, firmware_cmask, + 0, 0, 0, 0); + } + } else { + rvpmu_sbi_stop_all(pmu); + } if (riscv_pmu_use_irq) { cpu_hw_evt->irq = riscv_pmu_irq; @@ -1509,8 +1861,11 @@ static int rvpmu_device_probe(struct platform_device *pdev) pmu->pmu.capabilities |= PERF_PMU_CAP_NO_EXCLUDE; } - pmu->pmu.attr_groups = riscv_pmu_attr_groups; pmu->pmu.parent = &pdev->dev; + if (riscv_pmu_cdeleg_available_boot()) + pmu->pmu.attr_groups = riscv_cdeleg_pmu_attr_groups; + else + pmu->pmu.attr_groups = riscv_sbi_pmu_attr_groups; pmu->cmask = cmask; pmu->ctr_start = rvpmu_ctr_start; pmu->ctr_stop = rvpmu_ctr_stop; diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h index a3e1fdd5084a..9e2758c32e8b 100644 --- a/include/linux/perf/riscv_pmu.h +++ b/include/linux/perf/riscv_pmu.h @@ -20,6 +20,7 @@ */ #define RISCV_MAX_COUNTERS 64 +#define RISCV_MAX_HW_COUNTERS 32 #define RISCV_OP_UNSUPP (-EOPNOTSUPP) #define RISCV_PMU_PDEV_NAME "riscv-pmu" #define RISCV_PMU_LEGACY_PDEV_NAME "riscv-pmu-legacy" @@ -28,6 +29,8 @@ #define RISCV_PMU_CONFIG1_GUEST_EVENTS 0x1 +#define RISCV_PMU_DELEG_RAW_EVENT_MASK GENMASK_ULL(55, 0) + #define HW_OP_UNSUPPORTED 0xFFFF #define CACHE_OP_UNSUPPORTED 0xFFFF -- 2.43.0 From atishp at rivosinc.com Thu Mar 27 12:35:56 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:35:56 -0700 Subject: [PATCH v5 15/21] RISC-V: perf: Skip PMU SBI extension when not implemented In-Reply-To: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> Message-ID: <20250327-counter_delegation-v5-15-1ee538468d1b@rivosinc.com> From: Charlie Jenkins When the PMU SBI extension is not implemented, sbi_v2_available should not be set to true. The SBI implementation for counter config matching and firmware counter read should also be skipped when the SBI extension is not implemented. Signed-off-by: Atish Patra Signed-off-by: Charlie Jenkins --- drivers/perf/riscv_pmu_dev.c | 38 +++++++++++++++++++++++--------------- 1 file changed, 23 insertions(+), 15 deletions(-) diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c index 7c4a1ef15866..d1cc8310423f 100644 --- a/drivers/perf/riscv_pmu_dev.c +++ b/drivers/perf/riscv_pmu_dev.c @@ -417,18 +417,22 @@ static void rvpmu_sbi_check_event(struct sbi_pmu_event_data *edata) } } -static void rvpmu_sbi_check_std_events(struct work_struct *work) +static void rvpmu_check_std_events(struct work_struct *work) { - for (int i = 0; i < ARRAY_SIZE(pmu_hw_event_sbi_map); i++) - rvpmu_sbi_check_event(&pmu_hw_event_sbi_map[i]); - - for (int i = 0; i < ARRAY_SIZE(pmu_cache_event_sbi_map); i++) - for (int j = 0; j < ARRAY_SIZE(pmu_cache_event_sbi_map[i]); j++) - for (int k = 0; k < ARRAY_SIZE(pmu_cache_event_sbi_map[i][j]); k++) - rvpmu_sbi_check_event(&pmu_cache_event_sbi_map[i][j][k]); + if (riscv_pmu_sbi_available()) { + for (int i = 0; i < ARRAY_SIZE(pmu_hw_event_sbi_map); i++) + rvpmu_sbi_check_event(&pmu_hw_event_sbi_map[i]); + + for (int i = 0; i < ARRAY_SIZE(pmu_cache_event_sbi_map); i++) + for (int j = 0; j < ARRAY_SIZE(pmu_cache_event_sbi_map[i]); j++) + for (int k = 0; k < ARRAY_SIZE(pmu_cache_event_sbi_map[i][j]); k++) + rvpmu_sbi_check_event(&pmu_cache_event_sbi_map[i][j][k]); + } else { + DO_ONCE_LITE_IF(1, pr_err, "Boot time config matching not required for smcdeleg\n"); + } } -static DECLARE_WORK(check_std_events_work, rvpmu_sbi_check_std_events); +static DECLARE_WORK(check_std_events_work, rvpmu_check_std_events); static ssize_t rvpmu_format_show(struct device *dev, struct device_attribute *attr, char *buf) @@ -556,6 +560,9 @@ static int rvpmu_sbi_ctr_get_idx(struct perf_event *event) cflags = rvpmu_sbi_get_filter_flags(event); + if (!riscv_pmu_sbi_available()) + return -ENOENT; + /* * In legacy mode, we have to force the fixed counters for those events * but not in the user access mode as we want to use the other counters @@ -878,7 +885,7 @@ static u64 rvpmu_ctr_read(struct perf_event *event) return val; } - if (pmu_sbi_is_fw_event(event)) { + if (pmu_sbi_is_fw_event(event) && riscv_pmu_sbi_available()) { ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_FW_READ, hwc->idx, 0, 0, 0, 0, 0); if (ret.error) @@ -1945,12 +1952,13 @@ static int __init rvpmu_devinit(void) int ret; struct platform_device *pdev; - if (sbi_spec_version >= sbi_mk_version(0, 3) && - sbi_probe_extension(SBI_EXT_PMU)) - static_branch_enable(&riscv_pmu_sbi_available); + if (sbi_probe_extension(SBI_EXT_PMU)) { + if (sbi_spec_version >= sbi_mk_version(0, 3)) + static_branch_enable(&riscv_pmu_sbi_available); + if (sbi_spec_version >= sbi_mk_version(2, 0)) + sbi_v2_available = true; + } - if (sbi_spec_version >= sbi_mk_version(2, 0)) - sbi_v2_available = true; /* * We need all three extensions to be present to access the counters * in S-mode via Supervisor Counter delegation. -- 2.43.0 From atishp at rivosinc.com Thu Mar 27 12:35:57 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:35:57 -0700 Subject: [PATCH v5 16/21] RISC-V: perf: Use config2/vendor table for event to counter mapping In-Reply-To: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> Message-ID: <20250327-counter_delegation-v5-16-1ee538468d1b@rivosinc.com> The counter restriction specified in the json file is passed to the drivers via config2 paarameter in perf attributes. This allows any platform vendor to define their custom mapping between event and hpmcounters without any rules defined in the ISA. For legacy events, the platform vendor may define the mapping in the driver in the vendor event table. The fixed cycle and instruction counters are fixed (0 and 2 respectively) by the ISA and maps to the legacy events. The platform vendor must specify this in the driver if intended to be used while profiling. Otherwise, they can just specify the alternate hpmcounters that may monitor and/or sample the cycle/instruction counts. Signed-off-by: Atish Patra --- drivers/perf/riscv_pmu_dev.c | 79 ++++++++++++++++++++++++++++++++++-------- include/linux/perf/riscv_pmu.h | 2 ++ 2 files changed, 67 insertions(+), 14 deletions(-) diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c index d1cc8310423f..92ff42aca44b 100644 --- a/drivers/perf/riscv_pmu_dev.c +++ b/drivers/perf/riscv_pmu_dev.c @@ -76,6 +76,7 @@ static ssize_t __maybe_unused rvpmu_format_show(struct device *dev, struct devic RVPMU_ATTR_ENTRY(_name, rvpmu_format_show, (char *)_config) PMU_FORMAT_ATTR(firmware, "config:62-63"); +PMU_FORMAT_ATTR(counterid_mask, "config2:0-31"); static bool sbi_v2_available; static DEFINE_STATIC_KEY_FALSE(sbi_pmu_snapshot_available); @@ -119,6 +120,7 @@ static const struct attribute_group *riscv_sbi_pmu_attr_groups[] = { static struct attribute *riscv_cdeleg_pmu_formats_attr[] = { RVPMU_FORMAT_ATTR_ENTRY(event, RVPMU_CDELEG_PMU_FORMAT_ATTR), &format_attr_firmware.attr, + &format_attr_counterid_mask.attr, NULL, }; @@ -1381,24 +1383,77 @@ static int rvpmu_deleg_find_ctrs(void) return num_hw_ctr; } +/* + * The json file must correctly specify counter 0 or counter 2 is available + * in the counter lists for cycle/instret events. Otherwise, the drivers have + * no way to figure out if a fixed counter must be used and pick a programmable + * counter if available. + */ static int get_deleg_fixed_hw_idx(struct cpu_hw_events *cpuc, struct perf_event *event) { - return -EINVAL; + struct hw_perf_event *hwc = &event->hw; + bool guest_events = event->attr.config1 & RISCV_PMU_CONFIG1_GUEST_EVENTS; + + if (guest_events) { + if (hwc->event_base == SBI_PMU_HW_CPU_CYCLES) + return 0; + if (hwc->event_base == SBI_PMU_HW_INSTRUCTIONS) + return 2; + else + return -EINVAL; + } + + if (!event->attr.config2) + return -EINVAL; + + if (event->attr.config2 & RISCV_PMU_CYCLE_FIXED_CTR_MASK) + return 0; /* CY counter */ + else if (event->attr.config2 & RISCV_PMU_INSTRUCTION_FIXED_CTR_MASK) + return 2; /* IR counter */ + else + return -EINVAL; } static int get_deleg_next_hpm_hw_idx(struct cpu_hw_events *cpuc, struct perf_event *event) { - unsigned long hw_ctr_mask = 0; + u32 hw_ctr_mask = 0, temp_mask = 0; + u32 type = event->attr.type; + u64 config = event->attr.config; + int ret; - /* - * TODO: Treat every hpmcounter can monitor every event for now. - * The event to counter mapping should come from the json file. - * The mapping should also tell if sampling is supported or not. - */ + /* Select only available hpmcounters */ + hw_ctr_mask = cmask & (~0x7) & ~(cpuc->used_hw_ctrs[0]); + + switch (type) { + case PERF_TYPE_HARDWARE: + temp_mask = current_pmu_hw_event_map[config].counter_mask; + break; + case PERF_TYPE_HW_CACHE: + ret = cdeleg_pmu_event_find_cache(config, NULL, &temp_mask); + if (ret) + return ret; + break; + case PERF_TYPE_RAW: + /* + * Mask off the counters that can't monitor this event (specified via json) + * The counter mask for this event is set in config2 via the property 'Counter' + * in the json file or manual configuration of config2. If the config2 is not set, + * it is assumed all the available hpmcounters can monitor this event. + * Note: This assumption may fail for virtualization use case where they hypervisor + * (e.g. KVM) virtualizes the counter. Any event to counter mapping provided by the + * guest is meaningless from a hypervisor perspective. Thus, the hypervisor doesn't + * set config2 when creating kernel counter and relies default host mapping. + */ + if (event->attr.config2) + temp_mask = event->attr.config2; + break; + default: + break; + } + + if (temp_mask) + hw_ctr_mask &= temp_mask; - /* Select only hpmcounters */ - hw_ctr_mask = cmask & (~0x7); - hw_ctr_mask &= ~(cpuc->used_hw_ctrs[0]); return __ffs(hw_ctr_mask); } @@ -1427,10 +1482,6 @@ static int rvpmu_deleg_ctr_get_idx(struct perf_event *event) u64 priv_filter; int idx; - /* - * TODO: We should not rely on SBI Perf encoding to check if the event - * is a fixed one or not. - */ if (!is_sampling_event(event)) { idx = get_deleg_fixed_hw_idx(cpuc, event); if (idx == 0 || idx == 2) { diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h index 9e2758c32e8b..e58f83811988 100644 --- a/include/linux/perf/riscv_pmu.h +++ b/include/linux/perf/riscv_pmu.h @@ -30,6 +30,8 @@ #define RISCV_PMU_CONFIG1_GUEST_EVENTS 0x1 #define RISCV_PMU_DELEG_RAW_EVENT_MASK GENMASK_ULL(55, 0) +#define RISCV_PMU_CYCLE_FIXED_CTR_MASK 0x01 +#define RISCV_PMU_INSTRUCTION_FIXED_CTR_MASK 0x04 #define HW_OP_UNSUPPORTED 0xFFFF #define CACHE_OP_UNSUPPORTED 0xFFFF -- 2.43.0 From atishp at rivosinc.com Thu Mar 27 12:35:58 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:35:58 -0700 Subject: [PATCH v5 17/21] RISC-V: perf: Add legacy event encodings via sysfs In-Reply-To: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> Message-ID: <20250327-counter_delegation-v5-17-1ee538468d1b@rivosinc.com> Define sysfs details for the legacy events so that any tool can parse these to understand the minimum set of legacy events supported by the platform. The sysfs entry will describe both event encoding and corresponding counter map so that an perf event can be programmed accordingly. Signed-off-by: Atish Patra --- drivers/perf/riscv_pmu_dev.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c index 92ff42aca44b..8a079949e3a4 100644 --- a/drivers/perf/riscv_pmu_dev.c +++ b/drivers/perf/riscv_pmu_dev.c @@ -129,7 +129,20 @@ static struct attribute_group riscv_cdeleg_pmu_format_group = { .attrs = riscv_cdeleg_pmu_formats_attr, }; +#define RVPMU_EVENT_ATTR_RESOLVE(m) #m +#define RVPMU_EVENT_CMASK_ATTR(_name, _var, config, mask) \ + PMU_EVENT_ATTR_STRING(_name, rvpmu_event_attr_##_var, \ + "event=" RVPMU_EVENT_ATTR_RESOLVE(config) \ + ",counterid_mask=" RVPMU_EVENT_ATTR_RESOLVE(mask) "\n") + +#define RVPMU_EVENT_ATTR_PTR(name) (&rvpmu_event_attr_##name.attr.attr) + +static struct attribute_group riscv_cdeleg_pmu_event_group __ro_after_init = { + .name = "events", +}; + static const struct attribute_group *riscv_cdeleg_pmu_attr_groups[] = { + &riscv_cdeleg_pmu_event_group, &riscv_cdeleg_pmu_format_group, NULL, }; @@ -369,11 +382,14 @@ struct riscv_vendor_pmu_events { const struct riscv_pmu_event *hw_event_map; const struct riscv_pmu_event (*cache_event_map)[PERF_COUNT_HW_CACHE_OP_MAX] [PERF_COUNT_HW_CACHE_RESULT_MAX]; + struct attribute **attrs_events; }; -#define RISCV_VENDOR_PMU_EVENTS(_vendorid, _archid, _implid, _hw_event_map, _cache_event_map) \ +#define RISCV_VENDOR_PMU_EVENTS(_vendorid, _archid, _implid, _hw_event_map, \ + _cache_event_map, _attrs) \ { .vendorid = _vendorid, .archid = _archid, .implid = _implid, \ - .hw_event_map = _hw_event_map, .cache_event_map = _cache_event_map }, + .hw_event_map = _hw_event_map, .cache_event_map = _cache_event_map, \ + .attrs_events = _attrs }, static struct riscv_vendor_pmu_events pmu_vendor_events_table[] = { }; @@ -395,6 +411,8 @@ static void rvpmu_vendor_register_events(void) pmu_vendor_events_table[i].archid == arch_id) { current_pmu_hw_event_map = pmu_vendor_events_table[i].hw_event_map; current_pmu_cache_event_map = pmu_vendor_events_table[i].cache_event_map; + riscv_cdeleg_pmu_event_group.attrs = + pmu_vendor_events_table[i].attrs_events; break; } } -- 2.43.0 From atishp at rivosinc.com Thu Mar 27 12:35:59 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:35:59 -0700 Subject: [PATCH v5 18/21] RISC-V: perf: Add Qemu virt machine events In-Reply-To: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> Message-ID: <20250327-counter_delegation-v5-18-1ee538468d1b@rivosinc.com> Qemu virt machine supports a very minimal set of legacy perf events. Add them to the vendor table so that users can use them when counter delegation is enabled. Signed-off-by: Atish Patra --- arch/riscv/include/asm/vendorid_list.h | 4 ++++ drivers/perf/riscv_pmu_dev.c | 36 ++++++++++++++++++++++++++++++++++ 2 files changed, 40 insertions(+) diff --git a/arch/riscv/include/asm/vendorid_list.h b/arch/riscv/include/asm/vendorid_list.h index a5150cdf34d8..0eefc844923e 100644 --- a/arch/riscv/include/asm/vendorid_list.h +++ b/arch/riscv/include/asm/vendorid_list.h @@ -10,4 +10,8 @@ #define SIFIVE_VENDOR_ID 0x489 #define THEAD_VENDOR_ID 0x5b7 +#define QEMU_VIRT_VENDOR_ID 0x000 +#define QEMU_VIRT_IMPL_ID 0x000 +#define QEMU_VIRT_ARCH_ID 0x000 + #endif diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c index 8a079949e3a4..cd2ac4cf34f1 100644 --- a/drivers/perf/riscv_pmu_dev.c +++ b/drivers/perf/riscv_pmu_dev.c @@ -26,6 +26,7 @@ #include #include #include +#include #include #include #include @@ -391,7 +392,42 @@ struct riscv_vendor_pmu_events { .hw_event_map = _hw_event_map, .cache_event_map = _cache_event_map, \ .attrs_events = _attrs }, +/* QEMU virt PMU events */ +static const struct riscv_pmu_event qemu_virt_hw_event_map[PERF_COUNT_HW_MAX] = { + PERF_MAP_ALL_UNSUPPORTED, + [PERF_COUNT_HW_CPU_CYCLES] = {0x01, 0xFFFFFFF8}, + [PERF_COUNT_HW_INSTRUCTIONS] = {0x02, 0xFFFFFFF8} +}; + +static const struct riscv_pmu_event qemu_virt_cache_event_map[PERF_COUNT_HW_CACHE_MAX] + [PERF_COUNT_HW_CACHE_OP_MAX] + [PERF_COUNT_HW_CACHE_RESULT_MAX] = { + PERF_CACHE_MAP_ALL_UNSUPPORTED, + [C(DTLB)][C(OP_READ)][C(RESULT_MISS)] = {0x10019, 0xFFFFFFF8}, + [C(DTLB)][C(OP_WRITE)][C(RESULT_MISS)] = {0x1001B, 0xFFFFFFF8}, + + [C(ITLB)][C(OP_READ)][C(RESULT_MISS)] = {0x10021, 0xFFFFFFF8}, +}; + +RVPMU_EVENT_CMASK_ATTR(cycles, cycles, 0x01, 0xFFFFFFF8); +RVPMU_EVENT_CMASK_ATTR(instructions, instructions, 0x02, 0xFFFFFFF8); +RVPMU_EVENT_CMASK_ATTR(dTLB-load-misses, dTLB_load_miss, 0x10019, 0xFFFFFFF8); +RVPMU_EVENT_CMASK_ATTR(dTLB-store-misses, dTLB_store_miss, 0x1001B, 0xFFFFFFF8); +RVPMU_EVENT_CMASK_ATTR(iTLB-load-misses, iTLB_load_miss, 0x10021, 0xFFFFFFF8); + +static struct attribute *qemu_virt_event_group[] = { + RVPMU_EVENT_ATTR_PTR(cycles), + RVPMU_EVENT_ATTR_PTR(instructions), + RVPMU_EVENT_ATTR_PTR(dTLB_load_miss), + RVPMU_EVENT_ATTR_PTR(dTLB_store_miss), + RVPMU_EVENT_ATTR_PTR(iTLB_load_miss), + NULL, +}; + static struct riscv_vendor_pmu_events pmu_vendor_events_table[] = { + RISCV_VENDOR_PMU_EVENTS(QEMU_VIRT_VENDOR_ID, QEMU_VIRT_ARCH_ID, QEMU_VIRT_IMPL_ID, + qemu_virt_hw_event_map, qemu_virt_cache_event_map, + qemu_virt_event_group) }; const struct riscv_pmu_event *current_pmu_hw_event_map; -- 2.43.0 From atishp at rivosinc.com Thu Mar 27 12:36:00 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:36:00 -0700 Subject: [PATCH v5 19/21] tools/perf: Support event code for arch standard events In-Reply-To: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> Message-ID: <20250327-counter_delegation-v5-19-1ee538468d1b@rivosinc.com> RISC-V relies on the event encoding from the json file. That includes arch standard events. If event code is present, event is already updated with correct encoding. No need to update it again which results in losing the event encoding. Signed-off-by: Atish Patra --- tools/perf/pmu-events/arch/riscv/arch-standard.json | 10 ++++++++++ tools/perf/pmu-events/jevents.py | 4 +++- 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/tools/perf/pmu-events/arch/riscv/arch-standard.json b/tools/perf/pmu-events/arch/riscv/arch-standard.json new file mode 100644 index 000000000000..96e21f088558 --- /dev/null +++ b/tools/perf/pmu-events/arch/riscv/arch-standard.json @@ -0,0 +1,10 @@ +[ + { + "EventName": "cycles", + "BriefDescription": "cycle executed" + }, + { + "EventName": "instructions", + "BriefDescription": "instruction retired" + } +] diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py index fa7c466a5ef3..fdb7ddf093d2 100755 --- a/tools/perf/pmu-events/jevents.py +++ b/tools/perf/pmu-events/jevents.py @@ -417,7 +417,9 @@ class JsonEvent: self.long_desc += extra_desc if arch_std: if arch_std.lower() in _arch_std_events: - event = _arch_std_events[arch_std.lower()].event + # No need to replace as evencode would have updated the event before + if not eventcode: + event = _arch_std_events[arch_std.lower()].event # Copy from the architecture standard event to self for undefined fields. for attr, value in _arch_std_events[arch_std.lower()].__dict__.items(): if hasattr(self, attr) and not getattr(self, attr): -- 2.43.0 From atishp at rivosinc.com Thu Mar 27 12:36:01 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:36:01 -0700 Subject: [PATCH v5 20/21] tools/perf: Pass the Counter constraint values in the pmu events In-Reply-To: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> Message-ID: <20250327-counter_delegation-v5-20-1ee538468d1b@rivosinc.com> RISC-V doesn't have any standard event to counter mapping discovery mechanism in the ISA. The ISA defines 29 programmable counters and platforms can choose to implement any number of them and map any events to any counters. Thus, the perf tool need to inform the driver about the counter mapping of each events. The current perf infrastructure only parses the 'Counter' constraints in metrics. This patch extends that to pass in the pmu events so that any driver can retrieve those values via perf attributes if defined accordingly. Signed-off-by: Atish Patra --- tools/perf/pmu-events/jevents.py | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py index fdb7ddf093d2..f9f274678a32 100755 --- a/tools/perf/pmu-events/jevents.py +++ b/tools/perf/pmu-events/jevents.py @@ -274,6 +274,11 @@ class JsonEvent: return fixed[name.lower()] return event + def counter_list_to_bitmask(counterlist): + counter_ids = list(map(int, counterlist.split(','))) + bitmask = sum(1 << pos for pos in counter_ids) + return bitmask + def unit_to_pmu(unit: str) -> Optional[str]: """Convert a JSON Unit to Linux PMU name.""" if not unit or unit == "core": @@ -427,6 +432,10 @@ class JsonEvent: else: raise argparse.ArgumentTypeError('Cannot find arch std event:', arch_std) + if self.counters['list']: + bitmask = counter_list_to_bitmask(self.counters['list']) + event += f',counterid_mask={bitmask:#x}' + self.event = real_event(self.name, event) def __repr__(self) -> str: -- 2.43.0 From atishp at rivosinc.com Thu Mar 27 12:36:02 2025 From: atishp at rivosinc.com (Atish Patra) Date: Thu, 27 Mar 2025 12:36:02 -0700 Subject: [PATCH v5 21/21] Sync empty-pmu-events.c with autogenerated one In-Reply-To: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> Message-ID: <20250327-counter_delegation-v5-21-1ee538468d1b@rivosinc.com> Signed-off-by: Atish Patra --- tools/perf/pmu-events/empty-pmu-events.c | 144 +++++++++++++++---------------- 1 file changed, 72 insertions(+), 72 deletions(-) diff --git a/tools/perf/pmu-events/empty-pmu-events.c b/tools/perf/pmu-events/empty-pmu-events.c index 3a7ec31576f5..22f0463dc522 100644 --- a/tools/perf/pmu-events/empty-pmu-events.c +++ b/tools/perf/pmu-events/empty-pmu-events.c @@ -36,42 +36,42 @@ static const char *const big_c_string = /* offset=1127 */ "bp_l1_btb_correct\000branch\000L1 BTB Correction\000event=0x8a\000\00000\000\000\000" /* offset=1187 */ "bp_l2_btb_correct\000branch\000L2 BTB Correction\000event=0x8b\000\00000\000\000\000" /* offset=1247 */ "l3_cache_rd\000cache\000L3 cache access, read\000event=0x40\000\00000\000Attributable Level 3 cache access, read\000\000" -/* offset=1343 */ "segment_reg_loads.any\000other\000Number of segment register loads\000event=6,period=200000,umask=0x80\000\00000\000\0000,1\000" -/* offset=1446 */ "dispatch_blocked.any\000other\000Memory cluster signals to block micro-op dispatch for any reason\000event=9,period=200000,umask=0x20\000\00000\000\0000,1\000" -/* offset=1580 */ "eist_trans\000other\000Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions\000event=0x3a,period=200000\000\00000\000\0000,1\000" -/* offset=1699 */ "hisi_sccl,ddrc\000" -/* offset=1714 */ "uncore_hisi_ddrc.flux_wcmd\000uncore\000DDRC write commands\000event=2\000\00000\000DDRC write commands\000\000" -/* offset=1801 */ "uncore_cbox\000" -/* offset=1813 */ "unc_cbo_xsnp_response.miss_eviction\000uncore\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000event=0x22,umask=0x81\000\00000\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\0000,1\000" -/* offset=2048 */ "event-hyphen\000uncore\000UNC_CBO_HYPHEN\000event=0xe0\000\00000\000UNC_CBO_HYPHEN\000\000" -/* offset=2114 */ "event-two-hyph\000uncore\000UNC_CBO_TWO_HYPH\000event=0xc0\000\00000\000UNC_CBO_TWO_HYPH\000\000" -/* offset=2186 */ "hisi_sccl,l3c\000" -/* offset=2200 */ "uncore_hisi_l3c.rd_hit_cpipe\000uncore\000Total read hits\000event=7\000\00000\000Total read hits\000\000" -/* offset=2281 */ "uncore_imc_free_running\000" -/* offset=2305 */ "uncore_imc_free_running.cache_miss\000uncore\000Total cache misses\000event=0x12\000\00000\000Total cache misses\000\000" -/* offset=2401 */ "uncore_imc\000" -/* offset=2412 */ "uncore_imc.cache_hits\000uncore\000Total cache hits\000event=0x34\000\00000\000Total cache hits\000\000" -/* offset=2491 */ "uncore_sys_ddr_pmu\000" -/* offset=2510 */ "sys_ddr_pmu.write_cycles\000uncore\000ddr write-cycles event\000event=0x2b\000v8\00000\000\000\000" -/* offset=2584 */ "uncore_sys_ccn_pmu\000" -/* offset=2603 */ "sys_ccn_pmu.read_cycles\000uncore\000ccn read-cycles event\000config=0x2c\0000x01\00000\000\000\000" -/* offset=2678 */ "uncore_sys_cmn_pmu\000" -/* offset=2697 */ "sys_cmn_pmu.hnf_cache_miss\000uncore\000Counts total cache misses in first lookup result (high priority)\000eventid=1,type=5\000(434|436|43c|43a).*\00000\000\000\000" -/* offset=2838 */ "CPI\000\0001 / IPC\000\000\000\000\000\000\000\00000" -/* offset=2860 */ "IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\00000" -/* offset=2923 */ "Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\00000" -/* offset=3089 */ "dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000" -/* offset=3153 */ "icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000" -/* offset=3220 */ "cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\00000" -/* offset=3291 */ "DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\00000" -/* offset=3385 */ "DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\00000" -/* offset=3519 */ "DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\00000" -/* offset=3583 */ "DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\00000" -/* offset=3651 */ "DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\00000" -/* offset=3721 */ "M1\000\000ipc + M2\000\000\000\000\000\000\000\00000" -/* offset=3743 */ "M2\000\000ipc + M1\000\000\000\000\000\000\000\00000" -/* offset=3765 */ "M3\000\0001 / M3\000\000\000\000\000\000\000\00000" -/* offset=3785 */ "L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\00000" +/* offset=1343 */ "segment_reg_loads.any\000other\000Number of segment register loads\000event=6,period=200000,umask=0x80,counterid_mask=0x3\000\00000\000\0000,1\000" +/* offset=1465 */ "dispatch_blocked.any\000other\000Memory cluster signals to block micro-op dispatch for any reason\000event=9,period=200000,umask=0x20,counterid_mask=0x3\000\00000\000\0000,1\000" +/* offset=1618 */ "eist_trans\000other\000Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions\000event=0x3a,period=200000,counterid_mask=0x3\000\00000\000\0000,1\000" +/* offset=1756 */ "hisi_sccl,ddrc\000" +/* offset=1771 */ "uncore_hisi_ddrc.flux_wcmd\000uncore\000DDRC write commands\000event=2\000\00000\000DDRC write commands\000\000" +/* offset=1858 */ "uncore_cbox\000" +/* offset=1870 */ "unc_cbo_xsnp_response.miss_eviction\000uncore\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000event=0x22,umask=0x81,counterid_mask=0x3\000\00000\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\0000,1\000" +/* offset=2124 */ "event-hyphen\000uncore\000UNC_CBO_HYPHEN\000event=0xe0\000\00000\000UNC_CBO_HYPHEN\000\000" +/* offset=2190 */ "event-two-hyph\000uncore\000UNC_CBO_TWO_HYPH\000event=0xc0\000\00000\000UNC_CBO_TWO_HYPH\000\000" +/* offset=2262 */ "hisi_sccl,l3c\000" +/* offset=2276 */ "uncore_hisi_l3c.rd_hit_cpipe\000uncore\000Total read hits\000event=7\000\00000\000Total read hits\000\000" +/* offset=2357 */ "uncore_imc_free_running\000" +/* offset=2381 */ "uncore_imc_free_running.cache_miss\000uncore\000Total cache misses\000event=0x12\000\00000\000Total cache misses\000\000" +/* offset=2477 */ "uncore_imc\000" +/* offset=2488 */ "uncore_imc.cache_hits\000uncore\000Total cache hits\000event=0x34\000\00000\000Total cache hits\000\000" +/* offset=2567 */ "uncore_sys_ddr_pmu\000" +/* offset=2586 */ "sys_ddr_pmu.write_cycles\000uncore\000ddr write-cycles event\000event=0x2b\000v8\00000\000\000\000" +/* offset=2660 */ "uncore_sys_ccn_pmu\000" +/* offset=2679 */ "sys_ccn_pmu.read_cycles\000uncore\000ccn read-cycles event\000config=0x2c\0000x01\00000\000\000\000" +/* offset=2754 */ "uncore_sys_cmn_pmu\000" +/* offset=2773 */ "sys_cmn_pmu.hnf_cache_miss\000uncore\000Counts total cache misses in first lookup result (high priority)\000eventid=1,type=5\000(434|436|43c|43a).*\00000\000\000\000" +/* offset=2914 */ "CPI\000\0001 / IPC\000\000\000\000\000\000\000\00000" +/* offset=2936 */ "IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\00000" +/* offset=2999 */ "Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\00000" +/* offset=3165 */ "dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000" +/* offset=3229 */ "icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000" +/* offset=3296 */ "cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\00000" +/* offset=3367 */ "DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\00000" +/* offset=3461 */ "DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\00000" +/* offset=3595 */ "DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\00000" +/* offset=3659 */ "DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\00000" +/* offset=3727 */ "DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\00000" +/* offset=3797 */ "M1\000\000ipc + M2\000\000\000\000\000\000\000\00000" +/* offset=3819 */ "M2\000\000ipc + M1\000\000\000\000\000\000\000\00000" +/* offset=3841 */ "M3\000\0001 / M3\000\000\000\000\000\000\000\00000" +/* offset=3861 */ "L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\00000" ; static const struct compact_pmu_event pmu_events__common_tool[] = { @@ -101,27 +101,27 @@ const struct pmu_table_entry pmu_events__common[] = { static const struct compact_pmu_event pmu_events__test_soc_cpu_default_core[] = { { 1127 }, /* bp_l1_btb_correct\000branch\000L1 BTB Correction\000event=0x8a\000\00000\000\000\000 */ { 1187 }, /* bp_l2_btb_correct\000branch\000L2 BTB Correction\000event=0x8b\000\00000\000\000\000 */ -{ 1446 }, /* dispatch_blocked.any\000other\000Memory cluster signals to block micro-op dispatch for any reason\000event=9,period=200000,umask=0x20\000\00000\000\0000,1\000 */ -{ 1580 }, /* eist_trans\000other\000Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions\000event=0x3a,period=200000\000\00000\000\0000,1\000 */ +{ 1465 }, /* dispatch_blocked.any\000other\000Memory cluster signals to block micro-op dispatch for any reason\000event=9,period=200000,umask=0x20,counterid_mask=0x3\000\00000\000\0000,1\000 */ +{ 1618 }, /* eist_trans\000other\000Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions\000event=0x3a,period=200000,counterid_mask=0x3\000\00000\000\0000,1\000 */ { 1247 }, /* l3_cache_rd\000cache\000L3 cache access, read\000event=0x40\000\00000\000Attributable Level 3 cache access, read\000\000 */ -{ 1343 }, /* segment_reg_loads.any\000other\000Number of segment register loads\000event=6,period=200000,umask=0x80\000\00000\000\0000,1\000 */ +{ 1343 }, /* segment_reg_loads.any\000other\000Number of segment register loads\000event=6,period=200000,umask=0x80,counterid_mask=0x3\000\00000\000\0000,1\000 */ }; static const struct compact_pmu_event pmu_events__test_soc_cpu_hisi_sccl_ddrc[] = { -{ 1714 }, /* uncore_hisi_ddrc.flux_wcmd\000uncore\000DDRC write commands\000event=2\000\00000\000DDRC write commands\000\000 */ +{ 1771 }, /* uncore_hisi_ddrc.flux_wcmd\000uncore\000DDRC write commands\000event=2\000\00000\000DDRC write commands\000\000 */ }; static const struct compact_pmu_event pmu_events__test_soc_cpu_hisi_sccl_l3c[] = { -{ 2200 }, /* uncore_hisi_l3c.rd_hit_cpipe\000uncore\000Total read hits\000event=7\000\00000\000Total read hits\000\000 */ +{ 2276 }, /* uncore_hisi_l3c.rd_hit_cpipe\000uncore\000Total read hits\000event=7\000\00000\000Total read hits\000\000 */ }; static const struct compact_pmu_event pmu_events__test_soc_cpu_uncore_cbox[] = { -{ 2048 }, /* event-hyphen\000uncore\000UNC_CBO_HYPHEN\000event=0xe0\000\00000\000UNC_CBO_HYPHEN\000\000 */ -{ 2114 }, /* event-two-hyph\000uncore\000UNC_CBO_TWO_HYPH\000event=0xc0\000\00000\000UNC_CBO_TWO_HYPH\000\000 */ -{ 1813 }, /* unc_cbo_xsnp_response.miss_eviction\000uncore\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000event=0x22,umask=0x81\000\00000\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\0000,1\000 */ +{ 2124 }, /* event-hyphen\000uncore\000UNC_CBO_HYPHEN\000event=0xe0\000\00000\000UNC_CBO_HYPHEN\000\000 */ +{ 2190 }, /* event-two-hyph\000uncore\000UNC_CBO_TWO_HYPH\000event=0xc0\000\00000\000UNC_CBO_TWO_HYPH\000\000 */ +{ 1870 }, /* unc_cbo_xsnp_response.miss_eviction\000uncore\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000event=0x22,umask=0x81,counterid_mask=0x3\000\00000\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\0000,1\000 */ }; static const struct compact_pmu_event pmu_events__test_soc_cpu_uncore_imc[] = { -{ 2412 }, /* uncore_imc.cache_hits\000uncore\000Total cache hits\000event=0x34\000\00000\000Total cache hits\000\000 */ +{ 2488 }, /* uncore_imc.cache_hits\000uncore\000Total cache hits\000event=0x34\000\00000\000Total cache hits\000\000 */ }; static const struct compact_pmu_event pmu_events__test_soc_cpu_uncore_imc_free_running[] = { -{ 2305 }, /* uncore_imc_free_running.cache_miss\000uncore\000Total cache misses\000event=0x12\000\00000\000Total cache misses\000\000 */ +{ 2381 }, /* uncore_imc_free_running.cache_miss\000uncore\000Total cache misses\000event=0x12\000\00000\000Total cache misses\000\000 */ }; @@ -134,46 +134,46 @@ const struct pmu_table_entry pmu_events__test_soc_cpu[] = { { .entries = pmu_events__test_soc_cpu_hisi_sccl_ddrc, .num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_hisi_sccl_ddrc), - .pmu_name = { 1699 /* hisi_sccl,ddrc\000 */ }, + .pmu_name = { 1756 /* hisi_sccl,ddrc\000 */ }, }, { .entries = pmu_events__test_soc_cpu_hisi_sccl_l3c, .num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_hisi_sccl_l3c), - .pmu_name = { 2186 /* hisi_sccl,l3c\000 */ }, + .pmu_name = { 2262 /* hisi_sccl,l3c\000 */ }, }, { .entries = pmu_events__test_soc_cpu_uncore_cbox, .num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_uncore_cbox), - .pmu_name = { 1801 /* uncore_cbox\000 */ }, + .pmu_name = { 1858 /* uncore_cbox\000 */ }, }, { .entries = pmu_events__test_soc_cpu_uncore_imc, .num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_uncore_imc), - .pmu_name = { 2401 /* uncore_imc\000 */ }, + .pmu_name = { 2477 /* uncore_imc\000 */ }, }, { .entries = pmu_events__test_soc_cpu_uncore_imc_free_running, .num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_uncore_imc_free_running), - .pmu_name = { 2281 /* uncore_imc_free_running\000 */ }, + .pmu_name = { 2357 /* uncore_imc_free_running\000 */ }, }, }; static const struct compact_pmu_event pmu_metrics__test_soc_cpu_default_core[] = { -{ 2838 }, /* CPI\000\0001 / IPC\000\000\000\000\000\000\000\00000 */ -{ 3519 }, /* DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\00000 */ -{ 3291 }, /* DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\00000 */ -{ 3385 }, /* DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\00000 */ -{ 3583 }, /* DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\00000 */ -{ 3651 }, /* DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\00000 */ -{ 2923 }, /* Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\00000 */ -{ 2860 }, /* IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\00000 */ -{ 3785 }, /* L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\00000 */ -{ 3721 }, /* M1\000\000ipc + M2\000\000\000\000\000\000\000\00000 */ -{ 3743 }, /* M2\000\000ipc + M1\000\000\000\000\000\000\000\00000 */ -{ 3765 }, /* M3\000\0001 / M3\000\000\000\000\000\000\000\00000 */ -{ 3220 }, /* cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\00000 */ -{ 3089 }, /* dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000 */ -{ 3153 }, /* icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000 */ +{ 2914 }, /* CPI\000\0001 / IPC\000\000\000\000\000\000\000\00000 */ +{ 3595 }, /* DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\00000 */ +{ 3367 }, /* DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\00000 */ +{ 3461 }, /* DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\00000 */ +{ 3659 }, /* DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\00000 */ +{ 3727 }, /* DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\00000 */ +{ 2999 }, /* Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\00000 */ +{ 2936 }, /* IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\00000 */ +{ 3861 }, /* L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\00000 */ +{ 3797 }, /* M1\000\000ipc + M2\000\000\000\000\000\000\000\00000 */ +{ 3819 }, /* M2\000\000ipc + M1\000\000\000\000\000\000\000\00000 */ +{ 3841 }, /* M3\000\0001 / M3\000\000\000\000\000\000\000\00000 */ +{ 3296 }, /* cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\00000 */ +{ 3165 }, /* dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000 */ +{ 3229 }, /* icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000 */ }; @@ -186,13 +186,13 @@ const struct pmu_table_entry pmu_metrics__test_soc_cpu[] = { }; static const struct compact_pmu_event pmu_events__test_soc_sys_uncore_sys_ccn_pmu[] = { -{ 2603 }, /* sys_ccn_pmu.read_cycles\000uncore\000ccn read-cycles event\000config=0x2c\0000x01\00000\000\000\000 */ +{ 2679 }, /* sys_ccn_pmu.read_cycles\000uncore\000ccn read-cycles event\000config=0x2c\0000x01\00000\000\000\000 */ }; static const struct compact_pmu_event pmu_events__test_soc_sys_uncore_sys_cmn_pmu[] = { -{ 2697 }, /* sys_cmn_pmu.hnf_cache_miss\000uncore\000Counts total cache misses in first lookup result (high priority)\000eventid=1,type=5\000(434|436|43c|43a).*\00000\000\000\000 */ +{ 2773 }, /* sys_cmn_pmu.hnf_cache_miss\000uncore\000Counts total cache misses in first lookup result (high priority)\000eventid=1,type=5\000(434|436|43c|43a).*\00000\000\000\000 */ }; static const struct compact_pmu_event pmu_events__test_soc_sys_uncore_sys_ddr_pmu[] = { -{ 2510 }, /* sys_ddr_pmu.write_cycles\000uncore\000ddr write-cycles event\000event=0x2b\000v8\00000\000\000\000 */ +{ 2586 }, /* sys_ddr_pmu.write_cycles\000uncore\000ddr write-cycles event\000event=0x2b\000v8\00000\000\000\000 */ }; @@ -200,17 +200,17 @@ const struct pmu_table_entry pmu_events__test_soc_sys[] = { { .entries = pmu_events__test_soc_sys_uncore_sys_ccn_pmu, .num_entries = ARRAY_SIZE(pmu_events__test_soc_sys_uncore_sys_ccn_pmu), - .pmu_name = { 2584 /* uncore_sys_ccn_pmu\000 */ }, + .pmu_name = { 2660 /* uncore_sys_ccn_pmu\000 */ }, }, { .entries = pmu_events__test_soc_sys_uncore_sys_cmn_pmu, .num_entries = ARRAY_SIZE(pmu_events__test_soc_sys_uncore_sys_cmn_pmu), - .pmu_name = { 2678 /* uncore_sys_cmn_pmu\000 */ }, + .pmu_name = { 2754 /* uncore_sys_cmn_pmu\000 */ }, }, { .entries = pmu_events__test_soc_sys_uncore_sys_ddr_pmu, .num_entries = ARRAY_SIZE(pmu_events__test_soc_sys_uncore_sys_ddr_pmu), - .pmu_name = { 2491 /* uncore_sys_ddr_pmu\000 */ }, + .pmu_name = { 2567 /* uncore_sys_ddr_pmu\000 */ }, }, }; -- 2.43.0 From david.laight.linux at gmail.com Thu Mar 27 14:06:25 2025 From: david.laight.linux at gmail.com (David Laight) Date: Thu, 27 Mar 2025 21:06:25 +0000 Subject: [RFC PATCH V3 00/43] rv64ilp32_abi: Build CONFIG_64BIT kernel-self with ILP32 ABI In-Reply-To: <20250325121624.523258-1-guoren@kernel.org> References: <20250325121624.523258-1-guoren@kernel.org> Message-ID: <20250327210625.7a3021d0@pumpkin> On Tue, 25 Mar 2025 08:15:41 -0400 guoren at kernel.org wrote: > From: "Guo Ren (Alibaba DAMO Academy)" > > Since 2001, the CONFIG_64BIT kernel has been built with the LP64 ABI, > but this patchset allows the CONFIG_64BIT kernel to use an ILP32 ABI > for construction to reduce cache & memory footprint (Compared to > kernel-lp64-abi, kernel-rv64ilp32-abi decreased the used memory by > about 20%, as shown in "free -h" in the following demo.) ... Why on earth would you want to run a 64bit application on a 32bit kernel. IIRC the main justification for 64bit was to get a larger address space. Now you might want to compile a 32bit (ILP32) system that actually runs in 64bit mode (c/f x32) so that 64bit maths (long long) is more efficient - but that is a different issue. (I suspect you'd need to change the process switch code to save all 64bits of the registers - but maybe not much else??) David From andrew.jones at linux.dev Mon Mar 31 06:54:40 2025 From: andrew.jones at linux.dev (Andrew Jones) Date: Mon, 31 Mar 2025 15:54:40 +0200 Subject: [kvm-unit-tests PATCH v3 4/5] configure: Add --qemu-cpu option In-Reply-To: References: <20250325160031.2390504-3-jean-philippe@linaro.org> <20250325160031.2390504-7-jean-philippe@linaro.org> Message-ID: <20250331-11789f26d058d11c1aeeaa51@orel> On Thu, Mar 27, 2025 at 05:14:39PM +0000, Alexandru Elisei wrote: > Hi Jean-Philippe, > > On Tue, Mar 25, 2025 at 04:00:32PM +0000, Jean-Philippe Brucker wrote: > > Add the --qemu-cpu option to let users set the CPU type to run on. > > At the moment --processor allows to set both GCC -mcpu flag and QEMU > > -cpu. On Arm we'd like to pass `-cpu max` to QEMU in order to enable all > > the TCG features by default, and it could also be nice to let users > > modify the CPU capabilities by setting extra -cpu options. Since GCC > > -mcpu doesn't accept "max" or "host", separate the compiler and QEMU > > arguments. > > > > `--processor` is now exclusively for compiler options, as indicated by > > its documentation ("processor to compile for"). So use $QEMU_CPU on > > RISC-V as well. > > > > Suggested-by: Andrew Jones > > Signed-off-by: Jean-Philippe Brucker > > --- > > scripts/mkstandalone.sh | 3 ++- > > arm/run | 15 +++++++++------ > > riscv/run | 8 ++++---- > > configure | 24 ++++++++++++++++++++++++ > > 4 files changed, 39 insertions(+), 11 deletions(-) > > > > diff --git a/scripts/mkstandalone.sh b/scripts/mkstandalone.sh > > index 2318a85f..9b4f983d 100755 > > --- a/scripts/mkstandalone.sh > > +++ b/scripts/mkstandalone.sh > > @@ -42,7 +42,8 @@ generate_test () > > > > config_export ARCH > > config_export ARCH_NAME > > - config_export PROCESSOR > > + config_export QEMU_CPU > > + config_export DEFAULT_QEMU_CPU > > > > echo "echo BUILD_HEAD=$(cat build-head)" > > > > diff --git a/arm/run b/arm/run > > index efdd44ce..4675398f 100755 > > --- a/arm/run > > +++ b/arm/run > > @@ -8,7 +8,7 @@ if [ -z "$KUT_STANDALONE" ]; then > > source config.mak > > source scripts/arch-run.bash > > fi > > -processor="$PROCESSOR" > > +qemu_cpu="$QEMU_CPU" > > > > if [ "$QEMU" ] && [ -z "$ACCEL" ] && > > [ "$HOST" = "aarch64" ] && [ "$ARCH" = "arm" ] && > > @@ -37,12 +37,15 @@ if [ "$ACCEL" = "kvm" ]; then > > fi > > fi > > > > -if [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ]; then > > - if [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ]; then > > - processor="host" > > +if [ -z "$qemu_cpu" ]; then > > + if ( [ "$ACCEL" = "kvm" ] || [ "$ACCEL" = "hvf" ] ) && > > + ( [ "$HOST" = "aarch64" ] || [ "$HOST" = "arm" ] ); then > > + qemu_cpu="host" > > if [ "$ARCH" = "arm" ] && [ "$HOST" = "aarch64" ]; then > > - processor+=",aarch64=off" > > + qemu_cpu+=",aarch64=off" > > fi > > + else > > + qemu_cpu="$DEFAULT_QEMU_CPU" > > fi > > fi > > > > @@ -71,7 +74,7 @@ if $qemu $M -device '?' | grep -q pci-testdev; then > > fi > > > > A="-accel $ACCEL$ACCEL_PROPS" > > -command="$qemu -nodefaults $M $A -cpu $processor $chr_testdev $pci_testdev" > > +command="$qemu -nodefaults $M $A -cpu $qemu_cpu $chr_testdev $pci_testdev" > > command+=" -display none -serial stdio" > > command="$(migration_cmd) $(timeout_cmd) $command" > > > > diff --git a/riscv/run b/riscv/run > > index e2f5a922..02fcf0c0 100755 > > --- a/riscv/run > > +++ b/riscv/run > > @@ -11,12 +11,12 @@ fi > > > > # Allow user overrides of some config.mak variables > > mach=$MACHINE_OVERRIDE > > -processor=$PROCESSOR_OVERRIDE > > +qemu_cpu=$QEMU_CPU_OVERRIDE > > firmware=$FIRMWARE_OVERRIDE > > > > -[ "$PROCESSOR" = "$ARCH" ] && PROCESSOR="max" > > +[ -z "$QEMU_CPU" ] && QEMU_CPU="max" > > If you make DEFAULT_QEMU_CPU=max for riscv, you can use it instead of "max", > just like arm/arm64 does. Not sure if it's worth another respin. Yup, we'll also need diff --git a/configure b/configure index b86ccc0c338b..541b9577a718 100755 --- a/configure +++ b/configure @@ -32,7 +32,7 @@ function get_default_qemu_cpu() "arm") echo "cortex-a15" ;; - "arm64") + "arm64" | "riscv32" | "riscv64") echo "max" ;; esac Thanks, drew > > Otherwise the patch looks good to me. > > Thanks, > Alex > > > : "${mach:=virt}" > > -: "${processor:=$PROCESSOR}" > > +: "${qemu_cpu:=$QEMU_CPU}" > > : "${firmware:=$FIRMWARE}" > > [ "$firmware" ] && firmware="-bios $firmware" > > > > @@ -32,7 +32,7 @@ fi > > mach="-machine $mach" > > > > command="$qemu -nodefaults -nographic -serial mon:stdio" > > -command+=" $mach $acc $firmware -cpu $processor " > > +command+=" $mach $acc $firmware -cpu $qemu_cpu " > > command="$(migration_cmd) $(timeout_cmd) $command" > > > > if [ "$UEFI_SHELL_RUN" = "y" ]; then > > diff --git a/configure b/configure > > index b4875ef3..b79145a5 100755 > > --- a/configure > > +++ b/configure > > @@ -23,6 +23,21 @@ function get_default_processor() > > esac > > } > > > > +# Return the default CPU type to run on > > +function get_default_qemu_cpu() > > +{ > > + local arch="$1" > > + > > + case "$arch" in > > + "arm") > > + echo "cortex-a15" > > + ;; > > + "arm64") > > + echo "cortex-a57" > > + ;; > > + esac > > +} > > + > > srcdir=$(cd "$(dirname "$0")"; pwd) > > prefix=/usr/local > > cc=gcc > > @@ -52,6 +67,7 @@ earlycon= > > console= > > efi= > > efi_direct= > > +qemu_cpu= > > > > # Enable -Werror by default for git repositories only (i.e. developer builds) > > if [ -e "$srcdir"/.git ]; then > > @@ -70,6 +86,9 @@ usage() { > > --arch=ARCH architecture to compile for ($arch). ARCH can be one of: > > arm, arm64, i386, ppc64, riscv32, riscv64, s390x, x86_64 > > --processor=PROCESSOR processor to compile for ($processor) > > + --qemu-cpu=CPU the CPU model to run on. If left unset, the run script > > + selects the best value based on the host system and the > > + test configuration. > > --target=TARGET target platform that the tests will be running on (qemu or > > kvmtool, default is qemu) (arm/arm64 only) > > --cross-prefix=PREFIX cross compiler prefix > > @@ -146,6 +165,9 @@ while [[ $optno -le $argc ]]; do > > --processor) > > processor="$arg" > > ;; > > + --qemu-cpu) > > + qemu_cpu="$arg" > > + ;; > > --target) > > target="$arg" > > ;; > > @@ -471,6 +493,8 @@ ARCH=$arch > > ARCH_NAME=$arch_name > > ARCH_LIBDIR=$arch_libdir > > PROCESSOR=$processor > > +QEMU_CPU=$qemu_cpu > > +DEFAULT_QEMU_CPU=$(get_default_qemu_cpu $arch) > > CC=$cc > > CFLAGS=$cflags > > LD=$cross_prefix$ld > > -- > > 2.49.0 > > > > -- > kvm-riscv mailing list > kvm-riscv at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kvm-riscv From conor at kernel.org Mon Mar 31 08:38:41 2025 From: conor at kernel.org (Conor Dooley) Date: Mon, 31 Mar 2025 16:38:41 +0100 Subject: [PATCH v5 10/21] dt-bindings: riscv: add Counter delegation ISA extensions description In-Reply-To: <20250327-counter_delegation-v5-10-1ee538468d1b@rivosinc.com> References: <20250327-counter_delegation-v5-0-1ee538468d1b@rivosinc.com> <20250327-counter_delegation-v5-10-1ee538468d1b@rivosinc.com> Message-ID: <20250331-cardigan-canary-cdec5038faf9@spud> On Thu, Mar 27, 2025 at 12:35:51PM -0700, Atish Patra wrote: > Add description for the Smcdeleg/Ssccfg extension. > > Signed-off-by: Atish Patra Acked-by: Conor Dooley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: