[PATCH v12 00/25] Linux RISC-V AIA Support

Anup Patel apatel at ventanamicro.com
Tue Jan 30 07:48:56 PST 2024


On Tue, Jan 30, 2024 at 8:18 PM Björn Töpel <bjorn at kernel.org> wrote:
>
> Björn Töpel <bjorn at kernel.org> writes:
>
> > Anup Patel <apatel at ventanamicro.com> writes:
> >
> >> On Tue, Jan 30, 2024 at 1:22 PM Björn Töpel <bjorn at kernel.org> wrote:
> >>>
> >>> Björn Töpel <bjorn at kernel.org> writes:
> >>>
> >>> > Anup Patel <apatel at ventanamicro.com> writes:
> >>> >
> >>> >> The RISC-V AIA specification is ratified as-per the RISC-V international
> >>> >> process. The latest ratified AIA specifcation can be found at:
> >>> >> https://github.com/riscv/riscv-aia/releases/download/1.0/riscv-interrupts-1.0.pdf
> >>> >>
> >>> >> At a high-level, the AIA specification adds three things:
> >>> >> 1) AIA CSRs
> >>> >>    - Improved local interrupt support
> >>> >> 2) Incoming Message Signaled Interrupt Controller (IMSIC)
> >>> >>    - Per-HART MSI controller
> >>> >>    - Support MSI virtualization
> >>> >>    - Support IPI along with virtualization
> >>> >> 3) Advanced Platform-Level Interrupt Controller (APLIC)
> >>> >>    - Wired interrupt controller
> >>> >>    - In MSI-mode, converts wired interrupt into MSIs (i.e. MSI generator)
> >>> >>    - In Direct-mode, injects external interrupts directly into HARTs
> >>> >>
> >>> >> For an overview of the AIA specification, refer the AIA virtualization
> >>> >> talk at KVM Forum 2022:
> >>> >> https://static.sched.com/hosted_files/kvmforum2022/a1/AIA_Virtualization_in_KVM_RISCV_final.pdf
> >>> >> https://www.youtube.com/watch?v=r071dL8Z0yo
> >>> >>
> >>> >> To test this series, use QEMU v7.2 (or higher) and OpenSBI v1.2 (or higher).
> >>> >>
> >>> >> These patches can also be found in the riscv_aia_v12 branch at:
> >>> >> https://github.com/avpatel/linux.git
> >>> >>
> >>> >> Changes since v11:
> >>> >>  - Rebased on Linux-6.8-rc1
> >>> >>  - Included kernel/irq related patches from "genirq, irqchip: Convert ARM
> >>> >>    MSI handling to per device MSI domains" series by Thomas.
> >>> >>    (PATCH7, PATCH8, PATCH9, PATCH14, PATCH16, PATCH17, PATCH18, PATCH19,
> >>> >>     PATCH20, PATCH21, PATCH22, PATCH23, and PATCH32 of
> >>> >>     https://lore.kernel.org/linux-arm-kernel/20221121135653.208611233@linutronix.de/)
> >>> >>  - Updated APLIC MSI-mode driver to use the new WIRED_TO_MSI mechanism.
> >>> >>  - Updated IMSIC driver to support per-device MSI domains for PCI and
> >>> >>    platform devices.
> >>> >
> >>> > Thanks for working on this, Anup! I'm still reviewing the patches.
> >>> >
> >>> > I'm hitting a boot hang in text patching, with this series applied on
> >>> > 6.8-rc2. IPI issues?
> >>>
> >>> Not text patching! One cpu spinning in smp_call_function_many_cond() and
> >>> the others are in cpu_relax(). Smells like IPI...
> >>
> >> I tried bootefi from U-Boot multiple times but can't reproduce the
> >> issue you are seeing.
> >
> > Thanks! I can reproduce without EFI, and simpler command-line:
> >
> > qemu-system-riscv64 \
> >   -bios /path/to/fw_dynamic.bin \
> >   -kernel /path/to/Image \
> >   -append 'earlycon console=tty0 console=ttyS0' \
> >   -machine virt,aia=aplic-imsic \
> >   -no-reboot -nodefaults -nographic \
> >   -smp 4 \
> >   -object rng-random,filename=/dev/urandom,id=rng0 \
> >   -device virtio-rng-device,rng=rng0 \
> >   -m 4G -chardev stdio,id=char0 -serial chardev:char0
> >
> > I can reproduce with your upstream riscv_aia_v12 plus the config in the
> > gist [1], and all latest QEMU/OpenSBI:
> >
> > QEMU: 11be70677c70 ("Merge tag 'pull-vfio-20240129' of https://github.com/legoater/qemu into staging")
> > OpenSBI: bb90a9ebf6d9 ("lib: sbi: Print number of debug triggers found")
> > Linux: d9b9d6eb987f ("MAINTAINERS: Add entry for RISC-V AIA drivers")
> >
> > Removing ",aia=aplic-imsic" from the CLI above completes the boot (i.e.
> > panicking about missing root mount ;-))
>
> More context; The hang is during a late initcall, where an ftrace direct
> (register_ftrace_direct()) modification is done.
>
> Stop machine is used to call into __ftrace_modify_call(). Then into the
> arch specific patch_text_nosync(), where flush_icache_range() hangs in
> flush_icache_all(). From "on_each_cpu(ipi_remote_fence_i, NULL, 1);" to
> on_each_cpu_cond_mask() "smp_call_function_many_cond(mask, func, info,
> scf_flags, cond_func);" which never returns from "csd_lock_wait(csd)"
> right before the end of the function.
>
> Any ideas? Disabling CONFIG_HID_BPF, that does the early ftrace code
> patching fixes the boot hang, but it does seem related to IPI...
>
Looks like flush_icache_all() does not use the IPIs (on_each_cpu()
and friends) correctly.

On other hand, the flush_icache_mm() does the right thing by
doing local flush on the current CPU and IPI based flush on other
CPUs.

Can you try the following patch ?

diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
index 55a34f2020a8..a3dfbe4de832 100644
--- a/arch/riscv/mm/cacheflush.c
+++ b/arch/riscv/mm/cacheflush.c
@@ -19,12 +19,18 @@ static void ipi_remote_fence_i(void *info)

 void flush_icache_all(void)
 {
+    cpumask_t others;
+
     local_flush_icache_all();

+    cpumask_andnot(&others, cpu_online_mask, cpumask_of(smp_processor_id()));
+    if (cpumask_empty(&others))
+        return;
+
     if (IS_ENABLED(CONFIG_RISCV_SBI) && !riscv_use_ipi_for_rfence())
-        sbi_remote_fence_i(NULL);
+        sbi_remote_fence_i(&others);
     else
-        on_each_cpu(ipi_remote_fence_i, NULL, 1);
+        on_each_cpu_mask(&others, ipi_remote_fence_i, NULL, 1);
 }
 EXPORT_SYMBOL(flush_icache_all);


Regards,
Anup



More information about the linux-riscv mailing list