[PATCH v14 3/8] genirq: Add mechanism to multiplex a single HW IPI

Thu Dec 1 10:00:20 PST 2022

On Thu, Dec 1, 2022 at 10:50 PM Thomas Gleixner <tglx at linutronix.de> wrote:
>
> On Thu, Dec 01 2022 at 18:31, Anup Patel wrote:
> > All RISC-V platforms have a single HW IPI provided by the INTC local
> > interrupt controller. The HW method to trigger INTC IPI can be through
> > external irqchip (e.g. RISC-V AIA), through platform specific device
> > (e.g. SiFive CLINT timer), or through firmware (e.g. SBI IPI call).
> >
> > To support multiple IPIs on RISC-V, we add a generic IPI multiplexing
>
> s/we//

Okay, I will update.

>
> > mechanism which help us create multiple virtual IPIs using a single
> > HW IPI. This generic IPI multiplexing is inspired from the Apple AIC
>
> s/from/by/

Okay, I will update.

>
> > irqchip driver and it is shared by various RISC-V irqchip drivers.
>
> Sure, but now we have two copies of this. One in the Apple AIC and one
> here. The obvious thing to do is:
>
>    1) Provide generic infrastructure
>
>    2) Convert AIC to use it

Mark Z already has a converted version of AIC driver.
https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=irq/ipi-mux

>
>    3) Add RISCV users

The PATCH4 of this series converts the two existing
RISC-V users (SBI IPI and CLINT).

We also have a RISC-V AIA series (posted recently) which
uses the IPI muxing added by this series.

>
> No?
>
> > +static void ipi_mux_mask(struct irq_data *d)
> > +{
> > +     struct ipi_mux_cpu *icpu = this_cpu_ptr(ipi_mux_pcpu);
> > +
> > +     atomic_andnot(BIT(irqd_to_hwirq(d)), &icpu->enable);
> > +}
> > +
> > +static void ipi_mux_unmask(struct irq_data *d)
> > +{
> > +     u32 ibit = BIT(irqd_to_hwirq(d));
> > +     struct ipi_mux_cpu *icpu = this_cpu_ptr(ipi_mux_pcpu);
>
> The AIC code got the variable ordering correct ...
>
> https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#variable-declarations

Okay, I will update.

>
> > +     atomic_or(ibit, &icpu->enable);
> > +
> > +     /*
> > +      * The atomic_or() above must complete before the atomic_read()
> > +      * below to avoid racing ipi_mux_send_mask().
> > +      */
> > +     smp_mb__after_atomic();
> > +
> > +     /* If a pending IPI was unmasked, raise a parent IPI immediately. */
> > +     if (atomic_read(&icpu->bits) & ibit)
> > +             ipi_mux_send(smp_processor_id());
> > +}
> > +
> > +static void ipi_mux_send_mask(struct irq_data *d, const struct cpumask *mask)
> > +{
> > +     u32 ibit = BIT(irqd_to_hwirq(d));
> > +     struct ipi_mux_cpu *icpu = this_cpu_ptr(ipi_mux_pcpu);
> > +     unsigned long pending;
> > +     int cpu;
> > +
> > +     for_each_cpu(cpu, mask) {
> > +             icpu = per_cpu_ptr(ipi_mux_pcpu, cpu);
> > +             pending = atomic_fetch_or_release(ibit, &icpu->bits);
> > +
> > +             /*
> > +              * The atomic_fetch_or_release() above must complete
> > +              * before the atomic_read() below to avoid racing with
> > +              * ipi_mux_unmask().
> > +              */
> > +             smp_mb__after_atomic();
> > +
> > +             /*
> > +              * The flag writes must complete before the physical IPI is
> > +              * issued to another CPU. This is implied by the control
> > +              * dependency on the result of atomic_read() below, which is
> > +              * itself already ordered after the vIPI flag write.
> > +              */
> > +             if (!(pending & ibit) && (atomic_read(&icpu->enable) & ibit))
> > +                     ipi_mux_send(cpu);
> > +     }
> > +}
> > +
> > +static const struct irq_chip ipi_mux_chip = {
> > +     .name           = "IPI Mux",
> > +     .irq_mask       = ipi_mux_mask,
> > +     .irq_unmask     = ipi_mux_unmask,
> > +     .ipi_send_mask  = ipi_mux_send_mask,
> > +};
> > +
> > +static int ipi_mux_domain_alloc(struct irq_domain *d, unsigned int virq,
> > +                             unsigned int nr_irqs, void *arg)
> > +{
> > +     int i;
> > +
> > +     for (i = 0; i < nr_irqs; i++) {
> > +             irq_set_percpu_devid(virq + i);
> > +             irq_domain_set_info(d, virq + i, i, &ipi_mux_chip, NULL,
> > +                                 handle_percpu_devid_irq, NULL, NULL);
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +static const struct irq_domain_ops ipi_mux_domain_ops = {
> > +     .alloc          = ipi_mux_domain_alloc,
> > +     .free           = irq_domain_free_irqs_top,
> > +};
> > +
> > +/**
> > + * ipi_mux_process - Process multiplexed virtual IPIs
> > + */
> > +void ipi_mux_process(void)
> > +{
> > +     struct ipi_mux_cpu *icpu = this_cpu_ptr(ipi_mux_pcpu);
> > +     irq_hw_number_t hwirq;
> > +     unsigned long ipis;
> > +     unsigned int en;
> > +
> > +     /*
> > +      * Reading enable mask does not need to be ordered as long as
> > +      * this function called from interrupt handler because only
> > +      * the CPU itself can change it's own enable mask.
> > +      */
> > +     en = atomic_read(&icpu->enable);
> > +
> > +     /*
> > +      * Clear the IPIs we are about to handle. This pairs with the
> > +      * atomic_fetch_or_release() in ipi_mux_send_mask().
>
> The comments in the AIC code where you copied from are definitely
> better...
>
> Thanks,
>
>         tglx

Regards,
Anup