[PATCH 1/4] irqchip/riscv-imsic: Handle non-atomic MSI updates for device
Anup Patel
apatel at ventanamicro.com
Mon Dec 9 09:09:16 PST 2024
On Mon, Dec 9, 2024 at 9:23 PM Thomas Gleixner <tglx at linutronix.de> wrote:
>
> Anup!
>
> On Mon, Dec 09 2024 at 17:38, Anup Patel wrote:
> > On Mon, Dec 9, 2024 at 1:44 AM Thomas Gleixner <tglx at linutronix.de> wrote:
> >> There is no guarantee that set_affinity() runs on the original target
> >> CPU (CPU 1). Your scheme only works, when CPU1 vector 0x20 is not used
> >> by some other device. If it's used, you lost as CPU1 will consume the
> >> vector and your pending check is not seeing anything.
> >>
> >> x86 ensures CPU locality by deferring the affinity move to the next
> >> device interrupt on the original target CPU (CPU1 in the above
> >> example). See CONFIG_GENERIC_IRQ_PENDING.
> >
> > I agree with you.
> >
> > The IMSIC driver must do the affinity move upon the next device
> > interrupt on the old CPU. I will update this patch in the next revision.
> >
> > BTW, I did not find CONFIG_GENERIC_IRQ_PENDING. Is the
> > name correct ?
>
> CONFIG_GENERIC_PENDING_IRQ is close enough :)
>
> >> The interrupt domains which are not affected (remap) set the
> >> IRQ_MOVE_PCNTXT flag to avoid that dance and don't use that affinity
> >> setter code path at all.
> >
> > Yes, setting the IRQ_MOVE_PCNTXT flag in the remap domain
> > makes perfect sense.
> >
> > I suggest adding IRQ_MOVE_PCNTXT usage as part of Drew's
> > irqbypass series which adds a remap domain in the IOMMU
> > driver. Unless you insist on having it as part of this series ?
>
> You need to look at the other RISC-V controllers. Those which do not
> need this should set it. That's historically backwards.
I will update the RISC-V APLIC MSI-mode driver in the next revision.
This driver is a good candidate to use IRQ_MOVE_PCNTXT and
IRQCHIP_MOVE_DEFERRED.
>
> I think we can reverse the logic here. As this needs backporting, I
> can't make a full cleanup of this, but for your problem the patch below
> should just work.
>
> Select GENERIC_PENDING_IRQ and GENERIC_PENDING_IRQ_CHIPFLAGS and set the
> IRQCHIP_MOVE_DEFERRED flag on your interrrupt chip and the core logic
> takes care of the PCNTXT bits.
Sure, I will update.
Thanks,
Anup
>
> I'll convert x86 in a seperate step and remove the PCNTXT leftovers and
> the new config knob once the dust has settled.
>
> Thanks,
>
> tglx
> ---
> --- a/include/linux/irq.h
> +++ b/include/linux/irq.h
> @@ -567,6 +567,7 @@ struct irq_chip {
> * in the suspend path if they are in disabled state
> * IRQCHIP_AFFINITY_PRE_STARTUP: Default affinity update before startup
> * IRQCHIP_IMMUTABLE: Don't ever change anything in this chip
> + * IRQCHIP_MOVE_DEFERRED: Move the interrupt in actual interrupt context
> */
> enum {
> IRQCHIP_SET_TYPE_MASKED = (1 << 0),
> @@ -581,6 +582,7 @@ enum {
> IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND = (1 << 9),
> IRQCHIP_AFFINITY_PRE_STARTUP = (1 << 10),
> IRQCHIP_IMMUTABLE = (1 << 11),
> + IRQCHIP_MOVE_DEFERRED = (1 << 12),
> };
>
> #include <linux/irqdesc.h>
> --- a/kernel/irq/Kconfig
> +++ b/kernel/irq/Kconfig
> @@ -31,6 +31,10 @@ config GENERIC_IRQ_EFFECTIVE_AFF_MASK
> config GENERIC_PENDING_IRQ
> bool
>
> +# Deduce delayed migration from top-level interrupt chip flags
> +config GENERIC_PENDING_IRQ_CHIPFLAGS
> + bool
> +
> # Support for generic irq migrating off cpu before the cpu is offline.
> config GENERIC_IRQ_MIGRATION
> bool
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -47,6 +47,13 @@ int irq_set_chip(unsigned int irq, const
> return -EINVAL;
>
> desc->irq_data.chip = (struct irq_chip *)(chip ?: &no_irq_chip);
> +
> + if (IS_ENABLED(CONFIG_GENERIC_PENDING_IRQ_CHIPFLAGS) && chip) {
> + if (chip->flags & IRQCHIP_MOVE_DEFERRED)
> + irqd_clear(&desc->irq_data, IRQD_MOVE_PCNTXT);
> + else
> + irqd_set(&desc->irq_data, IRQD_MOVE_PCNTXT);
> + }
> irq_put_desc_unlock(desc, flags);
> /*
> * For !CONFIG_SPARSE_IRQ make the irq show up in
> @@ -1114,16 +1121,21 @@ void irq_modify_status(unsigned int irq,
> trigger = irqd_get_trigger_type(&desc->irq_data);
>
> irqd_clear(&desc->irq_data, IRQD_NO_BALANCING | IRQD_PER_CPU |
> - IRQD_TRIGGER_MASK | IRQD_LEVEL | IRQD_MOVE_PCNTXT);
> + IRQD_TRIGGER_MASK | IRQD_LEVEL);
> if (irq_settings_has_no_balance_set(desc))
> irqd_set(&desc->irq_data, IRQD_NO_BALANCING);
> if (irq_settings_is_per_cpu(desc))
> irqd_set(&desc->irq_data, IRQD_PER_CPU);
> - if (irq_settings_can_move_pcntxt(desc))
> - irqd_set(&desc->irq_data, IRQD_MOVE_PCNTXT);
> if (irq_settings_is_level(desc))
> irqd_set(&desc->irq_data, IRQD_LEVEL);
>
> + /* Keep this around until x86 is converted over */
> + if (!IS_ENABLED(CONFIG_GENERIC_PENDING_IRQ_CHIPFLAGS)) {
> + irqd_clear(&desc->irq_data, IRQD_MOVE_PCNTXT);
> + if (irq_settings_can_move_pcntxt(desc))
> + irqd_set(&desc->irq_data, IRQD_MOVE_PCNTXT);
> + }
> +
> tmp = irq_settings_get_trigger_mask(desc);
> if (tmp != IRQ_TYPE_NONE)
> trigger = tmp;
>
>
>
More information about the linux-riscv
mailing list