[PATCH 1/4] irqchip/riscv-imsic: Handle non-atomic MSI updates for device

Thomas Gleixner tglx at linutronix.de
Mon Dec 9 07:53:29 PST 2024


Anup!

On Mon, Dec 09 2024 at 17:38, Anup Patel wrote:
> On Mon, Dec 9, 2024 at 1:44 AM Thomas Gleixner <tglx at linutronix.de> wrote:
>> There is no guarantee that set_affinity() runs on the original target
>> CPU (CPU 1). Your scheme only works, when CPU1 vector 0x20 is not used
>> by some other device. If it's used, you lost as CPU1 will consume the
>> vector and your pending check is not seeing anything.
>>
>> x86 ensures CPU locality by deferring the affinity move to the next
>> device interrupt on the original target CPU (CPU1 in the above
>> example). See CONFIG_GENERIC_IRQ_PENDING.
>
> I agree with you.
>
> The IMSIC driver must do the affinity move upon the next device
> interrupt on the old CPU. I will update this patch in the next revision.
>
> BTW, I did not find CONFIG_GENERIC_IRQ_PENDING. Is the
> name correct ?

CONFIG_GENERIC_PENDING_IRQ is close enough :)

>> The interrupt domains which are not affected (remap) set the
>> IRQ_MOVE_PCNTXT flag to avoid that dance and don't use that affinity
>> setter code path at all.
>
> Yes, setting the IRQ_MOVE_PCNTXT flag in the remap domain
> makes perfect sense.
>
> I suggest adding IRQ_MOVE_PCNTXT usage as part of Drew's
> irqbypass series which adds a remap domain in the IOMMU
> driver. Unless you insist on having it as part of this series ?

You need to look at the other RISC-V controllers. Those which do not
need this should set it. That's historically backwards.

I think we can reverse the logic here. As this needs backporting, I
can't make a full cleanup of this, but for your problem the patch below
should just work.

Select GENERIC_PENDING_IRQ and GENERIC_PENDING_IRQ_CHIPFLAGS and set the
IRQCHIP_MOVE_DEFERRED flag on your interrrupt chip and the core logic
takes care of the PCNTXT bits.

I'll convert x86 in a seperate step and remove the PCNTXT leftovers and
the new config knob once the dust has settled.

Thanks,

        tglx
---
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -567,6 +567,7 @@ struct irq_chip {
  *                                    in the suspend path if they are in disabled state
  * IRQCHIP_AFFINITY_PRE_STARTUP:      Default affinity update before startup
  * IRQCHIP_IMMUTABLE:		      Don't ever change anything in this chip
+ * IRQCHIP_MOVE_DEFERRED:	      Move the interrupt in actual interrupt context
  */
 enum {
 	IRQCHIP_SET_TYPE_MASKED			= (1 <<  0),
@@ -581,6 +582,7 @@ enum {
 	IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND	= (1 <<  9),
 	IRQCHIP_AFFINITY_PRE_STARTUP		= (1 << 10),
 	IRQCHIP_IMMUTABLE			= (1 << 11),
+	IRQCHIP_MOVE_DEFERRED			= (1 << 12),
 };
 
 #include <linux/irqdesc.h>
--- a/kernel/irq/Kconfig
+++ b/kernel/irq/Kconfig
@@ -31,6 +31,10 @@ config GENERIC_IRQ_EFFECTIVE_AFF_MASK
 config GENERIC_PENDING_IRQ
 	bool
 
+# Deduce delayed migration from top-level interrupt chip flags
+config GENERIC_PENDING_IRQ_CHIPFLAGS
+	bool
+
 # Support for generic irq migrating off cpu before the cpu is offline.
 config GENERIC_IRQ_MIGRATION
 	bool
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -47,6 +47,13 @@ int irq_set_chip(unsigned int irq, const
 		return -EINVAL;
 
 	desc->irq_data.chip = (struct irq_chip *)(chip ?: &no_irq_chip);
+
+	if (IS_ENABLED(CONFIG_GENERIC_PENDING_IRQ_CHIPFLAGS) && chip) {
+		if (chip->flags & IRQCHIP_MOVE_DEFERRED)
+			irqd_clear(&desc->irq_data, IRQD_MOVE_PCNTXT);
+		else
+			irqd_set(&desc->irq_data, IRQD_MOVE_PCNTXT);
+	}
 	irq_put_desc_unlock(desc, flags);
 	/*
 	 * For !CONFIG_SPARSE_IRQ make the irq show up in
@@ -1114,16 +1121,21 @@ void irq_modify_status(unsigned int irq,
 	trigger = irqd_get_trigger_type(&desc->irq_data);
 
 	irqd_clear(&desc->irq_data, IRQD_NO_BALANCING | IRQD_PER_CPU |
-		   IRQD_TRIGGER_MASK | IRQD_LEVEL | IRQD_MOVE_PCNTXT);
+		   IRQD_TRIGGER_MASK | IRQD_LEVEL);
 	if (irq_settings_has_no_balance_set(desc))
 		irqd_set(&desc->irq_data, IRQD_NO_BALANCING);
 	if (irq_settings_is_per_cpu(desc))
 		irqd_set(&desc->irq_data, IRQD_PER_CPU);
-	if (irq_settings_can_move_pcntxt(desc))
-		irqd_set(&desc->irq_data, IRQD_MOVE_PCNTXT);
 	if (irq_settings_is_level(desc))
 		irqd_set(&desc->irq_data, IRQD_LEVEL);
 
+	/* Keep this around until x86 is converted over */
+	if (!IS_ENABLED(CONFIG_GENERIC_PENDING_IRQ_CHIPFLAGS)) {
+		irqd_clear(&desc->irq_data, IRQD_MOVE_PCNTXT);
+		if (irq_settings_can_move_pcntxt(desc))
+			irqd_set(&desc->irq_data, IRQD_MOVE_PCNTXT);
+	}
+
 	tmp = irq_settings_get_trigger_mask(desc);
 	if (tmp != IRQ_TYPE_NONE)
 		trigger = tmp;






More information about the linux-riscv mailing list