[RFC PATCH 1/3] irqchip/gic-v3: Make use of ICC_SGI1R IRM bit

Jingyi Wang wangjingyi11 at huawei.com
Mon Mar 29 11:38:04 BST 2021



On 3/29/2021 5:55 PM, Marc Zyngier wrote:
> On Mon, 29 Mar 2021 09:52:08 +0100,
> Jingyi Wang <wangjingyi11 at huawei.com> wrote:
>>
>> IRM, bit[40] in ICC_SGI1R, determines how the generated SGIs
>> are distributed to PEs. If the bit is set, interrupts are routed
>> to all PEs in the system excluding "self". We use cpumask to
>> determine if this bit should be set and make use of that.
>>
>> This will reduce vm trap when broadcast IPIs are sent.
> 
> I remember writing similar code about 4 years ago, only to realise
> what:
> 
> - the cost of computing the resulting mask is pretty high for large
> machines
> - Linux almost never sends broadcast IPIs, so the complexity was all
> in vain
> 
> What changed? Please provide supporting data showing how many IPIs we
> actually save, and for which workload.
Maybe we can implement send_IPI_allbutself hooks as other some other 
archs instead of computing cpumask here?

>>
>> Signed-off-by: Jingyi Wang <wangjingyi11 at huawei.com>
>> ---
>>   drivers/irqchip/irq-gic-v3.c | 12 ++++++++++++
>>   1 file changed, 12 insertions(+)
>>
>> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
>> index eb0ee356a629..8ecc1b274ea8 100644
>> --- a/drivers/irqchip/irq-gic-v3.c
>> +++ b/drivers/irqchip/irq-gic-v3.c
>> @@ -1127,6 +1127,7 @@ static void gic_send_sgi(u64 cluster_id, u16 tlist, unsigned int irq)
>>   static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask)
>>   {
>>   	int cpu;
>> +	cpumask_t tmp;
>>   
>>   	if (WARN_ON(d->hwirq >= 16))
>>   		return;
>> @@ -1137,6 +1138,17 @@ static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask)
>>   	 */
>>   	wmb();
>>   
>> +	if (!cpumask_and(&tmp, mask, cpumask_of(smp_processor_id()))) {
> 
> Are you sure this does the right thing? This is checking that the
> current CPU is not part of the mask. But it not checking that the mask
> is actually "all but self".
> 
> This means you are potentially sending IPIs to CPUs that are not part
> of the mask, making performance potentially worse.
> 
> Thanks,
> 
> 	M.
> 
I will fix that,thanks.

>> +		/* Set Interrupt Routing Mode bit */
>> +		u64 val;
>> +		val = (d->hwirq) << ICC_SGI1R_SGI_ID_SHIFT;
>> +		val |= BIT_ULL(ICC_SGI1R_IRQ_ROUTING_MODE_BIT);
>> +		gic_write_sgi1r(val);
>> +
>> +		isb();
>> +		return;
>> +	}
>> +
>>   	for_each_cpu(cpu, mask) {
>>   		u64 cluster_id = MPIDR_TO_SGI_CLUSTER_ID(cpu_logical_map(cpu));
>>   		u16 tlist;
>> -- 
>> 2.19.1
>>
>>
> 



More information about the linux-arm-kernel mailing list