[PATCH 2/6] irqchip/apple-aic: Add Fast IPI support

Hector Martin marcan at marcan.st
Fri Dec 17 21:31:28 PST 2021


On 12/12/2021 21.21, Marc Zyngier wrote:
>> +/* MPIDR fields */
>> +#define MPIDR_CPU			GENMASK(7, 0)
>> +#define MPIDR_CLUSTER			GENMASK(15, 8)
> 
> This should be defined in terms of MPIDR_AFFINITY_LEVEL() and co.

Yeah, I found out about that macro from your PMU driver... :)

>> +static const struct aic_info aic1_fipi_info = {
>> +	.version	= 1,
>> +
>> +	.fast_ipi	= true,
> 
> Do you anticipate multiple feature flags like this? If so, maybe we
> should consider biting the bullet and making this an unsigned long
> populated with discrete flags.
> 
> Not something we need to decide now though.

Probably not, but who knows! It's easy to change it later, though.

>>  	if (read_sysreg_s(SYS_IMP_APL_IPI_SR_EL1) & IPI_SR_PENDING) {
>> -		pr_err_ratelimited("Fast IPI fired. Acking.\n");
>> -		write_sysreg_s(IPI_SR_PENDING, SYS_IMP_APL_IPI_SR_EL1);
>> +		if (aic_irqc->info.fast_ipi) {
> 
> On the other hand, this is likely to hit on the fast path. Given that
> we know at probe time whether we support SR-based IPIs, we can turn
> this into a static key and save a few fetches on every IPI. It applies
> everywhere you look at this flag at runtime.

Good point, I'll see about refactoring this to use static keys.

>> +static void aic_ipi_send_fast(int cpu)
>> +{
>> +	u64 mpidr = cpu_logical_map(cpu);
>> +	u64 my_mpidr = cpu_logical_map(smp_processor_id());
> 
> This is the equivalent of reading MPIDR_EL1. My gut feeling is that it
> is a bit faster to access the sysreg than a percpu lookup, a function
> call and another memory access.

Yeah, I saw other IRQ drivers doing this, but I wasn't sure it made
sense over just reading MPIDR_EL1... I'll switch to that.

>> +	u64 idx = FIELD_GET(MPIDR_CPU, mpidr);
>> +
>> +	if (FIELD_GET(MPIDR_CLUSTER, my_mpidr) == cluster)
>> +		write_sysreg_s(FIELD_PREP(IPI_RR_CPU, idx),
>> +			       SYS_IMP_APL_IPI_RR_LOCAL_EL1);
>> +	else
>> +		write_sysreg_s(FIELD_PREP(IPI_RR_CPU, idx) | FIELD_PREP(IPI_RR_CLUSTER, cluster),
>> +			       SYS_IMP_APL_IPI_RR_GLOBAL_EL1);
> 
> Don't you need an ISB, either here or in the two callers? At the
> moment, I don't see what will force the execution of these writes, and
> they could be arbitrarily delayed.

Is there any requirement for timeliness sending IPIs? They're going to
another CPU after all, they could be arbitrarily delayed because it has
FIQs masked.

>> -	if (atomic_read(this_cpu_ptr(&aic_vipi_flag)) & irq_bit)
>> -		aic_ic_write(ic, AIC_IPI_SEND, AIC_IPI_SEND_CPU(smp_processor_id()));
>> +	if (atomic_read(this_cpu_ptr(&aic_vipi_flag)) & irq_bit) {
>> +		if (ic->info.fast_ipi)
>> +			aic_ipi_send_fast(smp_processor_id());
> 
> nit: if this is common enough, maybe having an aic_ipi_send_self_fast
> could be better. Needs evaluation though.

I'll do some printing to see how common self-IPIs are when running
common workloads, let's see. If it's common enough it's easy enough to add.

>> +	irqc->info = *(struct aic_info *)match->data;
> 
> Why the copy? All the data is const, and isn't going away.

... for now, but later patches then start computing register offsets and
putting them into this structure :)

-- 
Hector Martin (marcan at marcan.st)
Public Key: https://mrcn.st/pub



More information about the linux-arm-kernel mailing list