[PATCH 2/6] irqchip/apple-aic: Add Fast IPI support
Hector Martin
marcan at marcan.st
Fri Dec 17 21:31:28 PST 2021
On 12/12/2021 21.21, Marc Zyngier wrote:
>> +/* MPIDR fields */
>> +#define MPIDR_CPU GENMASK(7, 0)
>> +#define MPIDR_CLUSTER GENMASK(15, 8)
>
> This should be defined in terms of MPIDR_AFFINITY_LEVEL() and co.
Yeah, I found out about that macro from your PMU driver... :)
>> +static const struct aic_info aic1_fipi_info = {
>> + .version = 1,
>> +
>> + .fast_ipi = true,
>
> Do you anticipate multiple feature flags like this? If so, maybe we
> should consider biting the bullet and making this an unsigned long
> populated with discrete flags.
>
> Not something we need to decide now though.
Probably not, but who knows! It's easy to change it later, though.
>> if (read_sysreg_s(SYS_IMP_APL_IPI_SR_EL1) & IPI_SR_PENDING) {
>> - pr_err_ratelimited("Fast IPI fired. Acking.\n");
>> - write_sysreg_s(IPI_SR_PENDING, SYS_IMP_APL_IPI_SR_EL1);
>> + if (aic_irqc->info.fast_ipi) {
>
> On the other hand, this is likely to hit on the fast path. Given that
> we know at probe time whether we support SR-based IPIs, we can turn
> this into a static key and save a few fetches on every IPI. It applies
> everywhere you look at this flag at runtime.
Good point, I'll see about refactoring this to use static keys.
>> +static void aic_ipi_send_fast(int cpu)
>> +{
>> + u64 mpidr = cpu_logical_map(cpu);
>> + u64 my_mpidr = cpu_logical_map(smp_processor_id());
>
> This is the equivalent of reading MPIDR_EL1. My gut feeling is that it
> is a bit faster to access the sysreg than a percpu lookup, a function
> call and another memory access.
Yeah, I saw other IRQ drivers doing this, but I wasn't sure it made
sense over just reading MPIDR_EL1... I'll switch to that.
>> + u64 idx = FIELD_GET(MPIDR_CPU, mpidr);
>> +
>> + if (FIELD_GET(MPIDR_CLUSTER, my_mpidr) == cluster)
>> + write_sysreg_s(FIELD_PREP(IPI_RR_CPU, idx),
>> + SYS_IMP_APL_IPI_RR_LOCAL_EL1);
>> + else
>> + write_sysreg_s(FIELD_PREP(IPI_RR_CPU, idx) | FIELD_PREP(IPI_RR_CLUSTER, cluster),
>> + SYS_IMP_APL_IPI_RR_GLOBAL_EL1);
>
> Don't you need an ISB, either here or in the two callers? At the
> moment, I don't see what will force the execution of these writes, and
> they could be arbitrarily delayed.
Is there any requirement for timeliness sending IPIs? They're going to
another CPU after all, they could be arbitrarily delayed because it has
FIQs masked.
>> - if (atomic_read(this_cpu_ptr(&aic_vipi_flag)) & irq_bit)
>> - aic_ic_write(ic, AIC_IPI_SEND, AIC_IPI_SEND_CPU(smp_processor_id()));
>> + if (atomic_read(this_cpu_ptr(&aic_vipi_flag)) & irq_bit) {
>> + if (ic->info.fast_ipi)
>> + aic_ipi_send_fast(smp_processor_id());
>
> nit: if this is common enough, maybe having an aic_ipi_send_self_fast
> could be better. Needs evaluation though.
I'll do some printing to see how common self-IPIs are when running
common workloads, let's see. If it's common enough it's easy enough to add.
>> + irqc->info = *(struct aic_info *)match->data;
>
> Why the copy? All the data is const, and isn't going away.
... for now, but later patches then start computing register offsets and
putting them into this structure :)
--
Hector Martin (marcan at marcan.st)
Public Key: https://mrcn.st/pub
More information about the linux-arm-kernel
mailing list