BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT
Marc Kleine-Budde
mkl at pengutronix.de
Thu Jun 20 06:08:58 EDT 2013
On 06/20/2013 11:57 AM, Catalin Marinas wrote:
> Hi Marc,
>
> On Thu, Jun 20, 2013 at 09:43:33AM +0100, Marc Kleine-Budde wrote:
>> on current linus/master on armv5 we observed stack trashing[1] on high
>> load. I bisected this problem down to commit:
>>
>>> b9d4d42ad901cc848ac87f1cb8923fded3645568 is the first bad commit
>>> commit b9d4d42ad901cc848ac87f1cb8923fded3645568
>>> Author: Catalin Marinas <catalin.marinas at arm.com>
>>> Date: Mon Nov 28 21:57:24 2011 +0000
>>>
>>> ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs
>>>
>>> This patch removes the __ARCH_WANT_INTERRUPTS_ON_CTXSW definition for
>>> ARMv5 and earlier processors. On such processors, the context switch
>>> requires a full cache flush. To avoid high interrupt latencies, this
>>> patch defers the mm switching to the post-lock switch hook if the
>>> interrupts are disabled.
>>>
>>> Reviewed-by: Will Deacon <will.deacon at arm.com>
>>> Tested-by: Will Deacon <will.deacon at arm.com>
>>> Reviewed-by: Frank Rowand <frank.rowand at am.sony.com>
>>> Tested-by: Marc Zyngier <Marc.Zyngier at arm.com>
>>> Signed-off-by: Catalin Marinas <catalin.marinas at arm.com>
>>>
>>> :040000 040000 034899bdcbc9aa59b5455a85a9d78b646b4cf784 ecc23e33a4ca807d4153f87fbea85a9437ff2928 M arch
>>
>> The problem can be reproduced on several mx28 and an at91sam9263 and
>> only occurs of CONFIG_PREEMPT (Preemptible Kernel (Low-Latency Desktop))
>> is enabled.
>>
>> I have the gut feeling that the "if (irqs_disabled())" check in the
>> above patch is not correct for CONFIG_PREEMPT.
>
> The check is there to avoid long interrupt latencies (flushing the whole
> cache with interrupts disabled during context switch). You can drop the
> check and always call cpu_switch_mm() to confirm that it fixes the
> faults.
I have disabled the check:
diff --git a/arch/arm/include/asm/mmu_context.h b/arch/arm/include/asm/mmu_context.h
index a7b85e0..7b3f67f 100644
--- a/arch/arm/include/asm/mmu_context.h
+++ b/arch/arm/include/asm/mmu_context.h
@@ -39,7 +39,7 @@ static inline void check_and_switch_context(struct mm_struct *mm,
if (unlikely(mm->context.vmalloc_seq != init_mm.context.vmalloc_seq))
__check_vmalloc_seq(mm);
- if (irqs_disabled())
+ if (0 && irqs_disabled())
/*
* cpu_switch_mm() needs to flush the VIVT caches. To avoid
* high interrupt latencies, defer the call and continue
...and the test is running without problems, while I'm writing this
email. No more segfaults so far.
> finish_task_switch() calls finish_arch_post_lock_switch() after the
> interrupts have been enabled so that the CPU can actually switch the mm.
> I wonder whether we could actually be preempted after
> finish_lock_switch() but before we actually switched the MMU.
>
> Here's an untested patch (trying to keep it in the arch/arm code):
>
>
> diff --git a/arch/arm/include/asm/mmu_context.h b/arch/arm/include/asm/mmu_context.h
> index a7b85e0..ded85e9 100644
> --- a/arch/arm/include/asm/mmu_context.h
> +++ b/arch/arm/include/asm/mmu_context.h
> @@ -39,17 +39,20 @@ static inline void check_and_switch_context(struct mm_struct *mm,
> if (unlikely(mm->context.vmalloc_seq != init_mm.context.vmalloc_seq))
> __check_vmalloc_seq(mm);
>
> - if (irqs_disabled())
> + if (irqs_disabled()) {
> /*
> * cpu_switch_mm() needs to flush the VIVT caches. To avoid
> * high interrupt latencies, defer the call and continue
> * running with the old mm. Since we only support UP systems
> * on non-ASID CPUs, the old mm will remain valid until the
> - * finish_arch_post_lock_switch() call.
> + * finish_arch_post_lock_switch() call. Preemption needs to be
> + * disabled until the MMU is switched.
> */
> set_ti_thread_flag(task_thread_info(tsk), TIF_SWITCH_MM);
> - else
> + preempt_disable();
> + } else {
> cpu_switch_mm(mm->pgd, mm);
> + }
> }
>
> #define finish_arch_post_lock_switch \
> @@ -59,6 +62,10 @@ static inline void finish_arch_post_lock_switch(void)
> if (test_and_clear_thread_flag(TIF_SWITCH_MM)) {
> struct mm_struct *mm = current->mm;
> cpu_switch_mm(mm->pgd, mm);
> + /*
> + * Preemption disabled in check_and_switch_context().
> + */
> + preempt_enable();
> }
> }
Will now reboot to test your patch.
thanks,
Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Industrial Linux Solutions | Phone: +49-231-2826-924 |
Vertretung West/Dortmund | Fax: +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 259 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130620/510c114b/attachment.sig>
More information about the linux-arm-kernel
mailing list