[PATCH] generic-ipi: Initialize call_single_queue before enabling interrupt

Milton Miller miltonm at bga.com
Fri Mar 25 13:48:21 EDT 2011


On Fri, 25 Mar 2011 about 11:52:20 -0400, Takao Indoh wrote:
> 
> Hi all,
> 
> Actually this is a v2 patch but I post with new subject.
> The first post is here.
> 
> [PATCH][KDUMP] Ignore spurious IPI
> http://www.gossamer-threads.com/lists/linux/kernel/1356389
> 
> The problem I found is that kdump(2nd kernel) sometimes hangs up due to
> pending IPI from 1st kernel. Kernel panic occurs because IPI comes
> before call_single_queue is initialized, so this patch moves
> init_call_single_data() so that it can be called before
> local_irq_enable().
> 

This is good background, but I generally prefer changelogs to start
by describing what is being changed and then proceed on to why, and
then the history of the patch and discussion.

> The details of the problem is below.
> 
> (1)
> 2nd kernel boot up
> 
> (2)
> A pending IPI from 1st kernel comes after unmasking interrupts at the
> following point.
> 
> asmlinkage void __init start_kernel(void)
> {
> (snip)
>     time_init();
>     profile_init();
>     if (!irqs_disabled())
>             printk(KERN_CRIT "start_kernel(): bug: interrupts were "
>                              "enabled early\n");
>     early_boot_irqs_disabled = false;
>     local_irq_enable(); <=======================================HERE
> 

This could be described more concisely as "when irqs are first enabled
in start_kernel()".

> (3)
> Kernel tries to handle the interrupt, but call_single_queue is not
> initialized yet at this point. As a result, in the
> generic_smp_call_function_single_interrupt(), NULL pointer dereference
> occurs when list_replace_init() tries to access &q->list.next.
> 
> Any comments would be appreciated.
> 
> Signed-off-by: Takao Indoh <indou.takao at jp.fujitsu.com>
> ---
>  include/linux/smp.h |    1 +
>  init/main.c         |    1 +
>  kernel/smp.c        |   18 +++++++++++-------
>  3 files changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/smp.h b/include/linux/smp.h
> index 6dc95ca..0b81bba 100644
> --- a/include/linux/smp.h
> +++ b/include/linux/smp.h
> @@ -114,6 +114,7 @@ int on_each_cpu(smp_call_func_t func, void *info, int wait);
>  void smp_prepare_boot_cpu(void);
>  
>  extern unsigned int setup_max_cpus;
> +void init_call_single_data(void);
>  

This no longer applys (although it did to 2.6.38).

>  #else /* !SMP */
>  

I was going to point out that this will fail on up, but it looks
like ba8dd03ac09f51a69c154b8cb508b701d713a2cd (Full conversion to
early_initcall() interface, remove old interface) only found 2 of the
previous 3 occurrances.

However, this does point out that not all CONFIG_SMP compiles are
CONFIG_USE_GENERIC_SMP_HELPERS, and you need to adjust accordingly.

[ for other reviewers no this is not a simple revert of a hunk of that
2.6.27 change, this is moving the code yet earlier ].

> diff --git a/init/main.c b/init/main.c
> index 33c37c3..02a7617 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -631,6 +631,7 @@ asmlinkage void __init start_kernel(void)
>  		printk(KERN_CRIT "start_kernel(): bug: interrupts were "
>  				 "enabled early\n");
>  	early_boot_irqs_disabled = false;
> +	init_call_single_data();
>  	local_irq_enable();
>  

1) this is too late, the point of the printk and the state variable is
to make sure interrupts are enabled exactly when we expect for the
first time.

2) please follow the naming conventions and name this xxx_init.  Actually
I prefer xxx be the name of the subsystem (call_function) instead of
the name of the data structure (call_single_data).

>  	/* Interrupts are enabled now so all GFP allocations are safe. */
> diff --git a/kernel/smp.c b/kernel/smp.c
> index 7cbd0f2..f119610 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -74,9 +74,19 @@ static struct notifier_block __cpuinitdata hotplug_cfd_notifier = {
>  	.notifier_call		= hotplug_cfd,
>  };
>  
> -static int __cpuinit init_call_single_data(void)
> +static int __cpuinit init_hotplug_cfd(void)
>  {
>  	void *cpu = (void *)(long)smp_processor_id();
> +
> +	hotplug_cfd(&hotplug_cfd_notifier, CPU_UP_PREPARE, cpu);
> +	register_cpu_notifier(&hotplug_cfd_notifier);
> +
> +	return 0;
> +}
> +early_initcall(init_hotplug_cfd);
> +

Why did you seperate this from the rest of the function?  I haven't
tested the patch but don't why this needs to be delayed.  If there
is some reason I don't see, please describe it in the change log.

> +void __init init_call_single_data(void)
> +{
>  	int i;
>  
>  	for_each_possible_cpu(i) {
> @@ -85,13 +95,7 @@ static int __cpuinit init_call_single_data(void)
>  		raw_spin_lock_init(&q->lock);
>  		INIT_LIST_HEAD(&q->list);
>  	}
> -
> -	hotplug_cfd(&hotplug_cfd_notifier, CPU_UP_PREPARE, cpu);
> -	register_cpu_notifier(&hotplug_cfd_notifier);
> -
> -	return 0;
>  }
> -early_initcall(init_call_single_data);
>  
>  /*
>   * csd_lock/csd_unlock used to serialize access to per-cpu csd resources
> 

milton



More information about the kexec mailing list