[PATCH] generic-ipi: Initialize call_single_queue before enabling interrupt

Takao Indoh indou.takao at jp.fujitsu.com
Fri Mar 25 18:07:52 EDT 2011


On Fri, 25 Mar 2011 11:48:21 -0600, Milton Miller wrote:

>On Fri, 25 Mar 2011 about 11:52:20 -0400, Takao Indoh wrote:
>> 
>> Hi all,
>> 
>> Actually this is a v2 patch but I post with new subject.
>> The first post is here.
>> 
>> [PATCH][KDUMP] Ignore spurious IPI
>> http://www.gossamer-threads.com/lists/linux/kernel/1356389
>> 
>> The problem I found is that kdump(2nd kernel) sometimes hangs up due to
>> pending IPI from 1st kernel. Kernel panic occurs because IPI comes
>> before call_single_queue is initialized, so this patch moves
>> init_call_single_data() so that it can be called before
>> local_irq_enable().
>> 
>
>This is good background, but I generally prefer changelogs to start
>by describing what is being changed and then proceed on to why, and
>then the history of the patch and discussion.
>
>> The details of the problem is below.
>> 
>> (1)
>> 2nd kernel boot up
>> 
>> (2)
>> A pending IPI from 1st kernel comes after unmasking interrupts at the
>> following point.
>> 
>> asmlinkage void __init start_kernel(void)
>> {
>> (snip)
>>     time_init();
>>     profile_init();
>>     if (!irqs_disabled())
>>             printk(KERN_CRIT "start_kernel(): bug: interrupts were "
>>                              "enabled early\n");
>>     early_boot_irqs_disabled = false;
>>     local_irq_enable(); <=======================================HERE
>> 
>
>This could be described more concisely as "when irqs are first enabled
>in start_kernel()".
>
>> (3)
>> Kernel tries to handle the interrupt, but call_single_queue is not
>> initialized yet at this point. As a result, in the
>> generic_smp_call_function_single_interrupt(), NULL pointer dereference
>> occurs when list_replace_init() tries to access &q->list.next.
>> 
>> Any comments would be appreciated.
>> 
>> Signed-off-by: Takao Indoh <indou.takao at jp.fujitsu.com>
>> ---
>>  include/linux/smp.h |    1 +
>>  init/main.c         |    1 +
>>  kernel/smp.c        |   18 +++++++++++-------
>>  3 files changed, 13 insertions(+), 7 deletions(-)
>> 
>> diff --git a/include/linux/smp.h b/include/linux/smp.h
>> index 6dc95ca..0b81bba 100644
>> --- a/include/linux/smp.h
>> +++ b/include/linux/smp.h
>> @@ -114,6 +114,7 @@ int on_each_cpu(smp_call_func_t func, void *info, int 
>> wait);
>>  void smp_prepare_boot_cpu(void);
>>  
>>  extern unsigned int setup_max_cpus;
>> +void init_call_single_data(void);
>>  
>
>This no longer applys (although it did to 2.6.38).
>
>>  #else /* !SMP */
>>  
>
>I was going to point out that this will fail on up, but it looks
>like ba8dd03ac09f51a69c154b8cb508b701d713a2cd (Full conversion to
>early_initcall() interface, remove old interface) only found 2 of the
>previous 3 occurrances.
>
>However, this does point out that not all CONFIG_SMP compiles are
>CONFIG_USE_GENERIC_SMP_HELPERS, and you need to adjust accordingly.
>
>[ for other reviewers no this is not a simple revert of a hunk of that
>2.6.27 change, this is moving the code yet earlier ].
>
>> diff --git a/init/main.c b/init/main.c
>> index 33c37c3..02a7617 100644
>> --- a/init/main.c
>> +++ b/init/main.c
>> @@ -631,6 +631,7 @@ asmlinkage void __init start_kernel(void)
>>  		printk(KERN_CRIT "start_kernel(): bug: interrupts were "
>>  				 "enabled early\n");
>>  	early_boot_irqs_disabled = false;
>> +	init_call_single_data();
>>  	local_irq_enable();
>>  
>
>1) this is too late, the point of the printk and the state variable is
>to make sure interrupts are enabled exactly when we expect for the
>first time.

Ok, so the following place is correct?

 	profile_init();
+	call_function_init();
 	if (!irqs_disabled())
 		printk(KERN_CRIT "start_kernel(): bug: interrupts were "
 				"enabled early\n");
 	early_boot_irqs_disabled = false;
 	local_irq_enable();


>
>2) please follow the naming conventions and name this xxx_init.  Actually
>I prefer xxx be the name of the subsystem (call_function) instead of
>the name of the data structure (call_single_data).
>
>>  	/* Interrupts are enabled now so all GFP allocations are safe. */
>> diff --git a/kernel/smp.c b/kernel/smp.c
>> index 7cbd0f2..f119610 100644
>> --- a/kernel/smp.c
>> +++ b/kernel/smp.c
>> @@ -74,9 +74,19 @@ static struct notifier_block __cpuinitdata 
>> hotplug_cfd_notifier = {
>>  	.notifier_call		= hotplug_cfd,
>>  };
>>  
>> -static int __cpuinit init_call_single_data(void)
>> +static int __cpuinit init_hotplug_cfd(void)
>>  {
>>  	void *cpu = (void *)(long)smp_processor_id();
>> +
>> +	hotplug_cfd(&hotplug_cfd_notifier, CPU_UP_PREPARE, cpu);
>> +	register_cpu_notifier(&hotplug_cfd_notifier);
>> +
>> +	return 0;
>> +}
>> +early_initcall(init_hotplug_cfd);
>> +
>
>Why did you seperate this from the rest of the function?  I haven't
>tested the patch but don't why this needs to be delayed.  If there
>is some reason I don't see, please describe it in the change log.

There is not special reason. I just thought I should minimize the change
to prevent regression because I cannot do hotplug test. Ok, next time
I'll just change the name from init_call_single_data to
call_function_init instead of separating it.

It seems that Peter Zijlstra added this hotplug code at 8969a5ed, so I
added him to Cc.

Thanks for your review.

Thanks,
Takao Indoh


>
>> +void __init init_call_single_data(void)
>> +{
>>  	int i;
>>  
>>  	for_each_possible_cpu(i) {
>> @@ -85,13 +95,7 @@ static int __cpuinit init_call_single_data(void)
>>  		raw_spin_lock_init(&q->lock);
>>  		INIT_LIST_HEAD(&q->list);
>>  	}
>> -
>> -	hotplug_cfd(&hotplug_cfd_notifier, CPU_UP_PREPARE, cpu);
>> -	register_cpu_notifier(&hotplug_cfd_notifier);
>> -
>> -	return 0;
>>  }
>> -early_initcall(init_call_single_data);
>>  
>>  /*
>>   * csd_lock/csd_unlock used to serialize access to per-cpu csd resources
>> 
>
>milton



More information about the kexec mailing list