arm64 lockdep splat

Mark Rutland mark.rutland at arm.com
Wed Jun 28 08:11:05 PDT 2017


On Wed, Jun 28, 2017 at 10:49:57AM -0400, Mark Salter wrote:
> Hi Mark.

Hi Mark,

> I'm seeing this with lock debugging turned on and booting with ACPI:
> 
> [    0.137762] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags)) 
> [    0.137773] ------------[ cut here ]------------ 
> [    0.137785] WARNING: CPU: 0 PID: 12 at kernel/locking/lockdep.c:2881 lockdep_trace_alloc+0xb4/0xbc 
> [    0.137788] Modules linked in: 
> [    0.137793]  
> [    0.137797] CPU: 0 PID: 12 Comm: cpuhp/0 Not tainted 4.11.0-10.el7a.aarch64.debug #1 
> [    0.137800] Hardware name: HPE ProLiant m400 Server/ProLiant m400 Server, BIOS U02 08/19/2016 
> [    0.137803] task: ffff800fc656d000 task.stack: ffff800fc65c8000 
> [    0.137807] PC is at lockdep_trace_alloc+0xb4/0xbc 
> [    0.137810] LR is at lockdep_trace_alloc+0xb4/0xbc 
> ...
> [    0.137939] [<ffff00000814559c>] lockdep_trace_alloc+0xb4/0xbc 
> [    0.137944] [<ffff0000082b4fa0>] kmem_cache_alloc_trace+0x48/0x400 
> [    0.137949] [<ffff000008737ac8>] armpmu_alloc+0x38/0x1e4 
> [    0.137954] [<ffff000008738588>] arm_pmu_acpi_cpu_starting+0x170/0x1c4 
> [    0.137958] [<ffff0000080d5f6c>] cpuhp_invoke_callback+0x100/0xcc0 
> [    0.137961] [<ffff0000080d758c>] cpuhp_thread_fun+0xd8/0x12c 
> [    0.137966] [<ffff000008104670>] smpboot_thread_fn+0x170/0x27c 
> [    0.137970] [<ffff0000080fe910>] kthread+0x114/0x140 
> [    0.137975] [<ffff0000080833d0>] ret_from_fork+0x10/0x40 

Sorry about this; I have a partial fix for this, but nothing complete
yet.

> Specifically, warning about possible __GFP_FS reclaim with interrupts off.
> Interrupts are disabled for cpuhp startup threads before CPUHP_AP_ONLINE, Is
> there any reason why CPUHP_AP_PERF_ARM_ACPI_STARTING can't be moved after
> CPUHP_AP_ONLINE? 

I'll need to go digging into this. I can't immediately recall why
CPUHP_AP_PERF_ARM_ACPI_STARTING and CPUHP_AP_PERF_ARM_STARTING need to
be prior to CPUHP_AP_ONLINE.

I'm confused by the relationship with CPUHP_AP_PERF_ONLINE, and I think
we might have other subtle breakage here in other perf drivers.

Thanks for pointing this out -- this isn't an avenue I'd considered for
fixing this.

> Or we could enabled irqs in arm_pmu_acpi_cpu_starting()?

I don't beleive that this is safe, given the CPU isn't fully up yet.
Interrupts are presumably disabled with good reason.

> Or change the alloc flags?

Doing that's a first step, but we'll subsequently hit similar issues
when fiddling with the irqs, and I haven't yet found a way to make that
work.

Thanks,
Mark.



More information about the linux-arm-kernel mailing list