Possible issue with firmware crash reporting.

Mon Sep 29 04:04:27 PDT 2014

Ben Greear <greearb at candelatech.com> writes:

> This kernel is basically linux-ath from a few days ago
> plus a bunch of my patches, including my versions of the firmware
> BSS and stack dump patches.
> Problem could be mine alone, but likely the patches Kalle
> is working on would be susceptible to the same sort of problem.
>
> I produced this by purposefully crashing the firmware during
> station registration while debugging some firmware issues.
>
> This is just FYI, but if someone cares to do similar
> testing, I can build a special firmware that crashes
> in the same way and make it available.
>
>
> =================================
> [ INFO: inconsistent lock state ]
> 3.17.0-rc6+ #3 Not tainted
> ---------------------------------
> inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> swapper/2/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
>  (uevent_sock_mutex){+.?.+.}, at: [<ffffffff8133d402>]
> kobject_uevent_env+0x2b8/0x5d7

[...]

> {SOFTIRQ-ON-W} state was registered at:
>   [<ffffffff81111f34>] __lock_acquire+0x352/0xe48
>   [<ffffffff81112ef6>] lock_acquire+0xd2/0x120
>   [<ffffffff8165c77c>] mutex_lock_nested+0x4f/0x3c7
>   [<ffffffff8133d402>] kobject_uevent_env+0x2b8/0x5d7
>   [<ffffffff8133d72c>] kobject_uevent+0xb/0xd
>   [<ffffffff8133c970>] kset_register+0x30/0x3e
>   [<ffffffff81431a7a>] bus_register+0xae/0x292
>   [<ffffffff81d69174>] platform_bus_init+0x29/0x41
>   [<ffffffff81d69202>] driver_init+0x27/0x33
>   [<ffffffff81d1e0d9>] kernel_init_freeable+0x155/0x263
>   [<ffffffff8164e95a>] kernel_init+0x9/0xda
>   [<ffffffff8165f0bc>] ret_from_fork+0x7c/0xb0

[...]

>  <IRQ>  [<ffffffff81657366>] dump_stack+0x4e/0x71
>  [<ffffffff81653c50>] print_usage_bug+0x1ec/0x1fd
>  [<ffffffff8101bcae>] ? save_stack_trace+0x27/0x44
>  [<ffffffff81111457>] ? check_usage_backwards+0xa0/0xa0
>  [<ffffffff81111aeb>] mark_lock+0x11b/0x212
>  [<ffffffff81111ebe>] __lock_acquire+0x2dc/0xe48
>  [<ffffffff81113215>] ? mark_held_locks+0x54/0x76
>  [<ffffffff811904f3>] ? __free_pages_ok+0xb3/0xca
>  [<ffffffff811133c9>] ? trace_hardirqs_on_caller+0x192/0x1a1
>  [<ffffffff81112ef6>] lock_acquire+0xd2/0x120
>  [<ffffffff8133d402>] ? kobject_uevent_env+0x2b8/0x5d7
>  [<ffffffff8165c77c>] mutex_lock_nested+0x4f/0x3c7
>  [<ffffffff8133d402>] ? kobject_uevent_env+0x2b8/0x5d7
>  [<ffffffff8133d402>] ? kobject_uevent_env+0x2b8/0x5d7
>  [<ffffffff81430f16>] ? dev_uevent+0x1d4/0x274
>  [<ffffffff8133c147>] ? kobject_get_path+0x8c/0xdb
>  [<ffffffff8133d402>] kobject_uevent_env+0x2b8/0x5d7
>  [<ffffffff811133c9>] ? trace_hardirqs_on_caller+0x192/0x1a1
>  [<ffffffffa069c70f>] ath10k_pci_fw_crashed_dump+0x456/0x535 [ath10k_pci]
>  [<ffffffff81006432>] ? xen_set_domain_pte+0x37/0xe1
>  [<ffffffffa069c854>] ath10k_pci_tasklet+0x27/0x5a [ath10k_pci]
>  [<ffffffff810dcd4d>] tasklet_action+0xcb/0xdd

If I'm reading this right, uevent_sock_mutex is by both
platform_bus_init() and and ath10k tasklet in
ath10k_pci_fw_crashed_dump() tries to acquire the same lock via
kobject_uevent_evn(). But I don't understand is how
ath10k_pci_fw_crashed_dump() ends up calling kobject_uevent_env(), I
just can't find a code path to do that.

Are you sure you don't have some custom patches which cause this, like
sending a uevent whenever firmware crashes?

-- 
Kalle Valo