Possible issue with firmware crash reporting.
Ben Greear
greearb at candelatech.com
Mon Sep 29 09:10:53 PDT 2014
On 09/29/2014 04:04 AM, Kalle Valo wrote:
> Ben Greear <greearb at candelatech.com> writes:
>
>> This kernel is basically linux-ath from a few days ago
>> plus a bunch of my patches, including my versions of the firmware
>> BSS and stack dump patches.
>> Problem could be mine alone, but likely the patches Kalle
>> is working on would be susceptible to the same sort of problem.
>>
>> I produced this by purposefully crashing the firmware during
>> station registration while debugging some firmware issues.
>>
>> This is just FYI, but if someone cares to do similar
>> testing, I can build a special firmware that crashes
>> in the same way and make it available.
>>
>>
>> =================================
>> [ INFO: inconsistent lock state ]
>> 3.17.0-rc6+ #3 Not tainted
>> ---------------------------------
>> inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
>> swapper/2/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
>> (uevent_sock_mutex){+.?.+.}, at: [<ffffffff8133d402>]
>> kobject_uevent_env+0x2b8/0x5d7
>
> [...]
>
>> {SOFTIRQ-ON-W} state was registered at:
>> [<ffffffff81111f34>] __lock_acquire+0x352/0xe48
>> [<ffffffff81112ef6>] lock_acquire+0xd2/0x120
>> [<ffffffff8165c77c>] mutex_lock_nested+0x4f/0x3c7
>> [<ffffffff8133d402>] kobject_uevent_env+0x2b8/0x5d7
>> [<ffffffff8133d72c>] kobject_uevent+0xb/0xd
>> [<ffffffff8133c970>] kset_register+0x30/0x3e
>> [<ffffffff81431a7a>] bus_register+0xae/0x292
>> [<ffffffff81d69174>] platform_bus_init+0x29/0x41
>> [<ffffffff81d69202>] driver_init+0x27/0x33
>> [<ffffffff81d1e0d9>] kernel_init_freeable+0x155/0x263
>> [<ffffffff8164e95a>] kernel_init+0x9/0xda
>> [<ffffffff8165f0bc>] ret_from_fork+0x7c/0xb0
>
> [...]
>
>> <IRQ> [<ffffffff81657366>] dump_stack+0x4e/0x71
>> [<ffffffff81653c50>] print_usage_bug+0x1ec/0x1fd
>> [<ffffffff8101bcae>] ? save_stack_trace+0x27/0x44
>> [<ffffffff81111457>] ? check_usage_backwards+0xa0/0xa0
>> [<ffffffff81111aeb>] mark_lock+0x11b/0x212
>> [<ffffffff81111ebe>] __lock_acquire+0x2dc/0xe48
>> [<ffffffff81113215>] ? mark_held_locks+0x54/0x76
>> [<ffffffff811904f3>] ? __free_pages_ok+0xb3/0xca
>> [<ffffffff811133c9>] ? trace_hardirqs_on_caller+0x192/0x1a1
>> [<ffffffff81112ef6>] lock_acquire+0xd2/0x120
>> [<ffffffff8133d402>] ? kobject_uevent_env+0x2b8/0x5d7
>> [<ffffffff8165c77c>] mutex_lock_nested+0x4f/0x3c7
>> [<ffffffff8133d402>] ? kobject_uevent_env+0x2b8/0x5d7
>> [<ffffffff8133d402>] ? kobject_uevent_env+0x2b8/0x5d7
>> [<ffffffff81430f16>] ? dev_uevent+0x1d4/0x274
>> [<ffffffff8133c147>] ? kobject_get_path+0x8c/0xdb
>> [<ffffffff8133d402>] kobject_uevent_env+0x2b8/0x5d7
>> [<ffffffff811133c9>] ? trace_hardirqs_on_caller+0x192/0x1a1
>> [<ffffffffa069c70f>] ath10k_pci_fw_crashed_dump+0x456/0x535 [ath10k_pci]
>> [<ffffffff81006432>] ? xen_set_domain_pte+0x37/0xe1
>> [<ffffffffa069c854>] ath10k_pci_tasklet+0x27/0x5a [ath10k_pci]
>> [<ffffffff810dcd4d>] tasklet_action+0xcb/0xdd
>
> If I'm reading this right, uevent_sock_mutex is by both
> platform_bus_init() and and ath10k tasklet in
> ath10k_pci_fw_crashed_dump() tries to acquire the same lock via
> kobject_uevent_evn(). But I don't understand is how
> ath10k_pci_fw_crashed_dump() ends up calling kobject_uevent_env(), I
> just can't find a code path to do that.
>
> Are you sure you don't have some custom patches which cause this, like
> sending a uevent whenever firmware crashes?
Well yes, I do have that patch in this kernel I think.
I'll remove it, I can key off of the ethtool stats for
firmware crash counts instead.
Thanks,
Ben
--
Ben Greear <greearb at candelatech.com>
Candela Technologies Inc http://www.candelatech.com
More information about the ath10k
mailing list