Failed to wake device (9984)
Ben Greear
greearb at candelatech.com
Fri Sep 15 13:19:31 PDT 2017
On 09/15/2017 12:38 PM, Adrian Chadd wrote:
> On 15 September 2017 at 09:59, Ben Greear <greearb at candelatech.com> wrote:
>> On 09/14/2017 07:33 PM, Adrian Chadd wrote:
>>>
>>> On 14 September 2017 at 17:13, Ben Greear <greearb at candelatech.com> wrote:
>>>
>>>>>
>>>>> There were always weird cold reset races that necessitated a PCI bus
>>>>> reset of the device. :( can you even see the device? do any of the registers
>>>>> work?
>>>>
>>>>
>>>>
>>>> Can the cold reset be done on generic x86-64 hardware?
>>>
>>>
>>> I'll have to go check. You /should/ be able to. Are there are power
>>> and reset files in /sys/bus/pci for those devices?
>>>
>>>>
>>>> And, it shows up enough that the system probes it, at least. I guess no
>>>> infrastructure to speak of set up for this thing, so not sure how to
>>>> probe any registers.
>>>
>>>
>>> Well, that could be cached BAR information. There are some cold / warm
>>> reset registers in the RTC block that are used during initial wakeup;
>>> print what they're saying to see if it's coming back 0xfffffff or
>>> 0xdeadc0de or something?
>>
>>
>> One thing I notice, if I simply: rmmod ath10k_pci ath10k_core; modprobe
>> ath10k_pci
>> then it recovered (1 of 1 so far).
>
> See if that's reliable. For QCA9880 I know it needed a full
> reacharound sometimes (ie, the reference driver has hooks to reach
> back into the PCIe nexus to toggle reset.)
It is not that reliable. I'm now trying a hack to re-probe the bus up
to 3 times if we fail....hoping maybe that will help.
We just hit a case where the first 2 times failed, but it booted on
the third.
My patch looks like this:
diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c
index e0a7b338..711b3f0 100644
--- a/drivers/net/wireless/ath/ath10k/pci.c
+++ b/drivers/net/wireless/ath/ath10k/pci.c
@@ -3492,8 +3492,8 @@ static const struct ath10k_bus_ops ath10k_pci_bus_ops = {
.get_num_banks = ath10k_pci_get_num_banks,
};
-static int ath10k_pci_probe(struct pci_dev *pdev,
- const struct pci_device_id *pci_dev)
+static int __ath10k_pci_probe(struct pci_dev *pdev,
+ const struct pci_device_id *pci_dev)
{
int ret = 0;
struct ath10k *ar;
@@ -3668,6 +3668,22 @@ static int ath10k_pci_probe(struct pci_dev *pdev,
return ret;
}
+static int ath10k_pci_probe(struct pci_dev *pdev,
+ const struct pci_device_id *pci_dev)
+{
+ int cnt = 0;
+ int rv;
+ do {
+ rv = __ath10k_pci_probe(pdev, pci_dev);
+ if (rv == 0)
+ return rv;
+ pr_err("ath10k: failed to probe PCI : %d, retry-count: %d\n", rv, cnt);
+ udelay(10000); /* let the ath10k firmware gerbil take a small break */
+ } while (cnt++ < 3);
+ return rv;
+}
+
+
static void ath10k_pci_remove(struct pci_dev *pdev)
{
struct ath10k *ar = pci_get_drvdata(pdev);
Thanks,
Ben
>
>> We'll see if that is a reliable way to recover from this problem. And, will
>> see if we
>> can also find a nicer way to go about it...maybe there is just a timer that
>> is not long
>> enough somewhere?
>
> It's possible. I am just always wary about their host glue in the chip
> :-) If reloading the driver helps then great. But all that /should/ be
> dong is a cold reset / wakeup..
>
>
>
> -adrian
>
> _______________________________________________
> ath10k mailing list
> ath10k at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/ath10k
>
--
Ben Greear <greearb at candelatech.com>
Candela Technologies Inc http://www.candelatech.com
More information about the ath10k
mailing list