usb: dwc2: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 146s
Doug Anderson
dianders at chromium.org
Thu Apr 20 09:19:18 PDT 2017
Hi,
On Thu, Apr 20, 2017 at 12:46 AM, Stefan Wahren <stefan.wahren at i2se.com> wrote:
> Am 19.04.2017 um 23:47 schrieb Doug Anderson:
>> Hi,
>>
>> On Wed, Apr 19, 2017 at 1:25 PM, Stefan Wahren <stefan.wahren at i2se.com> wrote:
>>> Hi,
>>>
>>>> Doug Anderson <dianders at chromium.org> hat am 18. April 2017 um 22:41 geschrieben:
>>>>
>>>>
>>>> It's hard to know for sure that all of this time is really in
>>>> urb_enqueue(). Possible we could have task switched out and been
>>>> blocked elsewhere. Using ftrace to get more fine-grained timings
>>>> would be useful. ktime_get(), ktime_sub(), and ktime_to_us() are your
>>>> friends here if you want to use trace_printk.
>>> i'm a newbie to ftrace, so i hope this would be helpful.
>>>
>>> # connect PL2303 to the onboard hub
>>> # echo 0 > options/sleep-time
>>> # echo 0 > function_profile_enabled
>>> # echo 1 > function_profile_enabled
>>> # ./usb_test
>>> # Waiting for at least 20 seconds and then disconnect PL2303
>>> # echo 0 > function_profile_enabled
>>> # cat trace_stat/function0
>>>
>>> Function Hit Time Avg s^2
>>> -------- --- ---- --- ---
>>> bcm2835_handle_irq 361347 219567633 us 607.636 us 1485199 us
>>> __handle_domain_irq 1082482 212639551 us 196.437 us 3642030 us
>>> generic_handle_irq 1082482 100592051 us 92.927 us 50511334 us
>>> irq_exit 1082482 98197771 us 90.715 us 29649040 us
>>> handle_level_irq 1082482 95812379 us 88.511 us 51910093 us
>>> do_sys_open 1806 87612983 us 48512.17 us 2198507 us
>>> SyS_open 1601 87372331 us 54573.59 us 1898996 us
>>> do_filp_open 1862 87368058 us 46921.62 us 1634982 us
>>> path_openat 1862 87314553 us 46892.88 us 3357817 us
>>> __do_softirq 3035 86266050 us 28423.73 us 6449768 us
>>> vfs_open 1515 85877012 us 56684.49 us 101673.5 us
>>> do_dentry_open 1515 85861429 us 56674.21 us 812420.7 us
>>> usb_submit_urb 136 85760172 us 630589.5 us 59532024 us
>>> usb_hcd_submit_urb 134 85756518 us 639974.0 us 726298102 us
>>> _dwc2_hcd_urb_enqueue 134 85738333 us 639838.3 us 874104371 us
>> The 134 calls to this are taking a ton of time. It would be nice to
>> know where all the time actually was in here. Are you on a
>> single-core device, or multi-core?
>
> Raspberry Pi B is a single core device which runs with 700 MHz cpu freq.
>
>> Oh, probably this so slow because we're just getting interrupted
>> constantly. You can see that during your trace 80686112 us was in
>> handle_irq_event(). Presumably all of that time could have been
>> counted towards whatever code you were in when the interrupt went off.
>> Of that 80 seconds, 60 seconds or so was usb_hcd_irq, and of that 57
>> seconds was in _dwc2_hcd_irq().
>
> Since the FTDI (full speed device) and the PL2303 (also full speed
> device) behaves differently, i will try to compare those ftraces.
The most interesting to know is what the endpoints looked like for
these two devices. Since they are full speed devices, I'd expect that
any interrupt end points should fire once per "ms" at most.
>> Presumably the big part of your problem is that dwc2_handle_hcd_intr()
>> is just too slow. From glancing at the numbers below, it seems likely
>> that nothing in particular is slow, it's just running a lot of code
>> and (in total) all of that is slow. ...but someone with more time on
>> their hands would need to really debug.
>>
>> One thing that would be interesting would be to see if you can
>> increase the bug clock for talking to the dwc2 controller.
>
> Unfortunately the bcm2835 datasheet [1] doesn't provide many information
> about the USB IP core and the Synopsys documents aren't public
> available. The Raspberry Pi Foundation decided to use a different driver
> which uses FIQ in the downstream tree [2], but this won't be an option here.
>
> Maybe i could start with comparing the register settings between dwc2
> and dwc_otg.
I was suggesting looking at clock speeds related to usb that might be
in /sys/kernel/debug/clk/clk_summary
...but at 700 MHz single core (and I think it's also an older ARM core
so 700 MHz there might be a lot slower than 700 MHz on newer cores),
the theory that you're just running too many instructions is a sane
one.
Probably someone just needs to see if there's any way to fast path
some of the common cases in the IRQ handler.
-Doug
More information about the linux-rpi-kernel
mailing list