[PATCH v2 00/18] i.MX8MM GPC improvements and BLK_CTRL driver

Frieder Schrempf frieder.schrempf at kontron.de
Mon Sep 6 00:49:22 PDT 2021


On 02.09.21 12:25, Lucas Stach wrote:
> Hi Frieder,
> 
> Am Mittwoch, dem 01.09.2021 um 12:03 +0200 schrieb Frieder Schrempf:
> [...]
>>>>
>>>>>
>>>>> And I would appreciate if someone else could try to reproduce this problem on his/her side. I use this simple script for testing:
>>>>>
>>>>> #!/bin/sh
>>>>>
>>>>> glmark2-es2-drm &
>>>>>
>>>>> while true;
>>>>> do
>>>>>     echo +10 > /sys/class/rtc/rtc0/wakealarm
>>>>>     echo mem > /sys/power/state
>>>>>     sleep 5
>>>>> done;
>>>>
>>>> Hm, that's unfortunate.
>>>>
>>>> I'm back from a two week vacation, but it looks like I won't have much
>>>> time available to look into this issue soon. It would be very helpful
>>>> if you could try to pinpoint the hang a bit more.  If you can reproduce
>>>> the hang with no_console_suspend you might be able to extract a bit
>>>> more info in which stage the hang happens (suspend, resume, TF-A, etc.)
>>>> If the hang is in the kernel you might be able to add some prints to
>>>> the suspend/resume paths to be able to track down the exact point of
>>>> the hang.
>>>>
>>>> I'm happy to look into the issue once it's better known where to look,
>>>> but I fear that I won't have time to do the above investigation myself
>>>> short term. Frieder, is this something you could help with over the
>>>> next few days?
>>>
>>> I will see if I can find some time to track down the issue at least a little bit more. But I imagine it could get quite tedious if it takes up to several hours to reproduce the issue and I don't have much time to spare.
>>>
>>> @Peng, @Adam and everyone else: Any chance you could setup a similar test and try to reproduce this?
>>>
>>> On the other hand reboot cycle testing didn't show any lockup problems over more than 24 hours, so it seems like the issue is limited to resume.
>>
>> I ran a few more suspend/resume cycles and watched the log. The first
>> 2.5 hours nothing noteworthy happened, except that glmark2 crashed again
>> at some point.
>>
>> Then suddenly the following lines were printed while suspending:
>>
>>   imx-pgc imx-pgc-domain.6: failed to command PGC
>>   PM: dpm_run_callback(): platform_pm_suspend+0x0/0x78 returns -110
>>   imx8m-blk-ctrl 38330000.blk-ctrl: PM: failed to suspend: error -110
>>   PM: Some devices failed to suspend, or early wake event detected
>>
>> After that, the suspending continues to fail with the following on each try:
>>
>>   PM: dpm_run_callback(): platform_pm_suspend+0x0/0x78 returns -22
>>   imx8m-blk-ctrl 38330000.blk-ctrl: PM: failed to suspend: error -22
>>   PM: Some devices failed to suspend, or early wake event detected
>>
>> So far I didn't run into a lockup again with this test, but I will
>> continue trying to reproduce it and retrieve more information.
> 
> If you run into this "failed to command PGC" state again, I would be
> very interested in the GPC state there. You should be able to dump the
> full register state from the GPC regmap in debugfs.

I tried to reproduce this with the same setup for several days now, but
I didn't run into this error again so far. It seems to be something that
occurs only very rarely.

I also got only a single lockup with this board and something like ~40 h
testing in total. On the other hand I have a different board (same
design) that shows the lockups much more often.

I hope I can provide more data soon, but I can't promise anything.



More information about the linux-arm-kernel mailing list