[Bug 215679] New: NVMe/writeback wb_workfn/blocked for more than 30 seconds

Thorsten Leemhuis regressions at leemhuis.info
Thu Apr 21 04:27:41 PDT 2022


Hi, this is your Linux kernel regression tracker. Top-posting for once,
to make this easily accessible to everyone.

Jens, this regression is now a month old and the culprit was already
identified when the issue was reported, nevertheless it seems no real
progress was made to address this, at least afaics. What's up there?
Should the culprit be reverted?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

#regzbot poke

On 14.04.22 21:08, Keith Busch wrote:
> On Sun, Mar 13, 2022 at 08:13:53PM +0000, bugzilla-daemon at kernel.org wrote:
>> https://bugzilla.kernel.org/show_bug.cgi?id=215679
>>
>>             Bug ID: 215679
>>            Summary: NVMe/writeback wb_workfn/blocked for more than 30
>>                     seconds
>>            Product: IO/Storage
>>            Version: 2.5
>>     Kernel Version: 5.17.0-rc7
>>           Hardware: x86-64
>>                 OS: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: high
>>           Priority: P1
>>          Component: NVMe
>>           Assignee: io_nvme at kernel-bugs.kernel.org
>>           Reporter: imre.deak at intel.com
>>         Regression: No
>>
>> Created attachment 300564
>>   --> https://bugzilla.kernel.org/attachment.cgi?id=300564&action=edit
>> dmesg log after suspend resume, io stuck
>>
>> After system suspend/resume filesystem IO will stall, producing a 'kworker
>> blocked for more than x sec" in dmesg, recovering after a long delay. See the
>> attached dmesg-suspend-resume-nvme-stuck.txt. I also noticed the same issue
>> happening right after booting or after runtime suspend transitions.
>>
>> The same issue also happens on multiple SKL systems in the i915 team's CI farm,
>> see:
>>
>> https://gitlab.freedesktop.org/drm/intel/-/issues/4547
>>
>> I bisected the problem to
>> commit 4f5022453acd0f7b28012e20b7d048470f129894
>> Author: Jens Axboe <axboe at kernel.dk>
>> Date:   Mon Oct 18 08:45:39 2021 -0600
>>
>>     nvme: wire up completion batching for the IRQ path
>>
>> By reverting it on top of 5.17.0-rc7, I can't reproduce the problem. Attached  
>> dmesg-suspend-resume-nvme-ok.txt with the revert, captured after a few
>> suspend/resume.
> 
> Forwarding to linux-nvme for higher visibility.
> 




More information about the Linux-nvme mailing list