[bug report]BUG: workqueue leaked atomic, lock or RCU: kworker/u67:3[365] observed from v6.15-rc7
Yi Zhang
yi.zhang at redhat.com
Tue Jun 10 19:06:40 PDT 2025
On Tue, Jun 3, 2025 at 12:25 AM Yi Zhang <yi.zhang at redhat.com> wrote:
>
>
>
> On Mon, Jun 2, 2025 at 3:52 PM John Garry <john.g.garry at oracle.com> wrote:
>>
>> On 02/06/2025 06:52, Christoph Hellwig wrote:
>>
>> +
>>
>> > On Thu, May 29, 2025 at 08:41:36PM +0800, Yi Zhang wrote:
>> >> Hi
>> >>
>> >> My regression test found this issue from v6.15-rc7, please help check
>> >> it and let me know if you need any infor/test for it, thanks.
>> >
>> > Hi Zi,
>> >
>> > The new code seems to be missing a queue_limits_cancel_update,
>> > the patch below fies it. But what kind of devices is this?
>
>
> Yeah, the patch fixed the "BUG: workqueue leaked" issue.
> It's one Micron_9300_MTFDHAL3T8TDP NVMe disk installed on one DELL R6515 server with AMD EPYC 7232P CPU.
>
>>
>> > PCIe muti-controller subsystems aren't that command, and this
>> > looks like a grave bug, combined with the I/O page fault that
>> > looks really odd.
>> >
>
>
> Here is the full steps and logs I did to reproduce it:
> # nvme format -l1 -f /dev/nvme3n1
> Success formatting namespace:1
> # nvme reset /dev/nvme3
> # nvme list
> Node Generic SN Model Namespace Usage Format FW Rev
> --------------------- --------------------- -------------------- ---------------------------------------- ---------- -------------------------- ---------------- --------
> /dev/nvme0n1 /dev/ng0n1 S795NC0X201793 SAMSUNG MZWLO1T9HCJR-00A07 0x1 0.00 B / 1.92 TB 512 B + 0 B OPPA4B5Q
> /dev/nvme1n1 /dev/ng1n1 S39WNA0K201139 Dell Express Flash PM1725a 1.6TB AIC 0x1 1.60 TB / 1.60 TB 512 B + 0 B 1.2.1
> /dev/nvme2n1 /dev/ng2n1 3F50A00H0LR3 KIOXIA KCMYDRUG1T92 0x1 0.00 B / 1.92 TB 512 B + 0 B 1UET7104
> /dev/nvme3n1 /dev/ng3n1 2135312ADFD1 Micron_9300_MTFDHAL3T8TDP 0x1 480.09 GB / 3.84 TB 512 B + 0 B 11300DY0
> /dev/nvme4n1 /dev/ng4n1 S64FNE0R802879 SAMSUNG MZQL2960HCJR-00A07 0x1 960.20 GB / 960.20 GB 512 B + 0 B GDC5302Q
> /dev/nvme5n1 /dev/ng5n1 CVFT6011001V1P6DGN INTEL SSDPEDMD016T4 0x1 1.60 TB / 1.60 TB 512 B + 0 B 8DV10171
> # nvme format -l0 -f /dev/nvme3n1
> Success formatting namespace:1
> # dmesg
> [ 2390.248457] nvme nvme3: nvme3n1: Inconsistent Atomic Write Size, Namespace will not be added: Subsystem=512 bytes, Controller/Namespace=4096 bytes
> [ 2398.087894] nvme nvme3: resetting controller
> [ 2401.937274] nvme nvme3: D3 entry latency set to 10 seconds
> [ 2402.008719] nvme nvme3: 16/0/0 default/read/poll queues
> [ 2402.029564] nvme nvme3: nvme3n1: Inconsistent Atomic Write Size, Namespace will not be added: Subsystem=512 bytes, Controller/Namespace=4096 bytes
> [ 2446.470477] amd_iommu_report_page_fault: 3 callbacks suppressed
> [ 2446.470489] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0033 address=0x0 flags=0x0020]
> [ 2446.486293] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0033 address=0x0 flags=0x0020]
> [ 2446.496099] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0033 address=0x0 flags=0x0020]
> [ 2446.505896] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0033 address=0x500 flags=0x0020]
> [ 2446.515863] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0033 address=0x0 flags=0x0020]
> [ 2446.525657] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0033 address=0x500 flags=0x0020]
> [ 2446.535631] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0033 address=0x0 flags=0x0020]
> [ 2446.545431] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0033 address=0x600 flags=0x0020]
> [ 2446.555400] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0033 address=0x0 flags=0x0020]
> [ 2446.565224] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0033 address=0x600 flags=0x0020]
>
>>
>>
>> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> > index f69a232a000a..4bb3c68b3451 100644
>> > --- a/drivers/nvme/host/core.c
>> > +++ b/drivers/nvme/host/core.c
>> > @@ -2388,6 +2388,7 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
>> > * atomic write capabilities.
>> > */
>> > if (lim.atomic_write_hw_max > ns->ctrl->subsys->atomic_bs) {
>> > + queue_limits_cancel_update(ns->disk->queue);
>>
>> For that:
>>
>> Reviewed-by: John Garry <john.g.garry at oracle.com>
>
>
> Tested-by: Yi Zhang <yi.zhang at redhat.com>
>
Hi Christoph
Would you mind sending a formal patch for this issue? Thanks.
>>
>>
>> > blk_mq_unfreeze_queue(ns->disk->queue, memflags);
>> > ret = -ENXIO;
>> > goto out;
>> >
>>
>> But for the scenario which triggers this:
>>
>>
>> [ 2313.264089] nvme nvme2: resetting controller
>> [ 2317.125038] nvme nvme2: D3 entry latency set to 10 seconds
>> [ 2317.201142] nvme nvme2: 16/0/0 default/read/poll queues
>> [ 2319.450561] nvme nvme2: nvme2n1: Inconsistent Atomic Write Size,
>> Namespace will not be added: Subsystem=4096 bytes,
>> Controller/Namespace=512 bytes
>>
>> ....
>>
>> [ 2319.881675] nvme nvme3: rescanning namespaces.
>> [ 2320.163354] nvme nvme3: nvme3n1: Inconsistent Atomic Write Size,
>> Namespace will not be added: Subsystem=4096 bytes,
>> Controller/Namespace=32768 bytes
>> [ 2320.177588] BUG: workqueue leaked atomic, lock or RCU: kworker/u67:3[365]
>> preempt=0x00000000 lock=0->1 RCU=0->0
>> workfn=async_run_entry_fn
>>
>> It is overkill to just not add the namespace? I was under the impression
>> that this would be an highly unlikely scenario (of inconsistent atomic
>> write sizes).
>>
> Yeah, although the kernel shows "Namespace will not be added:", but the ns still can be seen after the format operation:
> # nvme format -l1 -f /dev/nvme3n1
> Success formatting namespace:1
> # lsblk
> nvme4n1 259:0 0 894.3G 0 disk
> nvme2n1 259:1 0 1.7T 0 disk
> nvme0n1 259:2 0 1.7T 0 disk
> nvme1n1 259:3 0 1.5T 0 disk
> nvme3n1 259:4 0 3.5T 0 disk
> nvme5n1 259:5 0 1.5T 0 disk
> # dmesg
> [ 3324.943476] nvme nvme3: nvme3n1: Inconsistent Atomic Write Size, Namespace will not be added: Subsystem=512 bytes, Controller/Namespace=4096 bytes
>
>
> --
> Best Regards,
> Yi Zhang
--
Best Regards,
Yi Zhang
More information about the Linux-nvme
mailing list