[bug report]BUG: workqueue leaked atomic, lock or RCU: kworker/u67:3[365] observed from v6.15-rc7
John Garry
john.g.garry at oracle.com
Mon Jun 2 00:51:45 PDT 2025
On 02/06/2025 06:52, Christoph Hellwig wrote:
+
> On Thu, May 29, 2025 at 08:41:36PM +0800, Yi Zhang wrote:
>> Hi
>>
>> My regression test found this issue from v6.15-rc7, please help check
>> it and let me know if you need any infor/test for it, thanks.
>
> Hi Zi,
>
> The new code seems to be missing a queue_limits_cancel_update,
> the patch below fies it. But what kind of devices is this?
> PCIe muti-controller subsystems aren't that command, and this
> looks like a grave bug, combined with the I/O page fault that
> looks really odd.
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index f69a232a000a..4bb3c68b3451 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2388,6 +2388,7 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
> * atomic write capabilities.
> */
> if (lim.atomic_write_hw_max > ns->ctrl->subsys->atomic_bs) {
> + queue_limits_cancel_update(ns->disk->queue);
For that:
Reviewed-by: John Garry <john.g.garry at oracle.com>
> blk_mq_unfreeze_queue(ns->disk->queue, memflags);
> ret = -ENXIO;
> goto out;
>
But for the scenario which triggers this:
[ 2313.264089] nvme nvme2: resetting controller
[ 2317.125038] nvme nvme2: D3 entry latency set to 10 seconds
[ 2317.201142] nvme nvme2: 16/0/0 default/read/poll queues
[ 2319.450561] nvme nvme2: nvme2n1: Inconsistent Atomic Write Size,
Namespace will not be added: Subsystem=4096 bytes,
Controller/Namespace=512 bytes
....
[ 2319.881675] nvme nvme3: rescanning namespaces.
[ 2320.163354] nvme nvme3: nvme3n1: Inconsistent Atomic Write Size,
Namespace will not be added: Subsystem=4096 bytes,
Controller/Namespace=32768 bytes
[ 2320.177588] BUG: workqueue leaked atomic, lock or RCU: kworker/u67:3[365]
preempt=0x00000000 lock=0->1 RCU=0->0
workfn=async_run_entry_fn
It is overkill to just not add the namespace? I was under the impression
that this would be an highly unlikely scenario (of inconsistent atomic
write sizes).
More information about the Linux-nvme
mailing list