[bug report]BUG: workqueue leaked atomic, lock or RCU: kworker/u67:3[365] observed from v6.15-rc7

John Garry john.g.garry at oracle.com
Mon Jun 2 00:51:45 PDT 2025


On 02/06/2025 06:52, Christoph Hellwig wrote:

+

> On Thu, May 29, 2025 at 08:41:36PM +0800, Yi Zhang wrote:
>> Hi
>>
>> My regression test found this issue from v6.15-rc7, please help check
>> it and let me know if you need any infor/test for it, thanks.
> 
> Hi Zi,
> 
> The new code seems to be missing a queue_limits_cancel_update,
> the patch below fies it.  But what kind of devices is this?
> PCIe muti-controller subsystems aren't that command, and this
> looks like a grave bug, combined with the I/O page fault that
> looks really odd.
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index f69a232a000a..4bb3c68b3451 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2388,6 +2388,7 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
>   	 * atomic write capabilities.
>   	 */
>   	if (lim.atomic_write_hw_max > ns->ctrl->subsys->atomic_bs) {
> +		queue_limits_cancel_update(ns->disk->queue);

For that:

Reviewed-by: John Garry <john.g.garry at oracle.com>

>   		blk_mq_unfreeze_queue(ns->disk->queue, memflags);
>   		ret = -ENXIO;
>   		goto out;
> 

But for the scenario which triggers this:


[ 2313.264089] nvme nvme2: resetting controller
[ 2317.125038] nvme nvme2: D3 entry latency set to 10 seconds
[ 2317.201142] nvme nvme2: 16/0/0 default/read/poll queues
[ 2319.450561] nvme nvme2: nvme2n1: Inconsistent Atomic Write Size,
Namespace will not be added: Subsystem=4096 bytes,
Controller/Namespace=512 bytes

....

[ 2319.881675] nvme nvme3: rescanning namespaces.
[ 2320.163354] nvme nvme3: nvme3n1: Inconsistent Atomic Write Size,
Namespace will not be added: Subsystem=4096 bytes,
Controller/Namespace=32768 bytes
[ 2320.177588] BUG: workqueue leaked atomic, lock or RCU: kworker/u67:3[365]
                     preempt=0x00000000 lock=0->1 RCU=0->0
workfn=async_run_entry_fn

It is overkill to just not add the namespace? I was under the impression 
that this would be an highly unlikely scenario (of inconsistent atomic 
write sizes).





More information about the Linux-nvme mailing list