[PATCH nvme-cli] nvme: check MUD support before firmware commit
顾泽兵
guzebing at bytedance.com
Tue May 19 23:22:10 PDT 2026
> From: "Tokunori Ikegami"<ikegami.t at gmail.com>
> Date: Wed, May 20, 2026, 06:50
> Subject: Re: [PATCH nvme-cli] nvme: check MUD support before firmware commit
> To: "guzebing"<guzebing1612 at gmail.com>, <linux-nvme at lists.infradead.org>
> Cc: "Daniel Wagner"<wagi at kernel.org>, "Guzebing"<guzebing at bytedance.com>
> On 2026/05/19 22:45, guzebing wrote:
> > From: Guzebing <guzebing at bytedance.com>
> >
> > Firmware Commit returns the Multiple Update Detected value in the command
> > completion. nvme-cli currently identifies the controller after the
> > command completes to decide whether to print that value.
> >
> > That post-command Identify is fragile for immediate activation. A possible
> > Linux race looks like:
> >
> > nvme-cli thread nvme driver AEN work
> > --------------- --------------------
> > libnvme_exec_admin_passthru()
> > -> Firmware Commit succeeds
> > nvme_handle_aen_notice()
> > -> FW_ACT_STARTING
> > -> nvme_change_ctrl_state(RESETTING)
> > -> nvme_fw_act_work()
> > -> nvme_quiesce_io_queues()
> > -> wait for activation
> > fw_commit_print_mud()
> > -> fw_commit_support_mud()
> > -> nvme_identify_ctrl()
> > -> admin ioctl passthru
> > -> nvme_user_cmd*()
> > -> blk_mq_alloc_request()
> > -> __nvme_check_ready()
> > rejects user admin command while resetting
> >
> > nvme-cli then prints "identify-ctrl: ..." after the successful
> > fw-commit output. The extra error makes it unclear whether Firmware
> > Commit itself failed, even though the command completion was already
> > successful.
> Seems the changes good but to make sure let me confirm below if possible.
> 1. Is this failure really caused?
Yes, we observed this in our production environment last week. For the
SOLIDIGM SB5PH27X076T Gen5 NVMe device, when upgrading the firmware from
G70YG472 to G70YG473, nvme-cli printed the following error after Firmware
Commit:
identify-ctrl: No such device or address
It reproduced 100% in our environment for this device and this firmware update
path. I suspect this is because the firmware activation takes long enough for
the post-command Identify Controller command to hit the controller reset window.
After applying this patch, the extra identify-ctrl error after the successful
Firmware Commit disappeared.
Environment:
- kernel: Linux 6.12
- OS: Debian 13.2
- nvme-cli: 2.13
> 2. Is there any log or issue report information?
There is no public issue report yet.
The user-visible nvme-cli output was:
Success committing firmware action:3 slot:1
identify-ctrl: No such device or address
We did not find any kernel error messages in dmesg around the failure. The only
user-visible error was printed by nvme-cli.
> 3. Not necessary to care the case as the old firmware not supported
> MUD but only the new firmware supported MUD? (I think so but let me
> double check.)
>
More information about the Linux-nvme
mailing list