[PATCH nvme-cli] nvme: check MUD support before firmware commit

顾泽兵 guzebing at bytedance.com
Tue May 19 23:22:10 PDT 2026


> From: "Tokunori Ikegami"<ikegami.t at gmail.com>
> Date:  Wed, May 20, 2026, 06:50
> Subject:  Re: [PATCH nvme-cli] nvme: check MUD support before firmware commit
> To: "guzebing"<guzebing1612 at gmail.com>, <linux-nvme at lists.infradead.org>
> Cc: "Daniel Wagner"<wagi at kernel.org>, "Guzebing"<guzebing at bytedance.com>
> On 2026/05/19 22:45, guzebing wrote:
> > From: Guzebing <guzebing at bytedance.com>
> >
> > Firmware Commit returns the Multiple Update Detected value in the command
> > completion. nvme-cli currently identifies the controller after the
> > command completes to decide whether to print that value.
> >
> > That post-command Identify is fragile for immediate activation. A possible
> > Linux race looks like:
> >
> >    nvme-cli thread              nvme driver AEN work
> >    ---------------              --------------------
> >    libnvme_exec_admin_passthru()
> >      -> Firmware Commit succeeds
> >                                 nvme_handle_aen_notice()
> >                                   -> FW_ACT_STARTING
> >                                   -> nvme_change_ctrl_state(RESETTING)
> >                                   -> nvme_fw_act_work()
> >                                      -> nvme_quiesce_io_queues()
> >                                      -> wait for activation
> >    fw_commit_print_mud()
> >      -> fw_commit_support_mud()
> >         -> nvme_identify_ctrl()
> >            -> admin ioctl passthru
> >            -> nvme_user_cmd*()
> >            -> blk_mq_alloc_request()
> >            -> __nvme_check_ready()
> >               rejects user admin command while resetting
> >
> > nvme-cli then prints "identify-ctrl: ..." after the successful
> > fw-commit output. The extra error makes it unclear whether Firmware
> > Commit itself failed, even though the command completion was already
> > successful.
> Seems the changes good but to make sure let me confirm below if possible.
>    1. Is this failure really caused?
Yes, we observed this in our production environment last week. For the
SOLIDIGM SB5PH27X076T Gen5 NVMe device, when upgrading the firmware from
G70YG472 to G70YG473, nvme-cli printed the following error after Firmware
Commit:

    identify-ctrl: No such device or address

It reproduced 100% in our environment for this device and this firmware update
path. I suspect this is because the firmware activation takes long enough for
the post-command Identify Controller command to hit the controller reset window.

After applying this patch, the extra identify-ctrl error after the successful
Firmware Commit disappeared.

Environment:
  - kernel: Linux 6.12
  - OS: Debian 13.2
  - nvme-cli: 2.13

>    2. Is there any log or issue report information?
There is no public issue report yet.

The user-visible nvme-cli output was:

    Success committing firmware action:3 slot:1
    identify-ctrl: No such device or address

We did not find any kernel error messages in dmesg around the failure. The only
user-visible error was printed by nvme-cli.

>    3. Not necessary to care the case as the old firmware not supported 
> MUD but only the new firmware supported MUD? (I think so but let me 
> double check.)
> 



More information about the Linux-nvme mailing list