[PATCH] nvme-pci: fix sleeping function called from interrupt context

Maurizio Lombardi mlombard at redhat.com
Mon Dec 18 07:11:38 PST 2023


pá 15. 12. 2023 v 18:25 odesílatel Keith Busch <kbusch at kernel.org> napsal:
> What async event occured to cause this? It looks like the only AEN
> handling that cancels anything is the FW activation one, but that only
> applies to fabrics, not pci. Still, that AEN hanlder sets the state to
> "RESETTING", but we only cancel work when transitioning to "LIVE" from a
> CONNECTING state.

It looks like the stack trace is a bit imprecise.
It's not the call to nvme_change_ctrl_state() that triggers the error
because, as
you correctly noticed, moving the controller to the resetting state
doesn't block.

The problem is the call to nvme_auth_stop().

        case NVME_AER_NOTICE_FW_ACT_STARTING:
                /*
                 * We are (ab)using the RESETTING state to prevent subsequent
                 * recovery actions from interfering with the controller's
                 * firmware activation.
                 */
                if (nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING)) {
                        nvme_auth_stop(ctrl);     <----------------
                        requeue = false;
                        queue_work(nvme_wq, &ctrl->fw_act_work);
                }

void nvme_auth_stop(struct nvme_ctrl *ctrl)
{
        cancel_work_sync(&ctrl->dhchap_auth_work);
}


Maurizio




More information about the Linux-nvme mailing list