[PATCH] nvme: fix APST error for power latency tolerance

Christoph Hellwig hch at infradead.org
Wed Mar 24 07:58:24 GMT 2021


On Wed, Mar 24, 2021 at 10:38:12AM +0800, Peng Liu wrote:
> On Tue, Mar 23, 2021 at 04:23:21PM +0000, Christoph Hellwig wrote:
> > On Tue, Mar 23, 2021 at 03:31:33PM +0800, pngliu at hotmail.com wrote:
> > > From: Peng Liu <liupeng17 at lenovo.com>
> > > 
> > > Clear apsta so that nvme_configure_apst() does not execute
> > > nvme_set_features(), which will fail because admin_q is either not set up
> > > yet or no longer available at the time of nvme_uninit_ctrl() being called,
> > > and this leads to the error message "nvme nvme0: failed to set APST feature
> > > (-19)".
> > > 
> > > Fixes: 510a405d945b("nvme: fix memory leak for power latency tolerance")
> > 
> > How did you get into this situation?  For PCIe nvme_uninit_ctrl is
> > only called at the end of ->remove and ->delete_ctrl, so how do we end
> > up in nvme_configure_apst after that?
> 
> I got into it with nvme surprise and non-surprise hot-removal tests.
> Below is the stack ftrace result for nvme_configure_apst under the
> surprise hot-removal, and it is similar for the non-surprise hot-removal.

Ok, looks like dev_pm_qos_hide_latency_tolerance calls back into
nvme_set_latency_tolerance, which is a little .. unexpected.

Does this patch work for you?

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 0896e21642beba..d5d7e0cdd78d80 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2681,7 +2681,8 @@ static void nvme_set_latency_tolerance(struct device *dev, s32 val)
 
 	if (ctrl->ps_max_latency_us != latency) {
 		ctrl->ps_max_latency_us = latency;
-		nvme_configure_apst(ctrl);
+		if (ctrl->state == NVME_CTRL_LIVE)
+			nvme_configure_apst(ctrl);
 	}
 }
 



More information about the Linux-nvme mailing list