Intermittent storage (dm-crypt?) freeze - regression 6.4->6.5
Keith Busch
kbusch at kernel.org
Thu Nov 2 07:02:34 PDT 2023
On Wed, Nov 01, 2023 at 07:23:05PM +0800, Ming Lei wrote:
> On Wed, Nov 01, 2023 at 11:15:02AM +0100, Hannes Reinecke wrote:
> > > nvme_queue_rq() on the above request.
> > >
> > And that is something I've been wondering (for quite some time now):
> > What _is_ the appropriate error handling for -ENOMEM?
>
> It is just my guess.
>
> Actually it shouldn't fail since the sgl allocation is backed with
> memory pool, but there is also dma pool allocation and dma mapping.
>
> > At this time, we assume it to be a retryable error and re-run the queue
> > in the hope that things will sort itself out.
>
> It should not be hard to figure out why nvme_queue_rq() can't move on.
There's only a few reasons nvme_queue_rq would return BLK_STS_RESOURCE
for a typical read/write command:
DMA mapping error
Can't allocate SGL from mempool
Can't allocate PRP from dma_pool
Controller stuck in resetting state
We should always be able to get at least one allocation from the memory
pools, so I think the only one the driver doesn't have a way to
guarantee eventual forward progress are the DMA mapping error
conditions. Is there some other limit that the driver needs to consider
when configuring it's largest supported transfers?
More information about the Linux-nvme
mailing list