[PATCH RFC 2/2] nvme: set integrity metadata size for EXT_LBAS non-PI namespace

Thu May 7 01:05:54 PDT 2026

On Thu, May 07, 2026 at 07:49:44AM +0200, Christoph Hellwig wrote:
> On Sun, Apr 26, 2026 at 08:34:57PM -0400, Chao Shi wrote:
> > +		/*
> > +		 * For PCIe EXT_LBAS non-PI namespaces the block layer sets
> > +		 * capacity to 0 (we return false) to prevent block I/O, but a
> > +		 * cached-rq bio may bypass bio_queue_enter freeze serialisation
> > +		 * and reach nvme_setup_rw() with head->ms != 0 and no
> > +		 * REQ_INTEGRITY set.  Populate bi->metadata_size so that
> > +		 * bio_integrity_action() returns non-zero and bio_integrity_prep()
> > +		 * sets REQ_INTEGRITY on any such bio, preventing the WARN_ON_ONCE
> > +		 * at nvme_setup_rw() (addressed by patch 1/2).
> 
> This sounds like the Bug Keith is trying to fix in the block layer
> ("blk-mq: check for stale cached request in blk_mq_submit_bio") ?

Both issues are dealing with cached request corner cases, but they're
not really related. My bug fix is specifically when those requests are
freed, while this one is just racing with them.

For this patch's issue, how is the host being triggered to rescan? If
you're sending a "Format NVM" command, the driver would have frozen the
queues first, which would have waited for any cached requests to flush
out and prevent new ones from being allocated until after the format has
been updated in the driver.

It's possible to format the namespace from a different controller the
host doesn't know about, so we've always had a race where the actual
format is different than what the host knows about. The rescan would
have to be triggered some other way in that case (either through AEN or
manual sysfs/ioctl trigger).

We always ensure the queue is frozen when we update the queue limits in
this path too, so the driver and block layer should always be in sync
even if it's not in sync with the device. That in itself can be pretty
nasty, but we'd need a new NVMe TP to define a way to fix that. But
specifically on the driver warning here, I'm not sure how it is getting
triggered due to the existing freeze semantics around the queue limits
update.