re attach-ns causing IO errors

Keith Busch kbusch at kernel.org
Mon Feb 19 17:28:37 PST 2024


On Mon, Feb 19, 2024 at 04:49:06PM -0600, Wen Xiong wrote:
> Hi All,
> 
> We discussed this at the beginning of November last year.
> Test team with extra ns-rescan, get new namespace back last year.
> But they still ask if there is way not doing extra ns-rescan.
> 
> Consider this scenario with nguid changes:
> 
> # nvme list-subsys
> nvme-subsys0 -
> NQN=nqn.1994-11.com.samsung:nvme:PM1743:2.5-inch:S7DDNG0X100093
> \
> +- nvme0 pcie 052a:58:00.0 live
> +- nvme1 pcie 058a:58:00.0 live
> 
> After system boots up:
> Nvme-subsys0  -> ns_head(NSID1/NGUID1)
> /dev/nvme0n1 -> ns_head(NSID1/NGUID1) ->ns(NSID1/NGUID1)
> /dev/nvme1     -> ns_head(NSID1/NGUID1) ->ns(NSID1/NGUID1)
> 
> After delete-ns /dev/nvme0 -n 1 -c 0x82:
> Nvme-subsy0 -> no more ns_head(EMPTY)
> /dev/nvme0  -> no ns(EMPTY)
> /dev/nvme1 -> no ns(EMPTY)
> 
> create-ns /dev/nvme0 -s 0x5000000 -c 0x5000000 -f 0 -d 0  -m 1:
> 
> After attach-ns /dev/nvme0 -n 1 -c 0x82: I saw calling nvme_scan_ns_list()
> twice.
> 
> 1st scan: nvme_queue_scan() -> nvme_scan_ns_list(), we got:
> Nvme-subsy0 -> ns_head(NSID1/NGUID1)   -------> Note: this is old NGUID1
> /dev/nvme0 -> ns_head(NSID1/NGUID1) ->ns(NSID1/NGUID1)  ----> Note: this is
> old NGUID1
> 
> 2nd scan: scan_work()  -> nvme_scan_ns_list, we got:
> Nvme-subsys0 ->ns_head(NSID1/NGUID1) ----> created this ns_head in 1st scan.
> /dev/nvme0n1  -> ns_head(NSID1/NGUID1) -> ns(NSID1/NGUID2)
> Then saw: nvme nvme0: identifiers changed for nsid 1  ---> because of NGUID
> changed to NGUID2
>                    block nvme0n1: no available path - failing I/O
>                    block nvme0n1: no available path - failing I/O
> 
> My question is:
> - Looks only scan once during system boot up.
> - Why scan twice when did delete-ns/create-ns/attach operation? 1st scan:
> Nvme_queue_scan is called from user space or udev?
> - 1st scan, NGUID1 is old one, 2nd scan, NGUID2 is new one?

The driver will automatically rescan based on command effects. It's also
possible the device completes an AEN for the Namespace List change
event, which would trigger a 2nd namespace scan. We use effects in the
driver on admin passthrough commands because a user can always mask an
AEN to be off with a 'set-feature' command, so we can't rely on it.



More information about the Linux-nvme mailing list