[PATCH 1/1] nvme: don't ignore tagset allocation failures

Sagi Grimberg sagi at grimberg.me
Wed Mar 29 22:47:49 PDT 2017


>>> Not having a tagset doesn't mean we can't go live; it just means we can't
>>> do IO, but the admin handle is still up for device management.
>>
>> So don't queue ns scanning...
>
> And what about the sysfs rescan or namespace notify async event? We have
> to fence those off too, so doing it in one place sounds better than
> three.

True, but it still feels wrong that the pci driver fails tagset
allocation and continues to scan namespaces as if nothing happened, then
queue namespaces checks for the existence of a tagset... It looks awkward...

Why is the controller moving to LIVE anyway? Obviously something went
wrong, which is either a SW BUG (EINVAL), or a temporary failure that
needs to be retried again (ENOMEM). So I think the proper solution would
be to either retry again later for transient errors and for
non-transient errors simply log the error remove the controller (its
not a device issue so no point in keeping the controller around for
error log query).

Thoughts?



More information about the Linux-nvme mailing list