[PATCH v7 0/6] nvme-fc: FPIN link integrity handling
John Meneghini
jmeneghi at redhat.com
Fri Jul 11 09:49:22 PDT 2025
Adding Howard Johnson.
On 7/11/25 4:54 AM, Muneendra Kumar M wrote:
> >>But that's precisely it, isn't it?
> >>If it's a straight error the path/controller is being reset, and
> >>really there's nothing for us to be done.
> >>If it's an FPIN LI _without_ any performance impact, why shouldn't
> >>we continue to use that path? Would there be any impact if we do?
> >>And if it's an FPIN LI with _any_ sort of performance impact
> >>(or a performance impact which might happen eventually) the
> >>current approach of steering away I/O should be fine.
> [Muneendra]
>
> With FPIN/LinkIntegrity (LI) - there is still connectivity, but the FPIN is identifying the link to the target (could be multiple remoteports if the target is doing NPIV) that had some error. It is *not* indicating that I/O won't complete. True, some I/O may not due to the error that affected it. And it is true, but not likely that all i/o hits the same problem. What we have seen with flaky links is most I/O does complete, but a few I/Os don't.
> It's actually a rather funky condition, kind of sick but not dead scenario.
> As FPIN-Li indicates that the path is "flaky" and using this path further will have a performance impact.
> And the current approach of steering away I/O is fine for FPIN-Li.
OK, then can we all agree that the current patch series, including the patch for the queue-depth handler, does the correct thing for FPIN LI?
Right now this patch series will "disable" a controller and remove it from being actively used by the multipath scheduler once that controller/path receives an FPIN LI event.
This is true for all three multi-path schedulers: round-robin, queue-depth and numa. Once a controller/path has received an LI event is reports a state of "marginal" in the
controller state field (e.g.: /sys/devices/virtual/nvme-subsystem/nvme-subsys6/nvme4/state). While in the marginal state the controller can still be used, it's only the path
selection policy that avoids it in the nvme multipath scheduling code.
These patches also prefer non-marginal paths over marginal paths and optimized paths over non-optimized paths. If all optimized paths are marginal, then the
non-optimized paths are used. If all paths are marginal then the first available marginal optimized path is used, else the first available non-optimized path is used.
To clear the marginal state a controller must be disconnected, allowing the /dev to be removed, and then reconnected. (We might want to change this, but that can be discussed in a future patch)
Bryan and I have tested these patches with all of the above configurations conditions and, at this point, they are working as described above while using an LPFC adapter.
You can see the test plan Bryan and I used is at https://bugzilla.kernel.org/show_bug.cgi?id=220329#c1
We observed a problem while testing this with a QLA adapter.
So I am hoping one more update to this patch series, to fix the QLA problem, will complete this work.
/John
More information about the Linux-nvme
mailing list