[PATCH v8 0/8] nvme-fc: FPIN link integrity handling
John Meneghini
jmeneghi at redhat.com
Wed Jul 9 15:05:57 PDT 2025
I've opened an upstream bugzilla to track this enhancement.
https://bugzilla.kernel.org/show_bug.cgi?id=220329
I've asked Bryan to record all information about the unit tests we are developing for FPIN there.
John A. Meneghini
Senior Principal Platform Storage Engineer
RHEL SST - Platform Storage Group
jmeneghi at redhat.com
On 7/9/25 5:19 PM, Bryan Gurney wrote:
> FPIN LI (link integrity) messages are received when the attached
> fabric detects hardware errors. In response to these messages I/O
> should be directed away from the affected ports, and only used
> if the 'optimized' paths are unavailable.
> Upon port reset the paths should be put back in service as the
> affected hardware might have been replaced.
> This patch adds a new controller flag 'NVME_CTRL_MARGINAL'
> which will be checked during multipath path selection, causing the
> path to be skipped when checking for 'optimized' paths. If no
> optimized paths are available the 'marginal' paths are considered
> for path selection alongside the 'non-optimized' paths.
> It also introduces a new nvme-fc callback 'nvme_fc_fpin_rcv()' to
> evaluate the FPIN LI TLV payload and set the 'marginal' state on
> all affected rports.
>
> The testing for this patch set was performed by Bryan Gurney, using the
> process outlined by John Meneghini's presentation at LSFMM 2024, where
> the fibre channel switch sends an FPIN notification on a specific switch
> port, and the following is checked on the initiator:
>
> 1. The controllers corresponding to the paths on the port that has
> received the notification are showing a set NVME_CTRL_MARGINAL flag.
>
> \
> +- nvme4 fc traddr=c,host_traddr=e live optimized
> +- nvme5 fc traddr=8,host_traddr=e live non-optimized
> +- nvme8 fc traddr=e,host_traddr=f marginal optimized
> +- nvme9 fc traddr=a,host_traddr=f marginal non-optimized
>
> 2. The I/O statistics of the test namespace show no I/O activity on the
> controllers with NVME_CTRL_MARGINAL set.
>
> Device tps MB_read/s MB_wrtn/s MB_dscd/s
> nvme4c4n1 0.00 0.00 0.00 0.00
> nvme4c5n1 25001.00 0.00 97.66 0.00
> nvme4c9n1 25000.00 0.00 97.66 0.00
> nvme4n1 50011.00 0.00 195.36 0.00
>
>
> Device tps MB_read/s MB_wrtn/s MB_dscd/s
> nvme4c4n1 0.00 0.00 0.00 0.00
> nvme4c5n1 48360.00 0.00 188.91 0.00
> nvme4c9n1 1642.00 0.00 6.41 0.00
> nvme4n1 49981.00 0.00 195.24 0.00
>
>
> Device tps MB_read/s MB_wrtn/s MB_dscd/s
> nvme4c4n1 0.00 0.00 0.00 0.00
> nvme4c5n1 50001.00 0.00 195.32 0.00
> nvme4c9n1 0.00 0.00 0.00 0.00
> nvme4n1 50016.00 0.00 195.38 0.00
>
> Link: https://people.redhat.com/jmeneghi/LSFMM_2024/LSFMM_2024_NVMe_Cancel_and_FPIN.pdf
>
> More rigorous testing was also performed to ensure proper path migration
> on each of the eight different FPIN link integrity events, particularly
> during a scenario where there are only non-optimized paths available, in
> a state where all paths are marginal. On a configuration with a
> round-robin iopolicy, when all paths on the host show as marginal, I/O
> continues on the optimized path that was most recently non-marginal.
> From this point, of both of the optimized paths are down, I/O properly
> continues on the remaining paths.
>
> The testing so far has been done with an Emulex host bus adapter using
> lpfc. When tested on a QLogic host bus adapter, a warning was found
> when the first FPIN link integrity event was received by the host:
>
> kernel: memcpy: detected field-spanning write (size 60) of single field
> "((uint8_t *)fpin_pkt + buffer_copy_offset)"
> at drivers/scsi/qla2xxx/qla_isr.c:1221 (size 44)
>
> Line 1221 of qla_isr.c is in the function qla27xx_copy_fpin_pkt().
>
>
> Changes to the original submission:
> - Changed flag name to 'marginal'
> - Do not block marginal path; influence path selection instead
> to de-prioritize marginal paths
>
> Changes to v2:
> - Split off driver-specific modifications
> - Introduce 'union fc_tlv_desc' to avoid casts
>
> Changes to v3:
> - Include reviews from Justin Tee
> - Split marginal path handling patch
>
> Changes to v4:
> - Change 'u8' to '__u8' on fc_tlv_desc to fix a failure to build
> - Print 'marginal' instead of 'live' in the state of controllers
> when they are marginal
>
> Changes to v5:
> - Minor spelling corrections to patch descriptions
>
> Changes to v6:
> - No code changes; added note about additional testing
>
> Changes to v7:
> - Split nvme core marginal flag addition into its own patch
> - Add patch for queue_depth marginal path support
>
> Bryan Gurney (2):
> nvme: add NVME_CTRL_MARGINAL flag
> nvme: sysfs: emit the marginal path state in show_state()
>
> Hannes Reinecke (5):
> fc_els: use 'union fc_tlv_desc'
> nvme-fc: marginal path handling
> nvme-fc: nvme_fc_fpin_rcv() callback
> lpfc: enable FPIN notification for NVMe
> qla2xxx: enable FPIN notification for NVMe
>
> John Meneghini (1):
> nvme-multipath: queue-depth support for marginal paths
>
> drivers/nvme/host/core.c | 1 +
> drivers/nvme/host/fc.c | 99 +++++++++++++++++++
> drivers/nvme/host/multipath.c | 24 +++--
> drivers/nvme/host/nvme.h | 6 ++
> drivers/nvme/host/sysfs.c | 4 +-
> drivers/scsi/lpfc/lpfc_els.c | 84 ++++++++--------
> drivers/scsi/qla2xxx/qla_isr.c | 3 +
> drivers/scsi/scsi_transport_fc.c | 27 +++--
> include/linux/nvme-fc-driver.h | 3 +
> include/uapi/scsi/fc/fc_els.h | 165 +++++++++++++++++--------------
> 10 files changed, 275 insertions(+), 141 deletions(-)
>
More information about the Linux-nvme
mailing list