[RFC PATCH 5/5] nvme-vfio: Add a document for the NVMe device

Jonathan Derrick jonathan.derrick at linux.dev
Wed Dec 7 14:42:25 PST 2022



On 12/5/2022 10:58 PM, Lei Rao wrote:
> The documentation describes the details of the NVMe hardware
> extension to support VFIO live migration.
> 
> Signed-off-by: Lei Rao <lei.rao at intel.com>
> Signed-off-by: Yadong Li <yadong.li at intel.com>
> Signed-off-by: Chaitanya Kulkarni <kch at nvidia.com>
> Reviewed-by: Eddie Dong <eddie.dong at intel.com>
> Reviewed-by: Hang Yuan <hang.yuan at intel.com>
> ---
>  drivers/vfio/pci/nvme/nvme.txt | 278 +++++++++++++++++++++++++++++++++
>  1 file changed, 278 insertions(+)
>  create mode 100644 drivers/vfio/pci/nvme/nvme.txt
> 
> diff --git a/drivers/vfio/pci/nvme/nvme.txt b/drivers/vfio/pci/nvme/nvme.txt
> new file mode 100644
> index 000000000000..eadcf2082eed
> --- /dev/null
> +++ b/drivers/vfio/pci/nvme/nvme.txt
> @@ -0,0 +1,278 @@
> +===========================
> +NVMe Live Migration Support
> +===========================
> +
> +Introduction
> +------------
> +To support live migration, NVMe device designs its own implementation,
> +including five new specific admin commands and a capability flag in
> +the vendor-specific field in the identify controller data structure to
> +support VF's live migration usage. Software can use these live migration
> +admin commands to get device migration state data size, save and load the
> +data, suspend and resume the given VF device. They are submitted by software
> +to the NVMe PF device's admin queue and ignored if placed in the VF device's
> +admin queue. This is due to the NVMe VF device being passed to the virtual
> +machine in the virtualization scenario. So VF device's admin queue is not
> +available for the hypervisor to submit VF device live migration commands.
> +The capability flag in the identify controller data structure can be used by
> +software to detect if the NVMe device supports live migration. The following
> +chapters introduce the detailed format of the commands and the capability flag.
> +
> +Definition of opcode for live migration commands
> +------------------------------------------------
> +
> ++---------------------------+-----------+-----------+------------+
> +|                           |           |           |            |
> +|     Opcode by Field       |           |           |            |
> +|                           |           |           |            |
> ++--------+---------+--------+           |           |            |
> +|        |         |        | Combined  | Namespace |            |
> +|    07  |  06:02  | 01:00  |  Opcode   | Identifier|  Command   |
> +|        |         |        |           |    used   |            |
> ++--------+---------+--------+           |           |            |
> +|Generic | Function|  Data  |           |           |            |
> +|command |         |Transfer|           |           |            |
> ++--------+---------+--------+-----------+-----------+------------+
> +|                                                                |
> +|                     Vendor SpecificOpcode                      |
> ++--------+---------+--------+-----------+-----------+------------+
> +|        |         |        |           |           | Query the  |
> +|   1b   |  10001  |  00    |   0xC4    |           | data size  |
> ++--------+---------+--------+-----------+-----------+------------+
> +|        |         |        |           |           | Suspend the|
> +|   1b   |  10010  |  00    |   0xC8    |           |    VF      |
> ++--------+---------+--------+-----------+-----------+------------+
> +|        |         |        |           |           | Resume the |
> +|   1b   |  10011  |  00    |   0xCC    |           |    VF      |
> ++--------+---------+--------+-----------+-----------+------------+
> +|        |         |        |           |           | Save the   |
> +|   1b   |  10100  |  10    |   0xD2    |           |device data |
> ++--------+---------+--------+-----------+-----------+------------+
> +|        |         |        |           |           | Load the   |
> +|   1b   |  10101  |  01    |   0xD5    |           |device data |
> ++--------+---------+--------+-----------+-----------+------------+
> +
I'm assuming by using these vendor specific opcodes and id-ctrl's vu space, that
you should make this code be protected by some kind of vendor/device specific-flag
or preferably something standard.

No way of knowing what 0xC4 will do on a non-lm drive, for instance



More information about the Linux-nvme mailing list