[PATCH v2] nvme/pci: Log PCI_STATUS when the controller dies
Andy Lutomirski
luto at amacapital.net
Fri Dec 2 08:57:46 PST 2016
On Fri, Dec 2, 2016 at 5:26 AM, Christoph Hellwig <hch at infradead.org> wrote:
> On Thu, Dec 01, 2016 at 04:42:41PM -0800, Andy Lutomirski wrote:
>> When debugging nvme controller crashes, it's nice to know whether
>> the controller died cleanly so that the failure is just reflected in
>> CSTS, it died and put an error in PCI_STATUS, or whether it died so
>> badly that it stopped responding to PCI configuration space reads.
>
> Just curious: what controller did this happen with?
I've seen a failure that gives 0xffff in PCI_STATUS on a Samsung
"SM951 NVMe SAMSUNG 256GB" with firmware "BXW75D0Q".
I'll add that to the v3 changelog.
>
>> + /* Read a config register to help see what died. */
>> + u16 pci_status;
>> + int result;
>> +
>> + result = pci_read_config_word(to_pci_dev(dev->dev),
>> + PCI_STATUS, &pci_status);
>> + if (result == PCIBIOS_SUCCESSFUL)
>> + dev_warn(dev->dev,
>> + "controller is down; will reset: CSTS=0x%x, PCI_STATUS=0x%hx\n",
>> + csts, pci_status);
>> + else
>> + dev_warn(dev->dev,
>> + "controller is down; will reset: CSTS=0x%x, PCI_STATUS read failed (%d)\n",
>> + csts, result);
>> + }
>
> Can you factor all this debug code into a separate function to keep
> the main flow easier to read?
>
> Except for that this patch looks fine to me:
>
> Reviewed-by: Christoph Hellwig <hch at lst.de>
Done. v3 coming.
--
Andy Lutomirski
AMA Capital Management, LLC
More information about the Linux-nvme
mailing list