Problems writing to intel P3700 NVMe drive

Bjorn Helgaas bhelgaas at google.com
Wed Jun 3 13:20:51 PDT 2015


On Wed, Jun 3, 2015 at 3:10 PM, Keith Busch <keith.busch at intel.com> wrote:
> +cc the pci list
>
> Hi Bjorn,
>
> A while back, there were a few proposals on changing the pci driver's
> default MPS tuning from the existing do-nothing policy to something
> safe so end-users don't need to remember kernel parameters as described
> below. Is this still active, or can we kick that back to life if not?

I'd love to see some activity there.  Somebody else asked me about
this a few weeks ago, and I sent him this list of some of the things I
think are wrong with the current situation:

- MPS configuration is done in pcie_bus_configure_settings().  This is
  not arch-specific, but it is done by arch code, and only arm,
  powerpc, tile, and x86 call it.

- MPS configuration should be done before a driver claims a device.
  This is currently wrong on arm.

- I would rather have MPS configuration done somewhere in the
  pci_scan_single_device() path.  That way it would be done on every
  arch and in every hotplug path.

- I know MPS depends on more than just the individual device and its
  upstream bridge, so it's not as simple as some other properties.
  But if we have a correctly-configured tree, we should be able to
  figure out what needs to change if we add one more device to it.

- I do not believe it is safe for drivers to manipulate MPS behind the
  back of the PCI core, so pcie_set_mps() should probably not be
  exported.

5f39e6705faa ("PCI: Disable MPS configuration by default") turned off
most the tuning we used to do.  I expect that if we turn it back on,
we'll trip over a few issues.  That in itself is not so much of a
problem, but before we turn things back on, I want to have better
infrastructure in place that makes it easier to diagnose and fix those
issues.  For example, I'd like to see just enough clues in dmesg to
enable us to see what the kernel's doing and verify that it's correct.
And I want it do be done cleanly on every arch for every device, with
hot-added ones being handled the same as those present at boot.

> On Thu, 28 May 2015, Keith Busch wrote:
>>
>> Perhaps not a desirable/permanent solution, but for everyone's benefit,
>> we can work-around the problem with kernel parameter:
>>
>>  pci=pcie_bus_safe
>>
>> Using 'pcie_bus_perf' instead may also work, though your original
>> solution of just using x86 sounded perfectly reasonable to me. :)
>>
>> On Thu, 28 May 2015, Pavilion Storage wrote:
>>>
>>> Thanks to Keith Busch for the providing the clues.
>>> The problem was that the PCIe MPSS on the drive and the CPU root
>>> complex did not match and that caused a problem. Setting it to a
>>> common value fixed the problem.
>>> Kishore



More information about the Linux-nvme mailing list