[PATCH v15,RESEND 22/23] PCI: starfive: Offload the NVMe timeout workaround to host drivers.
Keith Busch
kbusch at kernel.org
Wed Mar 13 19:51:29 PDT 2024
On Thu, Mar 14, 2024 at 02:18:38AM +0000, Kevin Xie wrote:
> > Re: [PATCH v15,RESEND 22/23] PCI: starfive: Offload the NVMe timeout
> > workaround to host drivers.
> >
> > On Mon, Mar 04, 2024 at 10:08:06AM -0800, Palmer Dabbelt wrote:
> > > On Thu, 29 Feb 2024 07:08:43 PST (-0800), lpieralisi at kernel.org wrote:
> > > > On Tue, Feb 27, 2024 at 06:35:21PM +0800, Minda Chen wrote:
> > > > > From: Kevin Xie <kevin.xie at starfivetech.com>
> > > > >
> > > > > As the Starfive JH7110 hardware can't keep two inbound post write
> > > > > in order all the time, such as MSI messages and NVMe completions.
> > > > > If the NVMe completion update later than the MSI, an NVMe IRQ handle
> > will miss.
> > > >
> > > > Please explain what the problem is and what "NVMe completions" means
> > > > given that you are talking about posted writes.
>
> Sorry, we made a casual conclusion here.
> Not any two of inbound post requests can`t be kept in order in JH7110 SoC,
> the only one case we found is NVMe completions with MSI interrupts.
> To be more precise, they are the pending status in nvme_completion struct and
> nvme_irq handler in nvme/host/pci.c.
>
> We have shown the original workaround patch before:
> https://lore.kernel.org/lkml/CAJM55Z9HtBSyCq7rDEDFdw644pOWCKJfPqhmi3SD1x6p3g2SLQ@mail.gmail.com/
> We put it in our github branch and works fine for a long time.
> Looking forward to better advices from someone familiar with NVMe drivers.
So this platform treats strictly ordered writes the same as if relaxed
ordering was enabled? I am not sure if we could reasonably work around
such behavior. An arbitrary delay is likely too long for most cases, and
too short for the worst case.
I suppose we could quirk a non-posted transaction in the interrupt
handler to force flush pending memory updates, but that will noticeably
harm your nvme performance. Maybe if you constrain such behavior to the
spurious IRQ_NONE condition, then it might be okay? I don't know.
More information about the linux-riscv
mailing list