回复: 回复: [PATCH v13 0/21] Refactoring Microchip PCIe driver and add StarFive PCIe

Mon Jan 8 02:48:10 PST 2024

> Kevin Xie <kevin.xie at starfivetech.com> writes:
> 
> >> Minda Chen <minda.chen at starfivetech.com> writes:
> >>
> >> > This patchset final purpose is add PCIe driver for StarFive JH7110 SoC.
> >> > JH7110 using PLDA XpressRICH PCIe IP. Microchip PolarFire Using the
> >> > same IP and have commit their codes, which are mixed with PLDA
> >> > controller codes and Microchip platform codes.
> >>
> >> Thank you for this series.
> >>
> >> I tested this on a VisionFive v2 board, and it seems to probe and
> >> find my
> >> M.2 NVMe SSD, but then gets timeouts when trying to use the NVMe (e.g.
> >> 'blkid' command)
> >>
> >
> > Hi, Kevin:
> > Could you please provide the manufacturer and model of the M.2 NVMe
> > SSD you tested?
> 
> I have a 256 Gb Silicon Power P34A60 M.2 NVMe SSD (part number:
> sp256gbp34a60m28)
> 
Thanks, Kevin, we will buy one to test.

Before doing this refactoring, we encountered the same bug with Kingston M.2 SSD,
and we workaround the problem with the below patch, please have a try:

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 507bc149046d..5be37f1ee150 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1059,6 +1059,16 @@ static inline int nvme_poll_cq(struct nvme_queue *nvmeq,
 {
        int found = 0;

+       /*
+        * In some cases, such as JH7110 SoC working with Kingston SSD,
+        * the CQE status may update a little bit later than the MSI,
+        * which cause an IRQ handle missing.
+        * As a workaround, here we will check the status first, and wait
+        * 1us if we get nothing.
+        */
+       if (!nvme_cqe_pending(nvmeq))
+               udelay(1);
+
        while (nvme_cqe_pending(nvmeq)) {
                found++;
                /*

> Also for reference, I tested the same SSD on another arm platform (Khadas
> VIM3) and it works fine.
> 
> Kevin

Hi, Bjorn:
Do you have any idea about the late CQE phase update condition as mentioned
in the patch comments above?
This is an issue that occurs with a small probability on individual devices in our
platform.
Thus, I suggest the refactoring patch set should go forward.
Later we will try to find a more formal solution instead, and send a new patch.