[Bug 217100] New: Bifurcation between pcie3x1 & pcie3x2 doesn't work in RK3568J.
Serge Semin
Sergey.Semin at baikalelectronics.ru
Tue Feb 28 08:35:01 PST 2023
* Cc += Linux ARM mailing list
Hi Anton
On Tue, Feb 28, 2023 at 06:44:27AM -0600, Bjorn Helgaas wrote:
> On Tue, Feb 28, 2023 at 08:53:50AM +0000, bugzilla-daemon at kernel.org wrote:
> > https://bugzilla.kernel.org/show_bug.cgi?id=217100
> >
> > Summary: Bifurcation between pcie3x1 & pcie3x2 doesn't work in
> > RK3568J.
> > ...
>
> > Hello.
> >
> > First, I want to say that pcie3x1 crashes if started before pcie3x2 . Driver
> > > pcie-designware.c
> >
> > in function
> >
> > > void dw_pcie_version_detect(struct dw_pcie *pci)
> >
> > tries to read parameter from dbi register (PCIE_VERSION_NUMBER) and fails on
> > it.
Could you give more details what crashes? Log/Stack-trace would be
nice to see.
Do you know the DW PCIe IP-core version? Could you try to manually get the
content of the PCIE_VERSION_NUMBER and PCIE_VERSION_TYPE registers after
the system successfully boots up?
On the first glance what you see here might be caused by the DBI
reference clock race condition/malfunction. The
dw_pcie_version_detect() function doesn't do much. It just reads the
IP-core version/type registers content. So if the reference clock
isn't enabled by the time of the function call then the system bus may
return an error, which could be caught by the system errors handler
module, which driver can cause the kernel crash. Though the LLDD
enables all the supplied clocks before calling dw_pcie_host_init(). So
what you've discovered seems unexpected and most likely caused by some
other place. Thereby here are several questions:
1. Are you sure your DT-file has all the required reference clocks
specified for all the Rockchip PCIe DT-nodes?
2. Are you sure there were no updates in the platform-clock
driver(s) which could cause a possible clock malfunction?
3. Could you have a look whether the pcie3x1 and pcie3x2 reference/DBI
clock sources are unrelated? That might give a clue to the problem
solution and could confirm the race.
4. Are you sure that the problem is in the dw_pcie_version_detect()
function? Could you try to replace all the dw_pcie_readl_dbi() calls
with the actual content of the PCIE_VERSION_NUMBER and PCIE_VERSION_TYPE
registers and then have a look whether the crash still happens?
Anyway Log/Stack-trace would give better understanding of the problem.
> > So I changed sequence of declaration PCIE in rk3568.dtsi: first - pcie3x2 next
> > pcie3x1. Now Linux first starts pcie3x2, then successfully starts pcie3x1.
> >
> > But main problem is that bifurcation in phy driver
> > > phy-rockchip-snps-pcie3.c
> >
> > doesn't work. I tried add next lines in function
It seems to me that this problem and the problem above might be weakly
related. Let's dive deeper in the first problem. Then we can go
towards the second problem if it will be actual by that time.
> >
> > > static int rockchip_p3phy_probe(struct platform_device *pdev)
> >
> > right after block check
> >
> > > if (priv->num_lanes == -EINVAL) {
> > > }
> >
> > > priv->num_lanes = 2;
> > > priv->lanes[0] = 1;
> > > priv->lanes[1] = 2
> >
> > And driver writes during Linux boot process that bifurcation is enabled, but
> >
> > lspci
> >
> > does't show second device.
> >
> > Best regards,
> > Anton.
>
> Thanks much for your report, Anton. People don't really pay attention
> to the bugzilla, so I'm forwarding this to the mailing list and to
> some folks who've worked on that driver in the past.
>
> MAINTAINERS doesn't list an entry for pcie-dw-rockchip.c; I'm not sure
> if that's an oversight or if nobody cares enough to actually maintain
> it.
Hi Bjorn
It seems that the driver is currently supported by the Rockchip SoC maintainer:
ARM/Rockchip SoC support
F: drivers/*/*/*rockchip*
F: drivers/*/*rockchip*
So @Heiko might give a more experienced advice about the bug.
-Serge(y)
>
> Bjorn
>
More information about the linux-arm-kernel
mailing list