[PATCH] PCI: dw-rockchip: Configure max payload size on host init
Hans Zhang
18255117159 at 163.com
Wed Apr 16 23:47:23 PDT 2025
On 2025/4/17 14:01, Niklas Cassel wrote:
> On Thu, Apr 17, 2025 at 10:19:10AM +0800, Hans Zhang wrote:
>> On 2025/4/17 04:40, Bjorn Helgaas wrote:
>>> On Wed, Apr 16, 2025 at 11:19:26PM +0800, Hans Zhang wrote:
>>>> The RK3588's PCIe controller defaults to a 128-byte max payload size,
>>>> but its hardware capability actually supports 256 bytes. This results
>>>> in suboptimal performance with devices that support larger payloads.
>>>>
>>>> Signed-off-by: Hans Zhang <18255117159 at 163.com>
>>>> ---
>>>> drivers/pci/controller/dwc/pcie-dw-rockchip.c | 18 ++++++++++++++++++
>>>> 1 file changed, 18 insertions(+)
>>>>
>>>> diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
>>>> index c624b7ebd118..5bbb536a2576 100644
>>>> --- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
>>>> +++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
>>>> @@ -477,6 +477,22 @@ static irqreturn_t rockchip_pcie_ep_sys_irq_thread(int irq, void *arg)
>>>> return IRQ_HANDLED;
>>>> }
>>>> +static void rockchip_pcie_set_max_payload(struct rockchip_pcie *rockchip)
>>>> +{
>>>> + struct dw_pcie *pci = &rockchip->pci;
>>>> + u32 dev_cap, dev_ctrl;
>>>> + u16 offset;
>>>> +
>>>> + offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
>>>> + dev_cap = dw_pcie_readl_dbi(pci, offset + PCI_EXP_DEVCAP);
>>>> + dev_cap &= PCI_EXP_DEVCAP_PAYLOAD;
>>>> +
>>>> + dev_ctrl = dw_pcie_readl_dbi(pci, offset + PCI_EXP_DEVCTL);
>>>> + dev_ctrl &= ~PCI_EXP_DEVCTL_PAYLOAD;
>>>> + dev_ctrl |= dev_cap << 5;
>>>> + dw_pcie_writel_dbi(pci, offset + PCI_EXP_DEVCTL, dev_ctrl);
>>>> +}
>>>
>>> I can't really complain too much about this since meson does basically
>>> the same thing, but there are some things I don't like about this:
>>>
>>> - I don't think it's safe to set MPS higher in all cases. If we set
>>> the Root Port MPS=256, and an Endpoint only supports MPS=128, the
>>> Endpoint may do a 256-byte DMA read (assuming its MRRS>=256). In
>>> that case the RP may respond with a 256-byte payload the Endpoint
>>> can't handle. The generic code in pci_configure_mps() might be
>>> smart enough to avoid that situation, but I'm not confident about
>>> it. Maybe I could be convinced.
>>>
>>
>> Dear Bjorn,
>>
>> Thank you very much for your reply. If we set the Root Port MPS=256, and an
>> Endpoint only supports MPS=128. Finally, Root Port is also set to MPS=128 in
>> pci_configure_mps.
>
> In you example below, the Endpoint has:
> DevCap: MaxPayload 512 bytes
>
> So at least your example can't be used to prove this specific point.
> But perhaps you just wanted to show that your Max Payload Size increase
> actually works?
>
Dear Niklas,
Do you have an Endpoint with MPS=128? If so, you can also help verify
the logic of the pci_configure_mps function. I don't have an Endpoint
with MPS=128 here.
The processing logic of the pci_configure_mps function has been verified
on our own SOC platform. Please refer to the following log.
Our Root Port will set MPS=512.
0002:30:00.0 PCI bridge: Device 1f6c:0001 (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 167
Bus: primary=30, secondary=31, subordinate=5f, sec-latency=0
I/O behind bridge: 300000-300fff [size=4K] [16-bit]
Memory behind bridge: 38300000-383fffff [size=1M] [32-bit]
Prefetchable memory behind bridge:
00000000fff00000-00000000000fffff [disabled] [64-bit]
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- <SERR- <PERR-
Expansion ROM at 38200000 [virtual] [disabled] [size=1M]
BridgeCtl: Parity- SERR+ NoISA- VGA- VGA16- MAbort- >Reset-
FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [80] Power Management version 3
Flags: PMEClk- DSI- D1+ D2- AuxCurrent=0mA
PME(D0+,D1+,D2-,D3hot+,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [90] MSI: Enable+ Count=1/32 Maskable+ 64bit+
Address: 000000000e060040 Data: 0000
Masking: fffffffe Pending: 00000000
Capabilities: [b0] MSI-X: Enable- Count=2 Masked-
Vector table: BAR=0 offset=00000040
PBA: BAR=0 offset=00000040
Capabilities: [c0] Express (v2) Root Port (Slot-), MSI 00
DevCap: MaxPayload 512 bytes, PhantFunc 0
ExtTag- RBE+
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 1024 bytes
0002:31:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL8125 2.5GbE Controller (rev 05)
Subsystem: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE
Controller
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 166
Region 0: I/O ports at 300000 [size=256]
Region 2: Memory at 38300000 (64-bit, non-prefetchable) [size=64K]
Region 4: Memory at 38310000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA
PME(D0+,D1+,D2+,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Address: 0000000000000000 Data: 0000
Masking: 00000000 Pending: 00000000
Capabilities: [70] Express (v2) Endpoint, MSI 01
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
<512ns, L1 <64us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
SlotPowerLimit 0W
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 256 bytes, MaxReadReq 2048 bytes
hans at hans:~$ iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from ethernet_ip, port 47114
[ 5] local ubuntu_host_ip port 5201 connected to ethernet_ip port 47122
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 108 MBytes 902 Mbits/sec
[ 5] 1.00-2.00 sec 112 MBytes 941 Mbits/sec
[ 5] 2.00-3.00 sec 112 MBytes 941 Mbits/sec
[ 5] 3.00-4.00 sec 112 MBytes 941 Mbits/sec
[ 5] 4.00-5.00 sec 112 MBytes 941 Mbits/sec
[ 5] 5.00-6.00 sec 112 MBytes 941 Mbits/sec
[ 5] 6.00-7.00 sec 112 MBytes 941 Mbits/sec
[ 5] 7.00-8.00 sec 112 MBytes 941 Mbits/sec
[ 5] 8.00-9.00 sec 112 MBytes 941 Mbits/sec
[ 5] 9.00-10.00 sec 112 MBytes 941 Mbits/sec
[ 5] 10.00-10.04 sec 4.92 MBytes 941 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.04 sec 1.10 GBytes 938 Mbits/sec
receiver
-----------------------------------------------------------
root at cix-localhost:~# iperf3 -c ubuntu_host_ip
Connecting to host ubuntu_host_ip, port 5201
[ 5] local ethernet_ip port 47122 connected to ubuntu_host_ip port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 114 MBytes 958 Mbits/sec 0 484 KBytes
[ 5] 1.00-2.00 sec 113 MBytes 946 Mbits/sec 0 535 KBytes
[ 5] 2.00-3.00 sec 112 MBytes 936 Mbits/sec 0 559 KBytes
[ 5] 3.00-4.00 sec 113 MBytes 946 Mbits/sec 0 587 KBytes
[ 5] 4.00-5.00 sec 112 MBytes 939 Mbits/sec 0 587 KBytes
[ 5] 5.00-6.00 sec 113 MBytes 948 Mbits/sec 0 587 KBytes
[ 5] 6.00-7.00 sec 112 MBytes 936 Mbits/sec 0 587 KBytes
[ 5] 7.00-8.00 sec 112 MBytes 939 Mbits/sec 0 587 KBytes
[ 5] 8.00-9.00 sec 112 MBytes 942 Mbits/sec 0 619 KBytes
[ 5] 9.00-10.00 sec 113 MBytes 945 Mbits/sec 0 677 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 1.10 GBytes 944 Mbits/sec 0 sender
[ 5] 0.00-10.04 sec 1.10 GBytes 938 Mbits/sec
receiver
Best regards,
Hans
More information about the linux-arm-kernel
mailing list