[PATCH net-next] net: stmmac: fix dwmac4 transmit performance regression

Georg Gottleuber g.gottleuber at tuxedocomputers.com
Mon Mar 16 03:51:46 PDT 2026



Am 13.03.26 um 19:39 schrieb Russell King (Oracle):
> On Fri, Mar 13, 2026 at 04:35:02PM +0000, Russell King (Oracle) wrote:
>> On Fri, Mar 13, 2026 at 04:03:16PM +0100, Georg Gottleuber wrote:
>>> Am 16.01.26 um 01:49 schrieb Russell King (Oracle):
>>>> dwmac4's transmit performance dropped by a factor of four due to an
>>>> incorrect assumption about which definitions are for what. This
>>>> highlights the need for sane register macros.
>>>>
>>>> Commit 8409495bf6c9 ("net: stmmac: cores: remove many xxx_SHIFT
>>>> definitions") changed the way the txpbl value is merged into the
>>>> register:
>>>>
>>>>         value = readl(ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan));
>>>> -       value = value | (txpbl << DMA_BUS_MODE_PBL_SHIFT);
>>>> +       value = value | FIELD_PREP(DMA_BUS_MODE_PBL, txpbl);
>>>>
>>>> With the following in the header file:
>>>>
>>>>  #define DMA_BUS_MODE_PBL               BIT(16)
>>>> -#define DMA_BUS_MODE_PBL_SHIFT         16
>>>>
>>>> The assumption here was that DMA_BUS_MODE_PBL was the mask for
>>>> DMA_BUS_MODE_PBL_SHIFT, but this turns out not to be the case.
>>>>
>>>> The field is actually six bits wide, buts 21:16, and is called
>>>> TXPBL.
>>>>
>>>> What's even more confusing is, there turns out to be a PBLX8
>>>> single bit in the DMA_CHAN_CONTROL register (0x1100 for channel 0),
>>>> and DMA_BUS_MODE_PBL seems to be used for that. However, this bit
>>>> et.al. was listed under a comment "/* DMA SYS Bus Mode bitmap */"
>>>> which is for register 0x1004.
>>>>
>>>> Fix this up by adding an appropriately named field definition under
>>>> the DMA_CHAN_TX_CONTROL() register address definition.
>>>>
>>>> Move the RPBL mask definition under DMA_CHAN_RX_CONTROL(), correctly
>>>> renaming it as well.
>>>>
>>>> Also move the PBL bit definition under DMA_CHAN_CONTROL(), correctly
>>>> renaming it.
>>>>
>>>> This removes confusion over the PBL fields.
>>>>
>>>> Signed-off-by: Russell King (Oracle) <rmk+kernel at armlinux.org.uk>
>>>
>>> Thank you for this patch, which significantly speeds up the transmit (by
>>> more than a factor of nine on our devices with Motorcomm yt6801).
>>>
>>> Unfortunately, this patch also causes DMA errors on two of our devices;
>>> logs from a iperf3 test are attached.
>>>
>>> Strangely enough, a third device with the same Motorcomm yt6801 does not
>>> appear to be affected by the DMA errors. However, further testing is needed.
>>>
>>> Do you have any ideas for further tests?
>>
>> I would suggest dumping the contents of these control registers prior
>> to commit 8409495bf6c9, and after this commit, comparing their values
>> to identify what has changed. I'm sorry, I don't have the bandwidth to
>> inspect the patches to see what may have been inadvertently changed.
> 
> Note that dwmac-motorcomm was merged between the broken commit
> 8409495bf6c9 and the fix 5ccde4c81e84. However, the broken commit was
> merged on 12 Jan, but there were postings of dwmac-motorcomm before
> this:
> 
> https://lore.kernel.org/netdev/20251014164746.50696-5-ziyao@disroot.org/
> 
> which uses the same parameters.
> 
> So, I wonder whether what you're running into is that with the
> breakage, the PCIe errors are masked. However, what I will note is that
> setting TxPBL to zero (as will happen with the breakage, since 32 & 1
> is 0) is documented as having undefined behaviour - so it's definitely
> wrong. Even if zero works there, you're operating the IP in undefined
> documented territory. If you really want to test that, with commit
> 5ccde4c81e84 applied, set txpbl to 64, as
> 
> 	FIELD_PREP(DMA_CHAN_TX_CTRL_TXPBL_MASK, 64)
> 
> will be zero because it overflows the 21:16 bitmask.
> 
> I suspect if you wind the kernel tree back to the before 8409495bf6c9,
> and then apply the motorcomm support patches, you may well see these
> PCIe errors - and that would rule out these changes. It would suggest
> that this is a pre-existing problem, or maybe a hardware issue.

Thank you very much for the explanation. You're right. I saw the DMA
errors in this case as well.

> Another idea would be to try reducing txpbl to see if there's a value
> where things stabilise.
> 

I'll give it a try.

Regards,
Georg




More information about the linux-arm-kernel mailing list