[PATCH net-next] stmmac: align RX buffers

Joakim Zhang qiangqing.zhang at nxp.com
Wed Aug 11 03:56:26 PDT 2021


> -----Original Message-----
> From: Thierry Reding <thierry.reding at gmail.com>
> Sent: 2021年8月11日 18:42
> To: Marc Zyngier <maz at kernel.org>
> Cc: Matteo Croce <mcroce at linux.microsoft.com>; netdev at vger.kernel.org;
> linux-kernel at vger.kernel.org; linux-riscv at lists.infradead.org; Giuseppe
> Cavallaro <peppe.cavallaro at st.com>; Alexandre Torgue
> <alexandre.torgue at foss.st.com>; David S. Miller <davem at davemloft.net>;
> Jakub Kicinski <kuba at kernel.org>; Palmer Dabbelt <palmer at dabbelt.com>;
> Paul Walmsley <paul.walmsley at sifive.com>; Drew Fustini
> <drew at beagleboard.org>; Emil Renner Berthing <kernel at esmil.dk>; Jon
> Hunter <jonathanh at nvidia.com>; Will Deacon <will at kernel.org>
> Subject: Re: [PATCH net-next] stmmac: align RX buffers
> 
> On Tue, Aug 10, 2021 at 08:07:47PM +0100, Marc Zyngier wrote:
> > Hi all,
> >
> > [adding Thierry, Jon and Will to the fun]
> >
> > On Mon, 14 Jun 2021 03:25:04 +0100,
> > Matteo Croce <mcroce at linux.microsoft.com> wrote:
> > >
> > > From: Matteo Croce <mcroce at microsoft.com>
> > >
> > > On RX an SKB is allocated and the received buffer is copied into it.
> > > But on some architectures, the memcpy() needs the source and
> > > destination buffers to have the same alignment to be efficient.
> > >
> > > This is not our case, because SKB data pointer is misaligned by two
> > > bytes to compensate the ethernet header.
> > >
> > > Align the RX buffer the same way as the SKB one, so the copy is faster.
> > > An iperf3 RX test gives a decent improvement on a RISC-V machine:
> > >
> > > before:
> > > [ ID] Interval           Transfer     Bitrate         Retr
> > > [  5]   0.00-10.00  sec   733 MBytes   615 Mbits/sec   88
> sender
> > > [  5]   0.00-10.01  sec   730 MBytes   612 Mbits/sec
> receiver
> > >
> > > after:
> > > [ ID] Interval           Transfer     Bitrate         Retr
> > > [  5]   0.00-10.00  sec  1.10 GBytes   942 Mbits/sec    0
> sender
> > > [  5]   0.00-10.00  sec  1.09 GBytes   940 Mbits/sec
> receiver
> > >
> > > And the memcpy() overhead during the RX drops dramatically.
> > >
> > > before:
> > > Overhead  Shared O  Symbol
> > >   43.35%  [kernel]  [k] memcpy
> > >   33.77%  [kernel]  [k] __asm_copy_to_user
> > >    3.64%  [kernel]  [k] sifive_l2_flush64_range
> > >
> > > after:
> > > Overhead  Shared O  Symbol
> > >   45.40%  [kernel]  [k] __asm_copy_to_user
> > >   28.09%  [kernel]  [k] memcpy
> > >    4.27%  [kernel]  [k] sifive_l2_flush64_range
> > >
> > > Signed-off-by: Matteo Croce <mcroce at microsoft.com>
> >
> > This patch completely breaks my Jetson TX2 system, composed of 2
> > Nvidia Denver and 4 Cortex-A57, in a very "funny" way.
> >
> > Any significant amount of traffic result in all sort of corruption
> > (ssh connections get dropped, Debian packages downloaded have the
> > wrong checksums) if any Denver core is involved in any significant way
> > (packet processing, interrupt handling). And it is all triggered by
> > this very change.
> >
> > The only way I have to make it work on a Denver core is to route the
> > interrupt to that particular core and taskset the workload to it. Any
> > other configuration involving a Denver CPU results in some sort of
> > corruption. On their own, the A57s are fine.
> >
> > This smells of memory ordering going really wrong, which this change
> > would expose. I haven't had a chance to dig into the driver yet (it
> > took me long enough to bisect it), but if someone points me at what is
> > supposed to synchronise the DMA when receiving an interrupt, I'll have
> > a look.
> 
> One other thing that kind of rings a bell when reading DMA and interrupts is a
> recent report (and attempt to fix this) where upon resume from system
> suspend, the DMA descriptors would get corrupted.
> 
> I don't think we ever figured out what exactly the problem was, but
> interestingly the fix for the issue immediately caused things to go haywire on...
> Jetson TX2.
> 
> I recall looking at this a bit and couldn't find where exactly the DMA was being
> synchronized on suspend/resume, or what the mechanism was to ensure that
> (in transit) packets were not received after the suspension of the Ethernet
> device. Some information about this can be found here:
> 
> 	https://lore.kernel.org/netdev/708edb92-a5df-ecc4-3126-5ab36707e275
> @nvidia.com/
> 
> It's interesting that this happens only on Jetson TX2. Apparently on the newer
> Jetson AGX Xavier this problem does not occur. I think Jon also narrowed this
> down to being related to the IOMMU being enabled on Jetson TX2, whereas
> Jetson AGX Xavier didn't have it enabled. I wasn't able to find any notes on
> whether disabling the IOMMU on Jetson TX2 did anything to improve on this,
> so perhaps that's something worth trying.
> 
> We have since enabled the IOMMU on Jetson AGX Xavier, and I haven't seen
> any test reports indicating that this is causing issues. So I don't think this has
> anything directly to do with the IOMMU support.
> 
> That said, if these problems are all exclusive to Jetson TX2, or rather Tegra186,
> that could indicate that we're missing something at a more fundamental level
> (maybe some cache maintenance quirk?).


Hey Thierry,

Please also notice me if you found the root cause, that would be appreciated!
I have not upstream the fix you mentioned yet since your continuous NACK.

Thanks in advance 😊

Best Regards,
Joakim Zhang
> Thierry
> 
> > > ---
> > >  drivers/net/ethernet/stmicro/stmmac/stmmac.h | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> > > b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> > > index b6cd43eda7ac..04bdb3950d63 100644
> > > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> > > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> > > @@ -338,9 +338,9 @@ static inline bool stmmac_xdp_is_enabled(struct
> > > stmmac_priv *priv)  static inline unsigned int
> > > stmmac_rx_offset(struct stmmac_priv *priv)  {
> > >  	if (stmmac_xdp_is_enabled(priv))
> > > -		return XDP_PACKET_HEADROOM;
> > > +		return XDP_PACKET_HEADROOM + NET_IP_ALIGN;
> > >
> > > -	return 0;
> > > +	return NET_SKB_PAD + NET_IP_ALIGN;
> > >  }
> > >
> > >  void stmmac_disable_rx_queue(struct stmmac_priv *priv, u32 queue);
> > > --
> > > 2.31.1
> > >
> > >
> >
> > --
> > Without deviation from the norm, progress is not possible.


More information about the linux-riscv mailing list