[PATCH net v3 3/4] net: ethernet: cortina: Handle large frames

Vladimir Oltean olteanv at gmail.com
Wed Nov 8 07:27:36 PST 2023


On Tue, Nov 07, 2023 at 10:54:28AM +0100, Linus Walleij wrote:
> The Gemini ethernet controller provides hardware checksumming
> for frames up to 1514 bytes including ethernet headers but not
> FCS.
> 
> If we start sending bigger frames (after first bumping up the MTU
> on both interfaces sending and receiveing the frames), truncated

s/receiveing/receiving/

> packets start to appear on the target such as in this tcpdump
> resulting from ping -s 1474:
> 
> 23:34:17.241983 14:d6:4d:a8:3c:4f (oui Unknown) > bc:ae:c5:6b:a8:3d (oui Unknown),
> ethertype IPv4 (0x0800), length 1514: truncated-ip - 2 bytes missing!
> (tos 0x0, ttl 64, id 32653, offset 0, flags [DF], proto ICMP (1), length 1502)
> OpenWrt.lan > Fecusia: ICMP echo request, id 1672, seq 50, length 1482
> 
> If we bypass the hardware checksumming and provide a software
> fallback, everything starts working fine up to the max TX MTU
> of 2047 bytes, for example ping -s2000 192.168.1.2:
> 
> 00:44:29.587598 bc:ae:c5:6b:a8:3d (oui Unknown) > 14:d6:4d:a8:3c:4f (oui Unknown),
> ethertype IPv4 (0x0800), length 2042:
> (tos 0x0, ttl 64, id 51828, offset 0, flags [none], proto ICMP (1), length 2028)
> Fecusia > OpenWrt.lan: ICMP echo reply, id 1683, seq 4, length 2008
> 
> The bit enabling to bypass hardware checksum (or any of the
> "TSS" bits) are undocumented in the hardware reference manual.
> The entire hardware checksum unit appears undocumented. The
> conclusion that we need to use the "bypass" bit was found by
> trial-and-error.
> 
> Since no hardware checksum will happen, we slot in a software
> checksum fallback.
> 
> Check for the condition where we need to compute checksum on the
> skb with either hardware or software using == CHECKSUM_PARTIAL instead
> of != CHECKSUM_NONE which is an incomplete check according to
> <linux/skbuff.h>.
> 
> We delete the code disabling the hardware checksum for large
> MTU:s: this is suboptimal because it will disable hardware

"MTUs" maybe?

> checksumming also on small packets which the checksumming
> engine can handle just fine, which is a waste of resources.
> 
> On the D-Link DIR-685 router this fixes a bug on the conduit
> interface to the RTL8366RB DSA switch: as the switch needs to add
> space for its tag it increases the MTU on the conduit interface
> to 1504 and that means that when the router sends packages
> of 1500 bytes these get an extra 4 bytes of DSA tag and the
> transfer fails because of the erroneous hardware checksumming,
> affecting such basic functionality as the LuCI web interface.
> 
> Suggested-by: Vladimir Oltean <olteanv at gmail.com>
> Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet")
> Signed-off-by: Linus Walleij <linus.walleij at linaro.org>
> ---
>  drivers/net/ethernet/cortina/gemini.c | 34 +++++++++++++++++++++++-----------
>  1 file changed, 23 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c
> index b21a94b4ab5c..78287cfcbf63 100644
> --- a/drivers/net/ethernet/cortina/gemini.c
> +++ b/drivers/net/ethernet/cortina/gemini.c
> @@ -1145,6 +1145,7 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb,
>  	dma_addr_t mapping;
>  	unsigned short mtu;
>  	void *buffer;
> +	int ret;
>  
>  	mtu  = ETH_HLEN;
>  	mtu += netdev->mtu;
> @@ -1159,9 +1160,30 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb,
>  		word3 |= mtu;
>  	}
>  
> -	if (skb->ip_summed != CHECKSUM_NONE) {
> +	if (skb->len >= ETH_FRAME_LEN) {
> +		/* Hardware offloaded checksumming isn't working on frames
> +		 * bigger than 1514 bytes. A hypothesis about this is that the
> +		 * checksum buffer is only 1518 bytes, so when the frames get
> +		 * bigger they get truncated, or the last few bytes get
> +		 * overwritten by the FCS.
> +		 *
> +		 * Just use software checksumming and bypass on bigger frames.
> +		 */
> +		if (skb->ip_summed == CHECKSUM_PARTIAL) {
> +			ret = skb_checksum_help(skb);
> +			if (ret)
> +				return ret;
> +		}
> +		word1 |= TSS_BYPASS_BIT;
> +	} else if (skb->ip_summed == CHECKSUM_PARTIAL) {
>  		int tcp = 0;
>  
> +		/* We do not switch off the checksumming on non TCP/UDP
> +		 * frames: as is shown from tests, the checksumming engine
> +		 * is smart enough to see that a frame is not actually TCP
> +		 * or UDP and then just pass it through without any changes
> +		 * to the frame.
> +		 */
>  		if (skb->protocol == htons(ETH_P_IP)) {
>  			word1 |= TSS_IP_CHKSUM_BIT;
>  			tcp = ip_hdr(skb)->protocol == IPPROTO_TCP;
> @@ -1978,15 +2000,6 @@ static int gmac_change_mtu(struct net_device *netdev, int new_mtu)
>  	return 0;
>  }
>  
> -static netdev_features_t gmac_fix_features(struct net_device *netdev,
> -					   netdev_features_t features)
> -{
> -	if (netdev->mtu + ETH_HLEN + VLAN_HLEN > MTU_SIZE_BIT_MASK)
> -		features &= ~GMAC_OFFLOAD_FEATURES;
> -
> -	return features;
> -}
> -

I think this entire ndo_fix_features() can be indeed removed, but your
justification was not immediately convincing. I'd point out that after
your patch 1/4 "net: ethernet: cortina: Fix MTU max setting", you
actually made this dead code, because netdev->mtu can't be larger than
netdev->max_mtu.

If you reverse the patch order a bit, such that "net: ethernet: cortina:
Handle large frames" comes first, I think it would be much more logical
for the removal of gmac_fix_features() to be part of the commit
"net: ethernet: cortina: Fix MTU max setting", with the simple
justification: the new MTU makes the code stop having any role.

>  static int gmac_set_features(struct net_device *netdev,
>  			     netdev_features_t features)
>  {
> @@ -2212,7 +2225,6 @@ static const struct net_device_ops gmac_351x_ops = {
>  	.ndo_set_mac_address	= gmac_set_mac_address,
>  	.ndo_get_stats64	= gmac_get_stats64,
>  	.ndo_change_mtu		= gmac_change_mtu,
> -	.ndo_fix_features	= gmac_fix_features,
>  	.ndo_set_features	= gmac_set_features,
>  };
>  
> 
> -- 
> 2.34.1
> 



More information about the linux-arm-kernel mailing list