FEC ethernet issues [Was: PL310 errata workarounds]
Jaccon Bastiaansen
jaccon.bastiaansen at gmail.com
Thu May 8 02:23:15 PDT 2014
Hello all,
2014-05-02 13:41 GMT+02:00 Russell King - ARM Linux <linux at arm.linux.org.uk>:
> On Tue, Apr 29, 2014 at 11:05:04AM +0200, Jaccon Bastiaansen wrote:
>> Hello all,
>>
>> I tried the FEC patches from Russel
>> (http://ftp.arm.linux.org.uk/cgit/linux-arm.git/log/?h=fec-testing),
>> but with the following test the FEC complety stops receiving frames:
>>
>>
>> Linux PC <-----> Gigabit Ethernet switch <------> SabreSD board
>>
>> Both ethernet links run at 1 Gigabit, full duplex.
>>
>> The SabreSD board runs the fec-testing kernel from Russel.
>>
>> On the SabreSD board we run "iperf -s -u"
>>
>> On the Linux PC we
>> - run the iperf client (command used: “while [ 1 ]; do iperf –c ‘IP
>> address of SabreSD’ -u -b 100m -t 300 -l 256;sleep 1;done”)
>> - ping the SabreSD board every second
>>
>> After a while (sometimes 10 seconds, sometimes a couple of minutes),
>> we see that the SabreSD board stops replying to the pings from the
>> Linux PC. Closer inspection shows that the FEC doesn't generate
>> receive frame interrupts anymore. The attached debugger screenshots
>> show the FEC registers when it is in this state. The RXF interrupt is
>> enabled and we know that the FEC is still receiving ethernet frames
>> (because the IEEE_R_FRAME_OK event counter is increasing), but no
>> receive frame interrupts are occurring. We only see MII interrupts
>> occurring.
>
> If the RXF interrupt is enabled, it means that we must have received
> less than NAPI_POLL_WEIGHT (64) frames from the ring during the previous
> NAPI poll - this is the only circumstance when we will re-enable
> interrupts.
>
> There are only two conditions where we stop receiving frames from the
> receive ring:
> 1. if we reach the NAPI poll weight number of frames received.
> 2. if we encounter a packet descriptor marked empty.
>
> It can't be (1) because we would have left the TXF/RXF interrupts
> disabled and waited for the next NAPI poll. So, it can only be (2).
>
> (2) implies that there is a ring descriptor which is marked as being
> owned by the FEC, which means that there are descriptors free. Whether
> it's the one which the FEC is expecting to be free or not is something
> that's impossible to tell (the FEC hardware doesn't tell us where it
> is in its ring, which is a big minus point against debugging.)
>
> It would be useful to see the state of the receive ring, as well as
> the rx_next index. That may be difficult to get though.
>
I added some code to show the RX ring when the reception of frames has
stopped. See the patch below. Reading the fec_rx_ring file gives the
following output (I stripped the bit decoding here to reduce the
number of lines):
0: c0a0d000 0x0800 (== rx_bd_base)
1: c0a0d020 0x0800
2: c0a0d040 0x0800
3: c0a0d060 0x0800
4: c0a0d080 0x0800
5: c0a0d0a0 0x0800
6: c0a0d0c0 0x0800 (== rx_next)
7: c0a0d0e0 0x0800
8: c0a0d100 0x0800
9: c0a0d120 0x0800
10: c0a0d140 0x0800
11: c0a0d160 0x0800
12: c0a0d180 0x0800
13: c0a0d1a0 0x0800
14: c0a0d1c0 0x0800
15: c0a0d1e0 0x0800
16: c0a0d200 0x0800
17: c0a0d220 0x0800
18: c0a0d240 0x0800
19: c0a0d260 0x0800
20: c0a0d280 0x0800
21: c0a0d2a0 0x0800
22: c0a0d2c0 0x0800
23: c0a0d2e0 0x0800
24: c0a0d300 0x0800
25: c0a0d320 0x0800
26: c0a0d340 0x0800
27: c0a0d360 0x0800
28: c0a0d380 0x0800
29: c0a0d3a0 0x0800
30: c0a0d3c0 0x0800
31: c0a0d3e0 0x0800
32: c0a0d400 0x0800
33: c0a0d420 0x0800
34: c0a0d440 0x0800
35: c0a0d460 0x0800
36: c0a0d480 0x0800
37: c0a0d4a0 0x0800
38: c0a0d4c0 0x0800
39: c0a0d4e0 0x0800
40: c0a0d500 0x0800
41: c0a0d520 0x0800
42: c0a0d540 0x0800
43: c0a0d560 0x0800
44: c0a0d580 0x0800
45: c0a0d5a0 0x0800
46: c0a0d5c0 0x0800
47: c0a0d5e0 0x0800
48: c0a0d600 0x0800
49: c0a0d620 0x0800
50: c0a0d640 0x0800
51: c0a0d660 0x0800
52: c0a0d680 0x0800
53: c0a0d6a0 0x0800
54: c0a0d6c0 0x0800
55: c0a0d6e0 0x0800
56: c0a0d700 0x0800
57: c0a0d720 0x0800
58: c0a0d740 0x0800
59: c0a0d760 0x0800
60: c0a0d780 0x0800
61: c0a0d7a0 0x0800
62: c0a0d7c0 0x0800
63: c0a0d7e0 0x0800
64: c0a0d800 0x0800
65: c0a0d820 0x0800
66: c0a0d840 0x0800
67: c0a0d860 0x0800
68: c0a0d880 0x0800
69: c0a0d8a0 0x0800
70: c0a0d8c0 0x0800
71: c0a0d8e0 0x0800
72: c0a0d900 0x0800
73: c0a0d920 0x0800
74: c0a0d940 0x0800
75: c0a0d960 0x0800
76: c0a0d980 0x0800
77: c0a0d9a0 0x0800
78: c0a0d9c0 0x0800
79: c0a0d9e0 0x0800
80: c0a0da00 0x0800
81: c0a0da20 0x0800
82: c0a0da40 0x0800
83: c0a0da60 0x0800
84: c0a0da80 0x0800
85: c0a0daa0 0x0800
86: c0a0dac0 0x0800
87: c0a0dae0 0x0800
88: c0a0db00 0x0800
89: c0a0db20 0x0800
90: c0a0db40 0x0800
91: c0a0db60 0x0800
92: c0a0db80 0x0800
93: c0a0dba0 0x0800
94: c0a0dbc0 0x0800
95: c0a0dbe0 0x0800
96: c0a0dc00 0x0800
97: c0a0dc20 0x0800
98: c0a0dc40 0x0800
99: c0a0dc60 0x0800
100: c0a0dc80 0x0800
101: c0a0dca0 0x0800
102: c0a0dcc0 0x0800
103: c0a0dce0 0x0800
104: c0a0dd00 0x0800
105: c0a0dd20 0x0800
106: c0a0dd40 0x0800
107: c0a0dd60 0x0800
108: c0a0dd80 0x0800
109: c0a0dda0 0x0800
110: c0a0ddc0 0x0800
111: c0a0dde0 0x0800
112: c0a0de00 0x0800
113: c0a0de20 0x0800
114: c0a0de40 0x0800
115: c0a0de60 0x0800
116: c0a0de80 0x0800
117: c0a0dea0 0x0800
118: c0a0dec0 0x0800
119: c0a0dee0 0x0800
120: c0a0df00 0x0800
121: c0a0df20 0x0800
122: c0a0df40 0x0800
123: c0a0df60 0x0800
124: c0a0df80 0x0800
125: c0a0dfa0 0x0800
126: c0a0dfc0 0x0800
127: c0a0dfe0 0x2800
This shows that the complete receive ring is filled with received
frames, which matches with the value 0 in the RDAR register.
The patched FEC driver also logs "FEC receive ring full" when the
receive ring is full after the napi_complete() call and before the
enabling of the RXF interrupt. The kernel logging shows this message a
couple of times. So the RX ring has been filled completely between the
fec_enet_rx() call and the napi_complete() call. In this testcase
(100Megabit, 256 byte frames) it takes less than 3 milliseconds to
completely fill the RX ring. So a preemption between the fec_enet_rx()
call and the napi_complete() call or a long execution time of the
fec_enet_tx() function (which is called between fec_enet_rx() and
napi_complete() ) results in the enabling of the RXF interrupt when
the RX ring is completely full. It seems that in that case the RXF
interrupt will not be generated anymore.
@@ -23,17 +23,20 @@
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/string.h>
#include <linux/ptrace.h>
+#include <linux/debugfs.h>
#include <linux/errno.h>
#include <linux/ioport.h>
#include <linux/slab.h>
#include <linux/interrupt.h>
#include <linux/delay.h>
#include <linux/netdevice.h>
#include <linux/etherdevice.h>
+#include <linux/sched.h>
+#include <linux/seq_file.h>
#include <linux/skbuff.h>
#include <linux/spinlock.h>
#include <linux/workqueue.h>
#include <linux/bitops.h>
#include <linux/io.h>
@@ -1164,11 +1167,17 @@ static int fec_enet_rx_napi(struct napi_struct
*napi, int budget)
fec_enet_tx(ndev);
if (pkts < budget) {
napi_complete(napi);
- writel(FEC_DEFAULT_IMASK, fep->hwp + FEC_IMASK);
+ if (readl(fep->hwp + FEC_R_DES_ACTIVE) == 0) {
+ writel(FEC_DEFAULT_IMASK, fep->hwp + FEC_IMASK);
+ printk(KERN_ERR "%llu FEC: receive ring full!!!\n",
+ sched_clock());
+ } else {
+ writel(FEC_DEFAULT_IMASK, fep->hwp + FEC_IMASK);
+ }
}
return pkts;
}
/* ------------------------------------------------------------------------- */
@@ -2404,10 +2413,79 @@ static void fec_reset_phy(struct platform_device *pdev)
* by machine code.
*/
}
#endif /* CONFIG_OF */
+#define BDRX_EMPTY (1 << 15)
+#define BDRX_RO1 (1 << 14)
+#define BDRX_WRAP (1 << 13)
+#define BDRX_RO2 (1 << 12)
+#define BDRX_LAST (1 << 11)
+#define BDRX_UNDEF10 (1 << 10)
+#define BDRX_UNDEF9 (1 << 9)
+#define BDRX_MISS (1 << 8)
+#define BDRX_BCAST (1 << 7)
+#define BDRX_MCAST (1 << 6)
+#define BDRX_LENERR (1 << 5)
+#define BDRX_ALGNERR (1 << 4)
+#define BDRX_UNDEF3 (1 << 3)
+#define BDRX_CRCERR (1 << 2)
+#define BDRX_OVERRUN (1 << 1)
+#define BDRX_TRUNC (1 << 0)
+
+
+static int fec_rx_ring_show(struct seq_file *s, void *p)
+{
+ struct fec_enet_private *fep = s->private;
+ int nbds;
+ union bufdesc_u *bd;
+
+ bd = fep->rx_bd_base;
+ for (nbds = 0;nbds < fep->rx_ring_size;nbds++) {
+ seq_printf(s, "%d: %p 0x%.4x", nbds, bd, bd->ebd.desc.cbd_sc);
+ if (bd == fep->rx_bd_base)
+ seq_printf(s, " (== rx_bd_base)");
+ if (nbds == fep->rx_next)
+ seq_printf(s, " (== rx_next)");
+ seq_printf(s, "\n");
+ seq_printf(s,
+ " [15]%s [14]%s [13]%s [12]%s [11]%s [10]%s [ 9]%s [ 8]%s"
+ " [ 7]%s [ 6]%s [ 5]%s [ 4]%s [ 3]%s [ 2]%s [ 1]%s [ 0]%s\n",
+ bd->ebd.desc.cbd_sc & BDRX_EMPTY ? "EMP" : "emp",
+ bd->ebd.desc.cbd_sc & BDRX_RO1 ? "RO1" : "ro1",
+ bd->ebd.desc.cbd_sc & BDRX_WRAP ? "WRP" : "wrp",
+ bd->ebd.desc.cbd_sc & BDRX_RO2 ? "RO2" : "ro2",
+ bd->ebd.desc.cbd_sc & BDRX_LAST ? "LST" : "lst",
+ bd->ebd.desc.cbd_sc & BDRX_UNDEF10 ? "UNA" : "una",
+ bd->ebd.desc.cbd_sc & BDRX_UNDEF9 ? "UN9" : "un9",
+ bd->ebd.desc.cbd_sc & BDRX_MISS ? "MIS" : "mis",
+ bd->ebd.desc.cbd_sc & BDRX_BCAST ? "BCS" : "bcs",
+ bd->ebd.desc.cbd_sc & BDRX_MCAST ? "MCS" : "mcs",
+ bd->ebd.desc.cbd_sc & BDRX_LENERR ? "LEN" : "len",
+ bd->ebd.desc.cbd_sc & BDRX_ALGNERR ? "ALN" : "aln",
+ bd->ebd.desc.cbd_sc & BDRX_UNDEF3 ? "UN3" : "un3",
+ bd->ebd.desc.cbd_sc & BDRX_CRCERR ? "CRC" : "crc",
+ bd->ebd.desc.cbd_sc & BDRX_OVERRUN ? "OVR" : "ovr",
+ bd->ebd.desc.cbd_sc & BDRX_TRUNC ? "TRC" : "trc");
+ bd += 1;
+ }
+ return 0;
+}
+
+static int fec_rx_ring_open(struct inode *inode, struct file *file)
+{
+ struct fec_enet_private *fep = inode->i_private;
+ return single_open(file, fec_rx_ring_show, fep);
+}
+
+static const struct file_operations fec_rx_ring_fops = {
+ .open = fec_rx_ring_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = single_release,
+};
+
static int
fec_probe(struct platform_device *pdev)
{
struct fec_enet_private *fep;
struct fec_platform_data *pdata;
@@ -2558,10 +2636,18 @@ fec_probe(struct platform_device *pdev)
if (fep->flags & FEC_FLAG_BUFDESC_EX && fep->ptp_clock)
netdev_info(ndev, "registered PHC device %d\n", fep->dev_id);
INIT_WORK(&fep->tx_timeout_work, fec_enet_timeout_work);
+
+ if (!debugfs_create_file("fec_rx_ring",
+ S_IFREG | S_IRUGO,
+ NULL,
+ fep,
+ &fec_rx_ring_fops))
+ goto failed_register;
+
return 0;
failed_register:
fec_enet_mii_remove(fep);
failed_mii_init:
Regards,
Jaccon
More information about the linux-arm-kernel
mailing list