Random reboots on ODROID-N2+
Stefan Agner
stefan at agner.ch
Tue May 18 03:15:51 PDT 2021
On 2021-05-18 03:33, Andrew Lunn wrote:
> On Mon, May 17, 2021 at 11:14:18AM +0200, Stefan Agner wrote:
>> Hi,
>>
>> We are currently testing a new release using Linux 5.10.33. I've
>> received since several reports of random reboots every couple of days.
>> Unfortunately the log (journald) doesn't show anything, just a hard cut
>> at some point.
>>
>> After running serial console on several instances, I was able to catch
>> this stack trace:
>>
>> [202983.988153] SError Interrupt on CPU3, code 0xbf000000 -- SError
>> [202983.988155] CPU: 3 PID: 3463 Comm: mdns-repeater Not tainted 5.10.33
>> #1
>> [202983.988156] Hardware name: Hardkernel ODROID-N2Plus (DT)
>> [202983.988157] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
>> [202983.988158] pc : udp_send_skb.isra.0+0x178/0x390
>> [202983.988159] lr : udp_send_skb.isra.0+0x130/0x390
>
> Hi Stefan
Hi Andrew,
>
> Could you generate net/ipv4/udp.lst so we can see what
> udp_send_skb.isra.0+0x178/0x390 is trying to do, and what bit of C
> code it maps to.
Ok, built net/ipv4/udp.lst using the same build environment (buildroot)
the kernel which generated the stack trace has been built with, so I
think this should add up:
ffff800010c1bb60 <udp_send_skb.isra.0>:
static int udp_send_skb(struct sk_buff *skb, struct flowi4 *fl4,
...
udp4_hwcsum(skb, fl4->saddr, fl4->daddr);
ffff800010c1bc78: 29450ae1 ldp w1, w2, [x23, #40]
ffff800010c1bc7c: aa1303e0 mov x0, x19
ffff800010c1bc80: 94000000 bl ffff800010c184b0
<udp4_hwcsum>
ffff800010c1bc80: R_AARCH64_CALL26
udp4_hwcsum
err = ip_send_skb(sock_net(sk), skb);
ffff800010c1bc84: f9401ac0 ldr x0, [x22, #48]
ffff800010c1bc88: aa1303e1 mov x1, x19
ffff800010c1bc8c: 94000000 bl 0 <ip_send_skb>
ffff800010c1bc8c: R_AARCH64_CALL26
ip_send_skb
if (err) {
ffff800010c1bc90: 350008e0 cbnz w0, ffff800010c1bdac
<udp_send_skb.isra.0+0x24c>
...
u64 pc = READ_ONCE(ti->preempt_count);
ffff800010c1bcd4: f9400820 ldr x0, [x1, #16]
WRITE_ONCE(ti->preempt.count, --pc);
ffff800010c1bcd8: d1000400 sub x0, x0, #0x1
ffff800010c1bcdc: b9001020 str w0, [x1, #16]
return !pc || !READ_ONCE(ti->preempt_count);
...
The full udp.lst file:
https://drive.google.com/file/d/1j0RKOfuMXmCRWILpkG3uk_beohWrr-ho/view?usp=sharing
Since I only have this one trace, I am not 100% if this trace is just a
random one or always the case.
But things seem to add up to me: mdns-repeater deals with UDP packets,
and the it seems that the code tries to make use of HW check-summing
(from lr)? This would explain why this platform only shows the problem.
--
Stefan
More information about the linux-amlogic
mailing list