Reset on Beaglebone Black has become unreliable/broken
Konstantin Kletschke
konstantin.kletschke at inside-m2m.de
Thu Nov 28 01:46:10 PST 2024
On Thu, Nov 28, 2024 at 10:23:10AM +0100, Ahmad Fatoum wrote:
> I assume this should be v2022.04? -dirty means you have local patches
> on top. Do any of them touch SoC-specific, board-specific parts
> like clock or power?
Yes, it is "barebox 2022.04.0-dirty #1 Tue Sep 10 08:45:54 UTC 2024".
The patches we apply do not touch any clock or power, we touch:
Environment, kernel cmdline, watchdog settings, bootchooser config,
autoabortkey. Config stuff.
> What changed over the last week on the software side? I understand barebox
> stayed the same? Is the kernel still the same?
We changed nothing. I use to ship this barebox version with kernel for a
couple of months. Last week we only ramped up quantity but the fails are
so high in percentage it should had happened a couple of times before.
> On affected hardware: Does this happen always or only some times?
Always. Easy reproducable.
Meanwhile I realized on affected BBBs it can be reproduced this way:
Boot, hit Ctrl-C to stop barebox at prompt.
Hit S1 button which is wired to NRESET_INOUT ball A10 (its not S2 as I
initially wrote, S1).
System is stuck/frozen/dead.
> This sounds very similar to the issue fixed in commit 9c1a78f959dd
> ("Revert "ARM: beaglebone: init MPU speed to 800Mhz""), but that's already
> included in v2022.04.0, hence the question if you have patches that
> do anything similar.
Sounds interesting, I will take a look. As said, we patch no clock
voltages or something like that.
> Yes, but it sounds strange that only now these problems pop up?
Yes. Last week we started to experience this problem in production, we
have ~200 working BBBs, ~20 have this problem. The batch worked
flawlessly but suddenly a couple of broken BBBs kinda heaped one day,
now sometimes this happens.
I am even not so shure if software is to blame or if the hardware is or
has become glitchy, but falsinh stock u-boot still is able to
reset/restart on its own on these devices.
> Besides checking what changed, you should check if Linux is playing
> around with the voltages powering the SoC and if it does, disable that
> to see if it improves the situation.
Sadly (or gladly?) linux is not involved on affected BBBs. Boot, stop in
bootloader, hit S1, system freezes.
> Your barebox restart handler is probably am33xx_restart_soc (named
> "soc" in reset -l output).
I will poke around, never in my life was dealing with reset code :-)
Regards
Konsti
--
INSIDE M2M GmbH
Konstantin Kletschke
Berenbosteler Straße 76 B
30823 Garbsen
Telefon: +49 (0) 5137 90950136
Mobil: +49 (0) 151 15256238
Fax: +49 (0) 5137 9095010
konstantin.kletschke at inside-m2m.de
http://www.inside-m2m.de
Geschäftsführung: Michael Emmert, Derek Uhlig
HRB: 111204, AG Hannover
More information about the barebox
mailing list