On Thu, Nov 28, 2024 at 10:23:10AM +0100, Ahmad Fatoum wrote: > I assume this should be v2022.04? -dirty means you have local patches > on top. Do any of them touch SoC-specific, board-specific parts > like clock or power? Yes, it is "barebox 2022.04.0-dirty #1 Tue Sep 10 08:45:54 UTC 2024". The patches we apply do not touch any clock or power, we touch: Environment, kernel cmdline, watchdog settings, bootchooser config, autoabortkey. Config stuff. > What changed over the last week on the software side? I understand barebox > stayed the same? Is the kernel still the same? We changed nothing. I use to ship this barebox version with kernel for a couple of months. Last week we only ramped up quantity but the fails are so high in percentage it should had happened a couple of times before. > On affected hardware: Does this happen always or only some times? Always. Easy reproducable. Meanwhile I realized on affected BBBs it can be reproduced this way: Boot, hit Ctrl-C to stop barebox at prompt. Hit S1 button which is wired to NRESET_INOUT ball A10 (its not S2 as I initially wrote, S1). System is stuck/frozen/dead. > This sounds very similar to the issue fixed in commit 9c1a78f959dd > ("Revert "ARM: beaglebone: init MPU speed to 800Mhz""), but that's already > included in v2022.04.0, hence the question if you have patches that > do anything similar. Sounds interesting, I will take a look. As said, we patch no clock voltages or something like that. > Yes, but it sounds strange that only now these problems pop up? Yes. Last week we started to experience this problem in production, we have ~200 working BBBs, ~20 have this problem. The batch worked flawlessly but suddenly a couple of broken BBBs kinda heaped one day, now sometimes this happens. I am even not so shure if software is to blame or if the hardware is or has become glitchy, but falsinh stock u-boot still is able to reset/restart on its own on these devices. > Besides checking what changed, you should check if Linux is playing > around with the voltages powering the SoC and if it does, disable that > to see if it improves the situation. Sadly (or gladly?) linux is not involved on affected BBBs. Boot, stop in bootloader, hit S1, system freezes. > Your barebox restart handler is probably am33xx_restart_soc (named > "soc" in reset -l output). I will poke around, never in my life was dealing with reset code :-) Regards Konsti -- INSIDE M2M GmbH Konstantin Kletschke Berenbosteler Straße 76 B 30823 Garbsen Telefon: +49 (0) 5137 90950136 Mobil: +49 (0) 151 15256238 Fax: +49 (0) 5137 9095010 konstantin.kletschke@xxxxxxxxxxxxx http://www.inside-m2m.de Geschäftsführung: Michael Emmert, Derek Uhlig HRB: 111204, AG Hannover