[LEDE-DEV] WRT1900ACS - Kernel 4.4.12 boot failure

Claudio Leite leitec at gmail.com
Tue Jun 7 06:33:06 PDT 2016


Hello Dheeran,

* Dheeran Senthilvel (dheeranmech at gmail.com) wrote:
> Hi,
> 	Yes even after I run resetenv followed by a reset command, the fw_printenv produces the same output every time. 
> 
> > On 07-Jun-2016, at 3:54 PM, Claudio Leite <leitec at gmail.com> wrote:
> > 
> > Hi,
> > 
> > * Dheeran Senthilvel (dheeranmech at gmail.com) wrote:
> >> Hi,
> >>       Thanks for the reply. But I already documented it in my previous
> >> mails in May 2016.
> >> This is temporary, once again when reboot is done the same error occurs.
> > 
> > I see, sorry for the noise.
> > 
> >> 
> >>       I haven't been able to solve this issue. Also if altnandboot is
> >> performed and used having lede in nandboot there seems to be no problem.
> >> But once lede boots the error is imminent upon reboot.
> > 
> > I'm not sure I understand this part. Is the same image, with no custom
> > settings, stored in both partitions?
> Sorry for that! What I mean to say is if I boot ‘altnandboot’ (recovery official firmware) having lede in the primary_image partition, the problem doesn’t occur. Once I invoke nandboot and allow lede to boot and issue a reboot command, the error occurs. I gave this statement to clarify that the problem lede and not the 'hardware' or 'u-boot’.

OK, I understand now-you have the factory firmware on the secondary
partition.

> > 
> > What does fw_printenv look like after a "resetenv; reset" and boot into
> > a recent LEDE? At that point, the linksys-recovery stuff will already
> > have run.
> But I really don’t understand the problem. This OpenWrt WiKi page - https://wiki.openwrt.org/doc/techref/bootloader/uboot.config , shows that this kind of error show only when the partition map is incorrect (That is what i understood) and as the following command shows that flash mapping recognised by lede is incorrect

It's not incorrect; it's just how LEDE/OpenWrt defines it. The partition
map is baked into the kernel image via the dtb. The relevant partition
is mtd1, u_env (and s_env, for the 'resetbc' stuff) which is correctly
defined and matches up with the bootloader's (and factory firmware's)
expectations.

From your own output:

lede:

[    0.973995] 0x000000000000-0x000000200000 : "u-boot"
[    0.979273] 0x000000200000-0x000000240000 : "u_env"
[    0.984435] 0x000000240000-0x000000280000 : "s_env"

factory:

0x000000000000-0x000000200000 : "uboot"
0x000000200000-0x000000240000 : "u_env"
0x000000240000-0x000000280000 : "s_env"

Given the output you posted from fw_printenv earlier,

root at WRT1900ACS:/# fw_printenv
bootcmd=bootp; setenv bootargs root=/dev/nfs
nfsroot=${serverip}:${rootpath}
ip=${ipaddr}:${serverip}:${gatewayip}:${netmask}:${hostname}::off; bootm
bootdelay=5
baudrate=115200
auto_recovery=yes 

there's some very strange corruption going on. It would probably help to
determine if this is happening during the auto_recovery/resetbc stuff
during boot, when it writes to u-env. If there's something off still
with the NAND driver this could cause the corruption you see.

Could you try this:

1. reboot and run "resetenv; reset"
2. boot into LEDE, but drop into failsafe mode (when prompted, type
f<enter>)

3. from failsafe mode, run
	strings /dev/mtd1

this should be the complete and long set of default variables.

4. still from failsafe, run 'reboot'

During this time nothing new is written to u-env, so upon a reboot it
should still be functional. The 'printenv' command from u-boot should
list the variables intact, and it should boot normally.

Once booted into LEDE normally, if the fw_setenv command during the boot
sequence is causing the corruption, fw_printenv or "strings /dev/mtd1"
will now show some sort of corrupted output.

-Claudio



More information about the Lede-dev mailing list