Build failure with verify size in next-20171102

Russell King - ARM Linux linux at armlinux.org.uk
Fri Nov 3 02:12:14 PDT 2017


On Fri, Nov 03, 2017 at 08:22:51AM +0000, Ard Biesheuvel wrote:
> On 3 November 2017 at 00:14, Russell King - ARM Linux
> <linux at armlinux.org.uk> wrote:
> > On Fri, Nov 03, 2017 at 12:04:37AM +0000, Russell King - ARM Linux wrote:
> >> On Thu, Nov 02, 2017 at 04:43:09PM -0700, Tony Lindgren wrote:
> >> > * Russell King - ARM Linux <linux at armlinux.org.uk> [171102 23:22]:
> >> > > On Thu, Nov 02, 2017 at 12:30:27PM -0700, Tony Lindgren wrote:
> >> > > > Hi Russeell,
> >> > > >
> >> > > > I think you're already aware of the build failure caused by commit
> >> > > > 078c0927485e ("ARM: verify size of zImage"), but here's info just
> >> > > > in case.
> >> > > >
> >> > > > arm-unknown-linux-musleabi-ld.bfd: error: zImage file size is incorrect
> >> > > > make[2]: *** [arch/arm/boot/compressed/Makefile:185: arch/arm/boot/compressed/vmlinux] Error 1
> >> > > > make[1]: *** [arch/arm/boot/Makefile:64: arch/arm/boot/compressed/vmlinux] Error 2
> >> > > > make: *** [arch/arm/Makefile:335: zImage] Error 2
> >> > > >
> >> > > > Other than that I was surprised that next actually booted for me
> >> > > > after a few week break with Linux next! :)
> >> > >
> >> > > It would be nice if people can investigate why that happens - I'm
> >> > > completely unable to reproduce it locally, even if I link using
> >> > > the same vmlinux.lds file and the objects from someone who sees the
> >> > > failure.
> >> > >
> >> > > There's some binutils version specific stuff that's going on here.
> >> > >
> >> > > What I have in my current for-next, which I'm intending to push,
> >> > > is all the same patches except the patch that introduces the above
> >> > > check is subsituted by a patch that produces an extra _edata_real
> >> > > symbol.  This _should_ match _edata.  So, if you hit this failure,
> >> > > try either my current for-next branch or tomorrow's linux-next, and
> >> > > run arm-linux-nm on arch/arm/boot/compressed/vmlinux and check the
> >> > > addresses given for _edata and _edata_real.
> >> >
> >> > OK thanks reverting 078c0927485e and applying dad4675388fc ("ARM:
> >> > add debug ".edata_real" symbol") from your for-next branch builds
> >> > and boots for me.
> >> >
> >> > > Theory says they should be identical, but the failure of that assert
> >> > > could only happen if "." inside the output section was different from
> >> > > _edata assigned outside.  _edata_real is now the address of "." inside
> >> > > the output section.
> >> >
> >> > With 078c0927485e reverted and dad4675388fc applied they are
> >> > identical for me:
> >> >
> >> > $ ${armcompiler}nm arch/arm/boot/compressed/vmlinux | grep _edata
> >> > 00421960 D _edata
> >> > 00421960 D _edata_real
> >> >
> >> > Let me know if you want me to run some test with the failing
> >> > commit also.
> >>
> >> So:
> >>
> >>   .image_end (NOLOAD) : {
> >>      ASSERT(. == _edata, "...");
> >>   }
> >>
> >> fails, because . != _edata, but:
> >>
> >>   .image_end (NOLOAD) : {
> >>      _edata_real = .;
> >>   }
> >>
> >> produces an image where _edata_real == _edata.
> >>
> >> That's completely insane and illogical.  *Shrug*.  God knows what's
> >> going on there.
> >
> > Reading through the "ld" documentation for ".":
> >
> >      Note: '.' actually refers to the byte offset from the start of the
> >   current containing object.  Normally this is the 'SECTIONS' statement,
> >   whose start address is 0, hence '.' can be used as an absolute address.
> >   If '.' is used inside a section description however, it refers to the
> >   byte offset from the start of that section, not an absolute address.
> >
> > If that _were_ true, then _edata_real should be zero, because it's
> > used inside a section description and it should be the "byte offset
> > from the start of that section, not an absolute address".  It isn't.
> > So, this documentation is blatently wrong and misleading.  In fact,
> > we use this fact that "." refers to the virtual address in the generic
> > parts of the linker script.
> >
> > I don't think we can trust anything the ld documentation says about "."
> > and by extension, I don't think we can trust ld's behaviour in how "."
> > behaves.
> >
> > Quite where we go from here, I've no idea, because I don't see how we
> > can trust the linker's behaviour with linker scripts.
> >
> 
> Could we try moving the ASSERT after the section, and using
> 
> ASSERT(ADDR(.image_end) == _edata, "...");
> 
> instead?

You want to trust what value ADDR(.image_end) comes out with when we
know that the linker already elides empty output sections?

Only if someone reads the linker's code and confirm that it will work,
and gets assurances from the binutils people that they will maintain
the behaviour into the future.

I'm not inclined to trust anything that the linker appears to do,
whether or not the documentation says anything about it.  As I've
said above, the entire thing is a complete mess of untrustworthy
documentation and behaviour.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up



More information about the linux-arm-kernel mailing list