Random stack corruption on v5.13 with dra76

Tony Lindgren tony at atomide.com
Fri May 21 02:14:19 PDT 2021


* Tomi Valkeinen <tomi.valkeinen at ideasonboard.com> [210521 08:45]:
> On 21/05/2021 10:39, Tony Lindgren wrote:
> > * Tomi Valkeinen <tomi.valkeinen at ideasonboard.com> [210521 07:05]:
> > > On 21/05/2021 08:36, Tony Lindgren wrote:
> > > > * Tomi Valkeinen <tomi.valkeinen at ideasonboard.com> [210520 08:27]:
> > > > > Hi,
> > > > > 
> > > > > I've noticed that the v5.13 rcs crash randomly (but quite often) on dra76 evm
> > > > > (I haven't tested other boards). Anyone else seen this problem?
> > > > 
> > > > I have not seen this so far and beagle-x15 is behaving for me.
> > > > 
> > > > Does it always happen on boot?
> > > 
> > > No, but quite often. I can't really say how often, as it's annoyingly random.
> > > I tried to bisect, but that proved to be difficult as sometimes I get multiple (5+)
> > > successful boots before the crash.
> > > 
> > > I tested with x15, same issue (below). So... Something in my kernel config? Or compiler?
> > > Looks like the crash happens always very soon after (or during) probing palmas.
> > 
> > After about 10 reboots with your .config I'm seeing it now too on
> > beagle-x15. So far no luck reproducing it with omap2plus_defconfig.
> 
> I think I have an easy way to see if a kernel is good or bad, by printing
> stack_not_used(current) in the first call to omap_i2c_xfer_irq(). There's a
> huge drop between v5.12 and v5.13-rc1.
> 
> And interestingly, sometimes a simple printk seems to use hundreds of bytes
> of stack (i.e. compare stack usage before and after the print). But not
> always. So maybe the issue is somehow related to printk.
> 
> I'm bisecting.

OK sounds good to me.

Thanks,

Tony



More information about the linux-arm-kernel mailing list