Re: ❌ FAIL:?Test?report?for?kernel?5.11.0-rc7 (arm-next)

Veronika Kabatova vkabatov at redhat.com
Thu Feb 11 07:25:34 EST 2021



----- Original Message -----
> From: "Will Deacon" <will at kernel.org>
> To: "Veronika Kabatova" <vkabatov at redhat.com>
> Cc: "catalin marinas" <catalin.marinas at arm.com>, linux-arm-kernel at lists.infradead.org, "CKI Project"
> <cki-project at redhat.com>
> Sent: Thursday, February 11, 2021 12:50:50 PM
> Subject: Re: ❌ FAIL:?Test?report?for?kernel?5.11.0-rc7 (arm-next)
> 
> On Thu, Feb 11, 2021 at 05:46:02AM -0500, Veronika Kabatova wrote:
> > > On Wed, Feb 10, 2021 at 08:17:53PM +0000, Will Deacon wrote:
> > > > On Wed, Feb 10, 2021 at 02:31:45PM -0500, Veronika Kabatova wrote:
> > > > > > > > > The machine in question can on course be somewhat flaky (hard
> > > > > > > > > to
> > > > > > > > > eliminate
> > > > > > > > > that possibility completely), but I checked our historical
> > > > > > > > > data
> > > > > > > > > and it
> > > > > > > > > didn't fail to boot a single time other than with these two
> > > > > > > > > new
> > > > > > > > > kernels.
> > > > > > > > 
> > > > > > > > So the first thing we should probably try is whether vanilla
> > > > > > > > -rc7
> > > > > > > > fails
> > > > > > > > on
> > > > > > > > the machine causing us problems. If it does, then the arm64
> > > > > > > > queue
> > > > > > > > for
> > > > > > > > 5.12
> > > > > > > > is out of the equation, if not then we can try a targetted
> > > > > > > > bisection.
> > > > > > > > 
> > > > > > > > Would you be able to try v5.11-rc7 please?
> > > > > > > 
> > > > > > > Can do.
> > > > > > 
> > > > > > Brill, thanks.
> > > > > > 
> > > > > > > Someone snatched up the machine in the meanwhile and appears to
> > > > > > > have
> > > > > > > it
> > > > > > > reserved till next Monday :/ I sincerely hope that they'll
> > > > > > > release it
> > > > > > > sooner
> > > > > > > and have queued up the test job with high priority. Will let you
> > > > > > > know
> > > > > > > right
> > > > > > > as I get the results.
> > > > > > 
> > > > > > Just tell them the machine is broken and they really don't want to
> > > > > > use
> > > > > > it
> > > > > > for anything important ;)
> > > > > > 
> > > > > 
> > > > > Our wishes were granted by the lab fairy and the machine was returned
> > > > > rather quickly :)
> > > > > 
> > > > > The 5.11-rc7 kernel boots.
> > > > 
> > > > Fantastic! Then it's something in the arm64 for-next/core tree that was
> > > > added since then. The diff isn't huge and one change stands out for me,
> > > > so
> > > > let me try reverting that and I'll update the branch...
> > > 
> > > Ok, I updated for-kernelci so that it contains the arm64 for-next/core
> > > branch merged into -rc7, but with a couple of patches reverted on top.
> > > 
> > > HEAD is	e56137cc7606 ("Revert "arm64/mm: Fix pfn_valid() for ZONE_DEVICE
> > > based memory""). Please can you try that?
> > > 
> > 
> > I didn't actually have to, the autopick picked the machine for the testing
> > and it happily booted :) \o/
> 
> Phew, so I'll drop those from linux-next as well. Do you know if "earlycon"
> works on the problematic machine? If possible, it would be helpful to try
> booting the bad kernel with earlycon on the cmdline to see if it manages to
> say anything in its dying breath.
> 

No idea, we don't have the option enabled and I can't find any information
about it. I submitted a new job with the option and let's see whether it
works or not. The machine is reserved again so there may be some delays.

> > Apparently IT cut off our email sending access because we're sending too
> > many emails so that's why you don't have the report yet. When we work
> > around it I'll make sure to retrigger the sending so you have it.
> 
> That's nice of them...
> 

Update for the drama loving folks: IT is actually innocent in this.
Someone deployed a misconfigured application in the same cluster as our
reporting system is in, and this application is sending out emails
every 3 seconds. We're collateral damage. Working on resolution, but
right now it seems that the situation is under control.


Veronika

> Will
> 
> 




More information about the linux-arm-kernel mailing list