[RFA] makedumpfile: fix access to os_info for /proc/kcore

Philipp Rudo prudo at redhat.com
Fri Sep 6 08:12:14 PDT 2024


Hi Alex,

On Fri, 6 Sep 2024 16:35:50 +0200
Alexander Gordeev <agordeev at linux.ibm.com> wrote:

> On Wed, Sep 04, 2024 at 06:12:59PM +0200, Philipp Rudo wrote:
> 
> Hi Philipp,
> 
> > Hi Alex,
> > 
> > our QE found a problem when trying to run makedumpfile with /proc/kcore
> > on s390. For example
> > 
> > 	# makedumpfile --mem-usage /proc/kcore
> > 	s390x_init_vm: Can't get s390x os_info ptr.
> > 
> > The exact options passed to makedumpfile don't matter. The error is
> > always the same. Trying the same on a dump file created from
> > /proc/vmcore works fine. As the function in question was introduced
> > with you commit 6f8325d ("[PATCH v2 2/2] s390x: uncouple virtual and
> > physical address spaces") I'm reaching out to you.
> > 
> > Looking at /proc/kcore with crash I noticed that
> > abs_lowcore->os_info (aka. address 0xe18) is zero. Hence the check
> > 
> > 	if (!readmem(PADDR, S390X_LC_OS_INFO, &addr,
> > 			sizeof(addr)) || !addr) {
> > 		ERRMSG("Can't get s390x os_info ptr.\n");
> > 		return FALSE;
> > 	}
> > 
> > at the beginning of s390x_init_vm fails. My theory is that when trying
> > to access the absolute lowcore via /proc/kcore the read gets prefixed
> > and thus ends up in the per-cpu lowcore. As the os_info field isn't set
> > in the per-cpu lowcore the read returns 0, triggering the error.  
> 
> Yes, I think your analysis is correct.

\o/ I haven't lost all my s390 skills, yet.

> > I played around with crash trying to access the absolute lowcore via
> > __abs_lowcore and lowcore_ptr but failed. I always ended up in the
> > per-cpu lowcore. I also tried to get the address of os_info from the
> > dwarf information but that only returnes a virtual address which cannot
> > be used in the function that sets up vm...
> > 
> > Any idea how this problem could be fixed?  
> 
> I will take a deeper look at it.

Thanks!

> 
> > Thanks
> > Philipp  
> 
> Thanks for reporting!
> 
> > P.S. While looking at the function I found one nit. Right after the
> > check mentioned above there's an other check for
> > 
> > 	if (addr == 0)
> > 		return TRUE;
> > 
> > which can never be true as the !addr from above already handles this
> > case.  
> 
> It will be TRUE when readmem() succeeded and read out zero.
> In fact, || !addr condition is redundant. Do you want to send a patch?

Could you take over the patch? I'm not really sure, when addr == 0 is
expected. You are much more qualified to describe that.

Thanks
Philipp
expected. 




More information about the kexec mailing list