[RFA] makedumpfile: fix access to os_info for /proc/kcore
Philipp Rudo
prudo at redhat.com
Fri Sep 6 08:12:14 PDT 2024
Hi Alex,
On Fri, 6 Sep 2024 16:35:50 +0200
Alexander Gordeev <agordeev at linux.ibm.com> wrote:
> On Wed, Sep 04, 2024 at 06:12:59PM +0200, Philipp Rudo wrote:
>
> Hi Philipp,
>
> > Hi Alex,
> >
> > our QE found a problem when trying to run makedumpfile with /proc/kcore
> > on s390. For example
> >
> > # makedumpfile --mem-usage /proc/kcore
> > s390x_init_vm: Can't get s390x os_info ptr.
> >
> > The exact options passed to makedumpfile don't matter. The error is
> > always the same. Trying the same on a dump file created from
> > /proc/vmcore works fine. As the function in question was introduced
> > with you commit 6f8325d ("[PATCH v2 2/2] s390x: uncouple virtual and
> > physical address spaces") I'm reaching out to you.
> >
> > Looking at /proc/kcore with crash I noticed that
> > abs_lowcore->os_info (aka. address 0xe18) is zero. Hence the check
> >
> > if (!readmem(PADDR, S390X_LC_OS_INFO, &addr,
> > sizeof(addr)) || !addr) {
> > ERRMSG("Can't get s390x os_info ptr.\n");
> > return FALSE;
> > }
> >
> > at the beginning of s390x_init_vm fails. My theory is that when trying
> > to access the absolute lowcore via /proc/kcore the read gets prefixed
> > and thus ends up in the per-cpu lowcore. As the os_info field isn't set
> > in the per-cpu lowcore the read returns 0, triggering the error.
>
> Yes, I think your analysis is correct.
\o/ I haven't lost all my s390 skills, yet.
> > I played around with crash trying to access the absolute lowcore via
> > __abs_lowcore and lowcore_ptr but failed. I always ended up in the
> > per-cpu lowcore. I also tried to get the address of os_info from the
> > dwarf information but that only returnes a virtual address which cannot
> > be used in the function that sets up vm...
> >
> > Any idea how this problem could be fixed?
>
> I will take a deeper look at it.
Thanks!
>
> > Thanks
> > Philipp
>
> Thanks for reporting!
>
> > P.S. While looking at the function I found one nit. Right after the
> > check mentioned above there's an other check for
> >
> > if (addr == 0)
> > return TRUE;
> >
> > which can never be true as the !addr from above already handles this
> > case.
>
> It will be TRUE when readmem() succeeded and read out zero.
> In fact, || !addr condition is redundant. Do you want to send a patch?
Could you take over the patch? I'm not really sure, when addr == 0 is
expected. You are much more qualified to describe that.
Thanks
Philipp
expected.
More information about the kexec
mailing list