[Crash-utility] crash: struct command can read irrelevant pages.
Dave Anderson
anderson at redhat.com
Thu Feb 20 15:45:50 EST 2014
Hello Atsushi,
I've committed a SLAB/SLUB kmem_cache-specific fix for this issue:
https://github.com/crash-utility/crash/commit/c0b7a74fc13121203810d06d163550436b2d5476
which is queued for crash-7.0.6.
Thanks,
Dave
----- Original Message -----
>
>
> ----- Original Message -----
> > Hello,
> >
> > Finally, I've found the cause of the issue I mentioned as below
> > when makedumpfile v1.5.5 was released:
> >
> > > 2. At first, the supported kernel will be updated to 3.12, but I
> > > found an issue while testing for v1.5.5, which seems that the page
> > > filtering works wrongly on kernel 3.12. I couldn't investigate this
> > > yet and it will take some time to finish it.
> > > Therefore, the latest supported kernel version is 3.11 in v1.5.5.
> >
> > This is neither a kernel issue nor a makedumpfile issue, it's a crash's bug.
> > It can happen when a slab cache is stored at almost end of a page.
> >
> > == Description ==
> >
> > At the beginning, I found the error message below when I used crash for
> > a dumpfile generated by makedumpfile -d2:
> >
> > please wait... (gathering kmem slab cache data)
> > crash: page excluded: kernel virtual address: f4e87000 type:
> > "kmem_cache
> > buffer"
> >
> > crash: unable to initialize kmem slab cache subsystem
> >
> > This message indicated that crash failed to get a slab cache during
> > kmem_cache_init(), and according to the below, crash failed to get
> > the slab cache stored at f4e86f40:
> >
> > crash> p kmem_cache
> > kmem_cache = $1 = (struct kmem_cache *) 0xc0b1cbc0 <kmem_cache_boot>
> > crash>
> > crash> list kmem_cache.list -s kmem_cache.name -h 0xc0b1cbc0
> > ...
> > f4d37840
> > name = 0xf4edf540 "uid_cache"
> > f4e86f40
> > list: page excluded: kernel virtual address: f4e87000 type:
> > "gdb_readmem_callback"
> >
> > It seems that the slab cache covered two pages, [f4e86000- f4e87000] and
> > [f4e87000- f4e88000]. Well, let's confirm the *real* size of it.
> >
> > Since slab caches except kmem_cache_boot are allocated as slab objects,
> > we can confirm the size like below:
> >
> > crash> p kmem_cache
> > kmem_cache = $2 = (struct kmem_cache *) 0xc0b1cbc0 <kmem_cache_boot>
> > crash> struct kmem_cache.object_size 0xc0b1cbc0
> > object_size = 104
> > crash>
> >
> > In my environment, the size was 104 bytes. Therefore, the slab cache
> > stored at f4e86f40 fits in the single page([f4e86000- f4e87000]) and
> > the excluded page([f4e87000- f4e88000]) isn't a related page.
> >
> > On the other hand, crash get the size from vmlinux by using gdb,
> > it was 216 bytes:
> >
> > crash> struct kmem_cache
> > struct kmem_cache {
> > unsigned int batchcount;
> > unsigned int limit;
> > ...
> > struct kmem_cache_node **node;
> > struct array_cache *array[33];
> > }
> > SIZE: 216
> > crash>
> >
> > So crash mistook the correlative pages of the slab cache as
> > [f4e86000- f4e87000] and [f4e87000- f4e88000] even though the latter
> > was a irrelevant page.
> >
> > This gap came from the fact that the size of slab cache is variable.
> >
> > struct kmem_cache {
> > ...
> > struct kmem_cache_node **node;
> > struct array_cache *array[NR_CPUS + MAX_NUMNODES];
> > /*
> > * Do not add fields after array[]
> > */
> > };
> >
> > The size of "array" is the variable factor of kmem_cache.
> > When building vmlinux, the size of kmem_cache will be calculated with
> > NR_CPUS and MAX_NUMNODES, and put it into vmlinux as a debug information.
> > (Sorry, I don't know gcc well. I may misunderstand this.)
> > However, the actual size will be smaller than the defined size because
> > the actual size will be decided based on the actual number of CPUs and
> > NODEs.
> >
> > void __init kmem_cache_init(void)::
> > ...
> > /*
> > * struct kmem_cache size depends on nr_node_ids & nr_cpu_ids
> > */
> > create_boot_cache(kmem_cache, "kmem_cache",
> > offsetof(struct kmem_cache, array[nr_cpu_ids]) +
> > nr_node_ids * sizeof(struct
> > kmem_cache_node
> > *), // object_size
> > SLAB_HWCACHE_ALIGN);
> > list_add(&kmem_cache->list, &slab_caches);
> >
> >
> > As for kmem_cache, we can get the actual size of it from kmem_cache_boot,
> > but I suppose that kmem_cache is not the only struct in kernel whose size
> > is variable. So I think we should discuss how to address such issues like
> > this.
> >
> > By the way, I mentioned the case of *SLAB* in this mail,
> > but SLUB seems have the same issue.
> >
> >
> > Thanks
> > Atsushi Kumagai
>
>
> This is a "known" issue has been discussed on the crash-utility list in the
> past,
> at least with respect to the kmem_cache data structure. But for any random
> data
> structure that has such a construct, I'm not sure what can be done.
>
> In the case of the CONFIG_SLAB kmem_cache data structure, there is a function
> that is supposed to "downsize" the size value of the kmem_cache data
> structure
> that is returned by gdb. It is called here in kmem_cache_init(), just
> prior to cycling through all of the kmem_cache structures, where the
> page excluded error shown above occurred:
>
> 8561 if (!(pc->flags & RUNTIME))
> 8562 kmem_cache_downsize();
> 8563
> 8564 cache_buf = GETBUF(SIZE(kmem_cache_s));
> 8565 hq_open();
> 8566
> 8567 do {
> 8568 cache_count++;
> 8569
> 8570 if (!readmem(cache, KVADDR, cache_buf,
> SIZE(kmem_cache_s),
> 8571 "kmem_cache buffer", RETURN_ON_ERROR)) {
> 8572 FREEBUF(cache_buf);
> 8573 vt->flags |= KMEM_CACHE_UNAVAIL;
> 8574 error(INFO,
> 8575 "%sunable to initialize kmem slab cache
> subsystem\n\n",
> 8576 DUMPFILE() ? "\n" : "");
> 8577 hq_close();
> 8578 return;
> 8579 }
>
> The SIZE(kmem_cache_s) value should have been downsized by that function,
> but presumably it did not work. If CRASHDEBUG(1) was turned on during
> initialization,
> you would have seen either of these two messages from kmem_cache_downsize():
>
> if (CRASHDEBUG(1))
> fprintf(fp, "kmem_cache_downsize: %ld to %ld\n",
> STRUCT_SIZE("kmem_cache"),
> SIZE(kmem_cache_s));
>
> or:
>
> if (CRASHDEBUG(1)) {
> fprintf(fp,
> "\nkmem_cache_downsize: SIZE(kmem_cache_s): %ld "
> "cache_cache.buffer_size: %d\n",
> STRUCT_SIZE("kmem_cache"), buffer_size);
> fprintf(fp,
> "kmem_cache_downsize: nr_node_ids: %ld\n",
> vt->kmem_cache_len_nodes);
> }
>
> The function failed probably failed due to some kernel change. In fact,
> I just checked a 3.13 CONFIG_SLAB kernel, and I see that
> kmem_cache_downsize()
> no longer works for that kernel.
>
> I see that kmem_cache_boot would be a good alternative for determining
> the size on CONFIG_SLAB kernels, at least on 3.7 and later kernels where
> it was introduced. And for CONFIG_SLUB, which doesn't currently have a
> "downsize" function, it looks like its "kmem_cache" cache also has size
> fields that could be used.
>
> By any chance can you make the 32-bit vmlinux/vmcore pair available for
> me to download? Reply to me off-list if you can.
>
> Thanks,
> Dave
>
>
>
>
>
>
>
>
>
>
>
More information about the kexec
mailing list