the exiting makedumpfile is almost there... :)

Jay Lan jlan at sgi.com
Wed Sep 24 14:30:16 EDT 2008


Ken'ichi Ohmichi wrote:
> Hi Dave, Jay,
> 
> Dave Anderson wrote:
>> We just ran into a similar problem using an older version of makedumpfile,
>> but looking at the latest makedumpfile code, it's seems that you could
>> run into the same problem.
>>
>> In exclude_unnecessary_pages(), if a physical page is in a memory
>> hole, then it skips the page and continues.  In our case, that happened,
>> but when it started up again, the next legitimate pfn was well beyond
>> the previously-read cache of 512 pages.  But since the new legit page
>> wasn't modulo-512, it didn't refresh the page cache, and it ended up
>> using stale page data (page->flags) and ended up excluding legitimate
>> pages:
>>
>>                 for (; pfn < mmd->pfn_end;
>>                     pfn++, mem_map += SIZE(page),
>>                     paddr += info->page_size) {
>>
>>                         /*
>>                          * Exclude the memory hole.
>>                          */
>>                         if (!is_in_segs(paddr))
>>                                 continue;
>>
>>                         if ((pfn % PGMM_CACHED) == 0) {
>>                                 if (pfn + PGMM_CACHED < mmd->pfn_end)
>>                                         pfn_mm = PGMM_CACHED;
>>                                 else
>>                                         pfn_mm = mmd->pfn_end - pfn;
>>                                 if (!readmem(VADDR, mem_map, page_cache,
>>                                     SIZE(page) * pfn_mm))
>>                                         goto out;
>>                         }
>>
>> We fixed it by doing something like this:
>>
>>          if (!is_in_segs(paddr)) {
>>                  reset_cache = 1;
>>                  continue;
>>          }
>>
>>          if (((pfn % PGMM_CACHED) == 0) || reset_cache) {
>>                  reset_cache = 0;
>>                  ...
> 
> Great, you are right.
> Thank you for fixing it  :-) 
> 
> Jay, could you try Dave's fixing like the attached patch ?

Yes. I applied your version of Dave's patch and tried again.
It failed at a different pfn f600315:


a4700rac:/mnt/sda9/diskdump # rm dump.cd31; /var/tmp/jlan/makedumpfile
-cd31 -e 0xe0000f60031502f0 -x vmlinux.3 vmcore-cp.3 dump.cd31
Excluding unnecessary pages        : [ 45 %]
pfn=f600315 flags=3c000000001026c

PAGE(vaddr:e0000f60031502f0, pfn:f600315) is excluded as CACHE PAGE.

Copying data                       : [100 %]

The dumpfile is saved to dump.cd31.

makedumpfile Completed.


Note the flags of pfn f600315. Crash checked on the pfn using the
vmcore-cp.3 and showed different flags:

crash> kmem -p f60031502f0
      PAGE         PHYSICAL      MAPPING       INDEX CNT FLAGS
a07ffffc45d00498 f6003150000                0        0  1 3c0000000000400
crash>


> 
> 
> Thanks
> Ken'ichi Ohmichi
> 




More information about the kexec mailing list