the exiting makedumpfile is almost there... :)

Jay Lan jlan at sgi.com
Wed Sep 24 17:56:44 EDT 2008


Jay Lan wrote:
> Ken'ichi Ohmichi wrote:
>> Hi Dave, Jay,
>>
>> Dave Anderson wrote:
>>> We just ran into a similar problem using an older version of makedumpfile,
>>> but looking at the latest makedumpfile code, it's seems that you could
>>> run into the same problem.
>>>
>>> In exclude_unnecessary_pages(), if a physical page is in a memory
>>> hole, then it skips the page and continues.  In our case, that happened,
>>> but when it started up again, the next legitimate pfn was well beyond
>>> the previously-read cache of 512 pages.  But since the new legit page
>>> wasn't modulo-512, it didn't refresh the page cache, and it ended up
>>> using stale page data (page->flags) and ended up excluding legitimate
>>> pages:
>>>
>>>                 for (; pfn < mmd->pfn_end;
>>>                     pfn++, mem_map += SIZE(page),
>>>                     paddr += info->page_size) {
>>>
>>>                         /*
>>>                          * Exclude the memory hole.
>>>                          */
>>>                         if (!is_in_segs(paddr))
>>>                                 continue;
>>>
>>>                         if ((pfn % PGMM_CACHED) == 0) {
>>>                                 if (pfn + PGMM_CACHED < mmd->pfn_end)
>>>                                         pfn_mm = PGMM_CACHED;
>>>                                 else
>>>                                         pfn_mm = mmd->pfn_end - pfn;
>>>                                 if (!readmem(VADDR, mem_map, page_cache,
>>>                                     SIZE(page) * pfn_mm))
>>>                                         goto out;
>>>                         }
>>>
>>> We fixed it by doing something like this:
>>>
>>>          if (!is_in_segs(paddr)) {
>>>                  reset_cache = 1;
>>>                  continue;
>>>          }
>>>
>>>          if (((pfn % PGMM_CACHED) == 0) || reset_cache) {
>>>                  reset_cache = 0;
>>>                  ...
>> Great, you are right.
>> Thank you for fixing it  :-) 
>>
>> Jay, could you try Dave's fixing like the attached patch ?
> 
> Yes. I applied your version of Dave's patch and tried again.
> It failed at a different pfn f600315:
> 
> 
> a4700rac:/mnt/sda9/diskdump # rm dump.cd31; /var/tmp/jlan/makedumpfile
> -cd31 -e 0xe0000f60031502f0 -x vmlinux.3 vmcore-cp.3 dump.cd31
> Excluding unnecessary pages        : [ 45 %]
> pfn=f600315 flags=3c000000001026c
> 
> PAGE(vaddr:e0000f60031502f0, pfn:f600315) is excluded as CACHE PAGE.
> 
> Copying data                       : [100 %]
> 
> The dumpfile is saved to dump.cd31.
> 
> makedumpfile Completed.
> 
> 
> Note the flags of pfn f600315. Crash checked on the pfn using the
> vmcore-cp.3 and showed different flags:
> 
> crash> kmem -p f60031502f0
>       PAGE         PHYSICAL      MAPPING       INDEX CNT FLAGS
> a07ffffc45d00498 f6003150000                0        0  1 3c0000000000400
> crash>
> 

I ran a testing on a 2-cpu  machine. The legitimate page got excluded
is:
   PAGE(vaddr:e00000300313fb70, pfn:300313) is excluded as CACHE PAGE

Values of some variables in the routine on processing that page are:
  pfn=300313 flags=3026c, page.flags=0
  page_cache=0x6000000000033f60, pcache=0x6000000000037b88

Hope these data help.

Thanks,
 - jay


>    
>>
>> Thanks
>> Ken'ichi Ohmichi
>>
> 




More information about the kexec mailing list