32TB kdump

Mon Jul 1 12:12:06 EDT 2013

On Fri, Jun 28, 2013 at 04:56:31PM -0500, Cliff Wickman wrote:

[..]
> > > page scanning  570 sec.
> > > copying data  5795 sec. (72G)
> > > (The data copy ran out of disk space at 23%, so the time and size above are
> > >  extrapolated.)
> > 
> > That's almost 110 mins. Approximately 2 hrs to dump. I think it is still
> > a lot. How many people can afford to keep a machine dumping for 2hrs. They
> > would rather bring the servies back online.
> 
> It is a long time, agreed.  But a vast improvement over the hours and
> hours (maybe 12 or more) it would have taken just to scan pages before the
> fix of ioremap() per page.

Which ioremap() fix you are referring to. I thought using mmap() was the
fix for per page ioremap() issue and that's not showing significant
improvements. Looks like you are referring to some other makeudmpfile
changes which I am not aware of.

> A 32T machine is probably a research engine rather than a server, and 2hrs
> might be pretty acceptable to track down a system bug that's blocking some
> important application.
>  
> > So more work needed in scalability area. And page scanning seems to have
> > been not too bad. Copying data has taken majority of time. Is it because
> > of slow disk.
> 
> I think compression is the bottleneck.
> 
> On an idle 2TB machine: (times in seconds)
>                                 copy time
> uncompressed, to /dev/null      61
> uncompressed, to file           336    (probably 37G, I extrapolate, disk full)
> compressed, to /dev/null        387
> compressed, to file             402    (file 3.7G)
> 
> uncompressed disk time  336-61  275
> compressed disk time    402-387  15
> compress time           387-61  326
> 

Ok, so now compression is the biggest bottleneck on large machines.

[..]
> > > - Use of crashkernel=1G,high was usually problematic.  I assume some problem
> > >   with a conflict with something else using high memory.  I always use
> > >   the form like 1G at 5G, finding memory by examining /proc/iomem.
> > 
> > Hmm..., do you think you need to reserve some low mem too for swiotlb. (In
> > case you are not using iommu).
> 
> It is reserving 72M in low mem for swiotlb + 8M.  But this seems not
> enough.
> I did not realize that I could specify crashkernel=xxx,high and
> crashkernel=xxx,low together, until you mentioned it below.  This seems
> to solve my crashkernel=1G,high problem.  I need to specify
> crashkernel=128M,low on some systems or else my crash kernel panics on
> not finding enough low memory.

Is it possible to dive deeper and figure out why do you need more low
memory. We might need some fixing in upstream kernel. Otherwise how a
user would know how much low memory to reserve.

Thanks
Vivek