[PATCH RFC 00/11] makedumpfile: parallel processing
Chao Fan
cfan at redhat.com
Fri Dec 4 00:56:41 PST 2015
Hi Zhou Wenjian and Kumagai,
I have follow Zhou Wenjian's words to do some tests, in the condition of
"-c", makdumpfile 1.5.9 does perform better than "-l".
I have done more tests in a machine with 128G memory, in the condition
of "-d 0" and "-d 3", the makedumpfile 1.5.9 performs well. But if with
"--num-threads 1", it does need more time than without "--num-threads".
Here is my results(makedumpfile -c):
"-d 0" (the size of vmcore is 2.6G):
--num-threads time(seconds)
0 556
1 1186
4 307
8 186
12 131
16 123
"-d 3" (the size of vmcore is 1.3G):
--num-threads time(seconds)
0 141
1 262
2 137
4 91
8 121
16 137
So, I think makedumpfile 1.5.9 can save time in the condition of "-c"
and not "-d 31" and not "--num-threads 1".
----- Original Message -----
> From: "Wenjian Zhou/周文剑" <zhouwj-fnst at cn.fujitsu.com>
> To: "Atsushi Kumagai" <ats-kumagai at wm.jp.nec.com>
> Cc: kexec at lists.infradead.org
> Sent: Friday, December 4, 2015 11:33:36 AM
> Subject: Re: [PATCH RFC 00/11] makedumpfile: parallel processing
>
> Hello Kumagai,
>
> On 12/04/2015 10:30 AM, Atsushi Kumagai wrote:
> > Hello, Zhou
> >
> >> On 12/02/2015 03:24 PM, Dave Young wrote:
> >>> Hi,
> >>>
> >>> On 12/02/15 at 01:29pm, "Zhou, Wenjian/周文剑" wrote:
> >>>> I think there is no problem if other test results are as expected.
> >>>>
> >>>> --num-threads mainly reduces the time of compressing.
> >>>> So for lzo, it can't do much help at most of time.
> >>>
> >>> Seems the help of --num-threads does not say it exactly:
> >>>
> >>> [--num-threads THREADNUM]:
> >>> Using multiple threads to read and compress data of each page in
> >>> parallel.
> >>> And it will reduces time for saving DUMPFILE.
> >>> This feature only supports creating DUMPFILE in kdump-comressed
> >>> format from
> >>> VMCORE in kdump-compressed format or elf format.
> >>>
> >>> Lzo is also a compress method, it should be mentioned that --num-threads
> >>> only
> >>> supports zlib compressed vmcore.
> >>>
> >>
> >> Sorry, it seems that something I said is not so clear.
> >> lzo is also supported. Since lzo compresses data at a high speed, the
> >> improving of the performance is not so obvious at most of time.
> >>
> >>> Also worth to mention about the recommended -d value for this feature.
> >>>
> >>
> >> Yes, I think it's worth. I forgot it.
> >
> > I saw your patch, but I think I should confirm what is the problem first.
> >
> >> However, when "-d 31" is specified, it will be worse.
> >> Less than 50 buffers are used to cache the compressed page.
> >> And even the page has been filtered, it will also take a buffer.
> >> So if "-d 31" is specified, the filtered page will use a lot
> >> of buffers. Then the page which needs to be compressed can't
> >> be compressed parallel.
> >
> > Could you explain why compression will not be parallel in more detail ?
> > Actually the buffers are used also for filtered pages, it sounds
> > inefficient.
> > However, I don't understand why it prevents parallel compression.
> >
>
> Think about this, in a huge memory, most of the page will be filtered, and
> we have 5 buffers.
>
> page1 page2 page3 page4 page5 page6 page7 .....
> [buffer1] [2] [3] [4] [5]
> unfiltered filtered filtered filtered filtered unfiltered filtered
>
> Since filtered page will take a buffer, when compressing page1,
> page6 can't be compressed at the same time.
> That why it will prevent parallel compression.
>
> > Further, according to Chao's benchmark, there is a big performance
> > degradation even if the number of thread is 1. (58s vs 240s)
> > The current implementation seems to have some problems, we should
> > solve them.
> >
>
> If "-d 31" is specified, on the one hand we can't save time by compressing
> parallel, on the other hand we will introduce some extra work by adding
> "--num-threads". So it is obvious that it will have a performance
> degradation.
>
> I'm not so sure if it is a problem that the performance degradation is so
> big.
> But I think if in other cases, it works as expected, this won't be a problem(
> or a problem needs to be fixed), for the performance degradation existing
> in theory.
>
> Or the current implementation should be replaced by a new arithmetic.
> For example:
> We can add an array to record whether the page is filtered or not.
> And only the unfiltered page will take the buffer.
>
> But I'm not sure if it is worth.
> For "-l -d 31" is fast enough, the new arithmetic also can't do much help.
>
> --
> Thanks
> Zhou
>
>
>
> _______________________________________________
> kexec mailing list
> kexec at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
>
More information about the kexec
mailing list