[PATCH v3 00/10] makedumpfile: parallel processing

"Zhou, Wenjian/周文剑" zhouwj-fnst at cn.fujitsu.com
Wed Jul 22 23:39:25 PDT 2015


On 07/23/2015 02:20 PM, Atsushi Kumagai wrote:
>> Hello Kumagai,
>>
>> The PATCH v3 has improved the performance.
>> The performance degradation in PATCH v2 mainly caused by the page_fault
>> produced by the function compress2().
>>
>> I wrote some codes to test the performance of compress2. It almost costs
>> the same time and produces the same amount of page_fault as executing compress2
>> in thread.
>>
>> To reduce page_faults, I have to do the following in kdump_thread_function_cyclic().
>>
>> +	/*
>> +	 * lock memory to reduce page_faults by compress2()
>> +	 */
>> +	void *temp = malloc(1);
>> +	memset(temp, 0, 1);
>> +	mlockall(MCL_CURRENT);
>> +	free(temp);
>> +
>>
>> With this, using a thread or not almost has the same performance.
>
> Hmm... I can't get good results with this patch, many page faults still
> occur. I guess mlock will change when page faults occur, but will not
> change the total number of page faults.
> Could you explain why compress2() causes many page faults only in thread,
> then I may understand why this patch is meaningful.
>

Actually, it will also cause so much page faults even not in thread, if
info->bitmap2 is not freed in makedumpfile.

I wrote some codes to test the performance of compress2().

<cut>
buf = malloc(PAGE_SIZE);
bufout = malloc(SIZE_OUT);
memset(buf, 1, PAGE_SIZE / 2);
while (1)
     compress2(bufout, &size_out, buf, PAGE_SIZE, Z_BEST_SPEED);
<cut>

The codes almost like this.
It will cause much page faults.

But if the codes turn to be the following, it will be much better.

<cut>
temp = malloc(TEMP_SIZE);
memset(temp, 0, TEMP_SIZE);
free(temp);

buf = malloc(PAGE_SIZE);
bufout = malloc(SIZE_OUT);
memset(buf, 1, PAGE_SIZE / 2);
while (1)
     compress2(bufout, &size_out, buf, PAGE_SIZE, Z_BEST_SPEED);
<cut>

TEMP_SIZE must be large enough.
(larger than 135097 will work,in my machine)


If in thread, the following codes can reduce the page faults.

<cut>
temp = malloc(1);
memset(temp, 0, 1);
mlockall(MCL_CURRENT);
free(temp);

buf = malloc(PAGE_SIZE);
bufout = malloc(SIZE_OUT);
memset(buf, 1, PAGE_SIZE / 2);
while (1)
     compress2(bufout, &size_out, buf, PAGE_SIZE, Z_BEST_SPEED);
<cut>

I haven't known why.

-- 
Thanks
Zhou Wenjian

>
> Thanks
> Atsushi Kumagai
>
>> In our machine, I can get the same result as the following with PATCH v2.
>>> Test2-1:
>>>    | threads | compress time | exec time |
>>>    |    1    |     76.12     |   82.13   |
>   >
>>> Test2-2:
>>>    | threads | compress time | exec time |
>>>    |    1    |     41.97     |   51.46   |
>>
>> I test the new patch set in the machine, and below is the results.
>>
>> PATCH V2:
>> ###################################
>> - System: PRIMEQUEST 1800E
>> - CPU: Intel(R) Xeon(R) CPU E7540
>> - memory: 32GB
>> ###################################
>> ************ makedumpfile -d 0 ******************
>>                  core-data               0       256     512     768     1024    1280    1536    1792
>>          threads-num
>> -c
>>          0                               158     1505    2119    2129    1707    1483    1440    1273
>>          4                               207     589     672     673     636     564     536     514
>>          8                               176     327     377     387     367     336     314     291
>>          12                              191     272     295     306     288     259     257     240
>>
>> ************ makedumpfile -d 7 ******************
>>                  core-data               0       256     512     768     1024    1280    1536    1792
>>          threads-num
>> -c
>>          0                               154     1508    2089    2133    1792    1660    1462    1312
>>          4                               203     594     684     701     627     592     535     503
>>          8                               172     326     377     393     366     334     313     286
>>          12                              182     273     295     308     283     258     249     237
>>
>>
>>
>> PATCH v3:
>> ###################################
>> - System: PRIMEQUEST 1800E
>> - CPU: Intel(R) Xeon(R) CPU E7540
>> - memory: 32GB
>> ###################################
>> ************ makedumpfile -d 0 ******************
>>                  core-data               0       256     512     768     1024    1280    1536    1792
>>          threads-num
>> -c
>>          0                               192     1488    1830
>>          4                               62      393     477
>>          8                               78      211     258
>>
>> ************ makedumpfile -d 7 ******************
>>                  core-data               0       256     512     768     1024    1280    1536    1792
>>          threads-num
>> -c
>>          0                               197     1475    1815
>>          4                               62      396     482
>>          8                               78      209     252
>>
>>
>> --
>> Thanks
>> Zhou Wenjian
>>
>> On 07/21/2015 02:29 PM, Zhou Wenjian wrote:
>>> This patch set implements parallel processing by means of multiple threads.
>>> With this patch set, it is available to use multiple threads to read
>>> and compress pages. This parallel process will save time.
>>> This feature only supports creating dumpfile in kdump-compressed format from
>>> vmcore in kdump-compressed format or elf format. Currently, sadump and
>>>    xen kdump are not supported.
>>>
>>> Qiao Nuohan (10):
>>>     Add readpage_kdump_compressed_parallel
>>>     Add mappage_elf_parallel
>>>     Add readpage_elf_parallel
>>>     Add read_pfn_parallel
>>>     Add function to initial bitmap for parallel use
>>>     Add filter_data_buffer_parallel
>>>     Add write_kdump_pages_parallel to allow parallel process
>>>     Initial and free data used for parallel process
>>>     Make makedumpfile available to read and compress pages parallelly
>>>     Add usage and manual about multiple threads process
>>>
>>>    Makefile       |    2 +
>>>    erase_info.c   |   29 ++-
>>>    erase_info.h   |    2 +
>>>    makedumpfile.8 |   24 ++
>>>    makedumpfile.c | 1095 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>    makedumpfile.h |   80 ++++
>>>    print_info.c   |   16 +
>>>    7 files changed, 1245 insertions(+), 3 deletions(-)
>>>
>>>
>>> _______________________________________________
>>> kexec mailing list
>>> kexec at lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/kexec
>>>





More information about the kexec mailing list