[PATCH v3 00/10] makedumpfile: parallel processing
ats-kumagai at wm.jp.nec.com
Wed Aug 5 19:46:35 PDT 2015
>On 07/31/2015 05:35 PM, "Zhou, Wenjian/周文剑" wrote:
>> On 07/31/2015 04:27 PM, Atsushi Kumagai wrote:
>>>> On 07/23/2015 02:20 PM, Atsushi Kumagai wrote:
>>>>>> Hello Kumagai,
>>> I assume that we are facing the known issue of glibc:
>>> According to the thread above, per-thread arena is easy to be grown and
>>> trimmed compared with main arena.
>>> Actually compress2() calls malloc() and free() for compression each time
>>> it is called, so every compression processing will cause page fault.
>>> Moreover, I confirmed that many madvise(MADV_DONTNEED) are invoked only
>>> when compress2() is called in thread.
>>> OTOH, in lzo case, a temp buffer for working is allocated on the caller
>>> side, so it can reduce the number of malloc()/free() pair.
>>> (but I'm not sure why snappy doesn't hit this issue. The buffer size
>>> for compression may be smaller than the trim threshold.)
>>> Anyway, basically it's hard for zlib to avoid this issue on the application
>>> side, it seems that we have to accept the performance degradation caused by it.
>>> Unfortunately, the main target of this multi thread feature is zlib as you
>>> measured, we should resolve this issue somehow.
>>> Nevertheless, even now we can get some benefit of parallel processing,
>>> so lets' start to discuss the implementation of the parallel processing
>>> feature to accept this patch. I have some comments:
>>> - read_pfn_parallel() doesn't use the cache feature(cache.c), is it
>>> intentional with you ?
>> Yes, since the data are read once a page here, cache feature seems not
OK, I see.
>>> - Now --num-buffers is tunable but the man description and your benchmark
>>> didn't mention what is the benefit of this parameter.
>> The default value of num-buffers is 50. Originally the value has great influence
>> on the performance. But since we changed the logic in the 2nd version of the
>> patch set, more buffers have little improvement(1000 buffers may have 1% improvement).
>> I'm considering if the option should be removed. what do you think about it?
I think this option should be removed, most users wouldn't use it.
>> BTW, the code (mlockall) added in the 3rd version works well in several machines here.
>> Should I keep it ?
>> With the codes, madvise(MADV_DONTNEED) will be failed in compress2 and the performance
>> is as expected in these machines.
That kludge isn't reasonable, it just change memory allocation pattern.
If you can't explain why it works well in theory, you should get rid of it.
>kexec mailing list
>kexec at lists.infradead.org
More information about the kexec