[PATCH] makedumpfile: print spinner in progress information
Atsushi Kumagai
kumagai-atsushi at mxc.nes.nec.co.jp
Tue Oct 29 00:50:30 EDT 2013
(2013/10/29 11:43), HATAYAMA Daisuke wrote:
> (2013/10/29 10:26), HATAYAMA Daisuke wrote:
>> (2013/10/25 13:07), Atsushi Kumagai wrote:
>>> Hello HATAYAMA-san,
>>>
>>> (2013/10/25 9:55), HATAYAMA Daisuke wrote:
>>>> On system with huge memory, percentage in progress information is
>>>> updated at very slow interval, because 1 percent on 1 TiB memory is
>>>> about 10 GiB, which looks like as if system has freezed. Then,
>>>> confused users might get tempted to push a reset button to recover the
>>>> system. We want to avoid such situation as much as possible.
>>>>
>>>> To address the issue, this patch adds spinner that rotates in the
>>>> order of /, |, \ and - next to the progress indicator in percentage,
>>>> which helps users to get aware that system is still active and crash
>>>> dump process is still in progress now.
>>>>
>>>> This code is borrowed from diskdump code.
>>>>
>>>> The example is like this:
>>>>
>>>> Copying data : [ 0 %] /
>>>> Copying data : [ 8 %] |
>>>> Copying data : [ 11 %] \
>>>> Copying data : [ 14 %] -
>>>> Copying data : [ 16 %] /
>>>> ...
>>>> Copying data : [ 99 %] /
>>>> Copying data : [100 %] |
>>>
>>> I like it, but have a comment.
>>>
>>> 6109 int
>>> 6110 write_kdump_pages_cyclic(struct cache_data *cd_header, struct cache_data *cd_page,
>>> 6111 struct page_desc *pd_zero, off_t *offset_data)
>>> 6112 {
>>> ...
>>> 6156 per = info->num_dumpable / 100;
>>> ...
>>> 6178 for (pfn = start_pfn; pfn < end_pfn; pfn++) {
>>> 6179
>>> 6180 if ((num_dumped % per) == 0)
>>> 6181 print_progress(PROGRESS_COPY, num_dumped, info->num_dumpable);
>>>
>>> The interval of calling print_progress() looks still long if
>>> num_dumpable is huge.
>>> So how about fix this, e.g., by changing the interval to time based ?
>>>
>>
>> I wrote simple bench for time-based interval as below, which measures
>> total time consumed for calling time system call with/without vDSO.
>> It seems to me that both results are acceptable.
>> I'll reflect this change in next version.
>>
>> $ ./bench
>> total: 21.059131
>> average: 0.000000
>> total: 65.558263
>> average: 0.000000
>>
>
> This conclusion was wrong. Sorry. For example on our FJ 12 TiB system we collected about 300 GiB
> crash dump in about 40 minutes. If removing "if ((num_dumped % per) == 0)" and calling time()
> in each loop in print_progress(), total time for invoking time() system call is about 65 * 12
> = 780 sec = 13 min. This is about 20 % of a whole crash dump time. Obviously problematic.
>
> Instead, I think it better to increase the number of calling print_progress() like:
>
> per = info->num_dumpable / 10000
I agree with you, let's go with this idea.
Thanks
Atsushi Kumagai
More information about the kexec
mailing list