makedumpfile: benchmark on mmap() with /proc/vmcore on 2TB memory system

HATAYAMA Daisuke d.hatayama at jp.fujitsu.com
Tue Mar 26 23:30:19 EDT 2013


Hello,

I finally did benchmark makedumpfile with mmap() on /proc/vmcore on
*2TB memory system*.

In summary, it tooks about 35 seconds to filter 2TB memory. This can be
compared to the two kernel-space filtering works:

- Cliff Wickman's 4 minutes on 8 TB memory system:
  http://lists.infradead.org/pipermail/kexec/2012-November/007177.html

- Jingbai Ma's 17.50 seconds on 1TB memory system:
  https://lkml.org/lkml/2013/3/7/275

= Machine spec

- System: PRIMEQUEST 1800E2
- CPU: Intel(R) Xeon(R) CPU E7- 8870  @ 2.40GHz (8 sockets, 10 cores, 2 threads)
  (*) only 1 lcpu is used in the 2nd kernel now.
- memory: 2TB
- kernel: 3.9-rc3 with the patch set in: https://lkml.org/lkml/2013/3/18/878
- kexec tools: v2.0.4
- makedumpfile
  - v1.5.2-map: git map branch
  - git://git.code.sf.net/p/makedumpfile/code
  - To use mmap, specify --map-size <size in kilo-bytes> option.

= Perofrmance of filtering processing

== How to measure

I measured performance of filtering processing by reading time
contained in makedumpfile's report message. For example:

$ makedumpfile --message-level 31 -p -d 31 /proc/vmcore vmcore-pd31
...
STEP [Checking for memory holes  ] : 0.163673 seconds
STEP [Excluding unnecessary pages] : 1.321702 seconds
STEP [Excluding free pages       ] : 0.489022 seconds
STEP [Copying data               ] : 26.221380 seconds

The message starting with "STEP [Excluding" corresponds to the message
of filtering processing.

- STEP [Excluding unnecessary pages] corresponds to the time for
  mem_map array logic.

- STEP [Excluding free pages ] corresponds to the time for free list
  logic.

The message is displayed multiple times in cyclic mode, exactly the
same number of cycles.

== Result

mmap

| map_size | unnecessay | unnecessary |  free list |
|     [KB] |     cyclic |  non-cyclic | non-cyclic |
|----------+------------+-------------+------------|
|        4 |  66.212    |   59.087    |  75.165    |
|        8 |  51.594    |   44.863    |  75.657    |
|       16 |  43.761    |   36.338    |  75.508    |
|       32 |  39.235    |   32.911    |  76.061    |
|       64 |  37.201    |   30.201    |  76.116    |
|      128 |  35.901    |   29.238    |  76.261    |
|      256 |  35.152    |   28.506    |  76.700    |
|      512 |  34.711    |   27.956    |  77.660    |
|     1024 |  34.432    |   27.746    |  79.319    |
|     2048 |  34.361    |   27.594    |  84.331    |
|     4096 |  34.236    |   27.474    |  91.517    |
|     8192 |  34.173    |   27.450    | 105.648    |
|    16384 |  34.240    |   27.448    | 133.099    |
|    32768 |  34.291    |   27.479    | 184.488    |

read

| unnecessary | unnecessary | free list  |
| cyclic      | non-cyclic  | non-cyclic |
|-------------+-------------+------------|
| 100.859588  | 93.881849   | 80.367015  |

== Discussion

- The best case shows the performance close to the ones in the
  kernel-space works by Cliff and Ma as mentioned first.

- The reason why times consumed for filtering unnecessary pages are
  different between cyclic mode nad non-cyclic mode is that the former
  does free pages filtering while the latter does not; in the latter,
  page filtering is done in free list logic.

= Performance degradation in cyclic mode

Next benchmark case is to measure how performance is changed in
cyclic-mode if the number of cycles is increased.

== How to measure

Similarly to the above, but in this benchmark I also added
--cyclic-buffer as parameter.

The command I executed was like:

  for buf_size in 4 8 16 ... 32768 ; do
    time makedumpfile --cyclic-buffer ${buf_size} /proc/vmcore vmcore
    rm -f ./vmcore
  done

I choosed buffers sizes as the number of cycles ranged from 1 to 8
because current existing huge system memory size is up to 16TB and if
crashkernel=512MB, the number of cycles would be at most 8.

== Result

mmap

| buf size | nr cycles |      1 |      2 |      3 |     4 |     5 | 6     | 7     | 8     |  total |
|     [KB] |           |        |        |        |       |       |       |       |       |        |
|----------+-----------+--------+--------+--------+-------+-------+-------+-------+-------+--------|
|     8747 |         8 |  4.695 |  4.470 |  4.582 | 4.512 | 4.935 | 4.790 | 4.824 | 2.345 | 35.153 |
|     9371 |         8 |  5.010 |  4.782 |  4.891 | 4.996 | 5.280 | 5.108 | 4.986 | 0.007 | 35.059 |
|    10092 |         7 |  5.371 |  5.145 |  5.001 | 5.316 | 5.500 | 5.405 | 2.593 | -     | 34.330 |
|    10933 |         7 |  5.816 |  5.581 |  5.533 | 6.169 | 6.163 | 5.882 | 0.007 | -     | 35.152 |
|    11927 |         6 |  6.308 |  6.078 |  6.174 | 6.734 | 6.667 | 3.049 | -     | -     | 35.010 |
|    13120 |         5 |  6.967 |  6.641 |  6.973 | 7.427 | 6.899 | -     | -     | -     | 34.907 |
|    14578 |         5 |  7.678 |  7.536 |  7.948 | 8.161 | 3.845 | -     | -     | -     | 35.167 |
|    16400 |         4 |  8.942 |  8.697 |  9.529 | 9.276 |     - | -     | -     | -     | 36.445 |
|    18743 |         4 |  9.822 |  9.718 | 10.452 | 5.013 |     - | -     | -     | -     | 35.005 |
|    21867 |         3 | 11.413 | 11.550 | 11.923 |     - |     - | -     | -     | -     | 34.886 |
|    26240 |         3 | 13.554 | 14.104 |  7.114 |     - |     - | -     | -     | -     | 34.772 |
|    32800 |         2 | 16.693 | 17.809 |      - |     - |     - | -     | -     | -     | 34.502 |
|    43733 |         2 | 22.633 | 11.863 |      - |     - |     - | -     | -     | -     | 34.497 |
|    65600 |         1 | 34.245 |      - |      - |     - |     - | -     | -     | -     | 34.245 |
|   131200 |         1 | 34.291 |      - |      - |     - |     - | -     | -     | -     | 34.291 |

read

| buf size | nr cycles |       1 |      2 |      3 |      4 |      5 | 6      | 7      | 8     |   total |
|     [KB] |           |         |        |        |        |        |        |        |       |         |
|----------+-----------+---------+--------+--------+--------+--------+--------+--------+-------+---------|
|     8747 |         8 |  13.514 | 13.351 | 13.294 | 13.488 | 13.981 | 13.678 | 13.848 | 6.953 | 102.106 |
|     9371 |         8 |  14.429 | 14.279 | 14.484 | 14.624 | 14.929 | 14.649 | 14.620 | 0.001 | 102.017 |
|    10092 |         7 |  15.560 | 15.375 | 15.164 | 15.559 | 15.720 | 15.626 |  8.033 | -     | 101.036 |
|    10933 |         7 |  16.906 | 16.724 | 16.650 | 17.474 | 17.440 | 17.127 |  0.002 | -     | 102.319 |
|    11927 |         6 |  18.456 | 18.254 | 18.339 | 19.037 | 18.943 |  9.477 | -      | -     | 102.505 |
|    13120 |         5 |  20.162 | 20.222 | 20.287 | 20.779 | 20.149 | -      | -      | -     | 101.599 |
|    14578 |         5 |  22.646 | 22.535 | 23.006 | 23.237 | 11.519 | -      | -      | -     | 102.942 |
|    16400 |         4 |  25.228 | 25.033 | 26.016 | 25.660 |      - | -      | -      | -     | 101.936 |
|    18743 |         4 |  28.849 | 28.761 | 29.648 | 14.677 |      - | -      | -      | -     | 101.935 |
|    21867 |         3 |  33.720 | 33.877 | 34.344 |      - |      - | -      | -      | -     | 101.941 |
|    26240 |         3 |  40.403 | 41.042 | 20.642 |      - |      - | -      | -      | -     | 102.087 |
|    32800 |         2 |  50.393 | 51.895 |      - |      - |      - | -      | -      | -     | 102.288 |
|    43733 |         2 |  66.658 | 34.056 |      - |      - |      - | -      | -      | -     | 100.714 |
|    65600 |         1 | 100.975 |      - |      - |      - |      - | -      | -      | -     | 100.975 |
|   131200 |         1 | 100.699 |      - |      - |      - |      - | -      | -      | -     | 100.699 |

- As the result shows, there's very small degradation only; just a
  second. Also, this small degradation depens on the number of cycles,
  not IO size, so there seems no effect even if system memory becomes
  larger.

Thanks.
HATAYAMA, Daisuke




More information about the kexec mailing list