[PATCH v2 0/8] Handle mmaped regions in cache [more analysis]

Michael Holzheu holzheu at linux.vnet.ibm.com
Thu Mar 12 06:56:16 PDT 2015

On Mon, 9 Mar 2015 17:08:58 +0100
Michael Holzheu <holzheu at linux.vnet.ibm.com> wrote:

> Hello Petr,


> As a conclusion, we could think of mapping larger chunks
> also for the fragmented case of -d 31 to reduce the amount
> of mmap/munmap calls.

FYI: I did some more tests and I am no longer sure if the above
conclusion was correct.

A simple "copy" program that reads or maps/unmaps every page
from /proc/vmcore and then writes it to /dev/null is faster
with mmap()/munmap() than with using read():

# time ./copy /dev/null read

real    0m1.072s
user    0m0.010s
sys     0m1.054s

# perf stat -e syscalls:sys_enter_old_mmap,syscalls:sys_enter_munmap,syscalls:sys_enter_read ./copy /dev/null read            
                 8      syscalls:sys_enter_old_mmap                                   
                 1      syscalls:sys_enter_munmap                                   
            458753      syscalls:sys_enter_read                                     
       1.405457536 seconds time elapsed

# time ./copy /dev/null mmap

real    0m0.947s
user    0m0.314s
sys     0m0.631s

# perf stat -e syscalls:sys_enter_old_mmap,syscalls:sys_enter_munmap,syscalls:sys_enter_read ./copy /dev/null mmap
            458760      syscalls:sys_enter_old_mmap                                   
            458753      syscalls:sys_enter_munmap                                   
                 1      syscalls:sys_enter_read                                     
       1.175956735 seconds time elapsed


