Recent 3.x kernels: Memory leak causing OOMs

David Rientjes rientjes at google.com
Sun Feb 16 18:42:46 EST 2014


On Sun, 16 Feb 2014, Russell King - ARM Linux wrote:

> However, that doesn't negate the point which I brought up in my other
> mail - I have been chasing a memory leak elsewhere, and I so far have
> two dumps off a different machine - both of these logs are from the
> same machine, which took 41 days to OOM.
> 
> http://www.home.arm.linux.org.uk/~rmk/misc/log-20131228.txt
> http://www.home.arm.linux.org.uk/~rmk/misc/log-20140208.txt
> 

You actually have free memory in both of these, the problem is 
fragmentation: the first log shows oom kills where order=2 and the second 
long shows oom kills where order=3.

If I look at an example from the second log:

Normal free:35052kB min:1416kB low:1768kB high:2124kB active_anon:28kB 
inactive_anon:60kB active_file:140kB inactive_file:140kB unevictable:0kB 
isolated(anon):0kB isolated(file):0kB present:131072kB managed:125848kB 
mlocked:0kB dirty:0kB writeback:40kB mapped:0kB shmem:0kB 
slab_reclaimable:3024kB slab_unreclaimable:9036kB kernel_stack:1248kB 
pagetables:1696kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB 
pages_scanned:574 all_unreclaimable? yes

you definitely are missing memory somewhere, but I'm not sure it's going 
to be detected by kmemleak since the slab stats aren't very high.  The 
system has ~123MB of memory, ~34.5MB is user or free memory, ~12MB is 
slab, and ~3MB for stack and pagetables means you're missing over half of 
your memory somewhere.  There's types of memory that isn't shown here for 
things like vmalloc(), things that call alloc_pages() directly, hugepages, 
etc.

You also have a lot of swap available:

Free swap  = 1011476kB
Total swap = 1049256kB

These ooms are coming from the high-order sk_page_frag_refill() which has 
been changed recently to fallback without calling the oom killer, you'll 
need commit ed98df3361f0 ("net: use __GFP_NORETRY for high order 
allocations") that Linus merged about 1.5 weeks ago.

So I'd recommend forgetting about kmemleak here, try a kernel with that 
commit to avoid the oom killing, and then capture /proc/meminfo at regular 
intervals to see if something continuously grows that isn't captured in 
the oom log.



More information about the linux-arm-kernel mailing list