[RFC] Removing FIXMEs in mtdchar.c: mtd_{read,write} (was Re: Recommendations on System Tuning for JFFS2/MTD 128 KiB Block Page Allocation Failures)

Grant Erickson marathon96 at gmail.com
Fri Apr 1 11:26:41 EDT 2011


On 3/16/11 10:28 AM, Grant Erickson wrote:
> I've an OMAP3 ARM-based embedded system with 256 MiB of NAND flash and 64 MiB
> of RAM on Linux 2.6.32 in which both sys_mount (via mount) and sys_read (via
> fw_setenv) occasionally fail with "page allocation failure. order:5,
> mode:0xd0".
> 
> In the analysis I've done so far, sys_mount funnels down to jffs2_scan_medium
> which eventually calls kmalloc with a size of 128 KiB and flag GFP_KERNEL:
> 
>     sw/tps/linux/linux/fs/jffs2/scan.c:
>     ...
>     120         /* Respect kmalloc limitations */
>     121         if (buf_size > 128*1024)
>     122             buf_size = 128*1024;
>     123 
>     124         D1(printk(KERN_DEBUG "Allocating readbuf of %d bytes\n",
> buf_size));
>     125         flashbuf = kmalloc(buf_size, GFP_KERNEL);
>     126         if (!flashbuf)
>     127             return -ENOMEM;
>     128     }
>     ...
> 
> The sys_read case winds down to mtd_read which eventually calls kmalloc with a
> size of 128 KiB (to cover a single NAND erase blcok) and flag GFP_KERNEL:
> 
>     sw/tps/linux/linux/drivers/mtd/mtdchar:
>     ...
>     161     if (count > MAX_KMALLOC_SIZE)
>     162         kbuf=kmalloc(MAX_KMALLOC_SIZE, GFP_KERNEL);
>     163     else
>     164         kbuf=kmalloc(count, GFP_KERNEL);
>     ...
> 
> Ostensibly this occurs because of memory fragmentation where any of the lower
> order blocks are are available must be non-contiguous.
> 
> ...
>
> The system is currently configured with the SLAB allocator. Has anyone found
> better fragmentation and low-memory performance with the default SLUB or
> embedded SLOB allocators? How about tweaking:
> 
>     vm.min_free_kbytes
> 
> Anyone met with success there?

For anyone following this thread, FWIW, I was able to reduce statistically
but not eliminate the likelihood of this issue occurring with set in
sysctl.conf:

    vm.min_free_kybtes = 2048

However, this problem is all about fragmentation, so a solution such as this
will never really guarantee that this problem goes away entirely.

In the meantime, I've been contemplating a more permanent solution in
mtdchar.c:mtd_{read,write} and, before pressing ahead with either, wanted to
get some feedback on which might have upstream integration support.

1) Simpler but Use More Memory

This approach keeps the code, more or less, as is; however, rather than
failing outright, it continues to attempt to allocate smaller and smaller
blocks (dividing by two each time) until either it succeeds or hits the
minimum (either count or PAGE_SIZE).

2) More Complex but Use Less Memory

This approach seems to be what was intimated in the "FIXME" comments from
2005 found in this code and maps in and pins user pages for the read or
write request using get_user_pages. Where possible, adjacent pages are
grouped together into iovec extents and then mtd_{read,write} are called in
a loop for each iovec covering that extent of mapped pages in a manner
similar to {read,write}v having been called from user space.

Comments welcomed.

Regards,

Grant





More information about the linux-mtd mailing list