Jffs2 and big file = very slow jffs2_garbage_collect_pass

Fri Jan 18 21:38:39 EST 2008

On Sat, 19 January 2008 00:23:02 +0000, Jamie Lokier wrote:
> Jörn Engel wrote:
> > 
> > There are two ways to solve this problem:
> > 1. Reserve some amount of free space for GC performance.
> 
> The real difficulty is that it's not clear how much to reserve for
> _reliable_ performance.  We're left guessing based on experience, and
> that gives only limited confidence.  The 5 blocks suggested in JFFS2
> docs seemed promising, but didn't work out.  Perhaps it does work with
> 5 blocks, but you have to count all potential metadata overhead and
> misalignment overhead when working out how much free "file" data that
> translates to?

The five blocks work well enough if your goal is that GC will return
_eventually_.  Now you come along and even want it to return within a
reasonable amount of time.  That is a different problem. ;)

Math is fairly simple.  The worst case is when the write pattern is
completely random and every block contains the same amount of data.  Let
us pick a 99% full filesystem for starters.

In order to write one block worth of data, GC need to move 99 blocks
worth of old data around, before it has freed a full block.  So on
average 99% of all writes handle GC data and only 1% handly the data you
- the user - care about.  If your filesystem is 80% full, 80% of all
writes are GC data and 20% are user data.  Very simple.

Latency is a different problem.  Depending on your design, those 80% or
99% GC writes can happen continuously or in huge batches.

> Really, some of us just want JFFS2 to return -ENOSPC
> at _some_ sensible deterministic point before the GC might behave
> peculiarly, rather than trying to squeeze as much as possible onto the
> partition.

Logfs has a field defined for GC reserve space.  I know the problem and
I care about it.  Although I have to admit that mkfs doesn't allow
setting this field yet.

> > 2. Write in some non-random fashion.
> > 
> > Solution 2 works even better if the filesystem actually sorts data
> > very roughly by life expectency.  That requires writing to several
> > blocks in parallel, i.e. one for long-lived data, one for short-lived
> > data.  Made an impressive difference in logfs when I implemented that.
> 
> Ah, a bit like generational GC :-)

Actually, no.  The different levels of the tree, which JFFS2 doesn't
store on the medium, also happen to have vastly different lifetimes.
Generational GC is the logical next step, which I haven't done yet.

Jörn

-- 
Science is like sex: sometimes something useful comes out,
but that is not the reason we are doing it.
-- Richard Feynman