Jffs2 and big file = very slow jffs2_garbage_collect_pass
Jörn Engel
joern at logfs.org
Fri Jan 18 21:38:39 EST 2008
On Sat, 19 January 2008 00:23:02 +0000, Jamie Lokier wrote:
> Jörn Engel wrote:
> >
> > There are two ways to solve this problem:
> > 1. Reserve some amount of free space for GC performance.
>
> The real difficulty is that it's not clear how much to reserve for
> _reliable_ performance. We're left guessing based on experience, and
> that gives only limited confidence. The 5 blocks suggested in JFFS2
> docs seemed promising, but didn't work out. Perhaps it does work with
> 5 blocks, but you have to count all potential metadata overhead and
> misalignment overhead when working out how much free "file" data that
> translates to?
The five blocks work well enough if your goal is that GC will return
_eventually_. Now you come along and even want it to return within a
reasonable amount of time. That is a different problem. ;)
Math is fairly simple. The worst case is when the write pattern is
completely random and every block contains the same amount of data. Let
us pick a 99% full filesystem for starters.
In order to write one block worth of data, GC need to move 99 blocks
worth of old data around, before it has freed a full block. So on
average 99% of all writes handle GC data and only 1% handly the data you
- the user - care about. If your filesystem is 80% full, 80% of all
writes are GC data and 20% are user data. Very simple.
Latency is a different problem. Depending on your design, those 80% or
99% GC writes can happen continuously or in huge batches.
> Really, some of us just want JFFS2 to return -ENOSPC
> at _some_ sensible deterministic point before the GC might behave
> peculiarly, rather than trying to squeeze as much as possible onto the
> partition.
Logfs has a field defined for GC reserve space. I know the problem and
I care about it. Although I have to admit that mkfs doesn't allow
setting this field yet.
> > 2. Write in some non-random fashion.
> >
> > Solution 2 works even better if the filesystem actually sorts data
> > very roughly by life expectency. That requires writing to several
> > blocks in parallel, i.e. one for long-lived data, one for short-lived
> > data. Made an impressive difference in logfs when I implemented that.
>
> Ah, a bit like generational GC :-)
Actually, no. The different levels of the tree, which JFFS2 doesn't
store on the medium, also happen to have vastly different lifetimes.
Generational GC is the logical next step, which I haven't done yet.
Jörn
--
Science is like sex: sometimes something useful comes out,
but that is not the reason we are doing it.
-- Richard Feynman
More information about the linux-mtd
mailing list