Q: Filesystem choice..
Eric W. Biederman
ebiederman at lnxi.com
Mon Jan 26 04:23:23 EST 2004
David Woodhouse <dwmw2 at infradead.org> writes:
> On Mon, 2004-01-26 at 00:09 -0700, Eric W. Biederman wrote:
> > Has anyone gotten as far as a proof. Or are there some informal
> > things that almost make up a proof, so I could get a feel? Reserving
> > more than a single erase block is going to be hard to swallow for such
> > a small filesystem.
>
> You need to have enough space to let garbage collection make progress.
> Which means it has to be able to GC a whole erase block into space
> elsewhere, then erase it. That's basically one block you require.
>
> Except you have to account for write errors or power cycles during a GC
> write, wasting some of your free space. You have to account for the
> possibility that what started off as a single 4KiB node in the original
> block now hits the end of the new erase block and is split between that
> and the start of another, so effectively it grew because it has an extra
> node header now. And of course when you do that you get worse
> compression ratios too, since 2KiB blocks compress less effectively than
> 4KiB blocks do.
Compression is an interesting question. Do you encode the uncompressed
size of a block in bytes. If so I don't think it would be too difficult
to get your uncompressed block size > page size. With the page cache
there is real reason a block size <= page size. You just need what
amounts to scatter/gather support.
My real question here is how difficult is it to disable compression?
Or can compression be deliberately disabled on a per file basis?
For the two primary files I am thinking of using neither one would
need compression. A file of my BIOS settings is would be dense
and quite small (128 bytes on a big day). A kernel is already
compressed and carries it's own decompresser, and whole file compression
is more effective than compressing small blocks.
> When you get down to the kind of sizes you're talking about, I suspect
> we need to be thinking in bytes rather than blocks -- because there
> isn't just one threshold; there's many, of which three are particularly
> relevant:
That makes sense. This at least looks like a viable alternative for
the 1MB case.
[snip actual formulas]
> You want resv_blocks_write to be larger than resv_blocks_deletion, and I
> suspect you could get away with values of 2 and 1.5 respectively, if we
> were counting bytes rather than whole eraseblocks.
I have a truly perverse case I would like to ask your opinion about.
A filesystem composed of 2 8K erase blocks? That is one of the
weird special cases that flash chips often support. I could
only store my parameter file in there but it would be interesting.
I think if I counted bytes very carefully and never got above .5 of
a block full I suspect that it would work, and be useful. I'd just
have to make certain the degenerate case matched the original jffs.
And a last question. jffs2 rounds all erase blocks up to a common size
doesn't it?
Eric
More information about the linux-mtd
mailing list