JFFS3 memory consumption

Sun Jan 30 13:58:34 EST 2005

Another interpretation of my idea.

We may look to the summary nodes as to the file system superblock.
Indeed,
1. The summary nodes altogether (will call this "summary super block")
give us the full file system map;
2. The summary nodes have "fixed position", at least we can find their
locations quickly, without scanning the file system fully.

Unlike "classic" super blocks, the "summary super block" is distributed
entity. It is scattered over different JFFS3 blocks. But this property
will make JFFS3 unclean reboot-tolerant, since only small part of our
"summary super block" may be lost due to unclean reboot. And we always
may scan the summary-less block and build the block's nodes map.

Since we consider all summary nodes as one single, united entity, we
they should be tightly coupled. But the are not. Summaries are
independent of each other. This is the problem.

For example, imagine we want to build the track of inode's nodes. For
this reason, we bust scan all the summary nodes. This is unacceptably
long operation. For this operation I see two possible solutions:

1. For each inode we keep in-core the list of block numbers, where this
inode's nodes are present. Thus, to build the inode's nodes track, we
need to read only these summary nodes.

2. Use ICPs.

Comments?

On Fri, 2005-01-28 at 18:57 +0000, Artem B. Bityuckiy wrote:
> Hi,
> 
> I have recently described the JFFS2 memory consumption problem. David has
> suggested several ideas.
> 
> The best idea is the write-back cache, I think. But this touches the 
> memory
> consumption only by implication. It is really about IO speed improvement.
> So, I think we may discuss how to implement this WB cache in another 
> thread.
> I think this would be *very* good JFFS3 attribute.
> 
> Another David's Idea about the maximum inode node size augmentation is 
> also
> pretty nice for me. The idea to implement zisofs-like read is very good, I 
> think.
> I've noted both of these, hope this will be in the JFFS3 design.
> 
> But I have another, more radical idea. It decreases the memory consumption 
> better
> in most cases. While we currently have the linear dependency on nodes 
> number, my
> idea changes this to the dependency on inodes number.
> 
> Most often the nodes number is much larger then the number of inodes. But 
> there are
> exceptions there and my proposal will not help in these cases much.
> 
> Before describing Idea, some definitions:
> 
> Summary node
> ~~~~~~~~~~~
> The node which describes all other nodes in one block. Summaries are 
> placed at the
> end of each block. For each other node in this block summaries contain the 
> following
> information:
> 
> node offset within the block
> node length
> node type
> for direntry nodes:
>         direntry name
>         direntry target/parent inode
> for inode nodes:
>         inode number
>         data offset
> ...... may be something other.
> 
> Thus, if we want to know the block's layout, we may just read its summary 
> node.
> 
> Inode checkpoint node (ICP)
> ~~~~~~~~~~~~~~~~~~~~~~~~~~
> Checkpoint node describes all the nodes related to some inode. Checkpoints 
> may
> be placed anywhere within flash, as any other node. For each node related 
> to
> the inode the ICP contains:
> 
> 
> node offset within the flash
> node length
> node type
> for direntry nodes:
>         direntry name
>         direntry target/parent inode
> for inode nodes:
>         inode number
>         data offset
> ...... may be something other.
> 
> Just almost the same as summary, but the per-inode information.
> 
> 
> The Idea
> ~~~~~~~
> Imagine we have file system, where each inode has ICP and each block has 
> Summary.
> Then we may keep in-core only the positions of ICPs for each inode.
> 
> In this case:
> 1. when we need per-inode nedes list, we read ICP and we have it!
> 2. when we need per block nodes list, we read Summary and we have it!
> 
> No need to keep the per-block and per-inode node_ref lists in-core 
> anymore!
> 
> The per-inode list is only needed on the iget() call. After the iget() 
> call
> we will have the per-inode list in memory (say, in form of fragtree or the
> dirent list).
> 
> The per-block list is only needed for the Garbage Collection.
> 
> When the file is opened and it is changed, wee build the ICP of this file 
> in RAM.
> We flush this ICP on iput() call. Summaries are also being formed in RAM 
> and
> flushed when the block is almost full.
> 
> So far so good. But there are a bundle of problems we must solve.
> Just count some of them:
> 
> 1. Unclean reboots. In this case we will have no up-to-date ICPs and 
> Summaries.
> I hope in this case we still will be ables to use old JFFS2 algorithms.
> I suspect this case includes situations when we have no flash space to 
> flush ICPs
> or we are out of memory while forming it. We will just not write them then 
> which
> is similar to unclean reboot.
> 
> 2. GC tends to merge non-pristine nodes. So it changes ICPs, and the GC 
> process
> will be slower.
> 
> 3. Since we form ICPs of opened file in memory, we will possibly need to 
> reserve some
> flash space to be able to flush these ICPs on unmount. This needs very 
> accurate
> flash space accounting.
> 
> I believe these are not all problems, but think the biggest ones.
> At the first glance all of them are solvable.
> 
> Thanks for your comments.
> 
> --
> Best Regards,
> Artem B. Bityuckiy,
> St.-Petersburg, Russia.
> 
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/