mtd/Documentation/jffs3 JFFS3design.tex,1.17,1.18

Tue Sep 27 13:03:06 EDT 2005

Update of /home/cvs/mtd/Documentation/jffs3
In directory phoenix.infradead.org:/tmp/cvs-serv12025

Modified Files:
	JFFS3design.tex 
Log Message:
Minor fixes, refinements


Index: JFFS3design.tex
===================================================================
RCS file: /home/cvs/mtd/Documentation/jffs3/JFFS3design.tex,v
retrieving revision 1.17
retrieving revision 1.18
diff -u -r1.17 -r1.18

--- JFFS3design.tex	18 Sep 2005 14:30:08 -0000	1.17
+++ JFFS3design.tex	27 Sep 2005 17:03:03 -0000	1.18
@@ -55,9 +55,9 @@
 \large{Artem B. Bityuckiy\\
 dedekind at infradead.org}\\
 \vspace{13cm}
-\large{Version 0.23 (draft)}\\
+\large{Version 0.24 (draft)}\\
 \vspace{0.5cm}
-Aug 15, 2005
+Sept 27, 2005
 \end{center}
 \end{titlepage}
 %\maketitle
@@ -69,8 +69,14 @@
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \pagestyle{empty}
 \begin{abstract}
-This document discusses various JFFS3 high-level design aspects.
-Additionally, it defines standard JFFS3 dictionary and terms.
+JFFS2, the Journalling Flash File System version 2, is widely used in the
+embedded systems world. JFFS2 was designed for small flash chips and has
+serious problems when running on large flash devices. Unfortunately, these
+scalability problems are deep inside the design of the file system, and
+cannot be solved without full redesign. 
+
+This document describes JFFS3 - new flash file system which is designed to be
+scalable.
 \end{abstract}
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@@ -86,54 +92,62 @@
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 %
-% MOTIVATION AND GOALS
+% JFFS2 OVERVIEW
 %
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\section{Motivation and goals}
+\section{JFFS2 overview}
 JFFS2, the Journalling Flash File System version 2 [\ref{ref_JFFSdwmw2}]
 is widely used in the embedded systems world. JFFS2 was originally designed for
-small NOR flashes ($<$ 32MB), and NAND support was added later. The first NAND
-flashes were also small enough, but grew in size very quickly and are currently
-much larger then 32MB JFFS2 was originally designed for (e.g., Samsung produces 2GB NAND
-flashes [\ref{ref_SamsungNANDlist}]). Unfortunately, owing to its design,
-JFFS2 has serious problems on large flash chips:
-
-\begin{itemize}
-\item the \emph{mount time} becomes too high;
-\item the \emph{memory consumption} becomes enormous;
-\item the \emph{access time} to large files (the \texttt{open()} call) becomes too
-high.
-\end{itemize}
-
-Due to the design of JFFS2, the three above parameters (mount time, memory
-consumption and the file access time) depend linearly on the size of the flash
-partition and files.
+small NOR flashes ($<$ 32MB) and the first device with JFFS2 file system was a
+small bar-code scanner. Later, when NAND flashes became widely used,
+NAND support was added to JFFS2. The first NAND flashes were also small enough,
+but grew in size very quickly and are currently much larger then 32MB
+(e.g., Samsung produces 2GB NAND flashes [\ref{ref_SamsungNANDlist}]).
+Owing to its design, JFFS2 cannot be used on large flash chips as it has
+serious scalability problems.
+
+JFFS2 has \emph{log-structured} design, which basically means, that the whole
+file system is one large log. All modifications (i.e., file changes, directory
+creation, and the like) are appended to the log. There is no superblock and
+the log is the only data structure on flash media. Modifications are
+encapsulated in small data structures called \emph{nodes}. So, the log
+consists of nodes, each node contains a file system modification.
+
+There is no indexing information stored on flash media. Each node contains
+full information about itself, but there is no central index. \emph{The index}
+is the crucial part of any file system as it is used to quickly locate
+different information (i.e., find a file kept in a directory, find the physical
+address where the files data is stored, etc.). In JFFS2, the index is
+maintained in RAM and takes significant amount of it. Roughly speaking, there
+is an in-RAM data structure for each on-flash node.
+
+To build the index in RAM, JFFS2 must scan the whole file system and read
+information about all nodes on each mount. This is why JFFS2 mounts file system
+so slowly.
+
+Thus, the log-structured design of JFFS2 leads to the following two fundamental
+problems (the size of JFFS2 partition is denoted as $N$):
+\begin{itemize}
+\item mount time scales linearly ($O(N)$);
+\item memory consumption scales linearly ($O(N)$);
+\end{itemize}
+
 To put it differently, JFFS2 \emph{does not scale}. But in spite of the scalability
 problems, JFFS2 has many advantages:
 
 \begin{itemize}
 \item very economical flash usage -- data usually take as much flash
-space as they actually need, without wasting much space as in case of
+space as it actually need, without wasting much space as in case of
 traditional file systems for block devices;
-\item admitting of very efficient utilization of "on-flight" compression which
-allows to fit a big deal of data into the flash;
-\item very quick read and write operations;
-\item natural unclean reboot robustness;
+\item admitting of "on-flight" compression which
+allows to fit a big deal of data into the flash; note, there are very few
+file systems which support compression;
+\item very quick read and write operations (no need to update any on-flash
+index as it simply does not exist);
+\item unclean reboot robustness;
 \item good wear-leveling.
 \end{itemize}
 
-The \textbf{goal of the JFFS3 project} is to develop a scalable flash file system
-which may be used on large scale flashes. JFFS3 must meet the following
-requirements:
-
-\begin{enumerate}
-\item provide fast mounting;
-\item consume few memory, although it isn't forbidden to consume much RAM providing
-the RAM is used \emph{for caching} and may be freed on any demand;
-\item be tolerant to unclean reboots;
-\item all the JFFS3 characteristics must scale well up to 1TB flash chips.
-\end{enumerate}
-
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 %
 % INTRODUCTION
@@ -511,7 +525,7 @@
 
 \begin{enumerate}
 \item \emph{erase count} -- the number of erase cycles of the eraseblock
-(needed fo wear-levelling);
+(needed for wear-levelling);
 \item good or bad;
 \item the eraseblock type (journal eraseblock, etc);
 \end{enumerate}
@@ -520,9 +534,9 @@
 the garbage collector and the space allocator) but is not seen outside. One way
 to organize the map is just to reserve an inode number for it and treat this
 file as any other file in the tree. But for optimization purposes, as the map
-is being changed nearly everytime, it is better to organize the map as a
+is being changed nearly every time, it is better to organize the map as a
 distinct $B^+$-tree with its own root, key format, etc. The root node of the
-map (the \emph{map root}) must be refered by the superblock, just like the root
+map (the \emph{map root}) must be referred by the superblock, just like the root
 of the tree.
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@@ -584,11 +598,11 @@
 (\emph{CEB}). Chain eraseblocks may either refer other chain eraseblocks or
 contain the superblock. The last chain eraseblock (which contains the
 superblock) is called the \emph{chain eraseblock level 0} (\emph{CEB0}).
-The CEB which precedes the CEB0 (hence, referes CEB0) is CEB1 and so on.
+The CEB which precedes the CEB0 (hence, refers CEB0) is CEB1 and so on.
 
 The JFFS3 superblock management mechanism works as follows. Suppose there are
 $n$ levels of chain eraseblocks. Superblock updates
-are written to consequtive sectors of the CEB0. When CEB0 has no empty
+are written to consecutive sectors of the CEB0. When CEB0 has no empty
 sectors, new CEB0 is picked, the SB update is written to the new CEB0, and the
 CEB0 reference is written to CEB1. Similarly, when there is no space in CEB1,
 new CEB1 is picked and the corresponding reference is written to CEB2, and so
@@ -767,7 +781,7 @@
 \end{equation}
 
 Table \ref{ref_TableNANDLevels} describes the number of required chain
-eraseblockss for NAND flashes of different size.
+eraseblocks for NAND flashes of different size.
 
 \begin{table}[h]
 \begin{center}
@@ -840,7 +854,7 @@
 \item \textbf{ACL} -- a modern mechanism to control accesses to files, see
 [\ref{ref_ACL}] for more details.
 
-\item \textbf{Anchor erasblock, anchor area}
+\item \textbf{Anchor eraseblock, anchor area}
 -- two \emph{good} erasable blocks at the beginning of the JFFS3 partition
 which are reserved for the superblock management.
 
@@ -848,8 +862,8 @@
 is called branch node.
 
 \item \textbf{Chain eraseblock} -- an eraseblock containing references to
-other chain eraseblock of lower lavel or the superblock. 
-Chain eraseblocks facilitate quich SB searching and are part of the JFFS3
+other chain eraseblock of lower level or the superblock. 
+Chain eraseblocks facilitate quick SB searching and are part of the JFFS3
 superblock management scheme. 
 
 \item \textbf{Directory entry} -- basically associates a