mtd/Documentation/jffs3 JFFS3design.tex,1.23,1.24
Artem Bityutskiy
dedekind at infradead.org
Thu Nov 3 09:25:43 EST 2005
- Previous message: mtd/drivers/mtd/nand sharpsl.c,1.5,1.6
- Next message: mtd/Documentation/jffs3/pics journal-01.eps, 1.2, 1.3 journal-01.pdf,
1.2, 1.3 journal-01.png, 1.1, 1.2 node-01.eps, 1.2,
1.3 node-01.pdf, 1.2, 1.3 node-01.png, 1.1, 1.2 wandtree.eps,
1.2, 1.3 wandtree.pdf, 1.2, 1.3 wandtree.png, 1.1, 1.2
- Messages sorted by:
[ date ]
[ thread ]
[ subject ]
[ author ]
Update of /home/cvs/mtd/Documentation/jffs3
In directory phoenix.infradead.org:/tmp/cvs-serv28774
Modified Files:
JFFS3design.tex
Log Message:
Minor updates and fixes.
Switch version to 0.27
Index: JFFS3design.tex
===================================================================
RCS file: /home/cvs/mtd/Documentation/jffs3/JFFS3design.tex,v
retrieving revision 1.23
retrieving revision 1.24
diff -u -r1.23 -r1.24
--- JFFS3design.tex 2 Nov 2005 17:24:54 -0000 1.23
+++ JFFS3design.tex 3 Nov 2005 14:25:39 -0000 1.24
@@ -57,9 +57,9 @@
\large{Artem B. Bityuckiy\\
dedekind at infradead.org}\\
\vspace{13cm}
-\large{Version 0.26 (draft)}\\
+\large{Version 0.27 (draft)}\\
\vspace{0.5cm}
-October 30, 2005
+November 3, 2005
\end{center}
\end{titlepage}
%\maketitle
@@ -101,11 +101,11 @@
JFFS2, the Journalling Flash File System version 2 [\ref{ref_JFFSdwmw2}] is
widely used in the embedded systems world. JFFS2 was originally designed for
-small NOR flashes ($<$ 32MB) and the first device with JFFS2 file system was a
-small \mbox{bar-code} scanner. Later, when NAND flashes became widely used,
-NAND support was added to JFFS2. The first NAND flashes were also small enough,
-but grew in size very quickly and are currently much larger then 32MB (e.g.,
-Samsung produces 2GB NAND flashes [\ref{ref_SamsungNANDlist}]).
+small NOR flashes (less then about 32MB) and the first device with JFFS2 file
+system was a small \mbox{bar-code} scanner. Later, when NAND flashes became
+widely used, NAND support was added to JFFS2. The first NAND flashes were also
+small enough, but grew in size very quickly and are currently much larger then
+32MB (e.g., Samsung produces 2GB NAND flashes [\ref{ref_SamsungNANDlist}]).
JFFS2 has \emph{\mbox{log-structured}} design, which basically means, that the
whole file system may be regarded as one large log. Any file system
@@ -120,9 +120,9 @@
information about the design of JFFS2 and about the \mbox{log-structured}
design, look at [\ref{ref_JFFSdwmw2}], [\ref{ref_LFS}], and [\ref{ref_LaFS}].
-It is not the goal of this chapter to delve into details of JFFS2 but still
-but it is still wanted to provide enough information to make it clear why JFFS2
-has scalability problems and why JFFS3 is needed. To keep this chapter simple,
+It is not the goal of this chapter to delve into details of JFFS2 but it is
+still wanted to provide enough information to make it clear why JFFS2 has
+scalability problems and why JFFS3 is needed. To keep this chapter simple,
terms \emph{index} or \emph{indexing information} are used.
\emph{The index} is a crucial part of any file system as it is used to keep
@@ -141,23 +141,24 @@
In JFFS2, \emph{the index is maintained in RAM}, not on the flash media.
And this is the root of all the JFFS2 scalability problems.
-Of course, having the index in RAM JFFS2 achieves extremely high file system
+Of course, as the index in kept in RAM, JFFS2 achieves extremely high
throughput, just because it does not need to update the index on flash after
something has been changed in the file system. And this works very well for
relatively small flashes, for which JFFS2 was originally designed. But as soon
as one tries to use JFFS2 on large flashes (starting from about 128MB), many
problems come up.
-At first, it is obvious, that JFFS2 needs to build the index in RAM when it
-mounts the file system. For this reason, it needs to scan the whole partition
-in order to locate all the nodes which are present there. So, the larger is
-JFFS2 partition, the more nodes it has, the longer it takes to mount it.
+At first, it is obvious that JFFS2 needs to build the index in RAM when it
+mounts the file system. For this reason, it needs to scan the entire flash
+partition in order to locate all the nodes which are present there. So, the
+larger is JFFS2 partition, the more nodes it has, the longer it takes to mount
+it.
The second, it is evidently that the index consumes some RAM. And the larger is
the JFFS2 file system, the more nodes it has, the more memory is consumed.
To put it differently, if $S$ is the size of the JFFS3 flash partition
-\footnote{Note, all the symbols used in this document are referred in the
+\footnote{Note, all the symbols used in this document are summarized in
section \ref{ref_SectionSymbols}},
\begin{itemize}
@@ -165,31 +166,33 @@
\item JFFS2 mount time scales as $O(S)$ (linearly);
\item JFFS2 memory consumption scales as $O(S)$ (linearly).
+
\end{itemize}
So, it may be stood that JFFS2 \emph{does not scale}. But in spite of the
-scalability problems, JFFS2 has many advantages:
+scalability problems, JFFS2 has many advantages, for example:
\begin{itemize}
+
\item very economical flash usage~-- data usually take as much flash
space as it actually need, without wasting a lot space as in case of
traditional file systems for block devices;
-\item admitting of "\mbox{on-flight}" compression which allows to fit a big
+\item admitting of "\mbox{on-flight}" compression which allows to fit a great
deal of data to the flash; note, there are few file systems which support
-compression;
+compression;
-\item very file system write throughput (no need to update any \mbox{on-flash}
-indexing information as it simply does not exist there);
+\item very good file system write throughput (no need to update any
+\mbox{on-flash} indexing information as it simply does not exist there);
-\item unclean reboot robustness;
+\item unclean reboots robustness;
\item good enough \mbox{wear-leveling}.
\end{itemize}
-It is also worth noting here, that there is a patch which is usually referred
-to as the "\emph{summary patch}", that was implemented by Ferenc Havasi and was
+It is also worth noting here that there is a patch which is usually referred to
+as the "\emph{summary patch}", that was implemented by Ferenc Havasi and was
recently committed to the JFFS2 CVS. This patch speed up the JFFS2 mount
greatly, especially in case of NAND flashes. What the patch basically does is
that it puts a small "\emph{summary}" node at the end of each flash erasable
@@ -197,9 +200,9 @@
nodes in this eraseblocks. So, when JFFS2 mounts the file system, it needs to
glance to the end of each eraseblock and read the summary node. This results in
that JFFS2 only needs to read one or few NAND pages from the end of each
-eraseblock. Instead, when there is no summary, JFFS2 reads almost all the NAND
-pages of the eraseblock, because the node headers are spread more or less
-evenly over the eraseblock.
+eraseblock. Instead, when there is no summary, JFFS2 reads almost every NAND
+page of the eraseblock, because node headers are spread more or less evenly
+over the eraseblock.
Although the patch helps a lot, it is still a not scalable solution and it only
relaxes the coefficient of the JFFS2 mount time liner dependency. Let alone
@@ -212,12 +215,12 @@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{JFFS3 Requirements} \label{ref_SectionJFFS3Req}
-The following are the main user level requirements JFFS3 have to meet.
+The following are the main \mbox{user-level} requirements JFFS3 has to meet.
\begin{enumerate}
\item[\textbf{R01}]
-JFFS3 memory consumption must not depend on the size of the JFFS3
+JFFS3 memory consumption must not depend on the size of JFFS3
partition, the number of inodes in the file system, size of files,
directories, and the like. Of course, JFFS3 must be able to use the advantage
of the available RAM, but only for different kinds of \emph{caches}
@@ -242,23 +245,22 @@
performance.
\item[\textbf{R07}]
-JFFS3 must gracefully deal with different kinds of data corruptions, flash bits
-flipping, bad blocks which may appear dynamically, etc.
+JFFS3 must gracefully deal with different kinds of data corruptions (flash
+\mbox{bit-flips}, bad blocks may appear dynamically, etc).
\item[\textbf{R08}]
-In case of serious corruptions it should be possible to reconstruct the any
+In case of serious corruptions it should be possible to reconstruct all the
data which were not damaged by means external tools like \texttt{ckfs.jffs3}.
-\item[\textbf{R09}]
-All the JFFS3 characteristics ought to vary not faster then $log(S)$,
-where $S$ is the size of the JFFS3 partition. \mbox{JFFS2-like} linear
-dependencies are not acceptable.
+\item[\textbf{R09}] All the JFFS3 characteristics ought to scale not faster the
+logarithmic function. \mbox{JFFS2-like} linear dependencies are not
+acceptable.
\item[\textbf{R10}]
JFFS3 must support extended attributes.
\item[\textbf{R11}]
-JFFS3 must support Access Control Lists feature (ACLs).
+JFFS3 must support the Access Control Lists feature (ACL).
\item[\textbf{R12}]
JFFS3 have to support \mbox{on-flight} compression.
@@ -291,8 +293,8 @@
Obviously, it is unacceptable to erase the whole eraseblock each time a sector
is updated. Instead, \mbox{so-called} "\emph{\mbox{out-of-place} updates}"
-technique is usually used. This simply means, that no attempts to update the
-sector \mbox{in-place} is made, but instead, the update is written to some
+technique is usually used. This simply means that no attempts to update
+sectors \mbox{in-place} are made but instead, updates are written to some
other sector and the contents of the previous sector is afterwords regarded as
garbage.
@@ -304,8 +306,8 @@
It is interesting to notice that in \mbox{log-structured} file systems for
block devices (like the one described in [\ref{ref_LFS}]) not any update is
"\mbox{out-of-place}". There are always some \mbox{fixed-position} sectors
-present. These sectors usually refer the file system index admitting of quick
-file system mount and they are updated \mbox{in-place} in these file systems.
+present. These sectors usually refer the file system index, admit of quick
+file system mount and they are updated \mbox{in-place}.
But flash devices have limited number of erase cycles for each eraseblock and
it is impossible to guaranty good \mbox{wear-levelling} if some eraseblocks are
@@ -339,9 +341,10 @@
Suppose $D$ should be updated. Since it is updated \mbox{out-of-place}, the
newer version $D_1$ is written to some other place. But there are $B$ and $C$
-which still refer $D$ and they ought to be updated as well. And when they are
-updated \mbox{out-of-place}, $A$ still refers the old $B$ and $C$, and so on.
-Thus, it is not that trivial to store indexing information on flash.
+which still refer $D$, not $D_1$, and they ought to be updated as well. And
+when they are updated \mbox{out-of-place}, $A$ will still refer the old $B$ and
+$C$, and so on. Thus, it is not that trivial to store and maintain indexing
+information on the flash media.
%
% WANDERING TREES
@@ -368,6 +371,7 @@
\end{figure}
\begin{enumerate}
+
\item Suppose that the index is a tree and it is stored and maintained
on the flash media. The tree consists of nodes
$A$, $B$, $C$, $D$, $E$, $F$, $G$, and $H$.
@@ -376,14 +380,15 @@
\item At first, the updated version $H_1$ is written. Obviously, $F$ still
refers $H$.
-\item Now the corresponding links in node $F$ are changed and node $F_1$ is
+\item Now the corresponding link in node $F$ is changed and node $F_1$ is
written to flash. $F_1$ refers $H_1$. But as $F_1$ is also written
\mbox{out-of-place}, $A$ still refers the old node $F$.
\item Finally, the new root node $A_1$ is written ant it refers $F_1$.
-\item Nodes $A$, $F$, $H$ are now treated as garbage and the updated tree
-contains nodes $A_1$, $B$, $C$, $D$, $E$, $F_1$, $G$, and $H_1$.
+\item Nodes $A$, $F$, $H$ are now treated as garbage and the updated tree is
+composed by nodes $A_1$, $B$, $C$, $D$, $E$, $F_1$, $G$, and $H_1$.
+
\end{enumerate}
So, wandering trees is the base idea of how the indexing information is going
@@ -395,7 +400,7 @@
%
% B+ TREES
%
-\subsection{B-trees}
+\subsection{B-trees} \label{ref_SectionBTrees}
JFFS3 uses \mbox{$B^+$-trees} and this subsection makes a short introduction to
\mbox{$B^+$-trees}. There is a plenty of books where one may find more
@@ -443,7 +448,7 @@
\includegraphics{pics/node-01.png}
\end{htmlonly}
%begin{latexonly}
-\includegraphics[width=90mm,height=30mm]{pics/node-01.pdf}
+\includegraphics[width=160mm,height=60mm]{pics/node-01.pdf}
%end{latexonly}
\end{center}
\caption{The structure of a non-leaf node in $B^+$-tree.}
@@ -487,22 +492,26 @@
to just as "\emph{the tree}".
Every object which is stored in the tree has a \emph{key}, and the object is
-found in the tree by this key. To make it clearer what object keys are, the
+found in the tree by this key. To make it clearer what are object keys, the
following is an example of how they may look like:
\begin{itemize}
+
\item file data key: \{\texttt{inode number}, \texttt{offset}\};
\item directory entry key: \{\texttt{parent directory inode number},
\texttt{direntry name hash}\};
\item extended attribute key: \{\texttt{target inode number}, \texttt{xattr
-name hash}\} and the like. \end{itemize}
+name hash}\} and the like.
+
+\end{itemize}
The following are terms which are used in JFFS3 to refer nodes of different
levels in the tree:
\begin{itemize}
+
\item nodes of level 0 are \emph{leaf nodes};
\item nodes of level 1 are \emph{twig nodes};
@@ -512,6 +521,7 @@
\item \mbox{no-leaf} nodes (i.e., the root, branch and twig) are
\emph{indexing nodes}.
+
\end{itemize}
Note, the same terminology (except indexing nodes) is used in the Reiser4 file
@@ -523,16 +533,16 @@
\emph{sector} size.
It is important to note that somewhat unusual terminology is used in this
-document. The smallest input/output unit of the flash chip is called
-\emph{sector}. Since JFFS3 mainly orients to NAND flashes, the sector is
-mostly the NAND page and is either 512 bytes or 2 Kilobytes. For other flash
-types the sector may be different. If flash's minimal input/output unit is very
-small (like one bit in case of NOR flash) there should be a layer which
-emulates larger sectors (say, 512 bytes).
+document. The smallest input/output unit of the flash chip is called a
+\emph{sector}. Since JFFS3 mainly orients to NAND flashes, the sector is mostly
+the NAND page and is either 512 bytes or 2 Kilobytes. For other flash types the
+sector may be different. If flash's minimal input/output unit is very small
+(say, one bit as in case of NOR flash), there should be a layer which emulates
+larger sectors (say, 512 bytes).
In opposite to indexing nodes, leaf nodes have flexible size, just like nodes
in JFFS2. So, roughly speaking, JFFS3 file system may be considered as JFFS2
-file system (leaf nodes) plus indexing information (figure
+file system (leaf nodes) plus indexing information (indexing nodes) (see figure
\ref{ref_FigureBTree_02}).
%
@@ -554,7 +564,7 @@
Similarly to JFFS2, leaf nodes consist of \emph{header} and \emph{data}. The
header describes the node data and contains information like the key of the
node, the length, and the like. Node data contains some file system data, for
-example directory entry, file's contents, etc.
+example a directory entry, file's contents, etc.
Leaf and indexing nodes are physically separated, which means that there are
eraseblocks with only indexing nodes and with only leaf nodes. But of course,
@@ -574,7 +584,7 @@
\includegraphics[width=159mm,height=30mm]{pics/flash-01.pdf}
%end{latexonly}
\end{center}
-\caption{Leaf and indexing nodes separation illustration.}
+\caption{Illustration of leaf and indexing nodes separation.}
\label{ref_FigureFlash_01}
\end{figure}
@@ -587,7 +597,7 @@
growing number of file system objects and the tree lookup scales as
$O(log_n{S})$ (logarithmically).
-The following are the advantages of the JFFS3 indexing approach.
+The following are advantages of the JFFS3 indexing approach.
\begin{itemize}
@@ -595,7 +605,7 @@
flexibility in how objects are sorted in the tree. Thus, one may optimize JFFS3
for specific workloads by means of changing the format of the keys.
-\item The leaf nodes may be compressed, so JFFS3 admits of \mbox{on-flight}
+\item Leaf nodes may be compressed, so JFFS3 admits of the \mbox{on-flight}
compression.
\item In case of corruptions of the indexing information it is possible to
@@ -604,7 +614,7 @@
\item There is a clear separation between data and indexing information. This
implies that the indexing information and data may be cached separately,
without overlapping in the same cache lines. This leads to better cache usage
-and is described very well in the Reiser4 paper~[\ref{ref_Reiser4}].
+as described in the Reiser4 paper~[\ref{ref_Reiser4}].
\end{itemize}
@@ -635,7 +645,7 @@
\includegraphics{pics/journal-01.png}
\end{htmlonly}
%begin{latexonly}
-\includegraphics[width=159mm,height=60mm]{pics/journal-01.pdf}
+\includegraphics[width=159mm,height=65mm]{pics/journal-01.pdf}
%end{latexonly}
\end{center}
\caption{The JFFS3 journal.}
@@ -669,29 +679,31 @@
\label{ref_FigureJournal_02}
\end{figure}
-The journal is \emph{checkpointed} when it is full or in some other appropriate
+The journal is \emph{committed} when it is full or in some other appropriate
for JFFS3 time. This means, that the indexing nodes corresponding to the
-journal changes are updated and written to the flash. The checkpointed
+journal changes are updated and written to the flash. The committed
journal eraseblocks are then treated as leaf eraseblocks and new journal
eraseblocks are picked by JFFS3 using the common JFFS3
\mbox{wear-levelling} algorithm.
The journal makes it possible to postpone indexing information updates to later
and potentially more appropriate time. It also allows to merge many indexing
-nodes updates and lessen the amount of flash write operations.
+node updates and lessen the amount of flash write operations.
When JFFS3 file system is being mounted, the journal should be read, "replayed"
-and the journal tree should be built. So, the larger is the journal the longer
+and the journal tree should be built. So, the larger is the journal, the longer
it may take to mount JFFS3. From the other hand, the larger is the journal, the
more writes may be deferred and the better performance may be achieved. By the
other words, there is a \mbox{trade-off} between the mount time and the
-performance and one may vary them by means of changing the size of the journal.
+performance and one may vary these characteristics by means of changing the
+size of the journal.
\subsection{The superblock}
-The JFFS3 \emph{superblock} is the data structure that describes
-the file system as a whole (i.e., the offset of the root node, the journal
-eraseblocks, etc). When the file system is being mounted, it first finds and
-reads the JFFS3 superblock.
+
+The JFFS3 \emph{superblock} is a data structure that describes the file system
+as a whole and contains important information like the offset of the root node,
+the journal eraseblocks, etc. When the file system is being mounted, it first
+finds and reads the JFFS3 superblock.
In case of traditional file systems the superblock usually resides at a fixed
position on the disk and may be found very quickly. Conversely, due to the
@@ -702,14 +714,17 @@
these eraseblocks will not be worn out earlier then the other eraseblocks.
We have the following two requirements that ought to be met in JFFS3:
+
\begin{itemize}
+
\item JFFS3 must be able to quickly find the superblock;
\item the superblock management techniques must not spoil the overall flash
wear levelling.
+
\end{itemize}
-In the classical file systems the superblock usually contains mostly static
+In the classical file systems the superblock usually contains a lot of static
data which is rarely updated and the superblock may have any size. In JFFS3,
the superblock must be updated quite often (e.g., each time the journal is
committed). This means that to lessen the amount of I/O, the JFFS3 superblock
@@ -718,27 +733,27 @@
system, its version, etc). For static data, JFFS3 reserves the first eraseblock
of the JFFS3 partition.
-Thus, the following terms are used:
+Thus, the following terms are used in this document:
\begin{itemize}
-\item \emph{the static superblock}~-- contains only static data which are never
+\item \emph{static superblock}~-- contains only static data which are never
changed by JFFS3; the static superblock resides at the \emph{static
eraseblock}; the static eraseblock is the first \mbox{non-bad} eraseblock of
the JFFS3 partition; it is supposed that the contents of the static eraseblock
may only be changed by external \mbox{user-level} tools;
-\item \emph{the superblock}~-- contains only dynamic data, is changed quite
+\item \emph{superblock}~-- contains only dynamic data, is changed quite
often and requires special methods to deal with.
\end{itemize}
-JFFS3 has rather complicated superblock management scheme which makes it
-possible to quickly find the superblock without any flash scanning when the
+JFFS3 has a rather complicated superblock management scheme which makes it
+possible to quickly find the superblock without full flash scanning when the
file system is being mounted. This scheme provides good flash
-\mbox{wear-levelling}. Theoretically, the superblock lookup should take few
-milliseconds and scale as $O(log_2(S))$. For more detailed information about
-the superblock management scheme see section \ref{ref_SectionSBAlg}.
+\mbox{wear-levelling}. The superblock lookup should take few milliseconds and
+scale as $O(log_2(S))$. For more detailed information about the superblock
+management scheme see section \ref{ref_SectionSBAlg}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
@@ -957,7 +972,7 @@
$$
where $D$ is the maximum number of flash eraseblock erase cycles, and $M$ is
-the number of non-bad eraseblock on the JFFS3 partition. We substracted 3 from
+the number of non-bad eraseblock on the JFFS3 partition. We subtracted 3 from
$M$ to get the number of eraseblocks in the data area.
\begin{equation}
@@ -1148,7 +1163,6 @@
\end{enumerate}
-
The following is the list of topics which should be highlighted in this document
as well.
@@ -1206,13 +1220,8 @@
\begin{enumerate}
-\item Review the "definitions section". Add more terms there, e.g.,
-checkpointing, quota, branching factor, indexing, leaf, journal, anchor, chain,
-super eraseblocks.
-
-\item Change the Credits section a bit.
-
\item Re-calculate digits for SB search time and $m$.
+
\end{enumerate}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@@ -1224,86 +1233,149 @@
\begin{enumerate}
-\item \textbf{Access Control List, ACL}~-- a modern mechanism to control
-accesses to files, see [\ref{ref_ACL}] for more details.
+\item \textbf{Access Control Lists, ACL}~-- a modern mechanism to control
+accesses to files which provides much more flexibility that the standard Unix
+mechanism of owner/group/others permissions, see [\ref{ref_ACL}] for more
+details.
\item \textbf{Anchor eraseblock, anchor area}~-- the second and the third
\emph{good} eraseblocks of the JFFS3 partition which are reserved for the
-superblock management.
+superblock management. See section \ref{ref_SectionSBAlg} for more details.
+
+\item \textbf{$B$-tree}~-- a balanced search tree where each node has many
+children. See section \ref{ref_SectionBTrees}.
+
+\item \textbf{$B^+$-tree}~-- a \mbox{$B$-tree} where no data is stored in
+\mbox{non-leaf} nodes but instead, is stored only in leaf nodes.
\item \textbf{Branch node}~-- any node that is not leaf, not twig and not root.
+\item \textbf{Branching factor}~-- the branching factor of the $B$-tree is the
+number of children of a node.
+
\item \textbf{Chain eraseblock}~-- an eraseblock containing references to other
chain eraseblocks or to the super eraseblock. Chain eraseblocks facilitate
-quick SB searching and are part of the JFFS3 superblock management scheme (see
-section \ref{ref_SectionSBAlg}).
-
-\item \textbf{Directory entry}~-- basically associates a name with an inode
-number. Directories may be considered as a list of directory entries.
+quick SB searching and are the part of the JFFS3 superblock management scheme
+(see section \ref{ref_SectionSBAlg}). The main reason why chain eraseblocks are
+needed is the need to provide good flash \mbox{wear-levelling}.
+
+\item \textbf{Directory entry, direntry}~-- basically an association between
+the name and the inode number. As any other object direntries are stored at the
+leaf level of the tree in direntry nodes.
\item \textbf{Erasable block, eraseblock}~-- the minimal erasable unit of the
flash chip from the JFFS3's viewpoint.
+\item \textbf{Extended attributes, xattr}~-- an association between names and
+data for files and directories. See attr(5) Linux manual pages for more
+information.
+
\item \textbf{Data area}~-- the whole JFFS3 partition excluding the static
-superblock and the anchor eraseblocks.
+superblock and anchor eraseblocks.
-\item \textbf{Dirty sector}~-- a sector with data which is not valid any longer,
-recycled by Garbage Collector.
+\item \textbf{Dirt, dirty space}~-- information on flash which is not valid due
+to \mbox{out-of-place} updates or objects deletion. It is the aim if the
+Garbage Collector to reclaim the space occupied by dirt.
-\item \textbf{Garbage Collector,}~-- a part of any Flash File System which
+\item \textbf{Dirty sector}~-- a sector which contains dirt.
+
+\item \textbf{Fanout}~-- the same as \textbf{branching factor}.
+
+\item \textbf{Garbage}~-- the same as \textbf{dirt}.
+
+\item \textbf{Garbage Collector}~-- a part of any Flash File System which
is responsible for recycling dirty space and producing free eraseblocks.
\item \textbf{Indexing information, index}~-- data structures which do not
-contain files, directories, extended attributes or whatever is seen by user,
-but instead, keep track of this data. For example, indexing information allows
-to quickly find all the directory entries for any specified directory. I case
-of the FAT file system, the File Allocation Table is the index, in case of ext2
-the inode table, the bitmap and the set of direct, indirect, doubly indirect
-and triply indirect pointers may be considered as the index. In JFFS3, the
-indexing nodes may be referred to as the index.
-
-\item \textbf{Journal}~-- contains all the recent JFFS3 changes. Its purpose is
-to accumulate a bunch of JFFS3 file system changes and and to postpone updating
-the index. See section \ref{ref_SectionJournalIntro} for more information.
+contain any file system data (files, directories, extended attributes, etc) but
+instead, keep track of this data. For example, indexing information allows to
+quickly find all the directory entries for any specified directory. In case of
+the FAT file system, the File Allocation Table is may be treated as the index,
+in case of the ext2 file system the inode table, the bitmap and the set of
+direct, indirect, doubly indirect and triply indirect pointers may be
+considered as the index. In JFFS3, the index is comprised by the indexing
+nodes. See section \ref{ref_SectionIndexing} for more information.
+
+\item \textbf{Indexing eraseblock}~-- an eraseblock which contains indexing
+nodes.
+
+\item \textbf{Indexing node}~-- a \mbox{non-leaf} node. Indexing nodes have
+fixed size (one sector) and contain only keys and links.
+
+\item \textbf{In-place updates, in-place writes}~-- a method of updating
+\mbox{on-media} data when the update is written to the physical position where
+the data resides (in opposite to \mbox{out-of-place} updates).
+
+\item \textbf{Journal}~-- contains recent JFFS3 changes and all the file system
+updates first go to the journal. The purpose of the Journal is to accumulate a
+bunch of JFFS3 file system changes and to postpone updating the index. See
+section \ref{ref_SectionJournalIntro} for more information.
+
+\item \textbf{Journal commit}~-- the process of \mbox{re-building} the indexing
+information for the data which is in the journal. After the journal has been
+committed the journal eraseblocks become just leaf eraseblocks.
\item \textbf{Journal eraseblock}~-- an eraseblock containing the journal data.
\item \textbf{Journal tree}~-- an \mbox{in-memory} tree referring Journal nodes
-which were not committed so far. For more information see section
-\ref{ref_SectionJournalIntro}.
+which were not committed so far. When JFFS3 reads, it first looks up the
+journal tree to find out whether the searched information is there. See
+section \ref{ref_SectionJournalIntro} for more details.
+
+\item \textbf{Key}~-- an identifier of objects in the tree.
+
+\item \textbf{Leaf eraseblock}~-- an eraseblock containing leaf nodes.
\item \textbf{Leaf node}~-- any node from the leaf level of the tree (level 0).
Leaf nodes contain only data and do not further refer other nodes. For more
information see section \ref{ref_SectionIndexing}.
-\item \textbf{Node}~-- a pile of the tree (the tree consists of nodes). There
-are different types of nodes in JFFS3. For more information see section
-\ref{ref_SectionIndexing}.
+\item \textbf{Node}~-- a pile of the tree (the tree consists of nodes) as well
+as the container for file system data. There are different types of nodes in
+JFFS3. For more information see section \ref{ref_SectionIndexing}.
-\item \textbf{Out-of-place updates, out-of-place writes}~-- a sort of data
-update when the updated version is not written to the same physical position,
-but instead, written to some other place and the previous contents is treated
-as garbage afterwords. Opposite to \mbox{in-place} updates.
+\item \textbf{Obsolete nodes/data/sectors}~-- the same as \textbf{dirty} nodes,
+data or sectors.
-\item \textbf{Sector}~-- the smallest writable unit of the \emph{flash chip},
-from the JFFS3's viewpoint. E.e. the NAND page in case of NAND.
+\item \textbf{Out-of-place updates, out-of-place writes}~-- a sort of data
+updates when the update is not written to the same physical position, but
+instead, is written to some other place and the previous contents is treated as
+garbage afterwords. Opposite to \mbox{in-place} updates.
+
+\item \textbf{Sector}~-- the smallest writable unit of the \emph{flash chip}
+from JFFS3's viewpoint. May be equivalent to the minimal physical input/output
+unit (like in case of NAND flashes) or larger (like in case of NOR flashes).
\item \textbf{Static eraseblock}~-- the fist good erasable block of the JFFS3
-partition where the \mbox{per-file} system static data is stored. JFFS3 may
+partition where the file system static data is stored. JFFS3 may
only read it and it is created/changed by external formatting tools.
-\item \textbf{Superblock}~-- a data structure describes the whole
+\item \textbf{Superblock}~-- a data structure which describes the whole
JFFS3 file system. Only dynamic data is stored in the superblock, all the
-static data is kept in the static superblock.
+static data is kept in the static superblock. There is a comprehensive
+superblock management scheme in JFFS3, see section \ref{ref_SectionSBAlg}.
-\item \textbf{Tree}~-- the main object JFFS3 design revolves about. The JFFS3
-tree is the wandering \mbox{$B^+$-tree} where all the file system stuff (files,
+\item \textbf{Super eraseblock}~-- an eraseblock where the superblock is kept.
+See section \ref{ref_SectionSBAlg} details.
+
+\item \textbf{Quota}~-- a mechanism which allows to assign different limits on
+the file system (e.g., restrict users in the number of files they may create or
+in the amount of space they may consume, etc). See [\ref{ref_Quota}] for more
+details about quota support in Linux.
+
+\item \textbf{Tree}~-- the main entity JFFS3 design revolves about. The JFFS3
+tree is a wandering \mbox{$B^+$-tree} where all the file system stuff (files,
directories, extended attributes, etc) is stored.
-\item \textbf{Twig nodes}~-- reside one level upper then leaf nodes (level 1).
+\item \textbf{Twig nodes}~-- nodes which reside one level upper then leaf nodes
+(level 1).
+
+\item \textbf{Wandering tree}~-- a method of updating trees when there is no
+possibility to perform \mbox{in-place} updates. The JFFS3 tree is a wandering
+\mbox{$B^+$-tree}. See section \ref{ref_SectionWandTrees} for more information.
-\item \textbf{xattr}~-- extended attributes, associate name/value pairs with
-files and directories, see attr(5) Linux manual pages for more information.
+\item \textbf{Xattr}~-- a widely used contracted form for \textbf{extended
+attributes}.
\end{enumerate}
@@ -1313,10 +1385,12 @@
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Symbols} \label{ref_SectionSymbols}
+
The following is the list of symbols which are used to denote different things
thought this document.
\begin{itemize}
+
\item $D$~-- number of guaranteed erases of flash eraseblocks (typically
$\sim 10^5$ for NAND flashes);
@@ -1334,6 +1408,7 @@
\item $S$~-- the size of the JFFS3 flash partition (assuming there are no bad
block).
+
\end{itemize}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@@ -1359,29 +1434,30 @@
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Credits}
-The following are the people I am grateful for help (alphabetical order):
+The following are the people I am very grateful for help (alphabetical order):
\begin{itemize}
\item \textbf{David Woodhouse} \texttt{<dwmw2 at infradead.org>}~-- the author of
-JFFS2, answered a great deal of my questions about MTD, JFFS2 and JFFS3 design
-approaches, English (e.g., "how to express this correctly in English"), etc.
+JFFS2, answered a great deal of my questions about MTD and JFFS2 and suggested
+some interesting ideas for JFFS3.
-\item \textbf{Joern Engel} \texttt{<joern at wohnheim.fh-wedel.de>}~-- discussed a
-lot of JFFS3 design aspects with me. Some ideas described in this document were
-pointed by Joern.
+\item \textbf{Joern Engel} \texttt{<joern at wohnheim.fh-wedel.de>}~-- discussed
+many aspects of a new scalable flash file system with me. Joern is developing
+his own flash file system \emph{LogFS}.
\item \textbf{Nikita Danilov} \texttt{<nikita at clusterfs.com>}~-- used to work
-in Namesys and implemented ReiserFS and Reiser4 file systems.
+in \emph{Namesys} and implemented ReiserFS and Reiser4 file systems.
Nikita answered many of my questions about Reiser4 FS internals.
\item \textbf{Thomas Gleixner} \texttt{<tglx at linutronix.de>}~-- helped me with
many MTD-related things, especially concerning flash hardware and low-level
-flash software. Proposed some ideas which I'm exploiting in the JFFS3 design.
+flash software.
\item \textbf{Victor V. Vengerov} \texttt{<vvv at oktetlabs.ru>}~-- my colleague
from OKTET~Labs who spent a lot of time discussing the JFFS3 design approaches
-with me and suggested many interesting ideas. He also reviewed some of my
-writings. \end{itemize}
+with me and suggested many interesting ideas.
+
+\end{itemize}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
@@ -1389,7 +1465,9 @@
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{References}
+
\begin{enumerate}
+
\item \raggedright \label{ref_JFFSdwmw2}
JFFS : The Journalling Flash File System,\\
\url{http://sources.redhat.com/jffs2/jffs2-html/}
@@ -1413,6 +1491,10 @@
POSIX Access Control Lists on Linux
\url{http://www.suse.de/~agruen/acl/linux-acls/}
+\item \raggedright \label{ref_Quota}
+Quota mini-HOWTO
+\url{http://www.tldp.org/HOWTO/Quota.html}
+
\end{enumerate}
\end{document}
- Previous message: mtd/drivers/mtd/nand sharpsl.c,1.5,1.6
- Next message: mtd/Documentation/jffs3/pics journal-01.eps, 1.2, 1.3 journal-01.pdf,
1.2, 1.3 journal-01.png, 1.1, 1.2 node-01.eps, 1.2,
1.3 node-01.pdf, 1.2, 1.3 node-01.png, 1.1, 1.2 wandtree.eps,
1.2, 1.3 wandtree.pdf, 1.2, 1.3 wandtree.png, 1.1, 1.2
- Messages sorted by:
[ date ]
[ thread ]
[ subject ]
[ author ]
More information about the linux-mtd-cvs
mailing list