mtd/Documentation/jffs3 JFFS3design.tex, 1.28, 1.29 definit.tex, 1.2,
1.3 intro.tex, 1.2, 1.3 jffs2.tex, 1.2, 1.3 jffs3req.tex, 1.2,
1.3 ref.tex, 1.2, 1.3 super.tex, 1.2, 1.3 tree.tex, 1.2, 1.3
Artem Bityutskiy
dedekind at infradead.org
Wed Nov 16 12:13:09 EST 2005
Update of /home/cvs/mtd/Documentation/jffs3
In directory phoenix.infradead.org:/tmp/cvs-serv15492
Modified Files:
JFFS3design.tex definit.tex intro.tex jffs2.tex jffs3req.tex
ref.tex super.tex tree.tex
Log Message:
Add keys compression chapter.
Various minor fixes across the document.
Fixes in pictures.
Switch to version 0.30.
Index: JFFS3design.tex
===================================================================
RCS file: /home/cvs/mtd/Documentation/jffs3/JFFS3design.tex,v
retrieving revision 1.28
retrieving revision 1.29
diff -u -r1.28 -r1.29
--- JFFS3design.tex 15 Nov 2005 12:41:43 -0000 1.28
+++ JFFS3design.tex 16 Nov 2005 17:13:05 -0000 1.29
@@ -57,9 +57,9 @@
\large{Artem B. Bityutskiy\\
dedekind at infradead.org}\\
\vspace{13cm}
-\large{Version 0.29 (draft)}\\
+\large{Version 0.30 (draft)}\\
\vspace{0.5cm}
-November 15, 2005
+November 16, 2005
\end{center}
\end{titlepage}
%\maketitle
@@ -248,6 +248,9 @@
\item Re-calculate digits for SB search time and $m$.
+\item For now only the idea of keys compression methods is provides. Would be
+nice to describe algorithms more strictly.
+
\end{enumerate}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Index: intro.tex
===================================================================
RCS file: /home/cvs/mtd/Documentation/jffs3/intro.tex,v
retrieving revision 1.2
retrieving revision 1.3
diff -u -r1.2 -r1.3
--- intro.tex 15 Nov 2005 12:41:43 -0000 1.2
+++ intro.tex 16 Nov 2005 17:13:05 -0000 1.3
@@ -224,7 +224,7 @@
to just as "\emph{the tree}".
Every object which is stored in the tree has a \emph{key}, and objects are
-found in the tree by their key. To make it clearer what are object keys, the
+found in the tree by their keys. To make it clearer what are object keys, the
following is an example of how they may look like:
\begin{itemize}
Index: tree.tex
===================================================================
RCS file: /home/cvs/mtd/Documentation/jffs3/tree.tex,v
retrieving revision 1.2
retrieving revision 1.3
diff -u -r1.2 -r1.3
--- tree.tex 15 Nov 2005 12:41:43 -0000 1.2
+++ tree.tex 16 Nov 2005 17:13:05 -0000 1.3
@@ -111,15 +111,15 @@
Lets start discussing JFFS3 keys with an example of a simple key layout which
is further referred to as the \emph{trivial key scheme}. All keys in this
-scheme have the same \mbox{56-bit} length (see figure~\ref{ref_FigureTrivKey}).
+scheme have the same \mbox{47-bit} length (see figure~\ref{ref_FigureTrivKey}).
\begin{itemize}
\item Data keys consist of the \mbox{32-bit} inode number the data belongs to,
-the unique \mbox{3-bit} key type identifier, and the \mbox{19-bit} data offset.
+the unique \mbox{3-bit} key type identifier, and the \mbox{20-bit} data offset.
\item Direntry keys consist of the \mbox{32-bit} parent directory inode number,
-the unique \mbox{3-bit} key type identifier, and the \mbox{19-bit} direntry
+the unique \mbox{3-bit} key type identifier, and the \mbox{20-bit} direntry
name hash value.
\item Attr-data keys consist of the \mbox{32-bit} inode number the attributes
@@ -127,10 +127,10 @@
\item Xentry keys consist of the \mbox{32-bit} inode number the extended
attribute belongs to, the unique \mbox{3-bit} key type identifier, and the
-\mbox{19-bit} extended attribute name hash value.
+\mbox{20-bit} extended attribute name hash value.
\item Xattr-data keys consist of the \mbox{32-bit} xattr ID, the unique
-\mbox{3-bit} key type identifier, and the \mbox{19-bit} extended attribute data
+\mbox{3-bit} key type identifier, and the \mbox{20-bit} extended attribute data
offset.
\item Acl keys consist of the \mbox{32-bit} inode number the acl object belongs
@@ -138,6 +138,19 @@
\end{itemize}
+The following is the list of key type identifiers.
+
+\begin{enumerate}
+
+\item Data keys~-- \texttt{0} (\texttt{000} bin);
+\item Direntry keys~-- \texttt{1} (\texttt{001} bin);
+\item Attr-data keys~-- \texttt{2} (\texttt{010} bin);
+\item Xentry keys~-- \texttt{3} (\texttt{011} bin);
+\item Xattr-data keys~-- \texttt{4} (\texttt{100} bin);
+\item Acl keys~-- \texttt{4} (\texttt{101} bin);
+
+\end{enumerate}
+
%
% The trivial key scheme
%
@@ -156,7 +169,7 @@
Since data objects may only contain multiple RAM~pages of data (excluding small
files and files' tails) offsets in keys are always RAM~page
-\mbox{size-aligned}. Assuming RAM~page is 4K (13~bits), 19~bits is enough to
+\mbox{size-aligned}. Assuming RAM~page is 4K (12~bits), 20~bits is enough to
refer up to 4GB of data. To put it differently, the trivial key scheme limits
files' length to 4GB providing system's RAM~page size is 4K.
@@ -184,7 +197,7 @@
where $p$ is the number of components in keys of this type.
Keys $K_1$ and $K_2$ are considered to be \emph{equivalent} if and only if all
-their fields are equivalent, i.e. $k^i_1 = k^i_2i = 1, 2, ..., p$.
+their fields are equivalent, i.e. $k^i_1 = k^i_2, i = 1, 2, ..., p$.
Keys are compared \mbox{field-by-field} starting from the first field. If on
$i$'th step $k^i_1 > k^i_2$, then $K_1$ is considered to be \emph{greater} then
@@ -210,7 +223,7 @@
insufficient or, conversely, too much and one may want to use less bits to
encode inode numbers (say, only 24 bits), just for optimization.
-\item Similarly, offsets are encoded as \mbox{19-bit} integers in the trivial
+\item Similarly, offsets are encoded as \mbox{20-bit} integers in the trivial
key scheme which may be insufficient when huge files (larger then 4G) should be
supported. So one may want to use more bits in certain cases.
@@ -254,4 +267,109 @@
So it is obvious why JFFS3 is not attached to a fixed key scheme but instead,
admits of many different key schemes (one at a time of course) with a
-possibility to choose the best suited key scheme.
+possibility to choose the best suited key scheme.
+
+%
+% Keys compression
+%
+\subsubsection{Keys compression}
+
+The index is the most frequently \mbox{re-written} part of JFFS3. Indeed, every
+single change at the leaf level of the tree requires \mbox{re-writing} $L-1$
+indexing nodes. So, it is really, really important for JFFS3 to keep the tree
+as shallow as it is possible. This means that it makes sense to apply a sort of
+compression to indexing nodes.
+
+There are several ways to compress keys and the following are the examples of
+possible compression techniques.
+
+\begin{description}
+
+\item[Offset compression.] The compression of offsets in keys may be based on
+the observation that the overwhelming majority of files in many file systems
+are small. This means that it might makes sense to code smaller offsets by
+fewer bits.
+
+Table \ref{ref_TableOffsCodes} contains an example of how offsets may be
+encoded. For offsets in range \texttt{0KB-8KB} only 3~bits are enough, so
+bit sequence "\texttt{000}" will encode offset \texttt{0}, and bit sequence
+"\texttt{001}" will encode offset \texttt{4K} (remember, offsets in keys are
+\mbox{RAM page-aligned} and the RAM~page size is assumed to be 4K). Offsets in
+range \texttt{8KB-64KB} are encoded by 6~bits and so on.
+
+\begin{table}[h]
+\begin{center}
+\begin{tabular}{llll}
+\textbf{Offset range} & \textbf{Bits in range} & \textbf{Code prefix} & \textbf{Code length}\\
+\hline
+\texttt{0KB-8KB} & 13 bits & \texttt{00} & 3 bits\\
+\texttt{8KB-64KB} & 16 bits & \texttt{01} & 6 bits\\
+\texttt{64KB-1MB} & 20 bits & \texttt{10} & 10 bits\\
+\texttt{1MB-128MB} & 27 bits & \texttt{110} & 18 bits\\
+\texttt{128MB-1GB} & 32 bits & \texttt{111} & 23 bits\\
+\end{tabular}
+\caption{An example of offset compression.}
+\label{ref_TableOffsCodes}
+\end{center}
+\end{table}
+
+Note, table \ref{ref_TableOffsCodes} contains just an example and does not
+pretend to work well in all cases.
+
+\item[Inode number compression.] If the approximate number of inodes on the
+file system is known in advance, similar coding scheme may be exploited for
+inode numbers providing JFFS3 may reuse deleted files' inode numbers.
+
+\item[Prefix compression.] In case of the trivial key scheme the first part of
+any key is the inode number and for sequences of keys with the same inode
+number the inode number may be specified only once as a common prefix.
+
+%
+% Prefix compression example
+%
+\begin{figure}[h]
+\begin{center}
+\begin{htmlonly}
+\includegraphics{pics/comprex-01.png}
+\end{htmlonly}
+%begin{latexonly}
+\includegraphics[width=75mm,height=60mm]{pics/comprex-01.pdf}
+%end{latexonly}
+\end{center}
+\caption{An example of prefix compression.}
+\label{ref_FigureKeyComprEx_01}
+\end{figure}
+
+Figure \ref{ref_FigureKeyComprEx_01} illustrates how prefix compression works.
+Suppose there are 3 consecutive keys with the same inode number. Compressed
+keys have the following format.
+
+\begin{itemize}
+
+\item The first field is the common inode number.
+
+\item The second field is the \mbox{3-bit} key type identifier. Prefix
+compression makes use of reserved key type identifier 7 (\texttt{type~=~7}).
+
+\item The next field is the compression descriptor (\texttt{CD}) which takes
+few bits and describes the compression. In our example, the compression
+descriptor says that the next two fields are \texttt{count} and the common
+\texttt{type~=~0} (remember, the keys could have had \mbox{non-equivalent}
+types). The \texttt{count} field defines to how many of following keys the
+common \{\texttt{inode~\#~=~17,~type~=~0}\} prefix ought to be applied.
+
+\end{itemize}
+
+\end{description}
+
+Thus, because of compression, JFFS3 keys have variable size which means that it
+is impossible to directly apply the binary search algorithm to the contents of
+indexing nodes. In JFFS3, indexing nodes are decompressed when read and are
+cached in decompressed form. After the indexing node have beed decompressed,
+the binary search algorithm is applicable.
+
+We hope that keys compression will considerably reduce the amount of
+\mbox{on-flash} indexing information and increase the overall performance (just
+because the amount of Input/Output will lessen). But only actual
+\mbox{fixed-size} keys~vs.~\mbox{variable-size} keys tests will show if there
+is some real performance gain.
More information about the linux-mtd-cvs
mailing list