mtd/Documentation/jffs3 definit.tex, NONE, 1.1 intro.tex, NONE, 1.1 jffs2.tex, NONE, 1.1 jffs3req.tex, NONE, 1.1 ref.tex, NONE, 1.1 super.tex, NONE, 1.1 tree.tex, NONE, 1.1 JFFS3design.tex, 1.26, 1.27

Sun Nov 13 12:53:44 EST 2005

Update of /home/cvs/mtd/Documentation/jffs3
In directory phoenix.infradead.org:/tmp/cvs-serv5292

Modified Files:
	JFFS3design.tex 
Added Files:
	definit.tex intro.tex jffs2.tex jffs3req.tex ref.tex super.tex 
	tree.tex 
Log Message:
Add keys description (not complete).
Split the file as it is getting too large.



***** Error reading new file: [Errno 2] No such file or directory: 'definit.tex'
***** Error reading new file: [Errno 2] No such file or directory: 'intro.tex'
***** Error reading new file: [Errno 2] No such file or directory: 'jffs2.tex'
***** Error reading new file: [Errno 2] No such file or directory: 'jffs3req.tex'
***** Error reading new file: [Errno 2] No such file or directory: 'ref.tex'
***** Error reading new file: [Errno 2] No such file or directory: 'super.tex'

--- NEW FILE tree.tex ---
%
% OBJECTS
%
\subsection{Objects} \label{ref_SectionObjects}

JFFS3 keeps file system objects in the leaf level of the tree (in leaf nodes)
and the below is the list of supported objects.

\begin{enumerate}

\item \emph{Data} objects contain files' data and are kept in \emph{data
nodes}. Each data node holds one \mbox{RAM page} bytes of data (i.e.,
\texttt{PAGE\_SIZE} which is 4K on most \mbox{32-bit} architectures). But of
course, for small files (less then one RAM page bytes) and for files' tails
there may be less data.

%
% Figure with an data objects illustration
%
\begin{figure}[h]
\begin{center}
\begin{htmlonly}
\includegraphics{pics/dataobj-01.png}
\end{htmlonly}
%begin{latexonly}
\includegraphics[width=130mm,height=90mm]{pics/dataobj-01.pdf}
%end{latexonly}
\end{center}
\caption{An illustration of files' data representation.}
\label{ref_FigureDataObj-01}
\end{figure}

Figure~\ref{ref_FigureDataObj-01} illustrates the correspondence between files'
contents and data objects. Each \mbox{RAM~page-size} piece of a 13K file
corresponds to a data node in the JFFS3 tree. The 1K tail of the file also
corresponds to a data node. Because of compression the actual size of
data nodes is less then the corresponding file fragments.

The division on \mbox{RAM~page-sized} fragments relates to the Linux Virtual
Memory Management architecture. Namely, the Linux \emph{Page Cache} works in
terms of RAM pages which means, that JFFS3 is always asked to read and write
files' in units of RAM page size.

It is worth noting that in order to optimize flash utilization, JFFS3 may store
multiple of \mbox{RAM page} bytes in one data node for static files. This
admits of better compression and leads to several other benefits.

\item \emph{Direntry} objects contain the correspondence between directory
entry names and inode numbers. Direntry objects are stored in \emph{direntry
nodes}. Every directory entry in the file system has a corresponding direntry
object.

\item \emph{Attr-data} objects contain attributes of inodes~-- both standard
Unix attributes like user~ID, last modification time, inode
length, etc and \mbox{JFFS3-specific} attributes like the type of compression,
etc. Each inode has only one corresponding \mbox{attr-data} object.

\item \emph{Xentry} objects contain the correspondence between names of
extended attributes and \emph{xattr~IDs}. Every extended attribute in the file
system has a corresponding xattr entry object. This is analogous to direntry
objects, but direntries contain
\{\texttt{xattr~name}$\Rightarrow$\texttt{xattr~ID}\} mapping instead of
\{\texttt{direntry~name}$\Rightarrow$\texttt{inode~number}\} mapping in
xentries.

Each extended attribute in JFFS3 has its own unique number~-- xattr~ID, just
like every inode has its own unique inode number. And in fact, JFFS3 utilizes
the same space of numbers to enumerate inodes and extended attributes. 

Xentry objects are stored in \emph{xentry nodes}.

\item \emph{Xattr-data} objects contain the data of extended attributes. The
way how \mbox{xattr-data} objects are kept in the tree is equivalent to the way
how data objects a kept there. \emph{Xattr-data} objects are stored in
\emph{xattr-data} nodes.

\item \emph{Acl} objects contain Access Control Lists (ACL) of inodes
(information about ACLs may be found out at~[\ref{ref_ACL}]). Acl objects are
stored in \emph{acl nodes}.

In \mbox{real-world} systems a lot of files have equivalent ACL while only few
files have unique ACL. For the former group of files (or more strictly~--
inodes) JFFS3 makes use of \emph{shared~acl} objects. This means, that there is
only one acl object instance for all of these inodes. Shared acls are referred
to from \mbox{attr-data} objects of these inodes. If a shared acl is written
to, a new acl object is created (\mbox{copy-on-write} mechanism). Conversely, for
the latter group there is a distinct acl object per each inode.

\end{enumerate}

%
% KEYS
%
\subsection{Keys} \label{ref_SectionKeys}

Each object has its own key and may be quickly found out in the tree by its
key. As there are 6 object types in JFFS3, there are also 6 key types:

\begin{enumerate}
\item \emph{data keys}~-- index data objects;
\item \emph{direntry keys}~-- index direntry objects;
\item \emph{attr-data key}~-- index attr-data objects;
\item \emph{xentry key}~-- index xentry objects;
\item \emph{xattr-data key}~-- index xattr-data objects;
\item \emph{acl key}~-- index acl objects.
\end{enumerate}

%
% Trivial key scheme
%
\subsubsection{Trivial key scheme}

Lets start discussing JFFS3 keys with an example of a simple key layout which
is further referred to as the \emph{trivial key scheme}. All keys in this
scheme have the same \mbox{67-bits} length (see
figure~\ref{ref_FigureTrivKey}). 

\begin{itemize}

\item Data keys consist of the \mbox{32-bit} inode number the data belongs to,
the unique \mbox{3-bit} key type identifier, and the \mbox{32-bit} data offset.

\item Direntry keys consist of the \mbox{32-bit} parent directory inode number,
the unique \mbox{3-bit} key type identifier, and the \mbox{32-bit} direntry
name hash value.

\item Attr-data keys consist of the \mbox{32-bit} inode number the attributes
belong to, and the unique \mbox{3-bit} key type identifier.

\item Xentry keys consist of the \mbox{32-bit} inode number the extended
attribute belongs to, the unique \mbox{3-bit} key type identifier, and the
\mbox{32-bit} extended attribute name hash value.

\item Xattr-data keys consist of the \mbox{32-bit} xattr ID, the unique
\mbox{3-bit} key type identifier, and the \mbox{32-bit} extended attribute data
offset.

\item Acl keys consist of the \mbox{32-bit} inode number the acl object belongs
to, and the unique \mbox{3-bit} key type identifier.

\end{itemize}

%
% The trivial key scheme
%
\begin{figure}[h]
\begin{center}
\begin{htmlonly}
\includegraphics{pics/trivkey-01.png}
\end{htmlonly}
%begin{latexonly}
\includegraphics[width=110mm,height=75mm]{pics/trivkey-01.pdf}
%end{latexonly}
\end{center}
\caption{The trivial key scheme.}
\label{ref_FigureTrivKey}
\end{figure}

The trivial key scheme is not actually used in JFFS3 and is only needed to
ease the further discussion.

\subsubsection{Key schemes}

The trivial key scheme makes use of \mbox{32-bit} inode numbers and
\mbox{32-bit} file offsets. But if a system needs to use huge files (larger
then 4GB), 32 bits may also be insufficient and more bits should be used.
Similarly, the file system may want to utilize huge number of files so
\mbox{32-bits} may be not sufficient to store the inode number.

>From the other hand, some systems may have only few inodes and, say,
24~bits may be enough. Similarly, some systems may have not very large flash
partition and do not utilize large files, so 30~bits for the file offset may be
enough for them.

It also possible that one may want to use some tricky key layouts to achieve
different kinds of optimizations. For example, direntry keys may include the
first 8~bytes (64~bits) of the direntry name (see figure
\ref{ref_FigureDirentKeyEx_01}). In this case the \texttt{getdents}
\footnote{See \texttt{getdents(2)} Linux manual pages} Linux system call will
return direntries in mostly alphabetically sorted order and \mbox{user-space}
programs will not spend much time to sort them. In fact this technique is used
in Reiser4 and it is claimed that slow sorting is a bottleneck in certain file
system workloads.

%
% Direntry key layout example
%
\begin{figure}[h]
\begin{center}
\begin{htmlonly}
\includegraphics{pics/keyex-01.png}
\end{htmlonly}
%begin{latexonly}
\includegraphics[width=110mm,height=8mm]{pics/keyex-01.pdf}
%end{latexonly}
\end{center}
\caption{Direntry key layout example.}
\label{ref_FigureDirentKeyEx_01}
\end{figure}

\texttt{[To be continued.]}

Index: JFFS3design.tex
===================================================================
RCS file: /home/cvs/mtd/Documentation/jffs3/JFFS3design.tex,v
retrieving revision 1.26
retrieving revision 1.27
diff -u -r1.26 -r1.27
--- JFFS3design.tex	13 Nov 2005 12:47:32 -0000	1.26
+++ JFFS3design.tex	13 Nov 2005 17:53:41 -0000	1.27
@@ -1,4 +1,8 @@
-
+%
+% JFFS3 design issues.
+%
+% Copyright (C), 2005, Artem B. Bityutskiy, <dedekind at infradead.org>
+%
 % $Id$
 %
 
@@ -44,13 +48,13 @@
 %
[...1437 lines suppressed...]
-\item \raggedright \label{ref_TOSHIBA_TC58DVM92A1FT}
-Toshiba TC58DVM92A1FT NAND flash chip,
-\url{http://www.toshiba.com/taec/components/Datasheet/TC58DVM92A1FT_030110.pdf}
-
-\item \raggedright \label{ref_TOSHIBA_TH58NVG1S3AFT05}
-Toshiba TH58NVG1S3AFT05 NAND flash chip,
-\url{http://www.toshiba.com/taec/components/Datasheet/TH58NVG1S3AFT05_030519A.pdf}
-
-\item \raggedright \label{ref_STMICRO_NAND08GB}
-ST-micro NAND08G-B NAND flash chip,
-\url{http://www.st.com/stonline/products/literature/ds/11241.pdf}
-
-\item \raggedright \label{ref_SMSUNG_K9K1G08X0B}
-Samsung K9K1G08X0B NAND flash chip,
-\url{http://www.samsung.com/Products/Semiconductor/NANDFlash/SLC_SmallBlock/1Gbit/K9K1G08U0B/ds_k9k1g08x0b_rev02.pdf}
-
-\end{enumerate}
+\input{ref.tex}
 
 \end{document}