mtd/Documentation/jffs3 JFFS3design.tex, 1.27, 1.28 definit.tex, 1.1, 1.2 intro.tex, 1.1, 1.2 jffs2.tex, 1.1, 1.2 jffs3req.tex, 1.1, 1.2 ref.tex, 1.1, 1.2 super.tex, 1.1, 1.2 tree.tex, 1.1, 1.2

Artem Bityutskiy dedekind at infradead.org
Tue Nov 15 07:41:47 EST 2005


Update of /home/cvs/mtd/Documentation/jffs3
In directory phoenix.infradead.org:/tmp/cvs-serv11030

Modified Files:
	JFFS3design.tex definit.tex intro.tex jffs2.tex jffs3req.tex 
	ref.tex super.tex tree.tex 
Log Message:
Various fixes in various chapters.
Add 2 sub-sections to the "The tree" section.
Switch version to 0.29.



Index: JFFS3design.tex
===================================================================
RCS file: /home/cvs/mtd/Documentation/jffs3/JFFS3design.tex,v
retrieving revision 1.27
retrieving revision 1.28
diff -u -r1.27 -r1.28
--- JFFS3design.tex	13 Nov 2005 17:53:41 -0000	1.27
+++ JFFS3design.tex	15 Nov 2005 12:41:43 -0000	1.28
@@ -57,9 +57,9 @@
 \large{Artem B. Bityutskiy\\
 dedekind at infradead.org}\\
 \vspace{13cm}
-\large{Version 0.28 (draft)}\\
+\large{Version 0.29 (draft)}\\
 \vspace{0.5cm}
-November 12, 2005
+November 15, 2005
 \end{center}
 \end{titlepage}
 %\maketitle
@@ -181,7 +181,9 @@
 
 \item Direct I/O.
 
-\item When/how to update stat-data ?
+\item When/how to update stat-data?
+
+\item How t ochose/add a key scheme?
 
 \end{enumerate}
 
@@ -215,6 +217,8 @@
 \item The minimal amount of file's data in a node is \texttt{PAGE\_SIZE}. No
 way to create smaller nodes as it it possible in JFFS2.
 
+\item Portability.
+
 \end{enumerate}
 
 The following is the list of ideas which were thought about but are not yet in
@@ -274,6 +278,10 @@
 
 \item $I$~-- inode number.
 
+\item $K$, $K_x$~-- tree's keys.
+
+\item $k$, $k_x$~-- keys' fields.
+
 \item $L$~-- the number of levels in the tree.
 
 \item $m$~-- the number of eraseblocks used in the superblock management

Index: definit.tex
===================================================================
RCS file: /home/cvs/mtd/Documentation/jffs3/definit.tex,v
retrieving revision 1.1
retrieving revision 1.2
diff -u -r1.1 -r1.2
--- definit.tex	13 Nov 2005 17:53:41 -0000	1.1
+++ definit.tex	15 Nov 2005 12:41:43 -0000	1.2
@@ -16,6 +16,8 @@
 \item \textbf{Acl}~-- an object in the tree containing inode's ACL. Refer to
 section~\ref{ref_SectionObjects} for more information.
 
+\item \textbf{Acl key}~-- acl object's key.
+
 \item \textbf{Acl node}~-- a leaf node containing an acl object.
 
 \item \textbf{Anchor eraseblock, anchor area}~-- the second and the third
@@ -27,6 +29,8 @@
 attributes like the type of compression, etc). Refer to
 section~\ref{ref_SectionObjects} for more information.
 
+\item \textbf{Attr-data key}~-- attr-data object's key.
+
 \item \textbf{Attr-data node}~-- a leaf node containing an \mbox{attr-data}
 object.
 
@@ -53,11 +57,15 @@
 \item \textbf{Data area}~-- the whole JFFS3 partition excluding the static
 superblock and anchor eraseblocks.
 
+\item \textbf{Data key}~-- data object's key.
+
 \item \textbf{Data node}~-- a leaf node with file's data.
 
 \item \textbf{Directory entry, direntry}~-- basically an association between
 the name and the inode number.
 
+\item \textbf{Direntry key}~-- direntry object's key.
+
 \item \textbf{Direntry node}~-- a leaf node containing a direntry object.
 
 \item \textbf{Dirt, dirty space}~-- information on flash which is not valid due
@@ -199,19 +207,23 @@
 \item \textbf{Xattr}~-- a widely used contracted form for \textbf{extended
 attributes}.
 
-\item \textbf{Xattr ID}~-- an unique identifier of an extended attribute. Refer to
-section~\ref{ref_SectionObjects} for more information.
-
 \item \textbf{Xattr-data}~-- an object in the tree containing the data of an
 extended attribute.
 
+\item \textbf{xattr-data key}~-- xattr-data object's key.
+
 \item \textbf{Xattr-data node}~-- a leaf node containing an \mbox{attr-data}
 object.
 
+\item \textbf{Xattr ID}~-- an unique identifier of an extended attribute. Refer to
+section~\ref{ref_SectionObjects} for more information.
+
 \item \textbf{Xentry}~-- an object in the tree which stores the association
 between the name of an extended attribute and its xattr~ID. Refer to
 section~\ref{ref_SectionObjects} for more information.
 
+\item \textbf{Xentry key}~-- xentry object's key.
+
 \item \textbf{Xentry node}~-- a leaf node containing a xentry object.
 
 \end{enumerate}

Index: intro.tex
===================================================================
RCS file: /home/cvs/mtd/Documentation/jffs3/intro.tex,v
retrieving revision 1.1
retrieving revision 1.2
diff -u -r1.1 -r1.2
--- intro.tex	13 Nov 2005 17:53:41 -0000	1.1
+++ intro.tex	15 Nov 2005 12:41:43 -0000	1.2
@@ -354,19 +354,18 @@
 
 This section illustrates how does JFFS3 indexing work by means of a simple
 example. The example is very rough but it shows JFFS3 indexing in action. It is
-assumed that keys of direntries and data objects have layout that
-mentioned in an example in section~\ref{ref_SectionIndexing}.
+assumed that keys of direntries and data objects have the same layout that
+is mentioned in section~\ref{ref_SectionIndexing}.
 
-Suppose that an user does the following:
+Suppose that user does the following:
 
 \begin{enumerate}
 
 \item mounts JFFS3 file system to "\texttt{/mnt/jffs3}" directory;
 
-\item issues "\texttt{ls /mnt/jffs3}" command in order find out if there is
-a "\texttt{my\_file}" present there;
+\item issues "\texttt{ls /mnt/jffs3}" command;
 
-\item reads the contents of the "\texttt{my\_file}" file.
+\item reads the contents of "\texttt{/mnt/jffs3/my\_file}" file.
 
 \end{enumerate}
 
@@ -377,23 +376,31 @@
 
 \item During mount JFFS3 locates the position of the root node. This is done
 with help of the JFFS3 \emph{superblock} which will be described later
-(section~\ref{ref_SectionSB}).
+(see section~\ref{ref_SectionSB}).
 
-\item to find out all directory entries in the root directory, JFFS3 searches
-all objects matching \{\texttt{1},~$*$\} pattern. Indeed, direntry keys have
-\{\texttt{parent~inode~\#},~\texttt{name~hash}\} format, the root directory inode
-number is 1 (a predefined constant). So, JFFS3 searches for any direntry in the
-root directory ("$*$" means wildcard).
-
-\item JFFS3 reads the direntry object which corresponds to the
-"\texttt{my\_file}" file using \{\texttt{1},~$H($\texttt{"my\_file"$)$}\} key,
-where $H()$ is the hash function. The direntry object contains
-"\texttt{my\_file}"'s inode number ($I$). Then JFFS3 searches for data objects
-of the "\texttt{my\_file}" file using \{$I$,~\texttt{offset}\} keys. Depending
-on which part of file should be read, \texttt{offset} takes different values.
+\item To get the list of directory entries in the root directory, JFFS3
+looks~up all objects matching the \{\texttt{2},~$*$\} key pattern. Indeed,
+direntry keys have \{\texttt{parent~inode~\#},~\texttt{name~hash}\} format, the
+root directory inode number is 2 (or another predefined constant).  $*$" means
+wildcard. Thus, JFFS3, \{\texttt{2},~$*$\} will match to any direntry in the
+root directory.
+
+\item To read the "\texttt{my\_file}" file, JFFS3 first needs to find out its
+inode number. The inode number is stored in the directory entry object. Hence,
+JFFS3 reads \texttt{my\_file}'s direntry using
+\{\texttt{1},~$H($\texttt{"my\_file"$)$}\} key ($H()$ is the hash function).
+
+Then JFFS3 searches for \texttt{my\_file}'s data objects using
+\{$I$,~\texttt{offset}\} keys. Depending on which part of file should be read,
+\texttt{offset} may take different values.
 
 \end{enumerate}
 
+The above description is somewhat simplified e.g., JFFS3 also needs to read
+\texttt{my\_file}'s attr-data object to fetch the inode length from there (see
+section \ref{ref_SectionObjects}), etc. But the aim of the section is just to
+provide an idea of how JFFS3 indexing works.
+
 %
 % THE JOURNAL
 %




Index: super.tex
===================================================================
RCS file: /home/cvs/mtd/Documentation/jffs3/super.tex,v
retrieving revision 1.1
retrieving revision 1.2
diff -u -r1.1 -r1.2
--- super.tex	13 Nov 2005 17:53:41 -0000	1.1
+++ super.tex	15 Nov 2005 12:41:43 -0000	1.2
@@ -194,7 +194,6 @@
 \frac{T_A}{T_D} \geqslant 1,
 \label{ref_EquationSBIneq}
 \end{equation}
-
 where $T_A$ is the period of time of the total anchor area wear and $T_D$ is
 the period of time of the total data area wear. Note, the whole JFFS3 partition
 excluding the static superblock and the anchor area is referred to as the
@@ -214,7 +213,6 @@
 $$
 T_D = \frac{(M-3) \cdot D \cdot N}{R_D},
 $$
-
 where $D$ is the maximum number of flash eraseblock erase cycles, and $M$ is
 the number of non-bad eraseblock on the JFFS3 partition. We subtracted 3 from
 $M$ to get the number of eraseblocks in the data area.
@@ -261,7 +259,6 @@
 $$
 \frac{T_A}{T_D} = 2 \cdot \frac{2N^3 + N^2 + N}{M-3},
 $$
-
 and for $m = 0,1,2,\ldots$
 
 $$
@@ -274,13 +271,11 @@
 $$
 2 \cdot \frac{2N^m + N^{m-1} + \ldots + N}{M-3} \geqslant 1,
 $$
-
 or neglecting the minor components,
 
 $$
 \frac{4N^m}{M-3} \geqslant 1,
 $$
-
 or
 
 \begin{equation}
@@ -337,7 +332,6 @@
 $$
 S = 2m + log_2(2N) + (m - 1) \cdot log_2(N)
 $$
-
 sectors.
 
 Table~\ref{ref_TableNANDTimes} contains the approximate superblock search time

Index: tree.tex
===================================================================
RCS file: /home/cvs/mtd/Documentation/jffs3/tree.tex,v
retrieving revision 1.1
retrieving revision 1.2
diff -u -r1.1 -r1.2
--- tree.tex	13 Nov 2005 17:53:41 -0000	1.1
+++ tree.tex	15 Nov 2005 12:41:43 -0000	1.2
@@ -59,9 +59,8 @@
 extended attributes and \emph{xattr~IDs}. Every extended attribute in the file
 system has a corresponding xattr entry object. This is analogous to direntry
 objects, but direntries contain
-\{\texttt{xattr~name}$\Rightarrow$\texttt{xattr~ID}\} mapping instead of
-\{\texttt{direntry~name}$\Rightarrow$\texttt{inode~number}\} mapping in
-xentries.
+\{\texttt{direntry~name}$\Rightarrow$\texttt{inode~number}\} mapping, instead
+of \{\texttt{xattr~name}$\Rightarrow$\texttt{xattr~ID}\} mapping in xentries.
 
 Each extended attribute in JFFS3 has its own unique number~-- xattr~ID, just
 like every inode has its own unique inode number. And in fact, JFFS3 utilizes
@@ -78,13 +77,13 @@
 (information about ACLs may be found out at~[\ref{ref_ACL}]). Acl objects are
 stored in \emph{acl nodes}.
 
-In \mbox{real-world} systems a lot of files have equivalent ACL while only few
-files have unique ACL. For the former group of files (or more strictly~--
-inodes) JFFS3 makes use of \emph{shared~acl} objects. This means, that there is
-only one acl object instance for all of these inodes. Shared acls are referred
-to from \mbox{attr-data} objects of these inodes. If a shared acl is written
-to, a new acl object is created (\mbox{copy-on-write} mechanism). Conversely, for
-the latter group there is a distinct acl object per each inode.
+In \mbox{real-world} systems a vast number of files have equivalent ACL while
+only few files have unique ACL. For the former group of files (or more
+strictly~-- inodes) JFFS3 makes use of \emph{shared~acl} objects. This means,
+that there is only one acl object instance for all of these inodes. Shared acls
+are referred to from \mbox{attr-data} objects of these inodes. If a shared acl
+is written to, a new acl object is created (\mbox{copy-on-write} mechanism).
+Conversely, for the latter group there is a distinct acl object per each inode.
 
 \end{enumerate}
 
@@ -93,16 +92,16 @@
 %
 \subsection{Keys} \label{ref_SectionKeys}
 
-Each object has its own key and may be quickly found out in the tree by its
+Each object has its own key and may be quickly looked~up in the tree by its
 key. As there are 6 object types in JFFS3, there are also 6 key types:
 
 \begin{enumerate}
 \item \emph{data keys}~-- index data objects;
 \item \emph{direntry keys}~-- index direntry objects;
-\item \emph{attr-data key}~-- index attr-data objects;
-\item \emph{xentry key}~-- index xentry objects;
-\item \emph{xattr-data key}~-- index xattr-data objects;
-\item \emph{acl key}~-- index acl objects.
+\item \emph{attr-data keys}~-- index attr-data objects;
+\item \emph{xentry keys}~-- index xentry objects;
+\item \emph{xattr-data keys}~-- index xattr-data objects;
+\item \emph{acl keys}~-- index acl objects.
 \end{enumerate}
 
 %
@@ -112,16 +111,15 @@
 
 Lets start discussing JFFS3 keys with an example of a simple key layout which
 is further referred to as the \emph{trivial key scheme}. All keys in this
-scheme have the same \mbox{67-bits} length (see
-figure~\ref{ref_FigureTrivKey}). 
+scheme have the same \mbox{56-bit} length (see figure~\ref{ref_FigureTrivKey}). 
 
 \begin{itemize}
 
 \item Data keys consist of the \mbox{32-bit} inode number the data belongs to,
-the unique \mbox{3-bit} key type identifier, and the \mbox{32-bit} data offset.
+the unique \mbox{3-bit} key type identifier, and the \mbox{19-bit} data offset.
 
 \item Direntry keys consist of the \mbox{32-bit} parent directory inode number,
-the unique \mbox{3-bit} key type identifier, and the \mbox{32-bit} direntry
+the unique \mbox{3-bit} key type identifier, and the \mbox{19-bit} direntry
 name hash value.
 
 \item Attr-data keys consist of the \mbox{32-bit} inode number the attributes
@@ -129,10 +127,10 @@
 
 \item Xentry keys consist of the \mbox{32-bit} inode number the extended
 attribute belongs to, the unique \mbox{3-bit} key type identifier, and the
-\mbox{32-bit} extended attribute name hash value.
+\mbox{19-bit} extended attribute name hash value.
 
 \item Xattr-data keys consist of the \mbox{32-bit} xattr ID, the unique
-\mbox{3-bit} key type identifier, and the \mbox{32-bit} extended attribute data
+\mbox{3-bit} key type identifier, and the \mbox{19-bit} extended attribute data
 offset.
 
 \item Acl keys consist of the \mbox{32-bit} inode number the acl object belongs
@@ -156,31 +154,87 @@
 \label{ref_FigureTrivKey}
 \end{figure}
 
-The trivial key scheme is not actually used in JFFS3 and is only needed to
-ease the further discussion.
+Since data objects may only contain multiple RAM~pages of data (excluding small
+files and files' tails) offsets in keys are always RAM~page
+\mbox{size-aligned}. Assuming RAM~page is 4K (13~bits), 19~bits is enough to
+refer up to 4GB of data. To put it differently, the trivial key scheme limits
+files' length to 4GB providing system's RAM~page size is 4K.
 
+It is also worth noting that several objects of the same type may have the same
+key. For example, in case of hash collision, two direntries may have equivalent
+keys. In this case objects may be distinguished by means of reading the
+corresponding leaf node headers.
+
+%
+% Keys comparison
+%
+\subsubsection{Keys comparison}
+
+The important topic is how keys are compared as it defines the relative order
+of objects in the tree and is crucial for searching. Note, it only makes sense
+to compare keys of the same type.
+
+JFFS3 keys are usually comprised of one or more fields, i.e., keys $K_1$ and
+$K_2$ may be represented as
+
+$$
+K_1 = \{k^1_1, k^2_1, ..., k^p_1\},
+K_2 = \{k^1_2, k^2_2, ..., k^p_2\},
+$$
+where $p$ is the number of components in keys of this type.
+
+Keys $K_1$ and $K_2$ are considered to be \emph{equivalent} if and only if all
+their fields are equivalent, i.e. $k^i_1 = k^i_2i = 1, 2, ..., p$.
+
+Keys are compared \mbox{field-by-field} starting from the first field. If on
+$i$'th step $k^i_1 > k^i_2$, then $K_1$ is considered to be \emph{greater} then
+$K_2$.  Similarly, if on $i$'th step $k^i_1 < k^i_2$, then $K_1$ is considered
+to be \emph{less} then $K_2$.
+
+%
+% Key schemes
+%
 \subsubsection{Key schemes}
 
-The trivial key scheme makes use of \mbox{32-bit} inode numbers and
-\mbox{32-bit} file offsets. But if a system needs to use huge files (larger
-then 4GB), 32 bits may also be insufficient and more bits should be used.
-Similarly, the file system may want to utilize huge number of files so
-\mbox{32-bits} may be not sufficient to store the inode number.
-
-From the other hand, some systems may have only few inodes and, say,
-24~bits may be enough. Similarly, some systems may have not very large flash
-partition and do not utilize large files, so 30~bits for the file offset may be
-enough for them.
-
-It also possible that one may want to use some tricky key layouts to achieve
-different kinds of optimizations. For example, direntry keys may include the
-first 8~bytes (64~bits) of the direntry name (see figure
+\emph{Key schemes} define layout of keys for all the 6 object types.
+Apparently, it would be too inflexible to hardcode JFFS3 to support only one
+fixed key scheme. Indeed, there may be a great deal of reasons why users may
+want to use different key schemes in different situations~-- some examples go
+bellow.
+
+\begin{itemize}
+
+\item The inode number is encoded by a \mbox{32-bit} integer in the trivial key
+scheme which means that about 4~million inodes and extended attributes may
+exist on the file system simultaneously. But in some cases this may be
+insufficient or, conversely, too much and one may want to use less bits to
+encode inode numbers (say, only 24 bits), just for optimization.
+
+\item Similarly, offsets are encoded as \mbox{19-bit} integers in the trivial
+key scheme which may be insufficient when huge files (larger then 4G) should be
+supported. So one may want to use more bits in certain cases.
+
+\item Depending on the concrete JFFS3 usage, different hash functions may be
+used in direntry keys. The length of hash values may also vary depending on how
+many directory entries are kept in directories. If there are huge directories
+with millions of files there, long hash values should be used to avoid massive
+hash collisions (say, \mbox{64-bit} hash values). But if it is known in advance
+that there will be no too large directory entries, \footnote{Note, when talking
+about directories, words "\emph{large}" and "\emph{small}" describe how many
+direntries are kept in these directories. The more direntries a directory
+contains, the larger is it.} the length of hash values may be shorter.
+
+\item It also possible that one may want to use some tricky key layouts to
+achieve different kinds of optimization. For example, direntry keys may
+include the first 8~bytes (64~bits) of the direntry name (see figure
 \ref{ref_FigureDirentKeyEx_01}). In this case the \texttt{getdents}
-\footnote{See \texttt{getdents(2)} Linux manual pages} Linux system call will
-return direntries in mostly alphabetically sorted order and \mbox{user-space}
+\footnote{See \texttt{getdents (2)} Linux manual pages} Linux system call will
+return direntries in "mostly" alphabetically sorted order and \mbox{user-space}
 programs will not spend much time to sort them. In fact this technique is used
-in Reiser4 and it is claimed that slow sorting is a bottleneck in certain file
-system workloads.
+in the Reiser4 file system and it is claimed that slow sorting is a bottleneck
+in certain \mbox{real-life} workloads. And the like.
+
+\end{itemize}
 
 %
 % Direntry key layout example
@@ -198,4 +252,6 @@
 \label{ref_FigureDirentKeyEx_01}
 \end{figure}
 
-\texttt{[To be continued.]}
+So it is obvious why JFFS3 is not attached to a fixed key scheme but instead,
+admits of many different key schemes (one at a time of course) with a
+possibility to choose the best suited key scheme. 





More information about the linux-mtd-cvs mailing list