mtd/Documentation/jffs3 gc.tex, NONE, 1.1 JFFS3design.tex, 1.30, 1.31 Makefile, 1.5, 1.6 definit.tex, 1.4, 1.5 intro.tex, 1.4, 1.5 jffs2.tex, 1.4, 1.5 jffs3req.tex, 1.4, 1.5 ref.tex, 1.4, 1.5 super.tex, 1.4, 1.5 tree.tex, 1.4, 1.5

Artem Bityutskiy dedekind at infradead.org
Sun Nov 27 09:37:53 EST 2005


Update of /home/cvs/mtd/Documentation/jffs3
In directory phoenix.infradead.org:/tmp/cvs-serv6755

Modified Files:
	JFFS3design.tex Makefile definit.tex intro.tex jffs2.tex 
	jffs3req.tex ref.tex super.tex tree.tex 
Added Files:
	gc.tex 
Log Message:
Minor fixes in various chapters.
Add GC introduction.
Add a subchapter about links.
Switch to version 0.32.


***** Error reading new file: [Errno 2] No such file or directory: 'gc.tex'
Index: JFFS3design.tex
===================================================================
RCS file: /home/cvs/mtd/Documentation/jffs3/JFFS3design.tex,v
retrieving revision 1.30
retrieving revision 1.31
diff -u -r1.30 -r1.31
--- JFFS3design.tex	22 Nov 2005 14:48:40 -0000	1.30
+++ JFFS3design.tex	27 Nov 2005 14:37:49 -0000	1.31
@@ -57,9 +57,9 @@
 \large{Artem B. Bityutskiy\\
 dedekind at infradead.org}\\
 \vspace{13cm}
-\large{Version 0.31 (draft)}\\
+\large{Version 0.32 (draft)}\\
 \vspace{0.5cm}
-November 22, 2005
+November 27, 2005
 \end{center}
 \end{titlepage}
 %\maketitle
@@ -133,6 +133,15 @@
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 %
+% GARBAGE COLLECTION
+%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\section{Garbage Collection} \label{ref_SectionGC}
+
+\input{gc.tex}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%
 % THE SUPERBLOCK
 %
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@@ -163,28 +172,19 @@
 semantics? Reiser4 pretends to support this via special \texttt{sys\_reiser4()}
 syscall. Would be nice.
 
-\item Does it make sense to "compress" keys. For example, if there are many
-consecutive keys with the same prefix, we may write this prefix only once. Will
-there be some problems when these compressed keys are changed and cannot be
-compressed any longer? Will this need some horrid tree and re-balancing?
-The idea of "extents" is similar.
-
 \item How can one select the compression mode on the per-inode basis? Xattrs
 with some reserved name?
 
-\item ACLs... Will they be implemented via xattrs or it will be something
-tricker/better?
-
 \item Orphaned files.
 
 \item Holes.
 
 \item Direct I/O.
 
-\item When/how to update stat-data?
-
 \item How to chose/add a key scheme?
 
+\item Extents.
+
 \end{enumerate}
 
 The following is the list of topics which should be highlighted in this document
@@ -196,6 +196,8 @@
 
 \item Tree balancing.
 
+\item Tree locking.
+
 \item Caching, write-behind cache.
 
 \item An assumed flash model and the model of interactions between JFFS3 and
@@ -222,6 +224,10 @@
 \item Portability (e.g., move FS between machines with different RAM~page size,
 etc).
 
+\item Errors handling.
+
+\item Bad blocks handling.
+
 \end{enumerate}
 
 The following is the list of ideas which were thought about but are not yet in
@@ -303,6 +309,10 @@
 \item $S$~-- the size of the JFFS3 flash partition (assuming there are no bad
 block).
 
+\item $s$~-- the size of sector.
+
+\item $w$~-- the \mbox{bit-width} of links.
+
 \end{itemize}
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Index: Makefile
===================================================================
RCS file: /home/cvs/mtd/Documentation/jffs3/Makefile,v
retrieving revision 1.5
retrieving revision 1.6
diff -u -r1.5 -r1.6
--- Makefile	22 Nov 2005 14:48:40 -0000	1.5
+++ Makefile	27 Nov 2005 14:37:49 -0000	1.6
@@ -4,13 +4,15 @@
 	   pics/comprex-02.pdf pics/dataobj-01.pdf pics/flash-01.pdf \
 	   pics/idxprobl.pdf pics/journal-01.pdf pics/journal-02.pdf \
 	   pics/keyex-01.pdf pics/node-01.pdf pics/sb-01.pdf \
-	   pics/sb-02.pdf pics/trivkey-01.pdf pics/wandtree.pdf
+	   pics/sb-02.pdf pics/trivkey-01.pdf pics/wandtree.pdf \
+	   pics/gcprobl-01.pdf
 	   
 png_files: pics/btree-01.png pics/btree-02.png pics/comprex-01.png \
 	   pics/comprex-02.png pics/dataobj-01.png pics/flash-01.png \
 	   pics/idxprobl.png pics/journal-01.png pics/journal-02.png \
 	   pics/keyex-01.png pics/node-01.png pics/sb-01.png \
-	   pics/sb-02.png pics/trivkey-01.png pics/wandtree.png
+	   pics/sb-02.png pics/trivkey-01.png pics/wandtree.png \
+	   pics/gcprobl-01.png
 	   
 
 tex_files: JFFS3design.tex


Index: intro.tex
===================================================================
RCS file: /home/cvs/mtd/Documentation/jffs3/intro.tex,v
retrieving revision 1.4
retrieving revision 1.5
diff -u -r1.4 -r1.5
--- intro.tex	22 Nov 2005 14:48:40 -0000	1.4
+++ intro.tex	27 Nov 2005 14:37:50 -0000	1.5
@@ -136,7 +136,8 @@
 
 JFFS3 uses \mbox{$B^+$-trees} and this subsection makes a short introduction to
 \mbox{$B^+$-trees}. There is a plenty of books where one may find more
-information.
+information. There are also many \mbox{on-line} resources available, e.g.
+[\ref{ref_BTrees}].
 
 The inexact definition of \mbox{$B$-tree} may be formulated as a balanced
 search tree where each node may have many children. The \emph{branching factor}
@@ -420,7 +421,7 @@
 Any flash eraseblock may be used as an journal eraseblock.
 
 %
-% The figure with journal illustration
+% Figure with journal illustration
 %
 \begin{figure}[h]
 \begin{center}
@@ -481,6 +482,90 @@
 performance and one may vary these characteristics by means of changing the
 size of the journal.
 
+%
+% GARBAGE COLLECTION
+%
+\subsection{Garbage collection}
+
+Garbage collection is a vital part of any \mbox{log-structured} file system.
+Over time, JFFS3 uses up all the flash free space and it needs to reclaim flash
+space occupied by garbage. And the goal of Garbage Collector is to recycle
+garbage and reclame flash space which it occupies. Since the only way to
+reclaim it is to erase the whole eraseblock, Garbage Collector works in terms
+of eraseblocks.
+
+JFFS2 Garbage Collector is quite simple and works in several steps.
+
+\begin{enumerate}
+
+\item To reclaim dirt from an eraseblock, JFFS2 moves all valid nodes from this
+eraseblock to another eraseblock.
+
+\item As nodes have changed their positions, the JFFS2 \mbox{in-RAM} index is
+adjusted.
+
+\item The first eraseblock may be erased and \mbox{re-used}.
+
+\end{enumerate}
+
+Note, JFFS2 (and JFFS3) aways reserves several eraseblocks in order to
+guarantee that there are always some free eraseblocks available to perform
+garbage collection.
+
+JFFS3 Garbage Collector is more complex. When valid data has been moved from an
+eraseblock, the corresponding indexing nodes must be updated as well. Depending
+on how many space Garbage Collector has reclaimed and how many space it has
+spent to update indexing nodes, it might be that Garbage Collector produces
+more garbage that it reclaims. This problem is demostrated in figure
+\ref{ref_FigureGCProbl_01}.
+
+%
+% Figure wit hthe garbage collection problem
+%
+\begin{figure}[h]
+\begin{center}
+\begin{htmlonly}
+\includegraphics{pics/gcprobl-01.png}
+\end{htmlonly}
+%begin{latexonly}
+\includegraphics[width=120mm,height=96mm]{pics/gcprobl-01.pdf}
+%end{latexonly}
+\end{center}
+\caption{JFFS3 garbage collection problem illustration.}
+\label{ref_FigureGCProbl_01}
+\end{figure}
+
+There is a subtree with root in node $A$ depicted in the figure. At the
+beginning (snapshot~1), leaf nodes $D$, $E$, $F$, and $G$ are situated in the
+eraseblock number~1. Indexing nodes $A$, $B$, and $C$ are situated in the
+eraseblock number~2. There are also two reserved eraseblocks. Suppose all nodes
+have the same size equivlent to the size of sector which is 512 bytes in
+this example.
+
+At snapshot~2 JFFS3 has decided to reclaim 512 bytes of dirty space form
+the eraseblock number~1. Garbage Collector moves all the valid nodes from the
+eraseblock number~1 to one of the reserved eraseblocks. But as indexing nodes
+$B$ and $C$ still refer old copies of the moved nodes in the eraseblock
+number~1, this eraseblock cannot be erased so far. Indexing nodes $A$, $B$, and
+$C$ have to be updated first.
+
+At snapshot~3 Garbage Collector has updated indexing nodes $A$, $B$ and $C$,
+putting them to one of the reserved eraseblocks. From now on, old copies of
+nodes $A$, $B$, $C$, $D$, $E$, $F$, and $G$ at eraseblocks~1 and~2 comprise
+garbage. The eraseblock number~1 was erased and is now free.
+
+But unfortunatelly, the result is that Garbage Collector made more dirt that it
+reclaimed space. Indeed, GC reclaimed 512 bytes while produced three times
+greater amount of garbage (see the first three sectors at eraseblock~2,
+snapshot~3). Compare snapshots~1 and~2.
+
+Hence, it is obvious that garbage collection in JFFS3 is must be more complex
+that in JFFS2. Chapter \ref{ref_SectionGC} discusses JFFS3 Garbage Collector in
+details.
+
+%
+% THE SUPERBLOCK
+%
 \subsection{The superblock}
 
 The JFFS3 \emph{superblock} is a data structure that describes the file system

Index: jffs2.tex
===================================================================
RCS file: /home/cvs/mtd/Documentation/jffs3/jffs2.tex,v
retrieving revision 1.4
retrieving revision 1.5
diff -u -r1.4 -r1.5
--- jffs2.tex	22 Nov 2005 14:48:40 -0000	1.4
+++ jffs2.tex	27 Nov 2005 14:37:50 -0000	1.5
@@ -100,7 +100,7 @@
 
 It is also worth noting here that there is a patch which is usually referred to
 as the "\emph{summary patch}", that was implemented by Ferenc Havasi and was
-recently committed to the JFFS2 CVS. This patch speed up the JFFS2 mount
+recently committed to the JFFS2 CVS. This patch speeds up the JFFS2 mount
 greatly, especially in case of NAND flashes. What the patch basically does is
 that it puts a small "\emph{summary}" node at the end of each flash erasable
 block. This node, roughly speaking, contains the copy of headers of all the
@@ -108,10 +108,9 @@
 glance to the end of each eraseblock and read the summary node. This results in
 that JFFS2 only needs to read one or few NAND pages from the end of each
 eraseblock. Instead, when there is no summary, JFFS2 reads almost every NAND
-page of the eraseblock, because node headers are spread more or less evenly
-over the eraseblock.
+page in each eraseblock, because node headers are spread more or less evenly
+over eraseblocks.
 
 Although the patch helps a lot, it is still a not scalable solution and it only
 relaxes the coefficient of the JFFS2 mount time liner dependency. Let alone
-that it does not lessen the memory consumption.
-
+that it does not lessen JFFS2 memory consumption.


Index: ref.tex
===================================================================
RCS file: /home/cvs/mtd/Documentation/jffs3/ref.tex,v
retrieving revision 1.4
retrieving revision 1.5
diff -u -r1.4 -r1.5
--- ref.tex	22 Nov 2005 14:48:40 -0000	1.4
+++ ref.tex	27 Nov 2005 14:37:50 -0000	1.5
@@ -28,6 +28,10 @@
 Reiser4 File System,
 \url{http://www.namesys.com/}
 
+\item \raggedright \label{ref_BTrees}
+B-Trees,
+\url{http://www.bluerwhite.org/btree/}
+
 \item \raggedright \label{ref_ACL}
 POSIX Access Control Lists on Linux,
 \url{http://www.suse.de/~agruen/acl/linux-acls/}


Index: tree.tex
===================================================================
RCS file: /home/cvs/mtd/Documentation/jffs3/tree.tex,v
retrieving revision 1.4
retrieving revision 1.5
diff -u -r1.4 -r1.5
--- tree.tex	22 Nov 2005 14:48:40 -0000	1.4
+++ tree.tex	27 Nov 2005 14:37:50 -0000	1.5
@@ -363,6 +363,7 @@
 For sequences like this it is possible to only specify the starting offset, the
 ending offset and the number of keys in the sequence, instead of wasting space
 storing the offset in each key of the sequence.
+
 %
 % Offsets sequence compression example
 %
@@ -390,7 +391,7 @@
 Note, the above compression methods may be combined to achieve better
 compression.
 
-Because of compression, JFFS3 keys have variable size which means that it is
+Because of compression, JFFS3 keys have variable size which means, that it is
 impossible to directly apply the binary search algorithm to the contents of
 indexing nodes. In JFFS3, indexing nodes are decompressed when read and are
 cached in decompressed form. And after the indexing node has been decompressed,
@@ -401,3 +402,28 @@
 because the amount of Input/Output will lessen. But only actual
 \mbox{fixed-size} keys~vs.~\mbox{variable-size} keys tests will show if there
 is some real performance gain present.
+
+%
+% LINKS
+%
+\subsection{Links} \label{ref_SectionKeys}
+
+Links in JFFS3 have fixed length and are not compressed. The link width depends
+on the size of JFFS3 partition~-- the larger is JFFS3 partition, the wider are
+links. Instead of choosing a huge link width to suit the largest possible file
+systems (e.g. 64~bits), JFFS3 admits of flexible links width, depending on
+JFFS3 partition size.
+
+As indexind nodes have fixed size equivalent to one sector, the width of links
+stored in branch nodes and in the root nodes is
+
+$$
+w = log_2{S} - s.
+$$
+Twig nodes refer variable-size leaf nodes so the width of links stored twih
+nodes is
+$$
+w = log_2{S},
+$$
+where $S$ is the size of the JFFS3 partition and $s$ is the size of sector.
+





More information about the linux-mtd-cvs mailing list