mtd/fs/jffs3 JFFS3design.tex,1.29,1.30

Artem Bityuckiy dedekind at infradead.org
Sat Mar 26 08:14:07 EST 2005


Update of /home/cvs/mtd/fs/jffs3
In directory phoenix.infradead.org:/tmp/cvs-serv27005

Modified Files:
	JFFS3design.tex 
Log Message:
Add 2 more tglx1's chapters.


Index: JFFS3design.tex
===================================================================
RCS file: /home/cvs/mtd/fs/jffs3/JFFS3design.tex,v
retrieving revision 1.29
retrieving revision 1.30
diff -u -r1.29 -r1.30
--- JFFS3design.tex	26 Mar 2005 11:55:01 -0000	1.29
+++ JFFS3design.tex	26 Mar 2005 13:14:03 -0000	1.30
@@ -198,6 +198,156 @@
 concatenation settings in case of removable media.
 
 %
+% WEAR LEVELLING
+%
+\section{Wear Leveling}
+JFFS2's wear spread is of random nature. On filesystems with
+unchanged files the wear almost never touches blocks which contain
+solely unchanged data. This should be addressed in JFFS3.
+
+When a block has been erased a clean marker is written. The clean
+markers carry block erase count information. In case of powerloss the
+erase count information is lost and set to the average erasecount of
+the partition on remount. Userspace tools (e.g. \texttt{flash\_eraseall})
+should be made aware of this, if the wear information has to be preserved.
+
+The wear algorithm enqueues blocks into hash arrays depending on the
+erase count, so we can pick them easily. We have two arrays:
+
+\begin{itemize}
+\item \texttt{struct list\_head *free\_blocks[HASH\_SIZE];}
+\item \texttt{struct list\_head *used\_blocks[HASH\_SIZE];}
+\end{itemize}
+
+For the current max. chipsizes we have max. 32678 blocks / chip. We
+assume an \texttt{HASH\_SIZE} of 256 for now. The wear leveling makes it
+necessary to enqueue the blocks in order into the hash array. This
+sounds complicated and time consuming, but it is not.
+
+The maximum guaranteed erase cycles for NAND and NOR are ~ 100K at the
+moment. We assume a best case maximum of 256K and build the hash index
+by shifting the erase count 10 times right and putting the block at
+the end of the hash entry list. So the variance of the erase counts in
+one entry is max. 1024, which is reasonable. This enqueueing lets us
+pick used blocks with static unchanged data for garbage collection on
+purpose to bring the blocks back into use. The garbage collection in
+such a case is simply copying the whole block to a block from the free
+list. Newer NAND Flash supports on chip page copying. This feature
+should be made available in the MTD Kernel API to enhance performance
+of such operations.
+
+%
+% BLOCK REFERENCE
+%
+\section{Blocks reference}
+
+The JFFS2 block accounting variables (see [\ref{ref_JFFS2_blk_ref}])
+can mostly go away. The information can be
+held in one variable, if it is stored depending on the block
+state. The possibility to have accounting control via the distinct
+variables could be kept for debugging purposes, but not for production
+use.
+
+It's not necessary to hold the current block being GCed (\texttt{gc\_node} JFFS2's field)
+in the block reference. The garbage collector recycles only one block at a time, so
+this is an information which can be held in the garbage collector or
+in the global file system data if other parts need to have access to
+it.
+
+The first/last node pointers should be reduced to one. This depends on the
+final node reference design, check pointing, summary nodes etc.
+
+\begin{verbatim}
+struct jffs2_eraseblock
+{
+  list_head *list;
+  node_ref *node;
+  /* Status bits and accounting dependend on state */
+  u32 status; 
+}
+\end{verbatim}
+
+The status variable should be sufficient for holding all required
+information.\\
+
+\begin{tabular}{ll}
+Bi  0-23  & Size information depending on state\\
+Bit 24-31 & Status information\\
+\end{tabular}\\
+
+Reserving 24 bits for accounting should be sufficient for quite a time 
+($2^23 = 8388608$). This is equivalent to 128 blocks of 64KiB physical size.
+On a 32 bit machine this results in 16 bytes/block.\\
+
+\begin{tabular}{llllll}
+Block size (KiB) & 4    & 16   & 64  & 128 & 256\\
+RAM (bytes/MiB)  & 4096 & 1024 & 256 & 128 &  64\\
+RAM (KiB/GiB)    & 4096 & 1024 & 256 & 128 &  64\\
+\end{tabular}\\[8pt]
+
+\begin{tabular}{ll}
+AG-AND 128MB 4k block size & 512 kiB\\
+NAND 512MB 16k block size  & 512 kiB\\
+NAND 2GiB 64k block size   & 512 kiB\\
+\end{tabular}\\
+
+This is 4 times the size of the current limit. This can be cut down
+below 128K by concatenating of 4 physical blocks to a virtual
+block. The above mentioned disadvantages of virtual blocks still
+apply, but degraded by factor 4. The concatenation should be made
+selectable by the user to scale the system according to the
+requirements. See the paragraph about virtual erase blocks.
+
+Using JFFS2 concatentation of 16 blocks would reduce the RAM
+requirement to 32KiB for the largest devices.
+
+\subsection{JFFS2 blocks reference} \label{ref_JFFS2_blk_ref}
+The virtual block data structure is way too big in JFFS2.
+JFFS2 uses 48 bytes per block:\\
+
+\begin{tabular}{llllll}
+Block size (KiB) & 4	 & 16   & 64  & 128 & 256 \\
+RAM (bytes/MiB)  & 12288 & 3072 & 768 & 384 & 192 \\
+RAM (KiB/GiB)    & 12288 & 3072 & 768 & 384 & 192 \\
+\end{tabular}\\
+
+This results in impressive numbers for the largest chips:\\
+
+\begin{tabular}{ll}
+AG-AND 128MB 4k block & 1.5 MiB \\
+NAND 512MB 16k block  & 1.5 MiB \\
+NAND 2GiB 64k block   & 1.5 MiB \\
+\end{tabular}\\
+
+JFFS2 limits the memory consumption to 128K by building virtual
+blocks. This results in 12 blocks per virtual block,
+which are rounded up to 16 to utilize $2^n$ operations.
+
+Virtual eraseblocks have disadvantages. The bigger size affects
+garbage collection as larger entities have to be handled, which
+degrades GC efficency and performance.
+
+The current JFFS2 erase block structure is defined as follows:
+
+\begin{verbatim}
+struct jffs2_eraseblock
+{
+  struct list_head list;
+  int bad_count;
+  uint32_t offset; /* of this block in the MTD */
+  uint32_t unchecked_size;
+  uint32_t used_size;
+  uint32_t dirty_size;
+  uint32_t wasted_size;
+  uint32_t free_size;
+  struct jffs2_raw_node_ref *first_node;
+  struct jffs2_raw_node_ref *last_node;
+  /* Next node to be garbage collected */
+  struct jffs2_raw_node_ref *gc_node; 
+};
+\end{verbatim}
+
+%
 % MAXIMAL SIZE OF THE INODE NODE DATA
 %
 \section{Maximal size of the inode node data}





More information about the linux-mtd-cvs mailing list