UBIFS and hardware ECC of all FF pages of MLC NAND

Darwin Rambo drambo at broadcom.com
Tue Sep 29 09:26:27 EDT 2009


Artem,

One thing you might add is a paranoid check for the OOB being set to 0xFF before 
programming a page. If someone programs trailing pages in a block of 0xFF by mistake, 
and puts a non-0xFF ECC in the OOB, then the UBIFS code would write to an already 
written ECC, which I have found to corrupt other blocks ECCs on my part. It also gives 
strange error messages and refuses to mount on reboot. The messages do not look like 
they are related to the original ECC write problem so it is harder to debug. 

With this particular error, you can see messages like below:

UBIFS error (pid 245): ubifs_read_node: bad node type (255 but expected 2)
UBIFS error (pid 245): ubifs_read_node: bad node at LEB 73:456392
UBI error: ubi_io_read: error -74 while reading 64 bytes from PEB 3:0, read 64 bytes
UBI warning: ubi_eba_init_scan: cannot reserve enough PEBs for bad PEB handling,
 reserved 17, need 19
UBI warning: ubi_eba_copy_leb: error -74 while reading data from PEB 3
UBI error: wear_leveling_worker: error -74 while moving PEB 3 to PEB 2
UBI warning: ubi_ro_mode: switch to read-only mode
UBI error: do_work: work failed with error code -74
UBI error: ubi_thread: ubi_bgt0d: work failed with error code -74
UBI error: ubi_io_read: error -74 while reading 516096 bytes from PEB 3:8192, re
ad 516096 bytes
UBIFS error (pid 1): ubifs_scan: corrupt empty space at LEB 1:8192
UBIFS error (pid 1): ubifs_scanned_corruption: corrupted data at LEB 1:8192
UBIFS error (pid 1): ubifs_scan: LEB 1 scanning failed
UBI error: ubi_io_read: error -74 while reading 516096 bytes from PEB 3:8192, read 516096 bytes
UBIFS error (pid 1): ubifs_recover_master_node: failed to recover master node
List of all partitions:
1f00             512 mtdblock0 (driver?)
1f01            2048 mtdblock1 (driver?)
1f02            2048 mtdblock2 (driver?)
1f03            8192 mtdblock3 (driver?)
1f04            2048 mtdblock4 (driver?)
1f05            2048 mtdblock5 (driver?)
1f06         1007616 mtdblock6 (driver?)
1f07         1006592 mtdblock7 (driver?)
1f08           32768 mtdblock8 (driver?)
1f09            1024 mtdblock9 (driver?)
1f0a          980280 mtdblock10 (driver?)
No filesystem could mount root, tried:  ubifs


A better error message would say something like:
"UBI error: Data page incorrectly programmed to all 0xFFs with non-0xFF ECC."


Another suggestion is rather than creating large files stuffed with 0xFF pads the 
end of some of the blocks, to have a ubinize option which creates a download header 
in front of each block with block length and valid data length. Then the 0xFF's 
wouldn't have to be carried around and the user would be less likely to program 
0xFF's by mistake. They would typically only program the useful data that is in 
the file instead, and since they erased the block to program, the trailing 0xFFs
would be taken care of automatically. Of course, this would require custom flasher
changes to accommodate. Thanks.

Regards,
Darwin

> -----Original Message-----
> From: Artem Bityutskiy [mailto:dedekind at infradead.org] 
> Sent: Friday, September 25, 2009 12:05 AM
> To: Matthieu CASTET
> Cc: Adrian Hunter; Darwin Rambo; linux-mtd at lists.infradead.org
> Subject: Re: UBIFS and hardware ECC of all FF pages of MLC NAND
> 
> On Thu, 2009-09-24 at 17:36 +0200, Matthieu CASTET wrote:
> > Adrian Hunter a écrit :
> > > Darwin Rambo wrote:
> > > 
> > >> 2. for initial downloading, should an ECC be programmed 
> on all FF data pages? Is there any correction advantage?
> > > 
> > > In your case, as you have discovered, you must not 
> program ECC for FF pages at
> > > the end of eraseblocks.
> > > 
> > The tricky part is when you read FF pages with ecc in mtd. 
> You will get
> > an ecc error.
> > 
> > If the ecc writing is done on software you can always xor 
> the ecc code
> > to make it "FF for FF data".
> > But if everything is done by hardware...
> 
> Right, which means the UBI/UBIFS flasher should be smart and skip
> 0xFF-ed NAND pages at the end of eraseblocks. This adds some 
> complexity
> to the flasher, thought. And here:
> 
> http://www.linux-mtd.infradead.org/doc/ubi.html#L_format_det
> 
> I even described in details the flashing algorithm with my limited
> English vocabulary :-)
> 
> -- 
> Best Regards,
> Artem Bityutskiy (Артём Битюцкий)
> 
> 
> 


More information about the linux-mtd mailing list