UBI(FS) issues: how to debug?

Fri Mar 29 03:22:57 EDT 2013

> My questions are:
>    - Where should we go from here?  Are there tools we're missing that
>      might help us stress UBIFS/UBI/the NAND controller driver/the NAND

[Pekon]: Following are my feedbacks..
Firstly you should identify whether it's a device issue or a buggy driver/
You should not see too many uncorrectable errors, unless your device
has really worn out, because UBIFS does scrubbing in back-ground to
prevent accumulation of bit-flips.
/scrubbing in linux-mtd.infradead.org/doc/ubidesign/ubidesign.pdf

 (1) Please mount UBIFS using 'chk_data_crc' option
http://www.linux-mtd.infradead.org/doc/ubifs.html#L_checksumming
This would flag errors as soon as any bit-flip is observed in data portion
of LEB as soon as it is read.

(2) run 'nand bad' command from UBOOT#  to check for bad-blocks. 

(3) However, best way is to debug via adding prints to generic 
NAND driver. I use following sample code generally:
--------------

diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index 5138b6b..2be9eb9 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -1578,6 +1578,8 @@ static int nand_read(struct mtd_info *mtd, loff_t from, size_t len,
 {
        struct mtd_oob_ops ops;
        int ret;
+       int corrected   = mtd->ecc_stats.corrected;
+       int failed      = mtd->ecc_stats.failed;

        nand_get_device(mtd, FL_READING);
        ops.len = len;
@@ -1587,6 +1589,15 @@ static int nand_read(struct mtd_info *mtd, loff_t from, size_t len,
        ret = nand_do_read_ops(mtd, from, &ops);
        *retlen = ops.retlen;
        nand_release_device(mtd);
+
+       if (corrected != mtd->ecc_stats.corrected)
+               printk(KERN_EMERG "%s: bit-flip corrected %d", __func__,
+                                               mtd->ecc_stats.corrected);
+
+       if (failed != mtd->ecc_stats.failed)
+               printk(KERN_EMERG "%s: bit-flip failed %d", __func__,
+                                               mtd->ecc_stats.failed);
+
        return ret;
 }
--------------

with regards, pekon