ubifs : corruption after power cut test

Artem Bityutskiy dedekind1 at gmail.com
Tue Jul 13 10:24:24 EDT 2010


On Tue, 2010-07-13 at 11:24 +0200, Matthieu CASTET wrote:
> Matthieu CASTET a écrit :
> > Matthieu CASTET a écrit :
> >> Hi,
> >>
> >> we found some bug in our driver. Now there no more ubifs error when
> >> there is uncorrectable ecc error (they should happen in the last
> >> (interrupted) written page).
> >>
> >> But now we got "validate_master: bad master node at offset 69632 error
> >> 7" [1].
> > notice that gc_lnum==-1 in this case.
> > Also this didn't happen on power cut.
> > The senario was :
> > - power cut
> > - mount fs [1]
> > - do some fs operation
> > - umount fs quickly (9 second after mount in this case) [2]
> > - mount fs [3]
> > 
> > The the problem seems that gc_lnum==-1 is not handled in mount or
> > shouldn't happen in umount.
> > 
> The attached patch try to support mount with gc_lnum == -1.
> 
> Does it look sane ?

I did not give it much thought, but I do not see how master node can end
up with gc_lnum = -1 in it, and it seems we assumed this cannot happen.
Could you please add this hack to your kernel? It should catch the
situations when we write gc_lnum == -1 to the master node and print the
stack dump, which should give some idea about the code-path which causes
it.

diff --git a/fs/ubifs/master.c b/fs/ubifs/master.c
index 28beaee..8277f64 100644
--- a/fs/ubifs/master.c
+++ b/fs/ubifs/master.c
@@ -378,6 +378,15 @@ int ubifs_write_master(struct ubifs_info *c)
 	c->mst_offs = offs;
 	c->mst_node->highest_inum = cpu_to_le64(c->highest_inum);
 
+	{
+		/* Temporary hack for Matthieu */
+		int gc_lnum = le32_to_cpu(c->mst_node->gc_lnum);
+		if (gc_lnum < 0) {
+			printk(KERN_CRIT "%s: gc_lnum is %d!\n", __func__, gc_lnum);
+			dump_stack();
+		}
+	}
+
 	err = ubifs_write_node(c, c->mst_node, len, lnum, offs, UBI_SHORTTERM);
 	if (err)
 		return err;

-- 
Best Regards,
Artem Bityutskiy (Артём Битюцкий)




More information about the linux-mtd mailing list