ubifs : corruption after power cut test

Matthieu CASTET matthieu.castet at parrot.com
Tue Jul 13 11:10:27 EDT 2010


Artem Bityutskiy a écrit :
> On Tue, 2010-07-13 at 11:24 +0200, Matthieu CASTET wrote:
>> Matthieu CASTET a écrit :
>>> Matthieu CASTET a écrit :
>>>> Hi,
>>>>
>>>> we found some bug in our driver. Now there no more ubifs error when
>>>> there is uncorrectable ecc error (they should happen in the last
>>>> (interrupted) written page).
>>>>
>>>> But now we got "validate_master: bad master node at offset 69632 error
>>>> 7" [1].
>>> notice that gc_lnum==-1 in this case.
>>> Also this didn't happen on power cut.
>>> The senario was :
>>> - power cut
>>> - mount fs [1]
>>> - do some fs operation
>>> - umount fs quickly (9 second after mount in this case) [2]
>>> - mount fs [3]
>>>
>>> The the problem seems that gc_lnum==-1 is not handled in mount or
>>> shouldn't happen in umount.
>>>
>> The attached patch try to support mount with gc_lnum == -1.
>>
>> Does it look sane ?
> 
> I did not give it much thought, but I do not see how master node can end
> up with gc_lnum = -1 in it, and it seems we assumed this cannot happen.
> Could you please add this hack to your kernel? It should catch the
> situations when we write gc_lnum == -1 to the master node and print the
> stack dump, which should give some idea about the code-path which causes
> it.
Ok thanks, I will run it

When checking the code, I saw that switch_gc_head can set c->gc_lnum to -1.

In ubifs_put_super, we set c->mst_node->gc_lnum to c->gc_lnum and write 
master node.
Can't ubifs_put_super run while switch_gc_head set gc_lnum to -1 ?

Matthieu



More information about the linux-mtd mailing list