UBIFS question

Martin Townsend mtownsend1973 at gmail.com
Thu Mar 17 04:16:43 PDT 2016


Hi Richard,

On Thu, Mar 17, 2016 at 8:56 AM, Richard Weinberger <richard at nod.at> wrote:
> Martin,
>
> Am 17.03.2016 um 09:33 schrieb Martin Townsend:
>> Hi Richard,
>>
>> Thanks for the reply.  rsync is the backup plan, I just wanted to rule
>> other options out first. The flash devices are going to subjected to a
>> fairly harsh environment and the idea of being able to fail over to a
>> backup docker container was appealing.
>>
>> Which leads me to a couple of questions:
>> 1) I need to simulate flash devices reading corrupted
>> pages/blocks/LEBs. Is there currently a way of doing this? if not
>> would it be possible to write something, say a kernel module to sit
>> above the NAND driver to do this.  I just want to see what effect
>> corruption has on a live system and how these errors manifest
>> themselves.
>
> check out nandsim and UBI's debugfs. We have a lot of knobs to
> simulate such stuff.
>
I will take a look at nandsim and UBI debugfs.

>> 2) One thing I'm going to have to do is write a background thread to
>> monitor the status of the filesystem and try and detect corruption
>> before the system becomes unstable, is there any way to find out the
>> validity of the LEBs, ie checking their checksums.
>
> So, what exactly is the error scenario you have in mind?
> If the SLC NAND behaves correctly UBIFS can deal with all kinds
> of errors.
> Of course UBI (and UBIFS) is not a magic bullet, if a NAND block
> turns bad all of a sudden there is nothing it can do for you.
> But this NAND would also not be with in the spec...
>
> It is not clear do me what this background thread should achieve.

We expect the flash devices to start failing quicker than normally
expected due to the environment in which they will be operating in, so
sudden NAND blocks turning bad will eventually happen and what we
would like to do is try and capture this as soon as possible.
The boards are not accessible as they will be located in very remote
locations so detecting these failures before the system locks up would
be an advantage so we can report home with the information and fail
over to the other filesystem (providing that hasn't also been
corrupted).

Many Thanks,
Martin.

>
> Thanks,
> //richard



More information about the linux-mtd mailing list