JFFS2 loss of power expectations
Ivan Djelic
ivan.djelic at parrot.com
Tue May 3 18:03:42 EDT 2011
On Tue, May 03, 2011 at 09:08:26PM +0100, Cliff Brake wrote:
> >> 2) any suggestions for debugging this?
> >
> > Some kind of device which may cut power is needed. Then you may write a
> > test program or script, cut power at random point, boot up, make sure
> > the FS look ok.
>
> Yes, we have a programmable PS set up to cut power during boot, and we
> can reproduce JFFS2 file system corruption with a day or so of
> testing. We are using a fairly old CPU board with a small SLC flash
> (128MB).
>
> Now, the question is how do we prevent it?
>
> We are looking into mounting the root file system in RO and sync
> modes, etc, but don't have test results yet.
>
> So, just looking for general ideas how to improve this situation.
Hi Cliff,
Just a few debugging ideas that helped me a lot in the past:
1. Try to focus your random power cuts so that they happen precisely during a
nand write/erase operation; this will help reproduce bugs much faster.
Ideally you could try to use a hw timer or watchdog to trigger a software
reset with µs precision.
2. Using instrumentation and targeted power cuts as described above, you
should be able to isolate the last interrupted nand operation that caused a
corruption: is it an interrupted page programming, or a partially erased block?
3. During reboot after a power cut, look for nand read failures. Are they
located as expected in the last page/block that was programmed/erased ? Or do
they appear in unrelated locations ? Or not appearing at all ?
4. If the above steps do not lead to an obvious explanation, they may still
provide you with a way to dump nand contents (before and after corruption) and
systematically reproduce the bug on a linux pc running nandsim. This makes
debugging much easier.
On the improvement side, I was going to suggest mounting as much as possible
as RO, but you mentioned that already.
Hope that helps,
Regards,
Ivan
More information about the linux-mtd
mailing list