[question]MTD:unstable bit issues?

Wed Oct 24 05:53:06 EDT 2012

Thomas.Betker at rohde-schwarz.com wrote:
> Hello Jia:
> We have also seen this "4k zeros" issue for some time. I never found out 
> was was happening because the issue was suddenly no longer reproducible. 
> :-(
> 
> In our case, though, we didn't have NAND flash, but JFFF2 with serial NOR 
> flash. So I would guess that this is not a NAND problem.

I've seen it a number of times with JFFS2 with parallel NOR, on
ancient kernels (2.4.26-uc0).  Typically the symptom would be an
executable crashing, and a 4k hole would be causing that.

We always put it down to a faulty board with poor signal and/or timings
to the NOR, as it "seemed" to happen on particular boards more, and
simply removed those boards from service.  However testing wasn't very
systematic.

If JFFS2 sees a corrupt block (detected by CRC), it simply discards
that block from the file data as if it's never been written, making a
hole.  An I/O error would be much nicer than corrupt data, but a hole
is what we get.

It might have been caused by faulty boards but in a different way:
Some of our boards crashed every few weeks because the manufacturer's
tolerances between CPU and DRAM were too tight.  After a very long
time in the field (years), we learned to underclock those slightly.
Maybe, occasional DRAM corruption was also causing occasional NOR
corruption as a side effect (e.g. writing/reading data that didn't
match the CRC during JFFS2 GC), leading to 4k holes in previously good
files (bad but understandable); or maybe occasional crashes caused it
(not acceptable and shouldn't happen).  But I am not sure the two
issues were correlated anyway.

-- Jamie