Testing a device using mtd_stresstest

David Peverley pev at sketchymonkey.com
Fri Feb 11 11:42:47 EST 2011


Hi Artem,

Thanks for the useful feedback!

> I do not know YAFFS, ask Charles, but I _think_ YAFFS does not use
> sub-pages, so you han have that option enabled.
Yep, I posted to the YAFFS list ; interestingly the write verify
failure found a case in yaffs where a write failure wasn't tested and
(I believe) erroneously completes a checkpoint write as a result...!
Having said that from

> For sure, if you do not use sub-pages and it catches problems - have it
> enabled and nail the problems down.
Sure, that makes sense. Although I'm having fun trying to
differentiate issues ; I get both the write verify failures and the
"uncorrectable errors" and there's not necessarily a direct
correlation between occurrences so my gut tells me they're to separate
issues...!

> Yeah, I think the tests should not do this, they should just test and
> report you issues.
Ok, so to summarise my understanding so far, a failure during the
stresstest is likely to be one of two things ; either a failure due to
a bad block developing which is not unexpected or an actual failure of
the test case per-se, or it could be due to Something Else Bad which
is not expected and IS a test failure. The only way I can see to
differentiate between these two situations is via statistics. i.e. if
a block is repeatedly failing its likely bad. If random blocks are
failing separately its probably something else warranting
investigation. Is that correct?

If so would it be more useful to adapt the (kernel) stresstest so that
it doesn't abort the test run on a failure but instead keeps a tally
of blocks within which failures have occurred and runs to completion.
Does that sound like a beneficial change? I'm not sure what strategy
is used for discerning if a block is bad or not but nandtest.c from
mtd-utils simply marks if an erase or write fails at all so this would
hopefully give more useful feedback from the stresstest. Aborting the
test on what could be a normal bad block seems a little misleading,
although I'm admittedly an unusually ardent fan of tests being
unambiguous... :-)

Thanks!

~Pev



More information about the linux-mtd mailing list