Increased frequency of fastmap failure due to CRC mismatch

Ronak Desai ronak.desai at rockwellcollins.com
Thu May 17 08:47:13 PDT 2018


On one of our units we noticed increase in fastmap failure due to
fastmap CRC mismatch.  On this unit, on every power-up UBI observed
fixable bit flips on a specific PEB. We are using SW ECC for ECC
correction as the processor's NAND controller does not support the
required ECC strength. We have also implemented read retry in the NAND
controller driver.

When UBI reads the fastmap data using NAND-MTD framework,  NAND-MTD
subsystem returns EUCLEAN meaning there were corrections greater or
equals to ECC strength. But the data should be corrected as the read
call does not return any other error.

In this failure scenarios, even though NAND-MTD subsystem has fixed
the corruption with SW ECC, UBI still finds CRC mis-match on fastmap
data. Successful data read with read retries has already been tested
at temperature as well so there is no doubt about the reliability of
read-retries. So, UBI should never receive corrupted data with fixable
bit flip return code.

So, would like to understand what is causing the fastamp data
corruption which leads to CRC mis-match.  Interesting thing is we see
fixable bit flip error message for that specific PEB on every power up
but we don't see fastmap CRC failure on every power up. All the
reboots are graceful (UBI partition is  detached and unmounted) and
there are no abrupt power-cut.

We are at kernel 4.1.8.

-- 
Ronak A Desai
Sr. Software Engineer
Airborne Information Solutions / RC Linux Platform Software
MS 131-100, C Ave NE, Cedar Rapids, IA, 52498, USA
Ronak.Desai at rockwellcollins.com
https://www.rockwellcollins.com/



More information about the linux-mtd mailing list