UBI bitflip maintenance

Jeff Harris jefftharris at gmail.com
Tue Jun 7 06:51:30 PDT 2016


I am looking to add a maintenance activity to our product to detect
correctable NAND bitflips and process them appropriately before they
become errors.  In looking at the MTD and Linux mailing lists, there
was work done by Richard Weinberger on this feature in March and
November of 2015.  Was the work completed?

I have been using Richard's work to develop a method which seems to be
working, though testing is in progress.  I've added an ioctl to check
bitflips on an individual PEB index.  An application will periodically
call the ioctl for each of the PEBs in a partition.

The ioctl performs a read on the whole PEB and looks for correctable
or uncorrectable ECC conditions.  If there is a bitflip, the
ubi_wl_entry is retrieved from the ubi->lookuptbl at the PEB index.
Following Richard's work, the ubi_wl_entry is examined to determine
its state and how to repair it.

If there is no entry, the PEB, which is assumed to be corrupt or
otherwise bad, is erased so that on the next attach, it may become
valid.  If the PEB is in the free list, the PEB is removed from the
list, and an erase is scheduled.  Otherwise, the logic is similar to
that in ubi_wl_scrub_peb.  We do not use fastmap, so I haven't looked
at that case.

Some status will need to be passed back to the user application to
prevent it from repeatedly trying to repair entries which have truly
gone bad and can not be repaired.

Does this approach of using the PEB index alleviate some of the
concerns Richard noted regarding race conditions with nodes becoming
bad while work is being scheduled?  An erase may be scheduled, but
with the node no longer in the free list, it shouldn't be used by any
intermediate operation, right?

Thanks,
Jeff



More information about the linux-mtd mailing list