RFC: detect and manage power cut on MLC NAND
Ricard Wanderlof
ricard.wanderlof at axis.com
Wed Mar 25 01:33:05 PDT 2015
On Wed, 25 Mar 2015, Iwo Mergler wrote:
> > From a simplified point of view you're right. In reality the
> > program/erase recipes are actually quite advanced in order to get
> > very tight distributions on a full page. The lower/upper page
> > sequence is designed to provide the most reliable results and
> > optimally we would like the lower and upper page programmed 100% of
> > the time. There's been a lot of work done over the years to improve
> > power loss and it's much better than in the past, but it's still
> > something to be avoided on NAND. It's always best to check the
> > integrity of the page after a power loss event.
>
> Is there any way to check the page integrity beyond ECC?
>
> I'm concerned that the power loss could yield an OK looking
> page, but with not so tight charge distribution.
>
> Maybe the hardware that can achieve tight distributions during
> programming, can be accessed to measure distribution of a
> programmed page?
What would be interesting from a software perspective would be if in some
special mode one could read the read the memory cells and get an analog
value with several bits of resolution, alowing the software to make an
assessment as to how "good" the bits are. This would be in contrast to the
normal, high speed, read mode. But perhaps matters are not that simple,
either there is no such value to be had (but as I understand it in certain
MLC flashes it is possible to shift the read thresholds, thus one could
accomplish this by successive approximation. Sure, that means that one
could do it entirely in software using existing devices, but it is a
rather cumbersome process however), or there are other factors that govern
the read thresholds which are not known outside the chip (or rather,
outside the manufacturers lab!).
> > I have to be careful here because it's very dependent on the design
> > and I really need to know the specifics to make a definitive
> > statement, but a few ms should be enough time to protect the NAND.
> > WP# is your friend here.
>
> The design is somewhat hypothetical - let's assume that we can
> guarantee the NAND supply for 10ms after system reset asserts.
>
> At reset time, the NAND controller will abort any command sequence in
> progress, so the final "program page" command will be sent either before
> the reset, or not at all. The command byte may be cut short on the bus.
It would seem to me that the only thing really needed to guarantee that
writes (or erase operations) are not cut short by power loss, is as Iwo
says that the system design is such that when power loss occurs, there is
enough power to maintain valid supply voltage levels to allow the NAND to
complete operations in the worst case, after system reset is asserted.
Admittedly we don't always have the luxury of well-designed hardware, but
having clear design rules for the hardware guys would help a long way in
future designs.
> I'm very happy to talk to someone at the coal face of modern NAND
> manufacturing. :-)
Agreed, I think we're very many that appreciate Jeff's contributions on
the list, me included. NAND data sheets are often not so forthcoming, and
there ends up being a lot of speculation about how things actually work,
so it's really nice to have someone with real knowledge to discuss this
with.
/Ricard
--
Ricard Wolf Wanderlöf ricardw(at)axis.com
Axis Communications AB, Lund, Sweden www.axis.com
Phone +46 46 272 2016 Fax +46 46 13 61 30
More information about the linux-mtd
mailing list