Numonyx NOR and chip->mutex bug?
joakim.tjernlund at transmode.se
Thu Feb 10 09:53:59 EST 2011
> On Feb 9, 2011, at 3:13 PM, Joakim Tjernlund wrote:
> > hmm, this sounds similar(from http://www.numonyx.com/Documents/Specification%20Updates/SU-309045_P30.pdf)
> > 5. W602: erase suspend resume operation
> > Problem: P30 product may fail to erase in intensive erase/suspend/resume environments. This is
> > due to an internal firmware issue that is exhibited in certain applications that require at
> > least 3000 to 4000 erase/suspend/resume cycles during the erase of a single block.
> > Implication: Customer may see erase failure (SR reports “A0”) during a background erase. This
> > does not damage the device in any way, but data in the block may be disturbed from its
> > original state.
> > Workaround: If such an erase failure occurs, customer should retry the erase with an increase to the
> > Tres (W602) spec from 500uS to 1mS. If the device still fails, continue to increase Tres
> > in increments of 500uS up to a maximum Tres of 2.5mS. Once the failing block passes,
> > subsequent blocks should revert back to original erase algorithm and timing for Tres
> > (500uS typical).
> > Status:
> > June 2008
> > 309045-09
> > This erratum does not apply to material marked with an FPO code dated x806xxxx or
> > later.
> Interesting. Thanks for the URL. I'll explore it. I have a 2009 errata sheet but had not seen the 2008 one. (I'm wondering if that's because your's is for a batch of the older 130nm parts which are no longer made. Just guessing.)
> It doesn't match exactly, since I'm not seeing erase failures. I'm seeing buffered write failures where, after the 0xd0 Write Confirm command I get back nonsense status. (See the 0xffff reported in my log. I've also seen many values other than 0xffff.)
> So the the program operation (0xe8) begins OK, the count and data are written, the confirm command is written but then the status read reads as junk. I'm thinking about adding status-read logs to my buffer if I can do it without disturbing the timing.
> The reason that status is nonsense in the status register is only in the lower 8 bits. If it's in read-status mode (which it should be) we should never see any set bits in the upper byte. Because I do see such bits I've suspected from the start that somehow the part is exiting the status-read mode
and returning array-read mode when it should not. (My very first post mentions this suspicion.)
> One more log point I think I may check is to count and report the number of suspend/resumes for each erase. It would be interesting to see if some sort of spike in that count correlates with my failures.
> More as I find it.
Mike, what if you move the udelay to just before doing erase suspend? Perhaps
you need to increase it a bit.
More information about the linux-mtd