Make NAND_BBT_NO_OOB_BBM configurable or let the gpmi driver decide?

Lothar Waßmann LW at KARO-electronics.de
Tue Mar 15 00:06:02 PDT 2022


Miquel Raynal <miquel.raynal at bootlin.com> wrote:

> Hi Daniel,
> 
> Sorry for the delay.
> 
> dg at emlix.com wrote on Thu, 24 Feb 2022 19:17:43 +0100:
> 
> > Hi Miquel,
> > 
> > Am 24.02.22 um 17:03 schrieb Miquel Raynal:  
> > > dg at emlix.com wrote on Thu, 24 Feb 2022 16:55:27 +0100:    
> > >> Am 24.02.22 um 16:29 schrieb Miquel Raynal:    
> > >>> dg at emlix.com wrote on Wed, 23 Feb 2022 11:59:02 +0100:      
> > >>>> Am 22.02.22 um 23:02 schrieb Han Xu:>>> Could you please
> > >>>> describe more details about what kind of error, how to      
> > >>>>> reproduce it and on which kernel version?        
> > >>>>
> > >>>> You need a flash that has one bad block where programming the
> > >>>> BBM sets NAND_STATUS_FAIL in its status register. The latest
> > >>>> kernels should still have problems when this happens in a
> > >>>> UBI.      
> > >>>
> > >>> I believe we should try to tackle "why" this happens more than
> > >>> try to workaround its consequences. Can you give more details
> > >>> about why we get this status?      
> > >>
> > >> Uhm, the block is bad, broken. It shows the same behavior even
> > >> after power cycling. The other blocks are ok. I don't think it
> > >> is our fault that it died so early.    
> > > 
> > > But why after a power cycle are we trying to write the BBM?    
> > 
> > I did not want to imply that Linux tries to write the block after
> > every power cycle. UBI notices that the block is broken once and
> > manages to mark it as bad in the BBT, so after power cycle it will
> > not try to write to that block again. What I wanted to say is that
> > manual testing of the block after power cycling shows that the
> > block remains unusable.
> > 
> > The problem is that UBI switches to read-only mode after it marked
> > the block as bad in the BBT because the redundant BBM in the OOB of
> > the block could not be written.  
> 
> I think I understand better your situation now.
> 
> So here is our problem : why can't we write the OOB? If there is a
> good reason this cannot happen, then we can provide the
> NAND_BBT_NO_OOB_BBM flag. Otherwise we should find the root cause.
> 
> > And we don't want to get into a situation
> > where we have to reboot the system, especially if it is because of
> > something we don't need.
> > 
> > We could change nand_block_markbad_lowlevel to return success as
> > long as updating the BBT succeeds, if you think that this is the
> > correct approach.  
> 
> That is not a correct approach if we did not asked to bypass writing
> BBMs explicitly.
>
The BBM in the OOB area is a "Factory Bad Block Marker" where the
manufacturer marks initially bad blocks. There is no guarantee that the
BBM can be written on a block that turned bad lateron.
If a block turned BAD during use it is completely useless to try writing
anything to it. Depending on the nature of the NAND error that turned
the block bad, trying to write that block may also affect random other
blocks.


Lothar Waßmann
-- 
___________________________________________________________

Ka-Ro electronics GmbH | Pascalstraße 22 | D - 52076 Aachen
Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10
Geschäftsführer: Matthias Kaussen
Handelsregistereintrag: Amtsgericht Aachen, HRB 4996

www.karo-electronics.de | info at karo-electronics.de
___________________________________________________________



More information about the linux-mtd mailing list