simulate a bad NAND block cause kernel hang

ahgu ahgu at ahgu.homeunix.com
Thu Aug 25 11:41:59 EDT 2005


I am using 2.4.18 kernel.

jffs2_erase_failed(c, jeb);is the last function before the fault condition 
get triggered.
Where can I find the diff between 2.4.18 and 2.4.20?

void jffs2_erase_block(struct jffs2_sb_info *c, struct jffs2_eraseblock 
*jeb)
{
 int ret;
#ifdef __ECOS
       ret = jffs2_flash_erase(c, jeb);
       if (!ret) {
               jffs2_erase_succeeded(c, jeb);
               return;
       }
#else /* Linux */
 struct erase_info *instr;

 instr = kmalloc(sizeof(struct erase_info) + sizeof(struct 
erase_priv_struct), GFP_KERNEL);
 if (!instr) {
  printk(KERN_WARNING "kmalloc for struct erase_info in jffs2_erase_block 
failed. Refiling block for later\n");
  spin_lock(&c->erase_completion_lock);
  list_del(&jeb->list);
  list_add(&jeb->list, &c->erase_pending_list);
  c->erasing_size -= c->sector_size;
  c->dirty_size += c->sector_size;
  jeb->dirty_size = c->sector_size;
  spin_unlock(&c->erase_completion_lock);
  return;
 }

 memset(instr, 0, sizeof(*instr));

 instr->mtd = c->mtd;
 instr->addr = jeb->offset;
 instr->len = c->sector_size;
 instr->callback = jffs2_erase_callback;
 instr->priv = (unsigned long)(&instr[1]);

 ((struct erase_priv_struct *)instr->priv)->jeb = jeb;
 ((struct erase_priv_struct *)instr->priv)->c = c;

 /* NAND , read out the fail counter, if possible */
 if (!jffs2_can_mark_obsolete(c))
  jffs2_nand_read_failcnt(c,jeb);

 ret = c->mtd->erase(c->mtd, instr);
 if (!ret)
  return;

 kfree(instr);
#endif /* __ECOS */

 if (ret == -ENOMEM || ret == -EAGAIN) {
  /* Erase failed immediately. Refile it on the list */
  D1(printk(KERN_DEBUG "Erase at 0x%08x failed: %d. Refiling on 
erase_pending_list\n", jeb->offset, ret));
  spin_lock(&c->erase_completion_lock);
  list_del(&jeb->list);
  list_add(&jeb->list, &c->erase_pending_list);
  c->erasing_size -= c->sector_size;
  c->dirty_size += c->sector_size;
  jeb->dirty_size = c->sector_size;
  spin_unlock(&c->erase_completion_lock);
  return;
 }
 if (ret == -EROFS)
  printk(KERN_WARNING "Erase at 0x%08x failed immediately: -EROFS. Is the 
sector locked?\n", jeb->offset);
 else
  printk(KERN_WARNING "Erase at 0x%08x failed immediately: errno %d\n", 
jeb->offset, ret);

 jffs2_erase_failed(c, jeb);
}



----- Original Message ----- 
From: "Thomas Gleixner" <tglx at linutronix.de>
To: "ahgu" <ahgu at ahgu.homeunix.com>
Cc: <linux-mtd at lists.infradead.org>
Sent: Thursday, August 25, 2005 9:44 AM
Subject: Re: simulate a bad NAND block cause kernel hang


> On Thu, 2005-08-25 at 09:27 -0400, ahgu wrote:
>> I forced the flash_erase function to fail. I expect the jffs2 will pick 
>> up
>> the return error and mark the block bad and put the bad block in a 
>> bad_block
>> list. But what I get is kernel failure:
>> I get similar error when I simulate a write error.
>> Am I doing the bad block simulation correctly? Is this a correct 
>> response?
>> What is supposed to happen when the NAND flash grow a bad block?
>
> JFFS2 should handle this.
>
> The oops trace is worthless, as it does not show the stack trace in
> human readable form (function names decoded)
>
> Make sure that CONFIG_KALLSYMS is set in your kernel .config file.
>
> Also information about kernel version and possibly applied MTD/JFFS2
> patches is missing.
>
>
> tglx
>
> 





More information about the linux-mtd mailing list