UBI wl_tree_add problems after PEB scrubbed

Artem Bityutskiy dedekind1 at gmail.com
Mon Dec 3 09:46:00 EST 2012


/me looks at Richard...

OK, I confirm that this is fastmap to blame. Namely, this commit:

Author: Richard Weinberger <richard at nod.at>
Date:   Wed Sep 26 17:51:48 2012 +0200

    UBI: Add fastmap support to the WL sub-system
    
    To make fastmap possible the WL sub-system needs some
    changes.
    Mostly to support fastmaps pools.
    
    Signed-off-by: Richard Weinberger <richard at nod.at>
    Signed-off-by: Artem Bityutskiy <artem.bityutskiy at linux.intel.com>

If you revert it, bit-flips are handled properly.

Use the following for a time being:

$ git revert 76ac66e dbb7d2a 8199b90

On Fri, 2012-11-30 at 09:05 -0600, Zach Sadecki wrote:
> Every time I see UBI scrub a PEB with fixable bit-flips (on my custom 
> Freescale i.MX28 board) the background thread has problems shortly 
> thereafter.  I'm not exactly sure where to start debugging this and I'm 
> hoping someone can help point me in the right direction.  Below are 
> kernel messages showing the problem from 2 different runs (in which both 
> ended up with a hung CPU).  This is using kernel 3.7-rc7.
> 
> Also worth noting is that I had to modify the gpmi-nand driver to 
> actually report max_bitflips back to the MTD layer to even get to this 
> point (before that everything would just run along happily until it hit 
> an uncorrectable ECC error).  I will submit a patch for this once 
> everything seems OK...
> 
> Kernel messages:
> [18444.860000] UBI: fixable bit-flip detected at PEB 2789
> [18444.870000] UBI: schedule PEB 2789 for scrubbing
> [18444.960000] UBI: fixable bit-flip detected at PEB 2789
> [18445.120000] UBI: scrubbed PEB 2789 (LEB 3:544), data moved to PEB 1574
> [18457.040000] UBI assert failed in wl_tree_add at 204 (pid 31)
> [18457.040000] [<c000d8fc>] (unwind_backtrace+0x0/0xf4) from 
> [<c0279758>] (wl_tree_add+0xb8/0xd0)
> [18457.050000] [<c0279758>] (wl_tree_add+0xb8/0xd0) from [<c027a008>] 
> (erase_worker+0x3bc/0x6c0)
> [18457.060000] [<c027a008>] (erase_worker+0x3bc/0x6c0) from [<c027961c>] 
> (do_work+0x8c/0x110)
> [18457.070000] [<c027961c>] (do_work+0x8c/0x110) from [<c027adf4>] 
> (ubi_thread+0xc4/0x148)
> [18457.080000] [<c027adf4>] (ubi_thread+0xc4/0x148) from [<c00370c8>] 
> (kthread+0xa0/0xac)
> [18457.090000] [<c00370c8>] (kthread+0xa0/0xac) from [<c00092e8>] 
> (ret_from_fork+0x14/0x2c)
> [21386.860000] UBI: fixable bit-flip detected at PEB 2467
> [21386.870000] UBI: schedule PEB 2467 for scrubbing
> [21387.000000] UBI: fixable bit-flip detected at PEB 2467
> [21387.190000] UBI: scrubbed PEB 2467 (LEB 3:490), data moved to PEB 1574
> [21400.920000] UBI assert failed in wl_tree_add at 204 (pid 31)
> [21400.920000] [<c000d8fc>] (unwind_backtrace+0x0/0xf4) from 
> [<c0279758>] (wl_tree_add+0xb8/0xd0)
> [21400.930000] [<c0279758>] (wl_tree_add+0xb8/0xd0) from [<c027a008>] 
> (erase_worker+0x3bc/0x6c0)
> [21400.940000] [<c027a008>] (erase_worker+0x3bc/0x6c0) from [<c027961c>] 
> (do_work+0x8c/0x110)
> [21400.950000] [<c027961c>] (do_work+0x8c/0x110) from [<c027adf4>] 
> (ubi_thread+0xc4/0x148)
> [21400.960000] [<c027adf4>] (ubi_thread+0xc4/0x148) from [<c00370c8>] 
> (kthread+0xa0/0xac)
> [21400.970000] [<c00370c8>] (kthread+0xa0/0xac) from [<c00092e8>] 
> (ret_from_fork+0x14/0x2c)
> [26168.070000] BUG: soft lockup - CPU#0 stuck for 22s! [ubi_bgt0d:31]
> [26168.070000] irq event stamp: 27382734
> [26168.070000] hardirqs last  enabled at (27382733): [<c0008e88>] 
> __irq_svc+0x48/0x54
> [26168.070000] hardirqs last disabled at (27382734): [<c0008e74>] 
> __irq_svc+0x34/0x54
> [26168.070000] softirqs last  enabled at (27382732): [<c001ea48>] 
> __do_softirq+0x1ac/0x22c
> [26168.070000] softirqs last disabled at (27382725): [<c001ebb0>] 
> irq_exit+0x84/0x94
> [26168.070000]
> [26168.070000] Pid: 31, comm:            ubi_bgt0d
> [26168.070000] CPU: 0    Not tainted  (3.7.0-rc7-itw #15)
> [26168.070000] PC is at wl_tree_add+0x44/0xd0
> [26168.070000] LR is at erase_worker+0x3bc/0x6c0
> [26168.070000] pc : [<c02796e4>]    lr : [<c027a008>] psr: 80000013
> [26168.070000] sp : c76d9e90  ip : c08ea20c  fp : c763add8
> [26168.070000] r10: 00000000  r9 : 00000043  r8 : c763ad78
> [26168.070000] r7 : c76d8000  r6 : c77073c0  r5 : c7707410  r4 : c7707408
> [26168.070000] r3 : 00000046  r2 : 00000044  r1 : c7707348  r0 : c77073c0
> [26168.070000] Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM 
> Segment kernel
> [26168.070000] Control: 0005317f  Table: 460a0000  DAC: 00000017
> [26168.070000] [<c000d8fc>] (unwind_backtrace+0x0/0xf4) from 
> [<c005fc94>] (watchdog_timer_fn+0x15c/0x1b0)
> [26168.070000] [<c005fc94>] (watchdog_timer_fn+0x15c/0x1b0) from 
> [<c003b24c>] (__run_hrtimer+0xd8/0x1e8)
> [26168.070000] [<c003b24c>] (__run_hrtimer+0xd8/0x1e8) from [<c003b724>] 
> (hrtimer_interrupt+0x104/0x348)
> [26168.070000] [<c003b724>] (hrtimer_interrupt+0x104/0x348) from 
> [<c0013390>] (mxs_timer_interrupt+0x20/0x30)
> [26168.070000] [<c0013390>] (mxs_timer_interrupt+0x20/0x30) from 
> [<c00603e4>] (handle_irq_event_percpu+0x70/0x24c)
> [26168.070000] [<c00603e4>] (handle_irq_event_percpu+0x70/0x24c) from 
> [<c00605fc>] (handle_irq_event+0x3c/0x5c)
> [26168.070000] [<c00605fc>] (handle_irq_event+0x3c/0x5c) from 
> [<c0062d70>] (handle_level_irq+0x90/0x120)
> [26168.070000] [<c0062d70>] (handle_level_irq+0x90/0x120) from 
> [<c0060360>] (generic_handle_irq+0x30/0x3c)
> [26168.070000] [<c0060360>] (generic_handle_irq+0x30/0x3c) from 
> [<c0009abc>] (handle_IRQ+0x30/0x84)
> [26168.070000] [<c0009abc>] (handle_IRQ+0x30/0x84) from [<c0008760>] 
> (icoll_handle_irq+0x34/0x48)
> [26168.070000] [<c0008760>] (icoll_handle_irq+0x34/0x48) from 
> [<c0008e84>] (__irq_svc+0x44/0x54)
> [26168.070000] Exception stack(0xc76d9e48 to 0xc76d9e90)
> [26168.070000] 9e40:                   c77073c0 c7707348 00000044 
> 00000046 c7707408 c7707410
> [26168.070000] 9e60: c77073c0 c76d8000 c763ad78 00000043 00000000 
> c763add8 c08ea20c c76d9e90
> [26168.070000] 9e80: c027a008 c02796e4 80000013 ffffffff
> [26168.070000] [<c0008e84>] (__irq_svc+0x44/0x54) from [<c02796e4>] 
> (wl_tree_add+0x44/0xd0)
> [26168.070000] [<c02796e4>] (wl_tree_add+0x44/0xd0) from [<c027a008>] 
> (erase_worker+0x3bc/0x6c0)
> [26168.070000] [<c027a008>] (erase_worker+0x3bc/0x6c0) from [<c027961c>] 
> (do_work+0x8c/0x110)
> [26168.070000] [<c027961c>] (do_work+0x8c/0x110) from [<c027adf4>] 
> (ubi_thread+0xc4/0x148)
> [26168.070000] [<c027adf4>] (ubi_thread+0xc4/0x148) from [<c00370c8>] 
> (kthread+0xa0/0xac)
> [26168.070000] [<c00370c8>] (kthread+0xa0/0xac) from [<c00092e8>] 
> (ret_from_fork+0x14/0x2c)
> 
> 
> 
> [13111.940000] UBI: fixable bit-flip detected at PEB 2822
> [13111.940000] UBI: schedule PEB 2822 for scrubbing
> [13112.030000] UBI: fixable bit-flip detected at PEB 2822
> [13112.200000] UBI: scrubbed PEB 2822 (LEB 3:634), data moved to PEB 2766
> [13124.260000] UBI assert failed in wl_tree_add at 204 (pid 31)
> [13124.270000] [<c000d8fc>] (unwind_backtrace+0x0/0xf4) from 
> [<c027974c>] (wl_tree_add+0xb8/0xd0)
> [13124.280000] [<c027974c>] (wl_tree_add+0xb8/0xd0) from [<c0279ffc>] 
> (erase_worker+0x3bc/0x6c0)
> [13124.290000] [<c0279ffc>] (erase_worker+0x3bc/0x6c0) from [<c0279610>] 
> (do_work+0x8c/0x110)
> [13124.300000] [<c0279610>] (do_work+0x8c/0x110) from [<c027ade8>] 
> (ubi_thread+0xc4/0x148)
> [13124.300000] [<c027ade8>] (ubi_thread+0xc4/0x148) from [<c00370c8>] 
> (kthread+0xa0/0xac)
> [13124.310000] [<c00370c8>] (kthread+0xa0/0xac) from [<c00092e8>] 
> (ret_from_fork+0x14/0x2c)
> [13361.320000] Unable to handle kernel NULL pointer dereference at 
> virtual address 00000000
> [13361.330000] pgd = c5c64000
> [13361.330000] [00000000] *pgd=45c4f831, *pte=00000000, *ppte=00000000
> [13361.340000] Internal error: Oops: 817 [#1] ARM
> [13361.340000] CPU: 0    Not tainted  (3.7.0-rc7-itw #12)
> [13361.340000] PC is at prot_queue_del+0x7c/0xcc
> [13361.340000] LR is at ubi_wl_put_peb+0x1d8/0x2d8
> [13361.340000] pc : [<c0279070>]    lr : [<c027ac24>] psr: 60000013
> [13361.340000] sp : c6f07cd0  ip : c7720228  fp : 00000265
> [13361.340000] r10: 00000f75  r9 : 00000003  r8 : c763add8
> [13361.340000] r7 : c763ad70  r6 : c763a000  r5 : c7720228  r4 : c763ad74
> [13361.340000] r3 : 00200200  r2 : c77202b9  r1 : 00000000  r0 : c763a000
> [13361.340000] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM 
> Segment user
> [13361.340000] Control: 0005317f  Table: 45c64000  DAC: 00000015
> [13361.340000] Process sqlite3 (pid: 6567, stack limit = 0xc6f061b8)
> [13361.340000] Stack: (0xc6f07cd0 to 0xc6f08000)
> [13361.340000] 7cc0:                                     c763ad74 
> c027ac24 00000000 c0056088
> [13361.340000] 7ce0: 00000000 c763acc4 c76cfc00 00000000 c763a000 
> 00000003 00000f75 00000265
> [13361.340000] 7d00: 00000000 c0275a94 00000000 c03cf0c8 c6f07dc0 
> c6e65000 00000265 c6e65000
> [13361.340000] 7d20: c6cd6000 c6f07dc0 00000800 c019b45c c6f07dc0 
> 00000000 c6f07dc0 00000000
> [13361.340000] 7d40: c6e65000 c01a779c 00000000 00000000 00000000 
> 00000000 c6e65b20 00000001
> [13361.340000] 7d60: c6cd6000 00000265 c6e652f4 c6e652f4 c6e65000 
> c6cd6108 c6cd6000 00000000
> [13361.340000] 7d80: 000001b5 c6e65000 00009000 c6cd6000 c6f07dc0 
> 00000800 00000000 c01a7e30
> [13361.340000] 7da0: c01ab440 00000007 00000000 c6f06000 c6c76180 
> c6f06000 c6cd6024 c6e65a80
> [13361.340000] 7dc0: 00000000 0001f000 00000010 00000265 c5c09ca8 
> c6e65bd0 c6cd612c ffffffe4
> [13361.340000] 7de0: 000001b5 00000000 c6cd6108 00000000 c6cd612c 
> c6e65000 00000001 c018df00
> [13361.340000] 7e00: c725bb20 00000003 00000060 00000004 00000080 
> c6f06000 00000188 c6e652b4
> [13361.340000] 7e20: 00000000 00000000 00000108 00000001 c725d020 
> c00bc1d4 00000002 00000000
> [13361.340000] 7e40: c725d020 c6d82400 c6e65000 c72f3b20 c72f40c0 
> c725d020 000000a0 c018ffd8
> [13361.340000] 7e60: c6c76180 60000013 c72f3d38 c6f06000 00000188 
> 00000047 000000a0 00000048
> [13361.340000] 7e80: 00000000 00000001 22222222 c72f3d5c 22222222 
> 22222222 22222222 22222222
> [13361.340000] 7ea0: c725d020 00000000 c725d020 c72f3b20 c72f40a8 
> c6e65000 c6f07edc 00000048
> [13361.340000] 7ec0: 00000000 c0193ffc 00000001 00000000 00000001 
> 00000001 c00ccad0 00200020
> [13361.340000] 7ee0: 00000000 00000000 00000000 00000278 22222222 
> 22222222 00000000 c725d020
> [13361.340000] 7f00: 00000000 c72f40a8 c6f07f38 c72f3b20 c6f06000 
> 00000000 00000001 c00ccb04
> [13361.340000] 7f20: c6f07f38 000093c4 00000000 c6f3a000 c72f40a8 
> c00ce958 c740ee70 c715cc78
> [13361.340000] 7f40: 0febbc64 0000000e c6f3a01d 00000000 00000000 
> c7138d20 c725d020 00000000
> [13361.340000] 7f60: 00000002 00000000 00000000 60000010 00000000 
> 00000000 00000001 c0056088
> [13361.340000] 7f80: c6f06000 0051da2c 00000000 0051d418 0051da2c 
> 00000000 0051d418 0000000a
> [13361.340000] 7fa0: c00093c4 c0009240 0051da2c 00000000 0051da2c 
> 0051da2c 00000000 b6f2576c
> [13361.340000] 7fc0: 0051da2c 00000000 0051d418 0000000a b6f24b24 
> 00000000 00000000 00000001
> [13361.340000] 7fe0: 00000000 bec32f28 b6ed9064 b6deb62c 60000010 
> 0051da2c 00000000 00000000
> [13361.340000] [<c0279070>] (prot_queue_del+0x7c/0xcc) from [<c027ac24>] 
> (ubi_wl_put_peb+0x1d8/0x2d8)
> [13361.340000] [<c027ac24>] (ubi_wl_put_peb+0x1d8/0x2d8) from 
> [<c0275a94>] (ubi_eba_unmap_leb+0x90/0xa8)
> [13361.340000] [<c0275a94>] (ubi_eba_unmap_leb+0x90/0xa8) from 
> [<c019b45c>] (ubifs_leb_unmap+0x78/0xf4)
> [13361.340000] [<c019b45c>] (ubifs_leb_unmap+0x78/0xf4) from 
> [<c01a779c>] (ubifs_garbage_collect_leb+0x23c/0x7b0)
> [13361.340000] [<c01a779c>] (ubifs_garbage_collect_leb+0x23c/0x7b0) from 
> [<c01a7e30>] (ubifs_garbage_collect+0x120/0x3b8)
> [13361.340000] [<c01a7e30>] (ubifs_garbage_collect+0x120/0x3b8) from 
> [<c018df00>] (make_reservation+0x110/0x428)
> [13361.340000] [<c018df00>] (make_reservation+0x110/0x428) from 
> [<c018ffd8>] (ubifs_jnl_update+0x120/0x640)
> [13361.340000] [<c018ffd8>] (ubifs_jnl_update+0x120/0x640) from 
> [<c0193ffc>] (ubifs_unlink+0x168/0x260)
> [13361.340000] [<c0193ffc>] (ubifs_unlink+0x168/0x260) from [<c00ccb04>] 
> (vfs_unlink+0x84/0xfc)
> [13361.340000] [<c00ccb04>] (vfs_unlink+0x84/0xfc) from [<c00ce958>] 
> (do_unlinkat+0xd4/0x160)
> [13361.340000] [<c00ce958>] (do_unlinkat+0xd4/0x160) from [<c0009240>] 
> (ret_fast_syscall+0x0/0x38)
> [13361.340000] Code: e59c2000 e59c1004 e59f3040 e5821004 (e5812000)
> [13361.740000] ---[ end trace 8fb6e4d7822cc150 ]---
> [57160.070000] BUG: soft lockup - CPU#0 stuck for 22s! [init:1]
> [57160.070000] irq event stamp: 1311500
> [57160.070000] hardirqs last  enabled at (1311499): [<c00654d8>] 
> rcu_sched_qs+0x8c/0xc8
> [57160.070000] hardirqs last disabled at (1311500): [<c03d15ec>] 
> _raw_spin_lock_irq+0x24/0x54
> [57160.070000] softirqs last  enabled at (1311474): [<c001ea48>] 
> __do_softirq+0x1ac/0x22c
> [57160.070000] softirqs last disabled at (1311465): [<c001ebb0>] 
> irq_exit+0x84/0x94
> [57160.070000]
> [57160.070000] Pid: 1, comm:                 init
> [57160.070000] CPU: 0    Tainted: G      D       (3.7.0-rc7-itw #12)
> [57160.070000] PC is at wait_consider_task+0x4/0xb14
> [57160.070000] LR is at do_wait+0x104/0x200
> [57160.070000] pc : [<c001ac78>]    lr : [<c001b88c>] psr: 20000013
> [57160.070000] sp : c7439f0c  ip : 00000005  fp : c7439f78
> [57160.070000] r10: c7438000  r9 : c7439f34  r8 : c7430000
> [57160.070000] r7 : c7439f60  r6 : c74300dc  r5 : 00000000  r4 : c6c728a0
> [57160.070000] r3 : c6c72984  r2 : c6c728a0  r1 : 00000000  r0 : c7439f60
> [57160.070000] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM 
> Segment user
> [57160.070000] Control: 0005317f  Table: 4779c000  DAC: 00000015
> [57160.070000] [<c000d8fc>] (unwind_backtrace+0x0/0xf4) from 
> [<c005fc94>] (watchdog_timer_fn+0x15c/0x1b0)
> [57160.070000] [<c005fc94>] (watchdog_timer_fn+0x15c/0x1b0) from 
> [<c003b24c>] (__run_hrtimer+0xd8/0x1e8)
> [57160.070000] [<c003b24c>] (__run_hrtimer+0xd8/0x1e8) from [<c003b724>] 
> (hrtimer_interrupt+0x104/0x348)
> [57160.070000] [<c003b724>] (hrtimer_interrupt+0x104/0x348) from 
> [<c0013390>] (mxs_timer_interrupt+0x20/0x30)
> [57160.070000] [<c0013390>] (mxs_timer_interrupt+0x20/0x30) from 
> [<c00603e4>] (handle_irq_event_percpu+0x70/0x24c)
> [57160.070000] [<c00603e4>] (handle_irq_event_percpu+0x70/0x24c) from 
> [<c00605fc>] (handle_irq_event+0x3c/0x5c)
> [57160.070000] [<c00605fc>] (handle_irq_event+0x3c/0x5c) from 
> [<c0062d70>] (handle_level_irq+0x90/0x120)
> [57160.070000] [<c0062d70>] (handle_level_irq+0x90/0x120) from 
> [<c0060360>] (generic_handle_irq+0x30/0x3c)
> [57160.070000] [<c0060360>] (generic_handle_irq+0x30/0x3c) from 
> [<c0009abc>] (handle_IRQ+0x30/0x84)
> [57160.070000] [<c0009abc>] (handle_IRQ+0x30/0x84) from [<c0008760>] 
> (icoll_handle_irq+0x34/0x48)
> [57160.070000] [<c0008760>] (icoll_handle_irq+0x34/0x48) from 
> [<c0008e84>] (__irq_svc+0x44/0x54)
> [57160.070000] Exception stack(0xc7439ec0 to 0xc7439f08)
> [57160.070000] 9ec0: c7439f60 00000000 c6c728a0 c6c72984 c6c728a0 
> 00000000 c74300dc c7439f60
> [57160.070000] 9ee0: c7430000 c7439f34 c7438000 c7439f78 00000005 
> c7439f0c c001b88c c001ac78
> [57160.070000] 9f00: 20000013 ffffffff
> [57160.070000] [<c0008e84>] (__irq_svc+0x44/0x54) from [<c001ac78>] 
> (wait_consider_task+0x4/0xb14)
> [57160.070000] [<c001ac78>] (wait_consider_task+0x4/0xb14) from 
> [<c7438000>] (0xc7438000)
> 
> 
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/

-- 
Best Regards,
Artem Bityutskiy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part
URL: <http://lists.infradead.org/pipermail/linux-mtd/attachments/20121203/a0baf2d7/attachment-0001.sig>


More information about the linux-mtd mailing list