[Help]Deadlock probelm in jffs2 garbage collection thread

吴旋 ppkwuxuan at 163.com
Tue Dec 13 01:58:21 EST 2011


I found a deadlock problem in our machine based on Linux. At that time, current process is jffs2_gcd_mtd7 while some other processes also keep in Running state. And I use sysrq to get the calling chain of the thread jffs2_gcd_mtd7, which is as follows:
 [52232.340000] Pid: 195, comm:       jffs2_gcd_mtd7

[52232.340000] CPU: 0    Tainted: P            (2.6.31 #46)

[52232.340000] pc : [<c00e5ecc>]    lr : [<c00e5da4>]    psr: 20000013

[52232.340000] sp : c0843ef8  ip : c0843ef8  fp : c0843f54

[52232.340000] r10: 00000000  r9 : 00000000  r8 : 00000000

[52232.340000] r7 : c0c50e00  r6 : c0c50e00  r5 : c0dbccf0  r4 : c0842000

[52232.340000] r3 : 00000001  r2 : 00000006  r1 : 0000008b  r0 : c0dbccf0

[52232.340000] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user

[52232.340000] Control: 0005317f  Table: 80800000  DAC: 00000015

[52232.340000] Function entered at [<c001f4d0>] from [<c01500fc>]

[52232.340000]  r5:20000013 r4:c02639d0

[52232.340000] Function entered at [<c01500dc>] from [<c014ff34>]

[52232.340000] Function entered at [<c014fe78>] from [<c015006c>]

[52232.340000] Function entered at [<c015003c>] from [<c01553c8>]

[52232.340000]  r5:00008170 r4:c024f5b0

[52232.340000] Function entered at [<c015526c>] from [<c015579c>]

[52232.340000]  r7:0000002d r6:00000000 r5:00000100 r4:c024f5b0

[52232.340000] Function entered at [<c015573c>] from [<c006574c>]

[52232.340000]  r5:00000000 r4:c0d6e940

[52232.340000] Function entered at [<c0065708>] from [<c0067520>]

[52232.340000]  r7:00000002 r6:0000002d r5:c0d6e940 r4:c025a0d8

[52232.340000] Function entered at [<c006744c>] from [<c001d070>]

[52232.340000]  r7:00000002 r6:002d0004 r5:00000000 r4:0000002d

[52232.340000] Function entered at [<c001d000>] from [<c001dacc>]

[52232.340000] Exception stack(0xc0843eb0 to 0xc0843ef8)

[52232.340000] 3ea0:                                     c0dbccf0 0000008b 00000006 00000001 

[52232.340000] 3ec0: c0842000 c0dbccf0 c0c50e00 c0c50e00 00000000 00000000 00000000 c0843f54 

[52232.340000] 3ee0: c0843ef8 c0843ef8 c00e5da4 c00e5ecc 20000013 ffffffff                   

[52232.340000]  r5:fc400000 r4:0000001f

[52232.340000] Function entered at [<c00e5cb8>] from [<c00e7bf4>]

[52232.340000] Function entered at [<c00e7a4c>] from [<c003dc94>]

[52232.340000]  r7:00000000 r6:00000000 r5:00000000 r4:00000000

         From System.map, I found that the kernel thread runs in the function jffs2_garbage_collect_pass(0xc00e5cb8), the location is in the “bug” function, which as follows:
int jffs2_garbage_collect_pass(struct jffs2_sb_info *c) {
         …
         switch(ic->state) {
                   …
         default:
                            BUG();      // Deadlock situation
         }
         …
}

         From the register map, I found that the value of ic->state is 6 with the macro INO_STATE_CLEARING. And I also check other running process through sysrq, and found a process which is calling the function jffs2_do_clear_inode(0x c00df720), which is as follow:

[51840.550000] SysManage     R running      0   263    193 0x00000000

[51840.550000] Backtrace: 

[51840.550000] Function entered at [<c0204224>] from [<c0204728>]

[51840.550000] Function entered at [<c02046d0>] from [<c001db20>]

[51840.550000]  r5:fc400000 r4:0000001f

[51840.550000] Function entered at [<c00dd444>] from [<c00df7ac>]

[51840.550000]  r7:00000003 r6:c0c50e00 r5:c05bfa80 r4:00000000

[51840.550000] Function entered at [<c00df720>] from [<c00e87a4>]    

[51840.550000]  r7:00000003 r6:c093e000 r5:c05bfab0 r4:c05bfaa8

[51840.550000] Function entered at [<c00e8788>] from [<c00a2c80>]

[51840.550000] Function entered at [<c00a2c04>] from [<c00a2e74>]

[51840.550000]  r5:c05bfab0 r4:c05bfaa8

[51840.550000] Function entered at [<c00a2e0c>] from [<c00a3178>]

[51840.550000]  r9:000008ae r8:00000080 r7:00000080 r6:00000080 r5:c093e000

[51840.550000] r4:c05bf568

[51840.550000] Function entered at [<c00a2f58>] from [<c0076d64>]

[51840.550000]  r8:00000000 r7:00000064 r6:c0261a30 r5:00000064 r4:00000080

[51840.550000] Function entered at [<c0076c78>] from [<c00774ec>]

[51840.550000] Function entered at [<c0077324>] from [<c00704a0>]

[51840.550000] Function entered at [<c007017c>] from [<c007e630>]

[51840.550000] Function entered at [<c007df00>] from [<c0023b64>]

[51840.550000] Function entered at [<c0023a6c>] from [<c001d298>]

[51840.550000] Function entered at [<c001d260>] from [<c001de40>]

[51840.550000] Exception stack(0xc093ffb0 to 0xc093fff8)

[51840.550000] ffa0:                                     40175748 00000000 00000000 40024e00 

[51840.550000] ffc0: 00000001 40021e00 4001f5a0 00000001 40025000 40026000 40021960 00011240 

[51840.550000] ffe0: 40025000 be8d0b88 4000dc2c 4000dbb8 60000010 ffffffff      
                   
                   In the function jffs2_do_clear_inode, the state of inocache is set as INO_STATE_CLEARING, which is as follows:
void jffs2_do_clear_inode(struct jffs2_sb_info *c, struct jffs2_inode_info *f) {
         …
         
         if (f->inocache && f->inocache->state != INO_STATE_CHECKING)
                   jffs2_set_inocache_state(c, f->inocache, INO_STATE_CLEARING);
         …
}
         I think deadlock problem will occur when the function jffs2_do_clear_inode and the function jffs2_garbage_collect_pass are called concurrently in different processes. In the function jffs2_do_clear_inode, state will be set as INO_STATE_CLEARING, and then, the function jffs2_garbage_collect_pass will judge the value of ic->state and go to the “BUG” statement. 

         About the Linux platform, I use 2.6.31. And for jffs2 part, I have patched 3 patches as follows(The source code of jffs2 in my system is attached in this mail)
(1) 2009-11-30 jffs2: Fix memory corruption in jffs2_read_inode_range()
(2) 2009-12-16 jffs2: Fix long-standing bug with symlink garbage collection.
(3) [Bug 15572] Bug on JFFS2: some nodes are written back with old size(https://bugzilla.kernel.org/show_bug.cgi?id=15572)
         And the System.map file is also attached in the mail
-------------- next part --------------
A non-text attachment was scrubbed...
Name: System.zip
Type: application/x-zip-compressed
Size: 190044 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-mtd/attachments/20111213/cadeadee/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: jffs2.zip
Type: application/x-zip-compressed
Size: 158085 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-mtd/attachments/20111213/cadeadee/attachment-0003.bin>


More information about the linux-mtd mailing list