[Help]Deadlock probelm in jffs2 garbage collection thread
吴旋
ppkwuxuan at 163.com
Tue Dec 13 01:58:21 EST 2011
I found a deadlock problem in our machine based on Linux. At that time, current process is jffs2_gcd_mtd7 while some other processes also keep in Running state. And I use sysrq to get the calling chain of the thread jffs2_gcd_mtd7, which is as follows:
[52232.340000] Pid: 195, comm: jffs2_gcd_mtd7
[52232.340000] CPU: 0 Tainted: P (2.6.31 #46)
[52232.340000] pc : [<c00e5ecc>] lr : [<c00e5da4>] psr: 20000013
[52232.340000] sp : c0843ef8 ip : c0843ef8 fp : c0843f54
[52232.340000] r10: 00000000 r9 : 00000000 r8 : 00000000
[52232.340000] r7 : c0c50e00 r6 : c0c50e00 r5 : c0dbccf0 r4 : c0842000
[52232.340000] r3 : 00000001 r2 : 00000006 r1 : 0000008b r0 : c0dbccf0
[52232.340000] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
[52232.340000] Control: 0005317f Table: 80800000 DAC: 00000015
[52232.340000] Function entered at [<c001f4d0>] from [<c01500fc>]
[52232.340000] r5:20000013 r4:c02639d0
[52232.340000] Function entered at [<c01500dc>] from [<c014ff34>]
[52232.340000] Function entered at [<c014fe78>] from [<c015006c>]
[52232.340000] Function entered at [<c015003c>] from [<c01553c8>]
[52232.340000] r5:00008170 r4:c024f5b0
[52232.340000] Function entered at [<c015526c>] from [<c015579c>]
[52232.340000] r7:0000002d r6:00000000 r5:00000100 r4:c024f5b0
[52232.340000] Function entered at [<c015573c>] from [<c006574c>]
[52232.340000] r5:00000000 r4:c0d6e940
[52232.340000] Function entered at [<c0065708>] from [<c0067520>]
[52232.340000] r7:00000002 r6:0000002d r5:c0d6e940 r4:c025a0d8
[52232.340000] Function entered at [<c006744c>] from [<c001d070>]
[52232.340000] r7:00000002 r6:002d0004 r5:00000000 r4:0000002d
[52232.340000] Function entered at [<c001d000>] from [<c001dacc>]
[52232.340000] Exception stack(0xc0843eb0 to 0xc0843ef8)
[52232.340000] 3ea0: c0dbccf0 0000008b 00000006 00000001
[52232.340000] 3ec0: c0842000 c0dbccf0 c0c50e00 c0c50e00 00000000 00000000 00000000 c0843f54
[52232.340000] 3ee0: c0843ef8 c0843ef8 c00e5da4 c00e5ecc 20000013 ffffffff
[52232.340000] r5:fc400000 r4:0000001f
[52232.340000] Function entered at [<c00e5cb8>] from [<c00e7bf4>]
[52232.340000] Function entered at [<c00e7a4c>] from [<c003dc94>]
[52232.340000] r7:00000000 r6:00000000 r5:00000000 r4:00000000
From System.map, I found that the kernel thread runs in the function jffs2_garbage_collect_pass(0xc00e5cb8), the location is in the “bug” function, which as follows:
int jffs2_garbage_collect_pass(struct jffs2_sb_info *c) {
…
switch(ic->state) {
…
default:
BUG(); // Deadlock situation
}
…
}
From the register map, I found that the value of ic->state is 6 with the macro INO_STATE_CLEARING. And I also check other running process through sysrq, and found a process which is calling the function jffs2_do_clear_inode(0x c00df720), which is as follow:
[51840.550000] SysManage R running 0 263 193 0x00000000
[51840.550000] Backtrace:
[51840.550000] Function entered at [<c0204224>] from [<c0204728>]
[51840.550000] Function entered at [<c02046d0>] from [<c001db20>]
[51840.550000] r5:fc400000 r4:0000001f
[51840.550000] Function entered at [<c00dd444>] from [<c00df7ac>]
[51840.550000] r7:00000003 r6:c0c50e00 r5:c05bfa80 r4:00000000
[51840.550000] Function entered at [<c00df720>] from [<c00e87a4>]
[51840.550000] r7:00000003 r6:c093e000 r5:c05bfab0 r4:c05bfaa8
[51840.550000] Function entered at [<c00e8788>] from [<c00a2c80>]
[51840.550000] Function entered at [<c00a2c04>] from [<c00a2e74>]
[51840.550000] r5:c05bfab0 r4:c05bfaa8
[51840.550000] Function entered at [<c00a2e0c>] from [<c00a3178>]
[51840.550000] r9:000008ae r8:00000080 r7:00000080 r6:00000080 r5:c093e000
[51840.550000] r4:c05bf568
[51840.550000] Function entered at [<c00a2f58>] from [<c0076d64>]
[51840.550000] r8:00000000 r7:00000064 r6:c0261a30 r5:00000064 r4:00000080
[51840.550000] Function entered at [<c0076c78>] from [<c00774ec>]
[51840.550000] Function entered at [<c0077324>] from [<c00704a0>]
[51840.550000] Function entered at [<c007017c>] from [<c007e630>]
[51840.550000] Function entered at [<c007df00>] from [<c0023b64>]
[51840.550000] Function entered at [<c0023a6c>] from [<c001d298>]
[51840.550000] Function entered at [<c001d260>] from [<c001de40>]
[51840.550000] Exception stack(0xc093ffb0 to 0xc093fff8)
[51840.550000] ffa0: 40175748 00000000 00000000 40024e00
[51840.550000] ffc0: 00000001 40021e00 4001f5a0 00000001 40025000 40026000 40021960 00011240
[51840.550000] ffe0: 40025000 be8d0b88 4000dc2c 4000dbb8 60000010 ffffffff
In the function jffs2_do_clear_inode, the state of inocache is set as INO_STATE_CLEARING, which is as follows:
void jffs2_do_clear_inode(struct jffs2_sb_info *c, struct jffs2_inode_info *f) {
…
if (f->inocache && f->inocache->state != INO_STATE_CHECKING)
jffs2_set_inocache_state(c, f->inocache, INO_STATE_CLEARING);
…
}
I think deadlock problem will occur when the function jffs2_do_clear_inode and the function jffs2_garbage_collect_pass are called concurrently in different processes. In the function jffs2_do_clear_inode, state will be set as INO_STATE_CLEARING, and then, the function jffs2_garbage_collect_pass will judge the value of ic->state and go to the “BUG” statement.
About the Linux platform, I use 2.6.31. And for jffs2 part, I have patched 3 patches as follows(The source code of jffs2 in my system is attached in this mail)
(1) 2009-11-30 jffs2: Fix memory corruption in jffs2_read_inode_range()
(2) 2009-12-16 jffs2: Fix long-standing bug with symlink garbage collection.
(3) [Bug 15572] Bug on JFFS2: some nodes are written back with old size(https://bugzilla.kernel.org/show_bug.cgi?id=15572)
And the System.map file is also attached in the mail
-------------- next part --------------
A non-text attachment was scrubbed...
Name: System.zip
Type: application/x-zip-compressed
Size: 190044 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-mtd/attachments/20111213/cadeadee/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: jffs2.zip
Type: application/x-zip-compressed
Size: 158085 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-mtd/attachments/20111213/cadeadee/attachment-0003.bin>
More information about the linux-mtd
mailing list