a UBIFS image makes task pdflush blocked > 120 seconds
Norbert van Bolhuis
nvbolhuis at aimvalley.nl
Fri Nov 13 10:09:31 EST 2009
>
> Try to see deeper at what the other task is doing, and store process ids
> as well.
>
> HTH.
>
first of all, thanks for your help with this.
please forget about "M=(t= ,xy= )", it's a internal log code which is not interesting.
OK, so now I understand commit_sem is a RW semaphore which can be acquired multiple
times by readers but only once for writers. I failed to see that initially.
I think this problem is not related to the commit_sem though.
I changed the logging a bit and added the PID info.
The problem is indeed caused by a third task holding the "MTD chip lock". This
blocks the application which is blocking pdflush.
This 3rd task is the UBI background thread. It starts to erase many NOR PEBs very soon.
To be absolutely certain I also added log entries for the do_erase_oneblock
function.
To better reproduce this problem I decreased the warning threshold from 120 sec to 60 secs.
Here's the detailed info.
log entries (LE:xyz is the log entry number):
---------------------------------------------
LE:0235, TIME=3984.244, pid=0918, M(xy=00)=do_erase_oneblock, START
LE:0236, TIME=3984.836, pid=1236, M(xy=04)=commit_sem, make_reservation, Req DOWN 1 <- application creating a file
LE:0237, TIME=3984.836, pid=1236, M(xy=05)=commit_sem, make_reservation, DOWN 1
LE:0238, TIME=3984.836, pid=1236, M(xy=02)=io_mutex, reserve_space, Req LOCK 1
LE:0239, TIME=3984.836, pid=1236, M(xy=03)=io_mutex, reserve_space, LOCK 1
LE:0240, TIME=3985.100, pid=0918, M(xy=01)=do_erase_oneblock, END <- ubi_bgt0d continuing erasing LEBs
LE:0241, TIME=3985.100, pid=0918, M(xy=00)=do_erase_oneblock, START
<omitted many do_erase_oneblock END/START entries>
LE:0275, TIME=3998.608, pid=0918, M(xy=00)=do_erase_oneblock, START
LE:0276, TIME=3999.256, pid=0108, M(xy=04)=commit_sem, make_reservation, Req DOWN 1
LE:0277, TIME=3999.256, pid=0108, M(xy=05)=commit_sem, make_reservation, DOWN 1
LE:0278, TIME=3999.256, pid=0108, M(xy=02)=io_mutex, reserve_space, Req LOCK 1 <- pdflush blocks here
LE:0279, TIME=3999.444, pid=0918, M(xy=01)=do_erase_oneblock, END
<omitted many do_erase_oneblock END/START entries>
LE:0433, TIME=4062.400, pid=0918, M(xy=01)=do_erase_oneblock, END
LE:0434, TIME=4062.400, pid=0918, M(xy=00)=do_erase_oneblock, START
LE:0435, TIME=4063.124, pid=0918, M(xy=01)=do_erase_oneblock, END
LE:0436, TIME=4063.124, pid=0918, M(xy=00)=do_erase_oneblock, START
LE:0437, TIME=4063.308, pid=0004, M(xy=0a)=pdflush error report
LE:0438, TIME=4066.120, pid=0918, M(xy=01)=do_erase_oneblock, END
LE:0439, TIME=4066.120, pid=0918, M(xy=00)=do_erase_oneblock, START
LE:0440, TIME=4066.876, pid=1236, M(xy=06)=io_mutex, release_head, Req UNLOCK
LE:0441, TIME=4066.876, pid=1236, M(xy=07)=io_mutex, release_head, UNLOCK <- application unlocks io_mutex
LE:0442, TIME=4066.876, pid=1236, M(xy=08)=commit_sem, finish_reservation, Req UP
LE:0443, TIME=4066.876, pid=1236, M(xy=09)=commit_sem, finish_reservation, UP
LE:0444, TIME=4066.880, pid=0108, M(xy=03)=io_mutex, reserve_space, LOCK 1 <- pdflush can continue
LE:0445, TIME=4066.880, pid=0108, M(xy=06)=io_mutex, release_head, Req UNLOCK
LE:0446, TIME=4066.880, pid=0108, M(xy=07)=io_mutex, release_head, UNLOCK
LE:0447, TIME=4066.880, pid=0108, M(xy=08)=commit_sem, finish_reservation, Req UP
LE:0448, TIME=4066.880, pid=0108, M(xy=09)=commit_sem, finish_reservation, UP
relevant processes
------------------
pdflush D 00000000 0 108 2
Call Trace:
[ceb51b50] [7f64db78] 0x7f64db78 (unreliable)
[ceb51c10] [c0008b24] 0xc0008b24 __switch_to
[ceb51c30] [c0291e40] 0xc0291e40 schedule
[ceb51c80] [c0292d60] 0xc0292d60 __mutex_lock_slowpath
[ceb51cb0] [c013d5fc] 0xc013d5fc make_reservation
[ceb51d20] [c013ddf8] 0xc013ddf8 ubifs_jnl_write_inode
[ceb51d60] [c014531c] 0xc014531c ubifs_write_inode
[ceb51d80] [c01410ec] 0xc01410ec ubifs_writepage
[ceb51db0] [c005b124] 0xc005b124 __writepage
[ceb51dc0] [c005b948] 0xc005b948 write_cache_pages
[ceb51e60] [c005babc] 0xc005babc do_writepages
[ceb51e70] [c00a3638] 0xc00a3638 __writeback_single_inode
[ceb51ec0] [c00a3bfc] 0xc00a3bfc generic_sync_sb_inodes
[ceb51f00] [c00a439c] 0xc00a439c writeback_inodes
[ceb51f20] [c005c7e0] 0xc005c7e0 wb_kupdate
[ceb51f80] [c005cfdc] 0xc005cfdc pdflush
[ceb51fd0] [c003bd88] 0xc003bd88 kthread
[ceb51ff0] [c00114a0] 0xc00114a0 kernel_thread
ubi_bgt0d R running 0 918 2
Call Trace:
[ce6e1b80] [00001032] 0x001032 (unreliable)
[ce6e1c40] [c0008b24] 0xc0008b24 __switch_to
[ce6e1c60] [c0291e40] 0xc0291e40 schedule
[ce6e1cb0] [c02926d4] 0xc02926d4 schedule_timeout
[ce6e1cf0] [c002f684] 0xc002f684 msleep
[ce6e1d00] [c01ee900] 0xc01ee900 do_erase_oneblock
[ce6e1d60] [c01e98a0] 0xc01e98a0 cfi_varsize_frob
[ce6e1da0] [c01ecdc4] 0xc01ecdc4 cfi_amdstd_erase_varsize
[ce6e1db0] [c01e30b8] 0xc01e30b8 part_erase
[ce6e1dc0] [c01e5800] 0xc01e5800 concat_erase
[ce6e1e20] [c01e30b8] 0xc01e30b8 part_erase
[ce6e1e30] [c01f8134] 0xc01f8134 do_sync_erase
[ce6e1ec0] [c01f83f0] 0xc01f83f0 ubi_io_sync_erase
[ce6e1f40] [c01f9658] 0xc01f9658 erase_worker
[ce6e1f70] [c01f8cc0] 0xc01f8cc0 do_work
[ce6e1f90] [c01fa078] 0xc01fa078 ubi_thread
[ce6e1fd0] [c003bd88] 0xc003bd88 kthread
[ce6e1ff0] [c00114a0] 0xc00114a0 kernel_thread
mainctrl D 0fbd3f00 0 1236 1215
Call Trace:
[ce7fd9f0] [00000001] 0x000001 (unreliable)
[ce7fdab0] [c0008b24] 0xc0008b24 __switch_to
[ce7fdad0] [c0291e40] 0xc0291e40 schedule
[ce7fdb20] [c01ea090] 0xc01ea090 get_chip
[ce7fdb90] [c01eb8dc] 0xc01eb8dc cfi_amdstd_write_buffers
[ce7fdc20] [c01e2da8] 0xc01e2da8 part_write
[ce7fdc30] [c01e4128] 0xc01e4128 concat_write
[ce7fdc70] [c01e2da8] 0xc01e2da8 part_write
[ce7fdc80] [c01f7854] 0xc01f7854 ubi_io_write
[ce7fdcb0] [c01f63d4] 0xc01f63d4 ubi_eba_write_leb
[ce7fdd10] [c01f546c] 0xc01f546c ubi_leb_write
[ce7fdd30] [c0148c6c] 0xc0148c6c ubifs_wbuf_write_nolock
[ce7fdd70] [c013dd0c] 0xc013dd0c write_head
[ce7fdd90] [c013e9e0] 0xc013e9e0 ubifs_jnl_update
[ce7fde00] [c0143030] 0xc0143030 ubifs_create
[ce7fde50] [c008caf8] 0xc008caf8 vfs_create
[ce7fde70] [c0090a20] 0xc0090a20 do_filp_open
[ce7fdf20] [c008155c] 0xc008155c do_sys_open
[ce7fdf40] [c001167c] 0xc001167c ret_from_syscall
relevant process (priorities)
-----------------------------
F S UID PID SPID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
1 S root 108 108 2 0 80 0 - 0 5cf64 22:16 ? 00:00:00 [pdflush]
1 S root 918 918 2 0 75 -5 - 0 1fa2dc 22:16 ? 00:00:00 [ubi_bgt0d]
0 S root 1236 1236 1221 0 90 10 - 12617 49354 22:17 ttyS1 00:00:00 mainctrl
So the problem is that the application holding the io_mutex is blocked by the ubi bg task
because of the "MTD chip lock".
I'm happy I understand the problem now, I guess it's not *that* serious.
It does cause unnecessary delays however which I'd like to get rid of.
I guess it wouldn't make sense to somehow release the per-journalhead io_mutex whenever
blocked by the "MTD chip lock" ?
How can I prevent the UBI background thread to erase so many NOR LEBs so soon ?
(I hope the only option is not by creating a 156MB UBIFS image (preprogrammed by manufacturing) for
our 156 MB NOR partition).
Would it make sense to lower the UBI background thread task prio ?
thanks,
Norbert.
More information about the linux-mtd
mailing list