[JFFS2]The patch "jffs2: Fix lock acquisition order bug in jffs2_write_begin" introduces another dead lock bug.
Thomas.Betker at rohde-schwarz.com
Thomas.Betker at rohde-schwarz.com
Fri Aug 23 17:42:49 EDT 2013
Hello Chao:
> Yes,my kernel actually had hit this dead lock.
>
> My kernel's version is 2.6.32. After i merged the patch into my
> kernel, i run my own filesystem test suite "fstess" on the kernel.
> About 9 hours later, i found that the kernel didn't reponse any file
> operation at all. So i used magic SysRq key to show state of all
> tasks. Finally i found this two threads' stacktraces in the SysRq
> output as below which could prove the dead lock:
When creating the patch, I was assuming that jffs2_write_begin() and
jffs2_write_end() could not be called at the same time (for the same
page). Obviously, I was wrong.
Taking also into account the report by Ming Liu, I have come to the
conclusion that page_lock() must be called _before_ c->alloc_sem or f->sem
is acquired. In other words, my patch was a mistake; it's not
jffs2_write_begin() that should be fixed, but
jffs2_garbage_collect_live().
The short way out is to revert my patch, which is fine by me. However,
this still leaves us with the original issue, i.e., the deadlock between
jffs2_garbage_collect_live() and jffs2_write_begin(). Unfortunately,
jffs2_garbage_collect_live() grabs f->sem right at the beginning, while
page_lock() or __set_page_locked() is called several layers deep within
the code. So I don't think this is an easy job.
Unfortunately, I won't be at the office in the next three weeks, so there
isn't much I can do at the moment. If you are going to provide a fix, I
will gladly test it when I am back.
Best regards,
Thomas
More information about the linux-mtd
mailing list