Another JFFS2 deadlock, kernel 3.4.11

Wed Nov 11 00:16:10 PST 2015

Hello Thomas：

> This looks suspiciously like a deadlock reported by Ming Liu 
> (22-Aug-2013). This deadlock, and another one reported by Deng Chao 
> (23-Jul-2013), were introduced by my patch, "jffs2: Fix lock acquisition 
> order bug in jffs2_write_begin".

> Deng Chao has created a patch which a) removes the deadlock I wanted to 
> get rid of originally, without b) introducing the new deadlocks; see 
> http://lists.infradead.org/pipermail/linux-mtd/2013-August/048352.html. 
> However, his patch modifies mm/filemap.c, and we were hoping to find a 
> more light-weight solution -- which never came to be.

> I do use his patch here around, though, and so far, it has worked fine. I 
> will try to run your test scripts on one of our devices, and see if it 
> holds up.

Though I didn't know about that Deng Chao and Ming Liu had reported the issue,
I have had the same patch thinking.

Yes,these deadlock issues which we have found always occured between gc thread(may 
actived by sync system call) and other user tasks

gc thread just like :
> for [sync_supers]
> jffs2_garbage_collect_live
>     mutex_lock(&f->sem)                         (A)
>     jffs2_garbage_collect_dnode
>         jffs2_gc_fetch_page
>             read_cache_page_async
>                 do_read_cache_page
>                     lock_page(page)             (B)

if we change lock_page(page) above to lock_page_try(page),deadlock will go away.

But i worry about this workaround. jffs2_garbage_collect_live action will changed.
and jffs2_garbage_collect_live is called not only by gc thread.
Is it ok to return an error rather than blocking . 
Can syscall 'sync' still reach its goal ?

wangzaiwei