UBI leb_write_unlock NULL pointer Oops (continuation)

Bill Pringlemeir bpringlemeir at nbsps.com
Mon Feb 24 10:45:50 EST 2014


>> On 22 Feb 2014, richard at nod.at wrote:

>>> Hmm, I'm not sure whether I was able to follow your thought.  But
>>> leb_write_unlock() is balanced with leb_write_trylock() in
>>> ubi_eba_copy_leb() which makes perfectly sense to me.  What exactly is
>>> the problem?

> Am 24.02.2014 16:09, schrieb Bill Pringlemeir:

>> There are two things that must be balanced.  The 'reference count'
>> ubi_ltree_entry -> users and the rw_semaphore down/up.  You are right,
>> the trylock needs to be balanced by the 'leb_write_unlock'.  However,
>> the 'leb_write_trylock()' has already decremented 'users' in preperation
>> to move the 'lnum'.  However, in the case of contention,
>> 'ubi_eba_copy_leb' bails and does the 'leb_write_unlock()', which
>> balances the 'trylock', but unbalances the 'users' reference count (I
>> added some comments on the lines).

On 24 Feb 2014, richard at nod.at wrote:

> My first thought here is "If this is true, why does
> ubi_assert(le->users >= 0) not trigger"?

A call to 'ltree_add_entry()' may add a completely new entry for the
'lnum'.  The context switches may happen at any point that the spinlock
is not held.

Here is ubi_eba_copy_leb() with just mutex and reference count.

leb_write_trylock -> ltree_add_entry(ubi, vol_id, lnum) create new or
                        old.
/* could reschedule here... */
leb_write_trylock -> down_write_trylock have write rwsem.
/* could reschedule here... */
leb_write_trylock -> get spin lock and decrement user.
/* could reschedule here... */
on 'if (vol->eba_tbl[lnum] != from)' another thread has this
                        'ltree_entry' so count is >1.
/* could reschedule here... */
call leb_write_unlock() and destroy in use ltree_entry.

Anyone calling 'ltree_add_entry' may create a new entry.  Also, as the
entry has been freed, the memory will be recycled and 'users' could be
anything in a freed node.  It is puzzling if this is related to the
problem that Thorsten and others have seen that the 'assert' never
fires.  However, this path seems to violate the reference count and
double decrements.  I am pretty sure it is an issue although it maybe
unrelated and latent (never triggered).  However, some of the same
'suspects' are involved so I think it is a possibility to explore.

Fwiw,
Bill Pringlemeir.





More information about the linux-mtd mailing list