Clock framework deadlock with external SPI clockchip

Mike Turquette mturquette at linaro.org
Tue Sep 3 19:22:29 EDT 2013


Quoting Lars-Peter Clausen (2013-08-30 06:24:45)
> Hi,
> 
> I'm currently facing a deadlock in the common clock framework that
> unfortunately is not addressed by the reentrancy patches. I have a external
> clock chip that is controlled via SPI. So for example to configure the rate
> of the clock chip you need to send a SPI message. Naturally the clock
> framework will hold the prepare lock while configuring the rate.
> Communication in the SPI framework happens asynchronously, spi_sync() will
> enqueue a message in the SPI masters queue and then wait using
> wait_for_completion(). The master will call complete() once the transfer has
> been finished. The SPI master runs in it's own thread in which it processes
> the messages. In this thread it also calls clk_set_rate() to configure the
> SPI transfer clock rate based on what the message says. Now the deadlock
> happens as we try to take the prepare_lock again and since the clock chip
> and the SPI master run in different threads the reentrancy code does not
> kick in.
> 
> The basic sequence is like this:
> 
> === Clock chip driver ===        === SPI master driver ===
>  clk_prepare_lock()
>  spi_sync()
>    wait_for_completion(X)
>                                  clk_get_rate()
>                                    clk_prepare_lock() <--- DEADLOCK
>                                    clk_prepare_unlock()
>                                  ...
>                                  complete(X)
>  ...
>  clk_prepare_unlock()

Is there a synchronous equivalent to spi_sync()?

> 
> I'm wondering if you have any idea how this can be fixed. In my opinion we'd
> need a per clock mutex to address this properly.

Increasing the number of locks is a pain in the ass. Per-clock mutexes
not only have a memory impact but in general nested locks do not play
nice with lockdep (which assumes a depth no greater than 8 nested locks,
which is often not enough for complex clock trees). And generating
unique lock subclasses helps with this but makes for terrible, ugly
code.

Another idea (which I have not thought about much) is to have async clk
api's. I think there might be many uses for async clock api's besides
just this case. Think about a GPU driver that scales the main clock
feeding a GPU. Maybe the driver doesn't care to block on clk_set_rate
which might block for 100mS if voltage scaling is involved.
clk_set_rate_async() might preferable.

Regards,
Mike

> 
> Thanks,
> - Lars



More information about the linux-arm-kernel mailing list