[PATCH 02/28] crypto: omap-sham: Don't idle/start SHA device between Encrypt operations

Grygorii Strashko grygorii.strashko at ti.com
Tue Jun 7 05:24:43 PDT 2016


On 06/07/2016 02:52 PM, Tero Kristo wrote:
> On 07/06/16 13:08, Herbert Xu wrote:
>> On Wed, Jun 01, 2016 at 06:03:52PM -0500, Dave Gerlach wrote:
>>> On 06/01/2016 04:53 AM, Grygorii Strashko wrote:
>>>> On 06/01/2016 11:56 AM, Tero Kristo wrote:
>>>>> From: Lokesh Vutla <lokeshvutla at ti.com>
>>>>>
>>>>> Calling runtime PM API for every block causes serious perf hit to
>>>>> crypto operations that are done on a long buffer.
>>>>> As crypto is performed on a page boundary, encrypting large buffers
>>>>> can
>>>>> cause a series of crypto operations divided by page. The runtime PM
>>>>> API
>>>>> is also called those many times.
>>>>>
>>>>> We call runtime_pm_get_sync only at beginning on the session
>>>>> (cra_init)
>>>>> and runtime_pm_put at the end. This result in upto a 50% speedup.
>>>>> This doesn't make the driver to keep the system awake as runtime
>>>>> get/put
>>>>> is only called during a crypto session which completes usually
>>>>> quickly.
>>>>>
>>>>> Signed-off-by: Lokesh Vutla <lokeshvutla at ti.com>
>>>>> Signed-off-by: Tero Kristo <t-kristo at ti.com>
>>>>> ---
>>>>>   drivers/crypto/omap-sham.c | 27 +++++++++++++++++----------
>>>>>   1 file changed, 17 insertions(+), 10 deletions(-)
>>>>>
>>>>> diff --git a/drivers/crypto/omap-sham.c b/drivers/crypto/omap-sham.c
>>>>> index 6eefaa2..bd0258f 100644
>>>>> --- a/drivers/crypto/omap-sham.c
>>>>> +++ b/drivers/crypto/omap-sham.c
>>>>> @@ -360,14 +360,6 @@ static void omap_sham_copy_ready_hash(struct
>>>>> ahash_request *req)
>>>>>
>>>>>   static int omap_sham_hw_init(struct omap_sham_dev *dd)
>>>>>   {
>>>>> -    int err;
>>>>> -
>>>>> -    err = pm_runtime_get_sync(dd->dev);
>>>>> -    if (err < 0) {
>>>>> -        dev_err(dd->dev, "failed to get sync: %d\n", err);
>>>>> -        return err;
>>>>> -    }
>>>>> -
>>>
>>> Would it be worth it to investigate a pm_runtime autosuspend
>>> approach rather than knocking runtime PM out here completely? I am
>>> not clear if the overhead is coming from the pm_runtime calls
>>> themselves or the actual idling of the IP, but if it's the idling of
>>> the IP causing the slowdown, with a large enough autosuspend_delay
>>> we don't actually sleep between each block but after a long enough
>>> period of idle time we would actually suspend.
>>
>> Indeed, I think this patch is bogus.  cra_init is associated
>> with the tfm object which is usually long-lived.  So doing power
>> management there makes no sense.
>>
>> Cheers,
>>
>
> I can investigate this further, but I believe this patch itself gave a
> noticeable performance boost.
>
> This is an optimization anyway, and not critical for functionality.
>

It is not critical only if below code would not introduce races
+	spin_lock_bh(&sham.lock);
+	list_for_each_entry(dd, &sham.dev_list, list) {
+		break;
+	}
+	spin_unlock_bh(&sham.lock);

Is it guaranteed that dd will alive always at this moment?

+
+	pm_runtime_get_sync(dd->dev);



-- 
regards,
-grygorii



More information about the linux-arm-kernel mailing list