[PATCH v3][for 4.15] dmaengine: dmatest: move callback wait queue to thread context

Adam Wallis awallis at codeaurora.org
Fri Nov 17 09:28:37 PST 2017


On 11/17/2017 12:01 PM, Adam Wallis wrote:
> On 11/17/2017 10:57 AM, Dan Williams wrote:
>> On Fri, Nov 17, 2017 at 7:28 AM, Adam Wallis <awallis at codeaurora.org> wrote:
>>> On 11/17/2017 10:12 AM, Dan Williams wrote:
>>>> On Fri, Nov 17, 2017 at 6:11 AM, Adam Wallis <awallis at codeaurora.org> wrote:
>>>>> Commit adfa543e7314 ("dmatest: don't use set_freezable_with_signal()")
>>>>> introduced a bug (that is in fact documented by the patch commit text)
>>>>> that leaves behind a dangling pointer. Since the done_wait structure is
>>>>> allocated on the stack, future invocations to the DMATEST can produce
>>>>> undesirable results (e.g., corrupted spinlocks).
>>>>>
>>>>> Commit a9df21e34b42 ("dmaengine: dmatest: warn user when dma test times
>>>>> out") attempted to WARN the user that the stack was likely corrupted but
>>>>> did not fix the actual issue.
>>>>>
>>>>> This patch fixes the issue by pushing the wait queue and callback
>>>>> structs into the the thread structure. If a failure occurs due to time,
>>>>> dmaengine_terminate_all will force the callback to safely call
>>>>> wake_up_all() without possibility of using a freed pointer.
>>>>>
>>>>> Cc: stable at vger.kernel.org # 4.13.x: a9df21e: dmatest: Warn User
>>>>> Cc: stable at vger.kernel.org # 4.13.x
>>>>> Cc: stable at vger.kernel.org # 4.14.x
>>>>
>>>
>>> Sure - do you want me to remove them? I was just following the instructions on
>>> stable.
>>
>> It's not broken, just a note for next time.
>>
>>>
>>>> You don't need 3 cc stables, you don't even need the "#
>>>> kernel-version". Since you have the "Fixes:" line the target kernel(s)
>>>> for the backport can be auto-determined. I should go update
>>>> Documentation/process/stable-kernel-rules.rst to mention this.
>>>>
>>>>> Bug: https://bugzilla.kernel.org/show_bug.cgi?id=197605
>>>>> Fixes: adfa543e7314 ("dmatest: don't use set_freezable_with_signal()")
>>>>> Reviewed-by: Sinan Kaya <okaya at codeaurora.org>
>>>>> Suggested-by: Shunyong Yang <shunyong.yang at hxt-semitech.com>
>>>>> Signed-off-by: Adam Wallis <awallis at codeaurora.org>
>>>>> ---
>>>>> changes from v2: Added "Fixes" tag
>>>>> changes from v1: Added pre-req patches for stable
>>>>>
>>>>>  drivers/dma/dmatest.c | 37 ++++++++++++++++---------------------
>>>>>  1 file changed, 16 insertions(+), 21 deletions(-)
>>>>>
>>>>> diff --git a/drivers/dma/dmatest.c b/drivers/dma/dmatest.c
>>>>> index 47edc7f..2573b6c 100644
>>>>> --- a/drivers/dma/dmatest.c
>>>>> +++ b/drivers/dma/dmatest.c
>>>>> @@ -155,6 +155,12 @@ struct dmatest_params {
>>>>>  #define PATTERN_COUNT_MASK     0x1f
>>>>>  #define PATTERN_MEMSET_IDX     0x01
>>>>>
>>>>> +/* poor man's completion - we want to use wait_event_freezable() on it */
>>>>> +struct dmatest_done {
>>>>> +       bool                    done;
>>>>> +       wait_queue_head_t       *wait;
>>>>> +};
>>>>> +
>>>>>  struct dmatest_thread {
>>>>>         struct list_head        node;
>>>>>         struct dmatest_info     *info;
>>>>> @@ -165,6 +171,8 @@ struct dmatest_thread {
>>>>>         u8                      **dsts;
>>>>>         u8                      **udsts;
>>>>>         enum dma_transaction_type type;
>>>>> +       wait_queue_head_t done_wait;
>>>>
>>>> Why are we defining a waitquehead per thread vs defining one globally
>>>> for the whole module with "static DECLARE_WAIT_QUEUE_HEAD(x);"?
>>>
>>> This is how the original dmatest functions. Each thread had a wait queue that it
>>> created so that it could go to sleep while the DMA transfer occurred. Each
>>> thread is dependent on its own DMA transaction for the wakeup call. Again, this
>>> is how the test originally worked. I just moved the wait queue from the stack
>>> (which was getting corrupted) to the thread context to allow for safe cleanup.
>>> In other words, I haven't really changed how the test works...just fixing a bug
>>> with the current implementation.
>>
>> Ok, always takes me a bit to re-orient myself to this file since I
>> only look at it once a year.
>>
>> This fix seems incomplete. The next test iteration after a timeout
>> will now reuse the per-thread 'done' notification. If the engine that
>> timed out still completes its dma it will collide with the next
>> operation that is using the same 'done' variable. So it seems to me
>> that the wait_queue_head should be global, and the 'done' variable
>> should be either allocated per-operation or we should call
>> dmaengine_terminate_all() after a timeout. Since not all engines
>> implement a terminate I think the potential memory leak of a few
>> 'done' variables is a better option.
>>
> Dan
> An important part of my patch was severed in this v3 submission. My apologies.
> 
> There is a change that addresses, I believe, your concern that was in v2
> 
>  	/* terminate all transfers on specified channels */
> -	if (ret)
> +	if (ret || failed_tests)
>  		dmaengine_terminate_all(chan);
> 
> Will clean up again, retest, and resubmit. Thanks for your patience and instruction.

Dan, I thought the patch was truncated, but it's all there in V3. I should have
finished my coffee before responding. You are absolutely right that in the timed
out case that dmaengine_terminate_all(chan) should be called, and that change is
in fact already included in this patch set

@@ -789,7 +782,7 @@ static int dmatest_func(void *data)
 		dmatest_KBs(runtime, total_len), ret);

 	/* terminate all transfers on specified channels */
-	if (ret)
+	if (ret || failed_tests)
 		dmaengine_terminate_all(chan);

Would you prefer that I add a better description in the commit text to address
the fact this was in fact added?

> 
> Adam
> 


-- 
Adam Wallis
Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.



More information about the linux-arm-kernel mailing list