Bug in v7_coherent_kern_range() ?

Sun Apr 1 05:14:38 EDT 2012

于 2012年04月01日 16:50, Dirk Behme 写道:
> On 01.04.2012 10:16, Huang Shijie wrote:
>> Hi Dirk:
>>> Hi Huang Shijie,
>>>
>>> On 01.04.2012 09:09, Huang Shijie wrote:
>>>> Hi Dirk:
>>>>> Hi Huang Shijie,
>>>>>
>>>>> On 01.04.2012 05:21, Huang Shijie wrote:
>>>>>> [1] Platform:
>>>>>> freescale's IMX6Q(4 cores) , ARM CORTEX-A9
>>>>>>
>>>>>> [2] kernel:
>>>>>> 3.0.15(I have cherry-picked many patches, and the
>>>>>> arch/arm/mm/cache-v7.S
>>>>>> is same code with the latest kernel v3.4-rc1)
>>>>>> enable SMP, VIPT,
>>>>>
>>>>> Could you try an unpatched, clean v3.4-rc1 instead?
>>>> Sorry, I could not try the v3.4-rc1. Some our bsp drivers are not DT
>>>> supported.
>>>
>>> I think we are not talking about drivers, we are talking about some
>>> kernel core code, like cache handling? To test
>>> v7_coherent_kern_range() you might not need to many bsp drivers?
>> Yes , the gplay will use the vpu driver. But the VPU driver is not in
>> the kernel. Without the vpu driver, the gplay can not works.
>
> You could try to disable the vpu driver and check if the issue is 
> still there, then.
>
:(
I have no idea how to reproduce this issue if i disable the vpu driver.
>>>>> What's about your 2.6.38?
>>>> 2.6.38 is not a good version to run the imx6q. It losts many our
>>>> drivers's patches.
>>>>>
>>>>> What's about 3.0.26? 3.0.15 seems to miss some maybe relevant
>>>>> patches.
>>>>>
>>>> Our bsp release are based on 3.0.15. so we could not test it on 3.0.26
>>>> too.
>>>
>>> You can. Just give git rebase a try.
>> It will be a nightmare to me. We have nearly 1000 patches. I will cost
>> me much time to handle the conflicts.
>
> IMHO you will get one easy to solve merge conflict. So it should you 
> take < 10min to rebase to 3.0.26. Just try it ;)
>
>>>
>>>>>> [3] application:
>>>>>
>>>>> Could you share a (simple) test case?
>>>> The test case is like this:
>>>> #gplay xx.avi
>>>>
>>>> gplay is our own player, such as mplayer.
>>>
>>> Could you share a (simple) test case? E.g. share 'gplay'? Or try to
>>> reproduce your issue with an other test case? E.g. mplayer? Or
>>> better anything simpler the community can use to try to reproduce
>>> your issue?
>> I can email to you the gplay, if you have an imx6q board. you can test
>> it.
>> I just wish someone give me some advice about this issue.
>
> It would help to use a kernel version and a test case the community 
> can use to reproduce.
>
I know.

thanks
Huang Shijie


> Best regards
>
> Dirk
>
>> I find the arch/arm/include/asm/assembler.h is out of date. So I will
>> update it and test it again.
>>
>> thanks a lot , Dirk.
>>
>> Huang Shijie
>>>
>>> Best regards
>>>
>>> Dirk
>>>
>>>> I just created a script which will play the video files one by one.
>>>>
>>>> BR
>>>> Huang Shijie
>>>>
>>>>>
>>>>> Best regards
>>>>>
>>>>> Dirk
>>>>>
>>>>>> I use our our application which will clone many threads,
>>>>>> two threads (assume as A and B) may do the same thing at the same
>>>>>> time
>>>>>> as the following code:
>>>>>>
>>>>>> In most of the time, it's ok.
>>>>>> But in some unknown situation, cacheflush() failed and one threads
>>>>>> (assume A) may hung up in the following code:
>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> open("/usr/lib/lib_mp3_dec_arm12_elinux.so.2.10.0", O_RDONLY) = 8
>>>>>> read(8,
>>>>>> "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\20\35\0\0004\0\0\0"..., 
>>>>>>
>>>>>>
>>>>>>
>>>>>> 512) = 512
>>>>>> fstat64(8, {st_mode=S_IFREG|0644, st_size=56232, ...}) = 0
>>>>>> mmap2(NULL, 88032, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE,
>>>>>> 8, 0)
>>>>>> = 0x2ff0a000
>>>>>> mprotect(0x2ff18000, 28672, PROT_NONE) = 0
>>>>>> mmap2(0x2ff1f000, 4096, PROT_READ|PROT_WRITE,
>>>>>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 8, 0xd) = 0x2ff1f000
>>>>>> close(8) = 0
>>>>>> mprotect(0x2ff0a000, 57344, PROT_READ|PROT_WRITE) = 0
>>>>>> mprotect(0x2ff0a000, 57344, PROT_READ|PROT_EXEC) = 0
>>>>>> cacheflush(0x2ff0a000, 0x2ff18000, 0, 0x6, 0x2cd03420) = 0 // System
>>>>>> hung up here!!!
>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> [4] kernel log
>>>>>> I use "echo t> /proc/sysrq-trigger" to show the tasks's information:
>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> multiqueue0:src D 804cd678 0 7328 5963 0x00000001
>>>>>> [<804cd678>] (__schedule+0x228/0x760) from [<804d0564>]
>>>>>> (__down_read+0xa8/0xe0)
>>>>>> [<804d0564>] (__down_read+0xa8/0xe0) from [<800478c4>]
>>>>>> (do_page_fault+0xbc/0x480)
>>>>>> [<800478c4>] (do_page_fault+0xbc/0x480) from [<8003841c>]
>>>>>> (do_DataAbort+0x34/0x98)
>>>>>> [<8003841c>] (do_DataAbort+0x34/0x98) from [<8003df10>]
>>>>>> (__dabt_svc+0x70/0xa0)
>>>>>> Exception stack(0xbae37ea8 to 0xbae37ef0)
>>>>>> 7ea0: 31e05000 31e1d000 00000020 0000001f 31e05000 31e1d000
>>>>>> 7ec0: bfac86b8 31e05000 31e1d000 bae36000 08100075 31e056fc 31e08000
>>>>>> bae37ef0
>>>>>> 7ee0: 800424a8 8004a1fc 800f0013 ffffffff
>>>>>> [<8003df10>] (__dabt_svc+0x70/0xa0) from [<8004a1fc>]
>>>>>> (v7_coherent_kern_range+0x20/0x80)
>>>>>> [<8004a1fc>] (v7_coherent_kern_range+0x20/0x80) from [<800424a8>]
>>>>>> (arm_syscall+0x2a0/0x2c4)
>>>>>> [<800424a8>] (arm_syscall+0x2a0/0x2c4) from [<8003e500>]
>>>>>> (ret_fast_syscall+0x0/0x3c)
>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> The do_cache_op() has already held the mm->mmap_sem, but
>>>>>> v7_coherent_kern_range()
>>>>>> cause one page fault during it flush the cache. deadlock! So it
>>>>>> hung up
>>>>>> in the do_page_fault().
>>>>>>
>>>>>> [5] questions:
>>>>>> Why the v7_coherent_kern_range() can caused the data abort?
>>>>>> Is there something wrong about the v7_coherent_kern_range()?
>>>>>>
>>>>>>
>>>>>> thanks
>>>>>> Huang Shijie
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> linux-arm-kernel mailing list
>>>>>> linux-arm-kernel at lists.infradead.org
>>>>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>