[PATCH] usb: ehci: fix update qtd->token in qh_append_tds
Ming Lei
ming.lei at canonical.com
Sat Aug 27 11:33:26 EDT 2011
Hi,
On Sat, Aug 27, 2011 at 11:13 PM, Greg KH <greg at kroah.com> wrote:
> On Sat, Aug 27, 2011 at 10:48:35PM +0800, ming.lei at canonical.com wrote:
>> From: Ming Lei <ming.lei at canonical.com>
>>
>> This patch fixs one performance bug on ARM Cortex A9 dual core platform,
>> which has been reported on quite a few ARM machines(OMAP4, Tegra 2, snowball...),
>> see details from link of https://bugs.launchpad.net/bugs/709245.
>>
>> In fact, one mb() on ARM is enough to flush L2 cache, but
>> 'dummy->hw_token = token;' after mb() is added just for obeying
>> correct mb() usage.
>
> Really? A mb() should not be flushing any caches, it's just a memory
> barrier. Or is ARM somehow "special" in this way?
As Santosh pointed out, mb on ARM will flush L2 write buffer. The
description here is wrong.
I think the below should make the writing reach into memory on all
ARCH after ' token = dummy->hw_token;' is executed.
dummy->hw_token = token;
mb()
token = dummy->hw_token;
The above is the idea introduced to fix the problem.
>
>> The patch has been tested ok on OMAP4 panda A1 board, the performance
>> of 'dd' over usb mass storage can be increased from 4~5MB/sec to
>> 14~16MB/sec after applying this patch.
>
> That's impressive, but I don't think this is really the proper way to do
> this...
>
>> Signed-off-by: Ming Lei <ming.lei at canonical.com>
>> ---
>> drivers/usb/host/ehci-q.c | 14 ++++++++++++++
>> 1 files changed, 14 insertions(+), 0 deletions(-)
>>
>> diff --git a/drivers/usb/host/ehci-q.c b/drivers/usb/host/ehci-q.c
>> index 0917e3a..65b5021 100644
>> --- a/drivers/usb/host/ehci-q.c
>> +++ b/drivers/usb/host/ehci-q.c
>> @@ -1082,6 +1082,20 @@ static struct ehci_qh *qh_append_tds (
>> wmb ();
>> dummy->hw_token = token;
>>
>> + /* The mb() below is added to make sure that
>> + * 'token' can be writen into qtd, so that ehci
>> + * HC can see the up-to-date qtd descriptor. On
>> + * some archs(at least on ARM Cortex A9 dual core),
>> + * writing into coherenet memory doesn't mean the
>> + * value written can reach physical memory
>> + * immediately, and the value may be buffered
>> + * inside L2 cache. 'dummy->hw_token = token;'
>> + * after mb() is added for obeying correct mb()
>> + * usage.
>> + * */
>> + mb();
>> + token = dummy->hw_token;
>
> Your comment does not match the code, so something is wrong here.
If you mean "L2 cache flush", I confess to the mistaken description,
and will update it later. If you mean others, could you help to point it out?
thanks,
--
Ming Lei
More information about the linux-arm-kernel
mailing list