Bug in split transactions on Raspberry Pi

Doug Anderson dianders at chromium.org
Wed Jan 27 13:25:32 PST 2016


Hi,

On Wed, Jan 27, 2016 at 12:59 PM, Alan Stern <stern at rowland.harvard.edu> wrote:
> On Wed, 27 Jan 2016, Doug Anderson wrote:
>
>> > In the end I just used the contents of the dwc2 directory from Linus's
>> > current tree -- I don't think it has changed since 4.5-rc1.  Your
>> > patches applied on top of that with no issues.
>> >
>> > They seem to work.  Is there anything in particular you would like me
>> > to test?
>>
>> I was sorta wondering if it fixed the original problem you were
>> seeing?  Unless I missed it it sounded like it was still broken on ToT
>> and I was wondering if my series happened to help in any way...
>
> They did not fix the original problem.  I still have no idea what could
> be causing the device's selective behavior.  If you want, I can send
> you the .tdc analyzer files for kernels with and without your patch
> series.  Maybe you'll see something I missed.

Argh.  Sure, go ahead and send the .tdc files.  Caveat emptor that I
know very little about USB (as I indicated in the cover letter of my
series), but I'll do my best.  ;)


> It did.  Without your patches the opposite problem occurred; the status
> polls sometimes didn't get sent when they should have.
>
>> Oh, wait, I think I know what this is.  This is probably ("usb: dwc2:
>> host: Manage frame nums better in scheduler").  I'd bet that if you go
>> before that change then the behavior will be the way it was.
>>
>> In said patch you can see that dwc2_schedule_periodic() will now
>> always pick a new first frame.  As I understand it that will happen if
>> there was ever a short period of time when we weren't scheduled and I
>> was worried about the next slice being in the past.  It wouldn't hurt
>> to change the test to only pick a new first frame if the existing
>> first frame is in the past.  That would probably fix your problem.
>
> Are you perhaps thinking of isochronous transfers?  This is an
> interrupt endpoint, not isochronous.  Individual interrupt transfers
> don't have assigned slices.  The "Add a delay before releasing periodic
> bandwidth" patch seems more relevant.  Or maybe "Schedule periodic
> right away if it's time", although that comes before the tasklet patch.

Please try the past I posted that crossed your reply in the ether.  ;)
 AKA: <https://chromium-review.googlesource.com/#/c/324185/>.  I
reproduced what I think is your problem and this patch fixed it for
me.

In my scheduler all periodic transfers have reservations.  Perhaps
it's not critical to reserve for INT transfers, but that's how things
are setup right now.


>> Question: is the behavior you're seeing illegal / undesirable?  It
>> only happens when there's data available, right?  My current
>> (rewritten) reservation algorithm in dwc2 is pretty crude  From a high
>> speed device perspective it reserves a timeslot for the EP in every
>> microframe even if the EP has a long interval.  It won't typically
>> schedule the EP more often than every "interval" microframes, but the
>> reservation is there.  That means we can't support quite as many
>> devices at the same time, but it has the advantage that if we screw up
>> (like if we miss an SOF) we know we can actually try to make up for it
>> quickly and we won't collide with anyone else.  The whole thing also
>> simplifies the scheduler.  In your particular case, we used our
>> existing reservation to schedule the device a little sooner than
>> requested.
>
> It's not illegal.  Interrupt transfers are guaranteed to be scheduled
> at least as often as bInterval; there's nothing wrong with issuing them
> more frequently.  But it is a waste of bandwidth and CPU time to issue
> transfers more often than necessary, so it is undesirable in that
> sense.

OK.  Well, hopefully fixed with my FIXUP patch.  ;)


> You know, from what you've said it seems that writing a proper periodic
> scheduler would be _easier_ for DWC2 than for EHCI.  The EHCI hardware
> requires special data structures to handle cases where a SPLIT packet
> occurs in microframe 7 or microframe 0 of the following frame; those
> structures are a nuisance to manage and ehci-hcd doesn't bother to
> implement them.  It sounds like DWC2 wouldn't need any of that.
> Furthermore, because coordination between the CPU and the EHCI core is
> rather asynchronous, it's difficult to change the microframe allotment
> for an interrupt endpoint should the schedule need to be rebalanced --
> that hasn't been implemented either.  Again, this should be much
> simpler for DWC2.
>
> I've got the overall design for a proper scheduler worked out in my
> head, but I have never sat down and written any of it because of the
> inertia involved in these two difficult requirements.  :-(

I'm certainly more than happy for any comments you might have or for
any followup or alternate patches for my dwc2 scheduler patches.  I'm
not attached to my scheduler and if someone wants to write a better
one I'm more than happy to have mine ripped out...  The only reason
I'm working on dwc2 stuff in the first place is that it was pretty
broken and I couldn't find anyone else willing / able to fix it.  The
last many years I've stayed many arms' lengths away from anything USB,
but that finally had to change I guess..

...it's unlikely I'll be able to make major changes to my scheduler at
the moment though.  If people agree that my microframe scheduler makes
things better than they were before and there are no huge objections,
I'd vote for merging it and when someone comes up with a better
solution then we can take that instead!  ;)


I'm happy to give an overview of my scheduler patch if the comments in
("usb: dwc2: host: Totally redo the microframe scheduler") aren't
enough.



-Doug



More information about the linux-rpi-kernel mailing list