KVM virtual timer issue with trinity

Will Deacon will.deacon at arm.com
Wed Oct 9 07:00:39 EDT 2013


On Thu, Sep 12, 2013 at 04:27:16PM +0100, Christoffer Dall wrote:
> On Thu, Sep 12, 2013 at 10:37:50AM +0100, Will Deacon wrote:
> > On Fri, Sep 06, 2013 at 05:30:52PM +0100, Will Deacon wrote:
> > > Running trinity as a normal user in a KVM guest on my TC2 (A15s only)
> > > eventually leads to a situation where responsiveness is extremely sluggish.
> > > Further investigation shows that issuing a `sleep 1' command never returns.
> > > This seems to be because the virtual timer has stopped generating interrupts
> > > on CPU0 (CPU1 seems ok).
> > > 
> > > Dumping the timer state (see below), it looks like CPU0's timer expired in
> > > the past, but we're perhaps not receiving the interrupt. The trinity logs
> > > don't reveal anything obvious (and they're huge, so I can't include them
> > > here).
> > > 
> > > I can reproduce this in an hour or so, so if you want me to try anything out
> > > in the host, I can give it a go. I'm using 3.11 as both the guest and host.
> > 
> > Any ideas on things I can do to get to the bottom of this? It's preventing
> > me from running trinity to find any other issues and there's no reason you
> > couldn't hit this lockup under other workloads.
> > 
> I've been thinking on this, sorry about the late response.
> 
> I see something similar when resuming a suspended guest, but I don't
> have very clever ideas or debug strategies yet.  I plan on looking at
> this once I get a new revision of the save/restore QEMU patches out.

Marc was saying that you'd managed to resolve the issue with suspend, but I
can still reproduce the issue with trinity on a 3.12-rc4 kernel (host and
guest).

I tried to reproduce in a model, but I ran into a bunch of other unrelated
problems that look like bugs in the model itself.

Will



More information about the linux-arm-kernel mailing list