[PATCH RFC 03/11] um: Use a simple time travel handler for line interrupts

Mon Nov 13 13:57:51 PST 2023

On Mon, 2023-11-13 at 22:22 +0100, Johannes Berg wrote:

> > My suggestion was to add the virtual interrupt line 
> > state change as a message to the calendar.
> 
> Yes but it doesn't work unless the other side _already_ knows that it
> will happen, because it broke the rule of "only one thing is running".
> 
> Again you can argue that it's fine to return to the calendar even if the
> receiving process won't actually do anything, but because the receiving
> process is still thinking about it, you end up with all the contortions
> you have to do in patch 4 and 7 ... because even the _calendar_ doesn't
> know when the request will actually come to it.
> 
> Perhaps one way of handling this that doesn't require all those
> contortions would be for the sender of the event to actually
> _completely_ handle the calendar request for the interrupt on behalf of
> the receiver, so that the receiver doesn't actually do _anything_ but
> mark "this IRQ is pending" in this case. Once it actually gets a RUN
> message it will actually start running since it assumes that it will not
> get a RUN message without having requested a calendar entry. If the
> calendar entry were already handle on its behalf, you'd not need the
> request and therefore not need the special handling for the request from
> patch 4.
> You'd need a different implementation of patches 2/3, and get rid of
> patches 4, 6, 7 and 8.
> 

So maybe for my future self and all the bystanders, I'll try to explain
how I see the issue that causes patches 4, 6, 7 and 8 to be needed:

Actors: UML, ES (Event Sender), Cal (Calendar)

In UML, we're using a line-based protocols as in drivers/line.c
The ES is connected to that line and can send it messages.
(The same scenario would probably apply the other way around, in theory,
so we might need to have a way to implement this with roles reversed.)

Let's start by sending a message to UML, that's the whole point of this
discussion:

    ES ---> UML
    // so far nothing happened, the host kernel queued SIGIO to UML
    ES: continues running until idle and returns to Cal

Now we already have a few options, and I don't know which one got
implemented.

A. Perhaps ES also told Cal to expect that UML is running:
    Cal ---RUN--> UML

B. Perhaps Cal just asks everyone what their current request is
    Cal ---GET--> UML

C. Perhaps something else?

In any case, we already see a first race here, let's say A happened, and
now:

    ! SIGIO -> UML
    UML calls simple_timetravel_handler -> time_travel_add_irq_event
    UML ---REQUEST--> Cal
    UML <----RUN----- Cal
    // UML really confused -> patch 4

Or maybe:

    UML <-- RUN ----- CAL
    UML: starts running, starts manipulating time event list
    ! SIGIO -> UML
    UML calls simple_timetravel_handler -> time_travel_add_irq_event
        manipulates time event list
    // UML corrupts data structures -> patch 8

Or:

    UML <-- RUN ------ CAL
    UML runs only really briefly, sends new request,
        time_travel_ext_prev_request_valid = true
    ! SIGIO -> UML
    UML calls simple_timetravel_handler -> time_travel_add_irq_event
        time_travel_ext_prev_request_valid == true
    // no new request, Cal confused -> patch 6, and maybe 7

Not sure if that's really all completely accurate, and almost certainly
there are more scenarios that cause issues?

But in all cases the root cause is the asynchronous nature of doing
this; partially internally in UML (list protection etc.) and partially
outside UML (calendar doesn't know what's supposed to happen next until
async SIGIO is processed.)

In contrast, with virtio, you get

    ES -- event ----> UML
    // so far nothing happened, the host kernel queued SIGIO to UML
    // ES waits for ACK
    ! SIGIO -> UML
    UML calls simple_timetravel_handler -> time_travel_add_irq_event
    UML ---REQUEST--> Cal
    UML <--ACK------- Cal
    ES <---ACK------- UML
    ES: continues running until idle and returns to Cal
    Cal: schedules next runnable entry, likely UML

Note how due to the "// ES waits for ACK" nothing bad happens, because
even if the host doesn't schedule the SIGIO immediately to the UML
process, or that needs some time, _nothing_ else in the simulation makes
progress either until UML has.

IMNSHO that's the far simpler model than taking into account all the
potential races of which I outlined some above and trying to work with
them.

Now then I went on to say that we could just basically make it _all_ the
sender's responsibility on behalf of the receiver, and then we'd get

    ES -- event ----> UML
    // so far nothing happened, the host kernel queued SIGIO to UML
    ES -- add UML --> Cal
    ES <--- ACK ----- Cal
    ES: continues running u8ntil idle and returns to Cal
    Cal: schedules next runnable entry, likely UML

Now the only thing we need to handle in terms of concurrency are two
scenarios that continue from that scenario:

1.
    Cal --- RUN ----> UML
    ! SIGIO -> UML

and

2.
    ! SIGIO -> UML
    Cal --- RUN ----> UML

Obviously 2 is what you expect, but again you can even have races like
in 1 and the SIGIO can happen at roughly any time.

I'm tempted to say the only reasonable way to handle that would be to
basically not do _anything_ in the SIGIO but go poll all the file
descriptors this might happen for upon receiving a RUN message.

We could even go so far as to add a new RUN_TRIGGERED_BY_OTHER message
that would make it clear that someone else entered the entry into the
calendar, and only then epoll for interrupts for this message, but I'm
not sure that's needed or makes sense (if you were going to wake up
anyway at that time, you'd still handle interrupts.)

In any case, that feels like a _far_ more tractable problem, and the
only concurrency would be between running and the SIGIO, where the SIGIO
basically no longer matters.

However ... if we consider this the other way around, we can actually
see that it's much harder to implement than it sounds - now suddenly
instead of having to connect the sockets with each other etc. you also
have to give the implementation knowledge about who is on the other
side, and how to even do a calendar request on their behalf! We don't
even have a protocol for doing a request on someone else's behalf today
(though with the shared memory you could just enter one), and actually
changing the protocol to support it might even be tricky since it
doesn't have much space for extra data in the messages ... maybe if we
split 'op' or use 'seq' for it... But you'd need to have a (pre-
assigned?) ID for each, or have them exchange IDs over something,
ideally the socket that's actually used for communication, but that
limits the kinds of sockets you can use, etc. So it's by no means
trivial either. And I understand that. I just don't think making the
protocol and implementations handle all the races that happen when you
start doing things asynchronously is really a good idea.

johannes