[PATCH RFC 03/11] um: Use a simple time travel handler for line interrupts

Mon Nov 20 05:42:04 PST 2023

Am 13.11.2023 um 22:57 schrieb Johannes Berg:
>
> So maybe for my future self and all the bystanders, I'll try to 
> explain how I see the issue that causes patches 4, 6, 7 and 8 to be 
> needed: 
--- snip ---
Hello Johannes,
Sorry for my delayed response to your detailed email. I find it quite 
hard to discuss such a complex topic via mailing lists without it 
sounding impolite.

Maybe also as some basis for my reasoning: I'm quite familiar with 
discrete event-based simulation, with a special focus on SystemC 
simulation.There are
some common (old) principles of DES that map perfectly to the time 
travel mode, so my mental model has always been shaped around those 
constraints.
- Every instance/object/element in a DES has some inputs/signals that 
activate it either at the current moment or at some later point in time.
- Every instance/object/element in a DES will run, starting from its 
activation time until it has "finished", without advancing the global 
simulation time.
- Events (or activations) occurring at the exact same point in 
simulation time would happen in parallel. Therefore, the actual order of 
execution in a
sequential simulator is more or less unimportant (though some 
implementations may require a deterministic order, ensuring simulations 
with the same
parameters yield the exact same output, e.g., SystemC). The parallel 
execution of events at the same time in the simulation is an 
optimization that
may be implemented, but is not always utilized.

After reviewing all your analyses, I believe the most significant 
difference in my implementation lies in the last point. I did not 
enforce the order
of message processing when they occur exactly at the same simulation 
time. Consequently, I modified my implementation to eliminate the 
synchronous
(and, to be honest, quite hacky) read operation with special handling on 
the timetravel socket. Instead, I implemented a central epoll routine, which
is called by my master simulation kernel (NS3). My rationale was that if 
I haven't received the request from the TT-protocol, I cannot advance time.

In conjunction with running not a single UML instance but many (my 
current use case consists of a 10-node pair == 20 UML nodes), this can 
create all
sorts of race/deadlock conditions, which we have identified.
For me, the ACK of vhost/virtio seemed somewhat redundant, as it 
provides the same information as the TT-protocol, assuming my device 
simulation
resides within the scheduler. I must admit my assumption was incorrect, 
primarily because the implementation of the TT-protocol in the kernel is
somewhat fragile and, most importantly, your requirement that 
***nothing***is allowed to interfere (creating SIGIOs) at certain states 
of a time-traveling
UML instance is not well-documented. We need the ACK not for an 
acknowledgment of registering the interrupt but to know that we are 
allowed to send the
next TT-msg. This very tight coupling of these two protocols does not 
appear to be the best design or, at the very least, is poorly documented.
The prohibition of interference in certain TT-states also led to my 
second mistake. I relaxed my second DES requirement and allowed 
interrupts when
the UML instance is in the RUN-state. This decision was based on the 
impression that UML was built to work this way without TT, so why should it
break when in TT-Mode (which you proved was wrong). Whether this is 
semantically reasonable or not can be questioned, but it triggered technical
problems with the current implementation.

With this realization, I tend to agree that maybe the whole patches to 
ensure thread-safe (or reentrant-safe access) of the event list might be 
dropped.
Still, we should ensure that the SIGIO is simply processed synchronously 
in the idle loop. This aligns with my last DES constraint: since everything
happens at the same moment in simulation time, we do not need "real" 
interrupts but can process interrupts (SIGIOs) later (but at the same 
simulation time).
I think this approach only works in ext or cpu-inf mode and may be 
problematic in "normal" timetravel mode. I might even consider dropping 
the signal handler,
which marks the interrupts pending, and process the signals with a 
signalfd, but that is, again, only an optimization.
Additionally, to address the interrupt acknowledgment for the serial 
line, I'd like to propose this: why not add an extra file descriptor in the
command line, which is something the kernel could write to, such as an 
eventfd or a pipe, to signal the acknowledgment of the interrupt. For 
example, the
command line changes to ssl0=fd:0,fd:1,fd:3. If somebody uses the serial 
line driver with timetravel mode but without that acknowledgment fd, we 
can emit
a warning or an error.

I believe all these changes should work well with the shared memory 
optimization and should make the entire time travel ext protocol a bit 
more robust,
easier to use, and harder to misuse. ;-)

However, even after the lengthy discussion on "when" interrupts should 
be processed, I still hold the opinion that the response to raising an 
interrupt
should be immediate to the device simulation and not delayed in 
simulation time. This only makes the device simulation harder without 
real benefit. If
you want to delay the interrupt handling (ISR and so on), that's still 
possible and in both situations highly dependent on the UML 
implementation. If
we want to add an interrupt delay, we need to implement something in UML 
anyway. If you want to delay it in the device driver, you can also always do
it, but you are not at the mercy of some hard-to-determine extra delay 
from UML.

Overall, if you think that makes sense, I could start on some patches, 
or perhaps you feel more comfortable doing that.

Benjamin