[PATCH RFC 04/11] um: Handle UM_TIMETRAVEL_RUN only in idle loop, signal success in ACK

Johannes Berg johannes at sipsolutions.net
Mon Nov 6 12:53:26 PST 2023


On Fri, 2023-11-03 at 16:41 +0000, Benjamin Beichler wrote:
> The Timetravel socket can be read from the SIGIO handler, which may
> catch a RUN message that was expected to be received in the wait loop.
> 
> It can happen that the device simulation only "created" one interrupt,
> but an uncertain (i.e., hard to predict by the calendar/device
> simulation) number of SIGIOs are emitted, with each SIGIO interrupt
> producing a GET message. The calendar might try to send a RUN message
> after the first GET message, which is then processed in the SIGIO
> handler, waiting for the response to a subsequent GET message. However,
> time is advanced by the received RUN message. Typically, this doesn't
> pose problems because the interrupt implies that the next RUN message
> should be at the current time (as sent by the calendar with the GET
> message). But there are corner cases, such as simultaneous other
> interrupts, which may desynchronize the current time in the UML instance
> from the expected time from the calendar. Since there is no real use
> case for advancing time in the SIGIO handler with RUN messages (in
> contrast to UPDATE messages), we logically only expect time to advance
> in the idle loop in TT-ext mode. Therefore, with this patch, we restrict
> time changes to the idle loop.
> 
> Additionally, since both the idle loop and the signal/interrupt handlers
> do blocking reads from the TT socket, a deadlock can occur if a RUN
> message intended for the idle loop is received in the SIGIO handler. In
> this situation, the calendar expects the UML instance to run, but it
> actually waits for another message, either in the SIGIO handler (e.g.,
> a second interrupt) or in a poll in the idle loop, as the previous
> message was handled by the signal handler, which returned execution to
> the main loop and ultimately entered the idle loop.
> 
> Therefore, this patch also allows checking whether the current RUN
> message was handled by the idle loop or by the signal/interrupt handlers
> in the ACK of the RUN.
> 
> With the information in the ACK of the RUN message, the calendar knows
> whether the RUN was answered in a signal handler and can act
> accordingly.
> 

This is going to take a bit more review cycles, tbh, still processing it
and probably need to sleep on it :)

johannes



More information about the linux-um mailing list