[PATCH v2] um: time-travel: fix time corruption
Vincent Whitchurch
Vincent.Whitchurch at axis.com
Thu Oct 26 01:49:21 PDT 2023
On Thu, 2023-10-26 at 09:38 +0200, Johannes Berg wrote:
> On Thu, 2023-10-26 at 07:23 +0000, Vincent Whitchurch wrote:
> > > @@ -839,9 +863,7 @@ static u64 timer_read(struct clocksource *cs)
> > > */
> > > if (!irqs_disabled() && !in_interrupt() && !in_softirq() &&
> > > !time_travel_ext_waiting)
> > > - time_travel_update_time(time_travel_time +
> > > - TIMER_MULTIPLIER,
> > > - false);
> > > + time_travel_update_time_rel(TIMER_MULTIPLIER);
> > > return time_travel_time / TIMER_MULTIPLIER;
> > > }
> >
> > The reason I hesitated with putting the whole of
> > time_travel_update_time() under local_irq_save() in my attempt was
> > because I didn't quite understand the reason for the !irqs_disabled()
> > condition here and the comment just above it about recursion and things
> > getting messed up. If it's OK to disable interrupts as this patch does,
> > is the !irqs_disabled() condition valid?
>
> Hmm. I was going to say that's different, because it wants to only
> prevent us from doing this while we're *already* in IRQ context, and the
> bug you found is calling timer_read() not in IRQ context, but getting an
> event queued by the signal.
>
> But ... now that I think about it, I have a feeling that this was a
> workaround for the exact same problem, and I just didn't understand it
> at the time? I mean, recursing into our own processing is now impossible
> here after this patch - either we're running normally, or the interrupt
> cannot hit timer_read() in the middle, same as it cannot hit
> time_travel_handle_real_alarm() in the middle now.
>
> Removing that still seems to work with your test, but it's also not a
> good test for this, since there are no devices etc. that could have
> interrupts, not sure how to test it right now?
>
> Maybe I'll add a comment there saying this might no longer be needed?
I tried removing the !irqs_disabled() check and that blew up pretty
quickly (below) when running the full roadtest suite. It works fine
with your unmodified patch so no need to change the comment.
Kernel panic - not syncing: time-travel: time goes backwards 26374790000864 -> 26374790000853
show_stack.cold (arch/um/kernel/sysrq.c:56)
dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4))
dump_stack (lib/dump_stack.c:114)
panic (kernel/panic.c:262 kernel/panic.c:361)
timer_handler.cold (arch/um/kernel/time.c:51 arch/um/kernel/time.c:510 arch/um/kernel/time.c:634)
timer_real_alarm_handler (arch/um/os-Linux/signal.c:109)
unblock_signals (arch/um/os-Linux/signal.c:338)
tick_nohz_idle_exit (kernel/time/tick-sched.c:1364)
do_idle (kernel/sched/idle.c:310)
cpu_startup_entry (kernel/sched/idle.c:379 (discriminator 1))
kernel_init (init/main.c:1435)
0x60001ce6
0x6000220e
0x60004961
new_thread_handler (arch/um/include/asm/thread_info.h:46 arch/um/kernel/process.c:136)
uml_finishsetup (arch/um/kernel/um_arch.c:268)
More information about the linux-um
mailing list