[PATCH] um: insert scheduler ticks when userspace does not yield
Benjamin Beichler
Benjamin.Beichler at uni-rostock.de
Mon Sep 23 14:50:27 PDT 2024
Hi,
Am 23.09.2024 um 16:48 schrieb Benjamin Berg:
>> Actually, I think, timeouts are no problem, if we can assure, that a
>> timeout is never rounded down to 0. Mostly a direct input of 0 have
>> special meanings, or provokes wrong behavior in the first place from
>> user space program.
> I don't think that is a problem. The kernel should guarantee that a
> timeout never fires too early.
>
> I believe in the case of the linked python code, the timeout fires at
> exactly the correct time. And then the python code (incorrectly)
> detects that the timeout has not passed and tries to "select" again
> with a timeout of exactly zero.
>
> Really, that implementation is just buggy in subtle ways. It could
> probably just trust the kernel to not wake up early. And, if it does
> check whether the timeout has passed, then it should just accept the
> exact time.
Maybe I'm doing a captain obvious here, but I had the impression this
code was written this way, to handle interruptions by signals and not to
doubt the time accuracy. Possibly I'm totally wrong, but it seems quite
elegant to simply use time here to avoid that dance to mask signals or
check for interruptions etc.
I believe this code was written in mind that time() will advance, so
this will never be an endless loop, so even the corner case that timeout
was 0 would be covered by this.
>> Since time-travel mode has a very limited niche, I would not try to
>> prevent every possible dumb behavior that bad user space programs could
>> have. I think busy-waiting on a system clock advancement is not the best
>> style, but acceptable.
>>
>> So my list was:
>>
>> sys_getitimer
>> sys_gettimeofday
>> sys_time
>> sys_timer_gettime
>> sys_clock_gettime
>> sys_timerfd_gettime
>>
>> While overthinking it, I see the possibility to read the access
>> timestamps of a file to create an endless loop, so maybe the stat
>> syscalls may be included, although this makes me a bit uncomfortable
>> again. I tend to say, this "bad" behavior of asking the same information
>> over and over again, should only be punished, if it happens multiple times.
>>
>> I was thinking about, storing the PID of a busy-looped process, and only
>> increase time, if the same PID is "suspicious". However, this "hack"
>> becomes more and more costly, which is on the other hand not important
>> for timetravel mode.
> Maybe a stupid question, but aren't we overthinking this in general?
>
> While I think that Johannes' solution to make reading the time cost
> time is kind of ingenious, I really wonder how much of an issue this
> actually is. Because if this is just a few userspace applications and
> libraries misbehaving, then we might as well fix the issue there
> instead of doing anything special in UML.
Your point is right, and such bugs may be fixed in user space. On the
other hand, what about software we can't or don't want to fix, which in
the wild simply works. For my future use cases, I will run code, that
I'm not able to compile myself. I would even consider to have a runtime
switch to change the behavior of this hack, to reduce the overhead in
simulations that behave nicely, but have some quick workaround for
misbehaving code.
And sorry for repeating myself, but I believe, that busy waiting on an
increasing timer value is not the best style, but considered okay/normal
for some use cases. So I think it would be helpful to be able to execute
such user space code.
But I want to bring in another idea: Could we use an ebpf program to
dynamically hook into syscalls and do a timetravel_update or something
similar? Actually, I do not know whether ebpf works normally in UM, but
that way it would be flexible and moving the dirty hacks into small
portions outside the kernel. From what I understand, we would need to
add an ebpf callable wrapper for the time travel update function, isn't it?
>
>>> One neat side effect is that if reading time does not actually cost
>>> time, then we could implement clock_gettime in the VDSO.
>> That would exactly not work, because of my comment from before.
> Of course. It is just that I have always in the back of my mind that
> syscalls and pagefaults (including minor faults) are really expensive
> in UML. So if the hack is moved elsewhere then implementing
> clock_gettime in the vDSO could be an easy win to speed up the
> simulation.
Mhh I did only a quick look into "arch/x86/um/vdso/um_vdso.c" and from
my understanding, currently every vdso call is converted into syscalls
of the host. So we need much more code to use here the time travel
clock, isn't it? Of course, my proposed ebpf hook would not work here
either...
>
> Benjamin
>
kind regards
(the other) Benjamin
More information about the linux-um
mailing list