[RFC PATCH 0/3] um: clean up mm creation - another attempt

Tue Sep 26 05:16:29 PDT 2023

On 25/09/2023 16:20, Anton Ivanov wrote:
> 
> On 25/09/2023 15:44, Johannes Berg wrote:
>> On Mon, 2023-09-25 at 15:27 +0100, Anton Ivanov wrote:
>>> On 25/09/2023 14:33, Johannes Berg wrote:
>>>> On Mon, 2023-09-25 at 14:29 +0100, Anton Ivanov wrote:
>>>>> I have rebased the preempt patch on top of these series.
>>>>>
>>>>> PREEMPT works with some performance decrease.
>>>>>
>>>>> VOLUNTARY deadlocks early in boot around the time it starts loading modules.
>>>>>
>>>>> non-preemptible deadlocks very early in boot.
>>>>>
>>>> Well I guess that means there's still some issue in here? Hmm.
>>>>
>>>> Now I don't understand anything anymore, I guess.
>>> PEBKAC. The tree got corrupted somewhere during rebase. Reapplying everything on top of a clean master fixed it.
>>>
>>> So it all works.
>> OK, whew. At least now I no longer _completely_ doubt the mental model I
>> have of UML VM :-)
>>
>>> With some performance penalties compared to the old approach, but works.
>> I still find this odd though, I don't see what the flush would possibly
>> do in a new (mm host) process that's not achievable in arch_dup_mmap()?
>>
>> OK, so let's see - arch_dup_mmap() is _earlier_ than the fork_handler,
>> because that only happens on the very first switch into the process.
>> This is only when it gets scheduled. So we'd be looking for something
>> that copy_process() changes in the MM after copy_mm() and before it can
>> get scheduled?
>>
>> I guess we could even move the flush it into copy_thread(), which is a
>> simpler patch too, but it felt a bit wrong, since that's about the
>> (guest!) process, not the mm.
>>
>> But basically I don't see anything there - fork syscall tail-calls
>> kernel_clone(), which doesn't really do anything with the result of
>> copy_process() except wake it up, and copy_process() doesn't really do
>> anything either?
> 
> I am continuing to dig through that and looking for voluntary preemption points in the process.
> 
> We have none of our own and the generic ones are not invoked - voluntary for all practical purposes does not differ from no-preempt.
> 
> If I have some suspects, I will post their mugshots to the list :)

For the time being it is mostly negative :)

1. The performance after the mm patch is down. By 30-40% on my standard bench.

2. The preemption patches work fine on top (all 3 cases). The performance difference stays.

3. We do not have anything of value to add in term of cond_resched() to the drivers :(
Most drivers are fairly simplistic with no safe points to add this.

4. The upper layer resched points are not enough.
F.E. -EAGAIN from block, retries, etc should all hit paths in the upper layer which cause a
cond_resched(). They are not noticeable. VOLUNTARY behaves nearly identical to non-preemptive.

5. There are a couple of places where an if (condition) schedule() should probably be replaced
with cond_resched(). F.E. interrupt_end(). While it looks more appropriate, the performance
effect is nil.

6. Do we still need force_flush_all() in the arch_dup_mmap()? This works with a non-forced tlb flush
using flush_tlb_mm(mm);

7. In all cases, UML is doing something silly.
The CPU usage while doing find -type f -exec cat {} > /dev/null measured from outside in non-preemptive and
PREEMPT_VOLUNTARY stays around 8-15%. The UML takes a sabbatical for the remaining 85 instead of actually
doing work. PREEMPT is slightly better at 60, but still far from 100%. It just keeps going into idle and I
cannot understand why.

8. All in all I do not have an obvious suspect for the performance difference. No change candidates
either. I also do not have any ideas why it ends up in idle instead of doing work.

> 
>> johannes
>>

-- 
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/