[RFC PATCH 0/3] um: clean up mm creation - another attempt
Anton Ivanov
anton.ivanov at cambridgegreys.com
Wed Sep 27 02:59:19 PDT 2023
On 27/09/2023 10:52, Benjamin Berg wrote:
> Hi,
>
> On Tue, 2023-09-26 at 14:38 +0200, Johannes Berg wrote:
>> [SNIP]
>> 1. Start from scratch, without copying, which my other patch [1] did.
>
> I really think we should go ahead with that approach. Then follow up
> with optimizations.
+1
>
>> [SNIP]
>>
>> I think the better approach for correctness and integration into the
>> kernel would be to actually admit that UML is special because page
>> faults are so expensive, and
>>
>> * start with a fresh mm process every time
>> * have vma_needs_copy() return true
>> * completely fill the mappings according to only the new mm's VMAs
>> in arch_dup_mmap() or perhaps later
>>
>> I don't know how that'd behave wrt. performance, though it likely cannot
>> be better than with these patches, but at least it'd be more correct,
>> and more obviously correct too, for starters, because then the actual
>> mappings in the UML mm process would actually reflect the PTEs that
>> Linux knows about.
>
> Yes, performance may degrade, but the implementation should be correct
> in the first place. Note that even though we looked at it (and e.g.
> found that MMAP_DONTFORK is incorrect), we have not figured out why the
> first approach is slower currently as everything interesting should be
> getting unmapped by the force_flush_all.
>
> Once we are there, we can look for optimizations. The fundamental
> problem is that page faults (even minor ones) are extremely expensive
> for us.
>
> Just throwing out ideas on what we could do:
> 1. SECCOMP as that reduces the amount of context switches.
> (Yes, I know I should resubmit the patchset)
Actually... YES, YES and YES.
I was just looking at all the workaround which are in place to prevent
guest processes doing a syscall on the host. If this is prohibited at
a higher level we should get quite a boost as all these PTRACE_PEEKs
will become unnecessary.
> 2. Maybe we can disable/cripple page access tracking? If we assume
> initially mark all pages as accessed by userspace (i.e.
> pte_mkyoung), then we avoid a minor page fault on first access.
> Doing that will mess with page eviction though.
> 3. Do DAX (direct_access) for files. i.e. mmap files directly in the
> host kernel rather than through UM.
> With a hostfs like file system, one should be able to add an
> intermediate block device that maps host files to physical pages,
> then do DAX in the FS.
> For disk images, the existing iomem infrastructure should be
> usable, this should work with any DAX enabled filesystems (ext2,
> ext4, xfs, virtiofs, erofs).
I had some plans to do a ubd gen 2 which uses mmap and/or this. They are
presently way on the backburner. We can do some of that once we push
the new VM changes.
>
> Benjamin
>
>>
>>> 2. The preemption patches work fine on top (all 3 cases). The
>>> performance difference stays.
>>
>> OK.
>>
>>> 3. We do not have anything of value to add in term of
>>> cond_resched() to the drivers :(
>>> Most drivers are fairly simplistic with no safe points to add this.
>>
>> Yeah, not surprised by this.
>>
>>> 6. Do we still need force_flush_all() in the arch_dup_mmap()? This
>>> works with a non-forced tlb flush
>>> using flush_tlb_mm(mm);
>>
>> Maybe not, does it make a difference though?
>>
>>> 7. In all cases, UML is doing something silly.
>>> The CPU usage while doing find -type f -exec cat {} > /dev/null
>>> measured from outside in non-preemptive and
>>> PREEMPT_VOLUNTARY stays around 8-15%. The UML takes a sabbatical
>>> for the remaining 85 instead of actually
>>> doing work. PREEMPT is slightly better at 60, but still far from
>>> 100%. It just keeps going into idle and I
>>> cannot understand why.
>>
>> Is it just waiting for IO?
>>
>> johannes
>>
>> _______________________________________________
>> linux-um mailing list
>> linux-um at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-um
>>
>
>
--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/
More information about the linux-um
mailing list