[PATCH v3 0/3] A Solution to Re-enable hugetlb vmemmap optimize
Nanyong Sun
sunnanyong at huawei.com
Mon Mar 25 08:24:34 PDT 2024
On 2024/3/14 7:32, David Rientjes wrote:
> On Thu, 8 Feb 2024, Will Deacon wrote:
>
>>> How about take a new lock with irq disabled during BBM, like:
>>>
>>> +void vmemmap_update_pte(unsigned long addr, pte_t *ptep, pte_t pte)
>>> +{
>>> + (NEW_LOCK);
>>> + pte_clear(&init_mm, addr, ptep);
>>> + flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
>>> + set_pte_at(&init_mm, addr, ptep, pte);
>>> + spin_unlock_irq(NEW_LOCK);
>>> +}
>> I really think the only maintainable way to achieve this is to avoid the
>> possibility of a fault altogether.
>>
>> Will
>>
>>
> Nanyong, are you still actively working on making HVO possible on arm64?
>
> This would yield a substantial memory savings on hosts that are largely
> configured with hugetlbfs. In our case, the size of this hugetlbfs pool
> is actually never changed after boot, but it sounds from the thread that
> there was an idea to make HVO conditional on FEAT_BBM. Is this being
> pursued?
>
> If so, any testing help needed?
I'm afraid that FEAT_BBM may not solve the problem here, because from
Arm ARM,
I see that FEAT_BBM is only used for changing block size. Therefore, in
this HVO feature,
it can work in the split PMD stage, that is, BBM can be avoided in
vmemmap_split_pmd,
but in the subsequent vmemmap_remap_pte, the Output address of PTE still
needs to be
changed. I'm afraid FEAT_BBM is not competent for this stage. Perhaps my
understanding
of ARM FEAT_BBM is wrong, and I hope someone can correct me.
Actually, the solution I first considered was to use the stop_machine
method, but we have
products that rely on /proc/sys/vm/nr_overcommit_hugepages to
dynamically use hugepages,
so I have to consider performance issues. If your product does not
change the amount of huge
pages after booting, using stop_machine() may be a feasible way.
So far, I still haven't come up with a good solution.
More information about the linux-arm-kernel
mailing list