[PATCH v3 0/3] A Solution to Re-enable hugetlb vmemmap optimize

David Hildenbrand david at redhat.com
Thu Jun 6 01:30:23 PDT 2024


>> Additionally, we also should alter RO permission of those 7 tail pages
>> to RW to avoid panic().
> 
> We can use RCU, which IMO is a better choice, as the following:
> 
> get_page_unless_zero()
> {
>    int rc = false;
> 
>    rcu_read_lock();
> 
>    if (page_is_fake_head(page) || !page_ref_count(page)) {
>          smp_mb(); // implied by atomic_add_unless()
>          goto unlock;
>    }
> 
>    rc = page_ref_add_unless();
> 
> unlock:
>    rcu_read_unlock();
> 
>    return rc;
> }
> 
> And on the HVO/de-HOV sides:
> 
>    folio_ref_unfreeze();
>    synchronize_rcu();
>    HVO/de-HVO;
> 
> I think this is a lot better than making tail page metadata RW because:
> 1. it helps debug, IMO, a lot;
> 2. I don't think HVO is the only one that needs this.
> 
> David (we missed you in today's THP meeting),

Sorry, I had a private meeting conflict :)

> 
> Please correct me if I'm wrong -- I think virtio-mem also suffers from
> the same problem when freeing offlined struct page, since I wasn't
> able to find anything that would prevent a **speculative** struct page
> walker from trying to access struct pages belonging to pages being
> concurrently offlined.

virtio-mem does currently not yet optimize fake-offlined memory like HVO 
would. So the only way we really remove "struct page" metadata is by 
actually offlining+removing a complete Linux memory block, like ordinary 
memory hotunplug would.

It might be an interesting project to optimize "struct page" metadata 
consumption for fake-offlined memory chunks within an online Linux 
memory block.

The biggest challenge might be interaction with memory hotplug, which 
requires all "struct page" metadata to be allocated. So that would make 
cases where virtio-mem hot-plugs a Linux memory block but keeps parts of 
it fake-offline a bit more problematic to handle .

In a world with memdesc this might all be nicer to handle I think :)


There is one possible interaction between virtio-mem and speculative 
page references: all fake-offline chunks in a Linux memory block do have 
on each page a refcount of 1 and PageOffline() set. When actually 
offlining the Linux memory block to remove it, virtio-mem will drop that 
reference during MEM_GOING_OFFLINE, such that memory offlining can 
proceed (seeing refcount==0 and PageOffline()).

In virtio_mem_fake_offline_going_offline() we have:

if (WARN_ON(!page_ref_dec_and_test(page)))
	dump_page(page, "fake-offline page referenced");

which would trigger on a speculative reference.

We never saw that trigger so far because quite a long time must have 
passed ever since a page might have been part of the page cache / page 
tables, before virtio-mem fake-offlined it (using alloc_contig_range()) 
and the Linux memory block actually gets offlined.

But yes, RCU (e.g., on the memory offlining path) would likely be the 
right approach to make sure GUP-fast and the pagecache will no longer 
grab this page by accident.

> 
> If this is true, we might want to map a "zero struct page" rather than
> leave a hole in vmemmap when offlining pages. And the logic on the hot
> removal side would be similar to that of HVO.

Once virtio-mem would do something like HVO, yes. Right now virtio-mem 
only removes struct-page metadata by removing/unplugging its owned Linux 
memory blocks once they are fully "logically offline".

-- 
Cheers,

David / dhildenb




More information about the linux-arm-kernel mailing list