[Hypervisor Live Update] Notes from September 8, 2025
David Rientjes
rientjes at google.com
Sat Sep 13 18:56:15 PDT 2025
Hi everybody,
Here are the notes from the last Hypervisor Live Update call that happened
on Monday, September 8. Thanks to everybody who was involved!
These notes are intended to bring people up to speed who could not attend
the call as well as keep the conversation going in between meetings.
----->o-----
I relayed updates on KHO, LUO, and memfd preservation:
- Jason Miu shared the page table like implementation for stateless KHO
to Pasha, Pratyush, and Mike; they'll send this out more widely after
iterating on it
- Work on LUO v4 continues with session functionality being a work in
progress, luod will be able to create the session through fds and pass
them to the clients and fds can only be preserved and unpreserved as
sessions; this requires an update to the design doc
- Pratyush shared an updated version of memfd preservation; we chatted
about the 1GB limitation. He will be incorporating changes for that
into his next patch series.
Chris clarified that the work happening on the page table like
implementation today are not real page tables that get loaded into the
cr3, it's an abstraction different from PKRAM that used real page tables.
----->o-----
Chris Li brought up a topic from an internal discussion on whether
stateless KHO would support folio split after the page has been preserved.
The answer was no, if a folio needs to be split then it needs to be
unpreserved and then split before preserving again at the lower order.
This has some complications if something under the hood is doing collapse
or split. Chris clarified that this happens during the preservation phase
of memory (live update prepare).
I suggested that things like khugepaged doing collapse could easily be
enlightened to avoid this but split is more difficult. David Matlack
asked what actually happens when we try to split folios that has been
preserved, i.e. what's the failure mode? Chris suggested that this isn't
handled at all and the new kernel will see the wrong order. Jason
Gunthorpe said this was user error if folios are being split after
preservation; after a memfd, for example, has been serialized then you
cannot do things like punching holes.
Chris said that the call should return an error when this happens and we'd
want to avoid silent errors. However, the only rules that exist today are
that you'd need to unpreserve if splitting the folio before represerving.
----->o-----
Chris discussed KSerial and the kernel-to-kernel API interface, not a
kernel-to-user interface, so it should be considered possible to update
later. Chris mentioned that he was working on incremental helper
functions. There was also discussion about posting patches before any LPC
talks as a general best practice.
For PCI preservation, Chris said he send an updated patch series this
week.
----->o-----
Andrey Ryabinin noted that he had prepared an update to KSTATE on top of
LUO and suggested that KSerial may be too much. He asked if there was
interest in this being shared, and the group echoed that there was
interest. The patches will need a little more work later, but it works.
Chris suggested there may be some good collaboration opportunities here.
Andrey would also submit an LPC discussion for this.
----->o-----
David Matlack noted that Vipin Sharma had discussed VFIO support for live
update at KVM Forum[1]. He suggested that we may want to have the same
talk happen in this meeting series. There are not patches available yet,
but they will be available for LPC. We planned on setting up about 25
minutes in the next meeting for this topic.
----->o-----
Next meeting will be on Monday, September 22 at 8am PDT (UTC-7), everybody
is welcome: https://meet.google.com/rjn-dmzu-hgq
Topics for the next meeting:
- update on latest status of LUO after v4 and next steps for merge into
akpm's tree
- any updates on luod based on internal and external feedback
- update on memfd preservation and status of 1GB limitation before
integration into akpm's tree
- discussion on what happens when we split folios while preserved and if
there are failure modes when this is attempted
- update on the latest status of PCI preservation, registration, and
initialization
- any updates on VFIO with noiommu as a next step for preservation after
Chris's v2
- update on status of KSTATE and its integration with LUO
- 25 min: VFIO support for live update from Vipin Sharma
- later: testing methodology to allow downstream consumers to qualify
that live update works from one version to another
- later: reducing blackout window during live update
Please let me know if you'd like to propose additional topics for
discussion, thank you!
[1] https://www.youtube.com/watch?v=IYfR1jYeM1g&t=1583s
More information about the kexec
mailing list