[PATCH v5 3/5] liveupdate: block session mutations during reboot
Pasha Tatashin
pasha.tatashin at soleen.com
Wed May 27 13:06:39 PDT 2026
On 05-22 14:52, Pratyush Yadav wrote:
> On Mon, May 18 2026, Pasha Tatashin wrote:
>
> > On 05-18 18:31, Pratyush Yadav wrote:
> >> On Mon, May 18 2026, Pasha Tatashin wrote:
> >>
> >> > During the reboot() syscall, user processes may still be running
> >> > concurrently and attempting to mutate sessions (e.g., creating,
> >> > retrieving, or releasing sessions). To prevent this, introduce
> >> > luo_session_serialize_rwsem to synchronize mutations with the
> >> > serialization process.
> >> >
> >> > All session mutation operations (create, retrieve, release, ioctl) take
> >> > the read lock. The serialization process (luo_session_serialize) takes
> >> > the write lock and holds it indefinitely on success. This effectively
> >> > freezes the LUO session subsystem during the transition to the new
> >> > kernel. If serialization fails, the lock is released to allow recovery.
> >>
> >> Good idea I think.
> >
> > Hi Pratyush,
> >
> >>
> >> But, do we need a new mutex? Can't we use luo_session_header->rwsem?
> >> Session creation and release take the header rwsem at one point anyway,
> >> so perhaps we can just reuse that?
> >
> > The sh->rwsem is for protecting the the session list. We only take it as
> > a writer when modifying the list (insert/remove) and as a reader when
> > traversing it. Also, we drop sh->rwsem as soon as we've acquired the
> > per-session mutex to allow other list operations to proceed while a
> > session is being modified.
> >
> > Because of this, many session mutation operations (specifically ioctl
> > calls) don't touch sh->rwsem at all—they jump straight to the session
> > state via the file's private_data. To use sh->rwsem to block
> > these mutations, we would be forced to add down_read(&sh->rwsem) to
> > every ioctl path. This would be a layering violation, coupling list
> > management to per-session data mutations, and would introduce a global
> > bottleneck for operations that are otherwise independent.
>
> As for the layering violation, I think we would need to change the
> semantics of the lock -- it no longer protects only the list, but other
> session operations as well.
>
> But yeah, if we do this then operations like session creation would have
> to wait for ongoing session operations like PRESERVE_FD. My argument was
> based around the fact that session creation or removal should not be
> very frequent (and don't happen in the hot path anyway) so the added
> latency should not affect them as much. By doing this tradeoff we get
> slightly simpler code (and simpler locking scheme).
>
> But I see your point as well. In practice session creation and
> PRESERVE_FD are independent and one should not block the other. Maybe we
> get VMMs creating sessions while another VMM is preserving stuff, and
> this slowing down the live update preparation? Dunno...
>
> I suppose let's go with this patch. But, can you please document the
> lock hierarchy where you explain what this lock is for?
SGTM, Added a documentation about locking.
Pasha
More information about the kexec
mailing list