[PATCH v5 3/5] liveupdate: block session mutations during reboot

Pasha Tatashin pasha.tatashin at soleen.com
Wed May 27 13:06:39 PDT 2026


On 05-22 14:52, Pratyush Yadav wrote:
> On Mon, May 18 2026, Pasha Tatashin wrote:
> 
> > On 05-18 18:31, Pratyush Yadav wrote:
> >> On Mon, May 18 2026, Pasha Tatashin wrote:
> >> 
> >> > During the reboot() syscall, user processes may still be running
> >> > concurrently and attempting to mutate sessions (e.g., creating,
> >> > retrieving, or releasing sessions). To prevent this, introduce
> >> > luo_session_serialize_rwsem to synchronize mutations with the
> >> > serialization process.
> >> >
> >> > All session mutation operations (create, retrieve, release, ioctl) take
> >> > the read lock. The serialization process (luo_session_serialize) takes
> >> > the write lock and holds it indefinitely on success. This effectively
> >> > freezes the LUO session subsystem during the transition to the new
> >> > kernel. If serialization fails, the lock is released to allow recovery.
> >> 
> >> Good idea I think.
> >
> > Hi Pratyush,
> >
> >> 
> >> But, do we need a new mutex? Can't we use luo_session_header->rwsem?
> >> Session creation and release take the header rwsem at one point anyway,
> >> so perhaps we can just reuse that?
> >
> > The sh->rwsem is for protecting the the session list. We only take it as 
> > a writer when modifying the list (insert/remove) and as a reader when 
> > traversing it. Also, we drop sh->rwsem as soon as we've acquired the 
> > per-session mutex to allow other list operations to proceed while a 
> > session is being modified.
> >
> > Because of this, many session mutation operations (specifically ioctl 
> > calls) don't touch sh->rwsem at all—they jump straight to the  session 
> > state via the file's private_data. To use sh->rwsem to block
> > these mutations, we would be forced to add down_read(&sh->rwsem) to 
> > every ioctl path. This would be a layering violation, coupling list 
> > management to per-session data mutations, and would introduce a global
> > bottleneck for operations that are otherwise independent.
> 
> As for the layering violation, I think we would need to change the
> semantics of the lock -- it no longer protects only the list, but other
> session operations as well.
> 
> But yeah, if we do this then operations like session creation would have
> to wait for ongoing session operations like PRESERVE_FD. My argument was
> based around the fact that session creation or removal should not be
> very frequent (and don't happen in the hot path anyway) so the added
> latency should not affect them as much. By doing this tradeoff we get
> slightly simpler code (and simpler locking scheme).
> 
> But I see your point as well. In practice session creation and
> PRESERVE_FD are independent and one should not block the other. Maybe we
> get VMMs creating sessions while another VMM is preserving stuff, and
> this slowing down the live update preparation? Dunno...
> 
> I suppose let's go with this patch. But, can you please document the
> lock hierarchy where you explain what this lock is for?

SGTM, Added a documentation about locking.

Pasha



More information about the kexec mailing list