[RFC PATCH 0/8] namespace-aware configfs

Hannes Reinecke hare at suse.de
Wed Jun 17 05:55:48 PDT 2026


On 6/17/26 13:58, Christian Brauner wrote:
>> Open Issues:
>> - I've added a new function 'mnt_clone_direct()' to clone the vfsmount
>>    entry (the original code just did a simple_pin_fs()). Not sure if
>>    that's correct. Christian?
> 
> I think we can put that issue aside for a second as there are a few
> bigger design questions.
> 
>> - The current cloning mechanism is not really a hierarchy, but rather
>>    always using the default namespace to find registered subsystems.
>>    Meaning you can only call 'register_subsystem()' for the default
>>    namespace. But then one shouldn't call modprobe in a container, so
>>    that's okay I guess.
> 
> I'm not sure I follow what this is trying to say, sorry.
> 
 >> - The original content of the configfs remains visible even from>> 
  within the container, and the new 'mount' will just overlay that.
> 
> I again fail to understand what precisely this means.
> 
The configuration in /sys/kernel/config will remain accessible from
the original namespace, but after an 'unshare' call /sys/kernel/config
will start off empty as a different 'instance' of the filesystem is
presented.

>>    Ideally I would have the container start off with an empty
>>    /sys/kernel/config to avoid configuration issues. But again I've
>>    no idea how to do that (or if it's even possible).
> 
> Shouldn't be a problem to implement this. Just means more work.
> 
> So the rough concept in my head would be that configfs is a
> multi-instance filesystem that remembers the namespaces it was mounted
> similar to how kernfs allows to be tagged with namespaces. It then only
> shows/allows registration of subsystems that match the relevant
> namespace tags.
> 
Correct. That was my intention.

> The challenge I see is that configfs shows subsystems that may be tied
> to different namespaces.

if with 'namespaces' you mean 'namespace types' then yes. However,
I _think_ we might get away with just considering the network namespace
for now.
Multi-namespace type support will not only be challenging to implement,
but also I don't have a good use-case for them. The only use-case I have
so far is for NVMe target, and that is primarily tied to the network
namespace (if we ignore FC and loopback for the moment).
So I'd be perfectly happy if we restrict ourselves to network namespaces.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare at suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich



More information about the Linux-nvme mailing list