[PATCH] nvmet: allow associating port to a cgroup via configfs

Tejun Heo tj at kernel.org
Mon Jul 3 12:16:21 PDT 2023


Hello,

On Mon, Jul 03, 2023 at 04:21:40PM +0300, Sagi Grimberg wrote:
> > Thorttiling files and passthru isn't possible with cgroup v1 as well,
> > cgroup v2 broke the abillity to throttle bdevs. The purpose of the patch
> > is to re-enable the broken functionality.
> 
> cgroupv2 didn't break anything, this was never an intended feature of
> the linux nvme target, so it couldn't have been broken. Did anyone
> know that people are doing this with nvmet?

Maybe he's referring to the fact that cgroup1 allowed throttling root
cgroups? Maybe they were throttling from the root cgroup on the client side?

> I'm pretty sure others on the list are treating this as a suggested
> new feature for nvmet. and designing this feature as something that
> is only supported for blkdevs is undersirable.
> 
> > There was an attempt to re-enable the functionality by allowing io
> > throttle on the root cgroup but it's against the cgroup v2 design.
> > Reference:
> > https://lore.kernel.org/r/20220114093000.3323470-1-yukuai3@huawei.com/
> > 
> > A full blown nvmet cgroup controller may be a complete solution, but it
> > may take some time to achieve,
> 
> I don't see any other sane solution here.
> 
> Maybe Tejun/others think differently here?

I'm not necessarily against the idea of enabling subsystems to assign cgroup
membership to entities which aren't processes or threads. It does make sense
for cases where a kernel subsystem is serving multiple classes of users
which aren't processes as here and it's likely that we'd need something
similar for certain memory regions in a limited way (e.g. tmpfs chunk shared
across multiple cgroups).

That said, because we haven't done this before, we haven't figured out how
the API should be like and we definitely want something which can be used in
a similar fashion across the board. Also, cgroup does assume that resources
are always associated with processes or threads, and making this work with
non-task entity would require some generalization there. Maybe the solution
is to always have a tying kthread which serves as a proxy for the resource
but that seems a bit nasty at least on the first thought.

In principle, at least from cgroup POV, I think the idea of being able to
assign cgroup membership to subsystem-specific entities is okay. In
practice, there are quite a few challenges to address.

Thanks.

-- 
tejun



More information about the Linux-nvme mailing list