[PATCH 00/18] bpf: Secure and authenticated preloading of eBPF programs

Fri Apr 1 16:55:37 PDT 2022

On Thu, Mar 31, 2022 at 08:25:22AM +0000, Roberto Sassu wrote:
> > From: Alexei Starovoitov [mailto:alexei.starovoitov at gmail.com]
> > Sent: Thursday, March 31, 2022 4:27 AM
> > On Mon, Mar 28, 2022 at 07:50:15PM +0200, Roberto Sassu wrote:
> > > eBPF already allows programs to be preloaded and kept running without
> > > intervention from user space. There is a dedicated kernel module called
> > > bpf_preload, which contains the light skeleton of the iterators_bpf eBPF
> > > program. If this module is enabled in the kernel configuration, its loading
> > > will be triggered when the bpf filesystem is mounted (unless the module is
> > > built-in), and the links of iterators_bpf are pinned in that filesystem
> > > (they will appear as the progs.debug and maps.debug files).
> > >
> > > However, the current mechanism, if used to preload an LSM, would not
> > offer
> > > the same security guarantees of LSMs integrated in the security
> > subsystem.
> > > Also, it is not generic enough to be used for preloading arbitrary eBPF
> > > programs, unless the bpf_preload code is heavily modified.
> > >
> > > More specifically, the security problems are:
> > > - any program can be pinned to the bpf filesystem without limitations
> > >   (unless a MAC mechanism enforces some restrictions);
> > > - programs being executed can be terminated at any time by deleting the
> > >   pinned objects or unmounting the bpf filesystem.
> > 
> > So many things to untangle here.
> 
> Hi Alexei
> 
> thanks for taking the time to provide such detailed
> explanation.
> 
> > The above paragraphs are misleading and incorrect.
> > The commit log sounds like there are security issues that this
> > patch set is fixing.
> > This is not true.
> 
> I reiterate the goal: enforce a mandatory policy with
> an out-of-tree LSM (a kernel module is fine), with the
> same guarantees of LSMs integrated in the security
> subsystem.

To make it 100% clear:
Any in-kernel feature that benefits out-of-tree module will be rejected.

> The root user is not part of the TCB (i.e. is untrusted),
> all the changes that user wants to make must be subject
> of decision by the LSM enforcing the mandatory policy.
> 
> I thought about adding support for LSMs from kernel
> modules via a new built-in LSM (called LoadLSM), but

Such approach will be rejected. See above.

> > I suspect there is huge confusion on what these two "progs.debug"
> > and "maps.debug" files are in a bpffs instance.
> > They are debug files to pretty pring loaded maps and progs for folks who
> > like to use 'cat' to examine the state of the system instead of 'bpftool'.
> > The root can remove these files from bpffs.
> > 
> > There is no reason for kernel module to pin its bpf progs.
> > If you want to develop DIGLIM as a kernel module that uses light skeleton
> > just do:
> > #include <linux/init.h>
> > #include <linux/module.h>
> > #include "diglim.lskel.h"
> > 
> > static struct diglim_bpf *skel;
> > 
> > static int __init load(void)
> > {
> >         skel = diglim_bpf__open_and_load();
> >         err = diglim_bpf__attach(skel);
> > }
> > /* detach skel in __fini */
> > 
> > It's really that short.
> > 
> > Then you will be able to
> > - insmod diglim.ko -> will load and attach bpf progs.
> > - rmmod diglim -> will detach them.
> 
> root can stop the LSM without consulting the security
> policy. The goal of having root untrusted is not achieved.

Out-of-tree module can do any hack.
For example:
1. don't do detach skel in __fini
  rmmod will remove the module, but bpf progs will keep running.
2. do module_get(THIS_MODULE) in __init
  rmmod will return EBUSY
  and have some out-of-band way of dropping mod refcnt.
3. hack into sys_delete_module. if module_name==diglem return EBUSY.
4. add proper LSM hook to delete_module

> My point was that pinning progs seems to be the
> recommended way of keeping them running. 

Not quite. bpf_link refcnt is what keeps progs attached.
bpffs is mainly used for:
- to pass maps/links from one process to another
when passing fd is not possible.
- to solve the case of crashing user space.
The user space agent will restart and will pick up where
it's left by reading map, link, prog FDs from bpffs.
- pinning bpf iterators that are later used to 'cat' such files.
That is what bpf_preload is doing by creating two debug
files "maps.debug" and "progs.debug".

> Pinning
> them to unreachable inodes intuitively looked the
> way to go for achieving the stated goal. 

We can consider inodes in bpffs that are not unlinkable by root
in the future, but certainly not for this use case.

> Or maybe I
> should just increment the reference count of links
> and don't decrement during an rmmod?

I suggest to abandon out-of-tree goal.
Only then we can help and continue this discussion.