[RFCv2 4/7] bpf/kexec: Introduce three bpf kfunc for kexec
Alexei Starovoitov
alexei.starovoitov at gmail.com
Wed Apr 30 09:16:32 PDT 2025
On Wed, Apr 30, 2025 at 3:47 AM Pingfan Liu <piliu at redhat.com> wrote:
>
> On Wed, Apr 30, 2025 at 8:04 AM Alexei Starovoitov
> <alexei.starovoitov at gmail.com> wrote:
> >
> > On Mon, Apr 28, 2025 at 9:13 PM Pingfan Liu <piliu at redhat.com> wrote:
> > +__bpf_kfunc struct mem_range_result *bpf_kexec_decompress(char
> > *image_gz_payload, int image_gz_sz,
> > > + unsigned int expected_decompressed_sz)
> > > +{
> > > + decompress_fn decompressor;
> > > + //todo, use flush to cap the memory size used by decompression
> > > + long (*flush)(void*, unsigned long) = NULL;
> > > + struct mem_range_result *range;
> > > + const char *name;
> > > + void *output_buf;
> > > + char *input_buf;
> > > + int ret;
> > > +
> > > + range = kmalloc(sizeof(struct mem_range_result), GFP_KERNEL);
> > > + if (!range) {
> > > + pr_err("fail to allocate mem_range_result\n");
> > > + return NULL;
> > > + }
> > > + refcount_set(&range->usage, 1);
> > > +
> > > + input_buf = vmalloc(image_gz_sz);
> > > + if (!input_buf) {
> > > + pr_err("fail to allocate input buffer\n");
> > > + kfree(range);
> > > + return NULL;
> > > + }
> > > +
> > > + ret = copy_from_kernel_nofault(input_buf, image_gz_payload, image_gz_sz);
> > > + if (ret < 0) {
> > > + pr_err("Error when copying from 0x%px, size:0x%x\n",
> > > + image_gz_payload, image_gz_sz);
> > > + kfree(range);
> > > + vfree(input_buf);
> > > + return NULL;
> > > + }
> > > +
> > > + output_buf = vmalloc(expected_decompressed_sz);
> > > + if (!output_buf) {
> > > + pr_err("fail to allocate output buffer\n");
> > > + kfree(range);
> > > + vfree(input_buf);
> > > + return NULL;
> > > + }
> > > +
> > > + decompressor = decompress_method(input_buf, image_gz_sz, &name);
> > > + if (!decompressor) {
> > > + pr_err("Can not find decompress method\n");
> > > + kfree(range);
> > > + vfree(input_buf);
> > > + vfree(output_buf);
> > > + return NULL;
> > > + }
> > > + //to do, use flush
> > > + ret = decompressor(image_gz_payload, image_gz_sz, NULL, NULL,
> > > + output_buf, NULL, NULL);
> > > +
> > > + /* Update the range map */
> > > + if (ret == 0) {
> > > + range->buf = output_buf;
> > > + range->size = expected_decompressed_sz;
> > > + range->status = 0;
> > > + } else {
> > > + pr_err("Decompress error\n");
> > > + vfree(output_buf);
> > > + kfree(range);
> > > + return NULL;
> > > + }
> > > + pr_info("%s, return range 0x%lx\n", __func__, range);
> > > + return range;
> > > +}
> >
> > These kfuncs look like generic decompress routines.
> > They're not related to kexec and probably should be in kernel/bpf/helpers.c
> > or kernel/bpf/compression.c instead of kernel/kexec_pe_image.c.
> >
>
> Thanks for your suggestion. I originally considered using these kfuncs
> only in kexec context (Later, introducing a dedicated BPF_PROG_TYPE
> for kexec).
We do not add new prog types anymore.
They're frozen just like the list of helpers.
> They are placed under a lock so that a malice attack can
> not exhaust the memory through repeatedly calling to the decompress
> kfunc.
attack? This is all root only anyway and all memory is counted
towards memcg.
Make sure to use GFP_KERNEL_ACCOUNT and something similar
to bpf_map_get_memcg.
> To generalize these kfunc, I think I can add some boundary control of
> the memory usage to prevent such attacks.
Don't reinvent the wheel. memcg is the mechanism.
> > They also must be KF_SLEEPABLE.
> > Please test your patches with all kernel debugs enabled.
> > Otherwise you would have seen all these "sleeping while atomic"
> > issues yourself.
> >
>
> See, I will have all these debug options for the V3 test.
>
> Appreciate your insight.
>
> Regards,
>
> Pingfan
>
More information about the kexec
mailing list