[PATCH 7/8] sched_ext: Sub-allocator over kernel-claimed BPF arena pages
Andrea Righi
arighi at nvidia.com
Thu May 21 00:56:05 PDT 2026
Hi Tejun,
On Wed, May 20, 2026 at 01:50:51PM -1000, Tejun Heo wrote:
> Build a per-scheduler sub-allocator on top of pages claimed from the BPF
> arena registered in the previous patch. Subsequent kernel-managed
> arena-resident structures (e.g. per-CPU set_cmask cmask) carve their storage
> from this pool.
>
> scx_arena_pool_init() creates a gen_pool. scx_arena_alloc() returns the
> kernel VA. On exhaustion, the pool grows by claiming more pages via
> bpf_arena_alloc_pages_sleepable(). Chunks are added at the kernel-side
> mapping address; callers translate to the BPF-arena form themselves if
> needed.
>
> Allocations sleep (GFP_KERNEL) - they may grow the pool through vzalloc and
> arena page allocation. All current consumers run from the enable path (after
> ops.init() and the kernel-side arena auto-discovery, before validate_ops()),
> where sleeping is fine.
>
> scx_arena_pool_destroy() walks each chunk, returns outstanding ranges to the
> gen_pool with gen_pool_free() and then calls gen_pool_destroy(). The
> underlying arena pages are released when the arena map itself is torn down,
> so the pool destroy doesn't free them explicitly.
>
> Signed-off-by: Tejun Heo <tj at kernel.org>
> ---
...
> +/*
> + * Allocate @size bytes from the arena pool. Returns kernel VA on success, NULL
> + * on failure. May grow the pool via scx_arena_grow() which sleeps. Caller must
> + * be in a GFP_KERNEL context.
> + */
> +void *scx_arena_alloc(struct scx_sched *sch, size_t size)
> +{
> + unsigned long kern_va;
> + u32 page_cnt;
> +
> + might_sleep();
> +
> + if (!sch->arena_pool)
> + return NULL;
> +
> + kern_va = gen_pool_alloc(sch->arena_pool, size);
> + if (!kern_va) {
> + page_cnt = max_t(u32, SCX_ARENA_GROW_PAGES,
> + (size + PAGE_SIZE - 1) >> PAGE_SHIFT);
> + if (scx_arena_grow(sch, page_cnt))
> + return NULL;
> + kern_va = gen_pool_alloc(sch->arena_pool, size);
> + if (!kern_va)
> + return NULL;
IIUC, since @page_cnt is sized to cover @size and the new chunk is added empty
to the pool, gen_pool_alloc() here should always succeed. Should we do:
if (WARN_ON_ONCE(!kern_va))
return NULL;
to catch potential logical bugs / future concurrency / exotic configurations?
Thanks,
-Andrea
More information about the linux-arm-kernel
mailing list