OpenWrt Next Generation Ideas
Daniel Golle
daniel at makrotopia.org
Fri Mar 31 08:36:26 PDT 2023
On Fri, Mar 31, 2023 at 05:35:22PM +0300, Arınç ÜNAL wrote:
> On 31.03.2023 16:47, Daniel Golle wrote:
> > On Fri, Mar 31, 2023 at 03:52:47PM +0300, Arınç ÜNAL wrote:
> > > On 31.03.2023 14:33, Daniel Golle wrote:
> > > > Hi!
> > > >
> > > > On Fri, Mar 31, 2023 at 12:44:12PM +0300, Arınç ÜNAL wrote:
> > > > > Hi all,
> > > > >
> > > > > These are the ideas I've been thinking about for the future of OpenWrt for a
> > > > > while. It looks complete enough to share it with all of you.
> > > > >
> > > > > I'm willing to put a great deal of effort to get as much out-of-tree patches
> > > > > on mainline Linux as possible.
> > > > >
> > > > > You can make a comment on Notion or discuss it here, I'm wondering if the
> > > > > ideas are feasible and how well it would benefit the people maintaining
> > > > > OpenWrt.
> > > > >
> > > > > https://arinc9.notion.site/OpenWrt-next-gen-ideas-6db745e7584b4823950291c96f2326bb
> > > >
> > > > I will comment here, I don't have an account on Notion and it seems
> > > > to be required to be able to comment there.
> > > >
> > > > > defconfig for each device instead of config for each (sub)target.
> > > >
> > > > Given that we support thousands of devices this will not only increase
> > > > the time needed to build a release or snapshot by several magnitudes,
> > > > but also make debugging **much** harder. As of now, all devices of a
> > > > subtarget are using the same kernel, hence e.g. symbol offsets in a
> > > > kernel stack dump match for all of them. To reproduce or investigate
> > > > a problem it's hence enough to have similar hardware, not necessarily
> > > > the exact same board. As we are already lacking testers and maintainers
> > > > for the relatively small amount of targets/subtargets, have a build for
> > > > each board would make things much worse...
> > > >
> > > > Per-device builds would also be an invitation to downstream users to
> > > > introduce device-specific (kernel-)hacks. If you want that, better go
> > > > for OpenEmbedded.
> > > >
> > > > We can modularize things more or even have more sub-targets if it's
> > > > really needed to save space.
> > > >
> > > > The disadvantages outweight the advantages imho when it comes to having
> > > > a complete kernel build for each device.
> > >
> > > Hmm, what about we enable the bare minimum of kernel options for a target,
> > > which is already how it is, then select the rest as kernel modules (like on
> > > the makefile of a target for each device) on the defconfig for each device?
> > > So, in the end, it wouldn't be any different than selecting a kernel module
> > > package from the OpenWrt SDK which, I believe, does not change the symbol
> > > offsets in the kernel stack.
> > >
> > > My reason for pushing for the use defconfigs is that anyone can build the
> > > Linux kernel for their device, without needing OpenWrt. So the work for
> > > adding support for a device would benefit far more people.
> >
> > This is pretty much what we are currently doing.
> > The exception are network drivers to allow for failsafe mode to work
> > and provide SSH access **before** any modules are loaded.
>
> Got it, network drivers should also be built into the kernel on the
> defconfigs then.
>
> >
> > >
> > > >
> > > > > Some Modest Virtualization Observations
> > > >
> > > > How is this related? Virtualization (with OpenWrt being the guest)
> > > > matters on x86 which is usually not that space-constraint.
> > > > And maybe armvirt. If space is a problem for older x86 boards, let's
> > > > disable guest support in x86/legacy.
> > >
> > > It was a broad example without any explanation. What caught my attention
> > > there is the configuration of the kernel that causes problems, if I
> > > understand correctly.
> > >
> > > >
> > > > > Contribute defconfigs and all the devicetrees on OpenWrt to Linux.
> > > >
> > > > For devicetrees this would of course be desirable, but also implies a
> > > > lot of work and discussions. If you are up to get it started (ie. setup
> > > > a tree to collect cleaned-up and ready to submit dts), I think it would
> > > > be worth the effort, at least for more recent targets/SoCs.
> > >
> > > I've been meaning to do this for the mt7621 SoC devices for months. The main
> > > roadblock is that some drivers are out-of-tree, like the NAND flash so it
> > > makes no sense to have them defined on the devicetree. Getting the
> > > out-of-tree patches on mainline Linux is another step so it'll happen
> > > eventually.
> >
> > Hm, I thought that Weijie had sent the mt7621-nand driver also upstream,
> > but I haven't been following the process...
>
> I haven't checked for a while too, maybe it did get in. Then there's the mac
> incrementing on the devicetree which doesn't exist on mainline Linux.
I think Rafal has been taking care of that lately, but I might be
wrong.
label-mac also is a downstream use of DTS specific to OpenWrt which
didn't yet make it upstream.
>
> >
> > >
> > > I'll get this started with my Linux fork on GitHub.
> >
> > Very nice, I will join in there.
>
> I'm leaving the link here for future reference.
>
> https://github.com/arinc9/linux
Thank you!
>
> >
> > >
> > > >
> > > > Regarding defconfigs I don't think we need an individual defconfig for
> > > > each board. The problem here is also that OpenWrt currently has a layered
> > > > approach (generic->target->subtarget) approach while Linux itself has
> > > > a flat approach, and using that would result in a lot of duplication,
> > > > which would in turn make keeping all those defconfigs up-to-date quite
> > > > a lot of work...
> > >
> > > Not really, once you make a defconfig for a device, the only case for it to
> > > change in the future is if a kernel option was renamed, which is very rare.
> > > Anyway, even in that case, there are known ways to update them by bulk.
> > >
> > > https://lore.kernel.org/kernel-janitors/20220929090645.1389-1-lukas.bulwahn@gmail.com/
> >
> > I agree for "established" platforms for which drivers are already
> > upstream and not many changes are still expected to happen.
> > Yet, a defconfig **for each board** seems like overkill to me, esp.
> > because 90%+ will not be able to boot the resulting image generated
> > by a vanilla kernel build in any meaningful way.
> > Having defconfig for each sub-target would be agreeable, though I must
> > say that I do like our layerd approach to defconfig.
> > Maybe we should pick single representative **and resourceful** boards
> > and try upstreaming defconfig for those (in case it didn't already happen).
> > However, not using the generic defconfig anyway is only relevant for
> > rather resource-constraint boards (as otherwise you'd just use, let's say
> > multi_v7_defconfig or the like). Are you sure upstream is interested in
> > being flooded with hundreds of defconfigs, most of them for very similar
> > devices?
>
> I'm not sure, there's Thomas who maintains MIPS. There's a lot of ARM
> maintainers. I'll have to talk it through with them.
>
> The main issue I see that forces me to make defconfig per device is because
> every device will slightly need different kernel modules.
>
> It's basically taking the kernel modules from here and put it to the
> defconfig for the device.
>
> https://github.com/openwrt/openwrt/blob/master/target/linux/ramips/image/mt7621.mk#L137
Taking MT7621 as the example I believe that the main differences are
mostly the selection of wifi drivers, USB support and occasionally AHCI.
You really think it makes sense to provide 200+ defconfigs just to
cover the (limited) number of combinations?
Wouldn't a handful or even a single defconfig be sufficient?
>
> >
> > >
> > > >
> > > > > Either submit all out-of-tree patches on OpenWrt to Linux or get rid
> > > > > of them and find a better solution for what the unacceptable patch
> > > > > does.
> > > >
> > > > This would of course be great, but especially for legacy devices it may
> > > > not be possible in many cases. Think of all the devices stuck on
> > > > swconfig, just to name one example... Think of all the completely
> > > > broken vendor bootloaders which require hacks (mangeling kernel cmdline
> > > > and such) and cannot easily be replaced...
> > >
> > > Those can stay until eventually the support for them will be dropped on
> > > newer OpenWrt versions. I believe there are a lot of out-of-tree patches
> > > that are not for old devices and can be on mainline Linux.
> >
> > In terms of hardware support you are probably right.
> > When it comes to vendor bootloaders which require mangeling kernel
> > bootargs the kernel folks have always argued that users should just
> > change the bootloader environment -- however, this requires a serial
> > console attached and makes reverting to stock firmware harder.
>
> Makes sense. It makes sense to maintain this part on OpenWrt.
>
> >
> > Another example is selection of the rootfs. Kernel folks argue that
> > we should use an initramfs for that, however, we try to avoid the
> > overhead of using an initramfs with it's own userland just for that.
> >
> > To resolve this situation at least for future boards (ie. with
> > post-2020 versions of U-Boot) I've put quite some effort into adding
> > native support for mapping the filesystem sub-image(s) of a uImage.FIT
> > image as a block device in Linux:
> >
> > https://github.com/dangowrt/linux/commit/8a70f52fddd518d1d7093a12874ca3a5de86fab1
> >
> > I've already had quite a lot of debate about that, currently I'm waiting
> > for add/remove/size-change notifications for in-kernel block device users
> > being added to the kernel, as that would be required for a really clean
> > implementation.
>
> This is very nice. Please CC me when there's anything new. This was one of
> the things I was planning to figure out. I think the current way, even with
> a FIT image, is to put the location of the filesystem device on the bootargs
> so the kernel can find it once the SPI/NAND driver probes the flash?
Yes, the current way is very platform-specific and storage-type specific.
With recent devboards supporting several boot devices on the same board
I felt the need to have one unified method which can work on all of them,
hence I came up with the uImage.FIT partition parser. The partition parser,
however, has been rejected upstream and I was told to instead implement
this as a tiny block driver.
At least for devices with more or less recent U-Boot this is imho a
very good option.
>
> >
> > Most other patches in target/linux/generic/pending-5.* should be sutiable
> > for upstream inclusion. Patches in target/linux/generic/hack-5.* are more
> > tragic cases...
> >
> > Apart from the technical debate the issue here is of course also of
> > organizational nature. While not accepting new patches unless they are
> > already accepted upstream is a matter of policy, upstreaming all the
> > existing patches is a lot of work, and the question arises *who* is
> > going to volunteer do that...
>
> This is what I'm planning to spend my time on for the next couple of years.
That's very good to hear! Let me know if you have any questions
regarding existing patches, for most I should be able to provide you
with some background information.
>
> >
> > >
> > > >
> > > > > Bugfix backporting should happen only after it's accepted to Linux.
> > > > > The patch must be identical to the commit on Linux.
> > > >
> > > > The wording here might be a bit too strict to support our existing
> > > > mess, but I generally agree. So I'd say 'should' instead of 'must', but
> > > > otherwise agree.
> > > >
> > > > > Feature backporting should be done only if it's thoroughly tested.
> > > >
> > > > ... and testing often happens in the OpenWrt tree. So it's a bit of
> > > > a chicken-egg problem, as often developers don't even have all the
> > > > different hardware needed for testing. But generally I agree.
> > > > A way to ease testing *before* pushing to openwrt.git or posting to
> > > > upstream mailing lists would be to have snapshot builds also for
> > > > developers' staging trees.
> > > >
> > > > > Kernel Solution
> > > > > Make a mode menu.
> > > > > Filesystem only.
> > > >
> > > > So which kernel headers should be used to build e.g. libc and netlink
> > > > users?
> > > > In a way it is also currently possible to build generic images for
> > > > most architectures using the armvirt, malta and x86 targets. Of course,
> > > > also in this case a kernel is being built.
> > > >
> > > > > Make a kernel selection menu where the user can choose to feed the
> > > > > kernel directory of their own or use the longterm one defined on the
> > > > > OpenWrt SDK. Add this as a suboption to the full image mode.
> > > >
> > > > What about CONFIG_EXTERNAL_KERNEL_TREE and friends...?
> > >
> > > I've never tried this but looks like it may need a bit of extra options.
> >
> > No, I'm using this on a daily base to use linux-next and friends together
> > with OpenWrt userland. For some devices you may need to add some
> > downstream patches, but e.g. recent MediaTek stuff works out-of-the-box.
>
> I couldn't get netifd to work properly on 6.2 when I was just booting the
> OpenWrt filesystem as initramfs. I'll try this option though, thanks.
>
> >
> > > No patches must be applied, kernel module packages must not be compiled,
> > > etc.
> >
> > Kernel module packages work fine, just the version will be wrong in case
> > of an external kernel tree being used.
>
> If I remember correctly, with 5.17, some netfilter kernel option was changed
> which caused the compilation of a kernel module package to fail.
Ah, yes, that's true...
diff --git a/package/kernel/linux/modules/netfilter.mk b/package/kernel/linux/modules/netfilter.mk
index 1772545f25..6cb098527b 100644
--- a/package/kernel/linux/modules/netfilter.mk
+++ b/package/kernel/linux/modules/netfilter.mk
@@ -1171,8 +1171,6 @@ define KernelPackage/nft-offload
CONFIG_NFT_FLOW_OFFLOAD
FILES:= \
$(LINUX_DIR)/net/netfilter/nf_flow_table_inet.ko \
- $(LINUX_DIR)/net/ipv4/netfilter/nf_flow_table_ipv4.ko \
- $(LINUX_DIR)/net/ipv6/netfilter/nf_flow_table_ipv6.ko \
$(LINUX_DIR)/net/netfilter/nft_flow_offload.ko
AUTOLOAD:=$(call AutoProbe,nf_flow_table_inet nf_flow_table_ipv4 nf_flow_table_ipv6 nft_flow_offload)
endef
>
> Having the option to disable them altogether would address this type of edge
> cases.
Yep, agreed, such an option would be useful.
More information about the openwrt-devel
mailing list