From j.granados at samsung.com Wed May 1 02:16:14 2024 From: j.granados at samsung.com (Joel Granados) Date: Wed, 1 May 2024 11:16:14 +0200 Subject: [PATCH v4 1/8] net: Remove the now superfluous sentinel elements from ctl_table array In-Reply-To: <20240429082219.3qev2nzftzp2gecc@joelS2.panther.com> References: <20240425-jag-sysctl_remset_net-v4-0-9e82f985777d@samsung.com> <20240425-jag-sysctl_remset_net-v4-1-9e82f985777d@samsung.com> <20240425155804.66f3bed5@kernel.org> <20240426065931.wyrzevlheburnf47@joelS2.panther.com> <20240426071944.206e9cff@kernel.org> <20240429082219.3qev2nzftzp2gecc@joelS2.panther.com> Message-ID: <20240501091614.7r6wded6quegxr6z@joelS2.panther.com> On Mon, Apr 29, 2024 at 10:22:19AM +0200, Joel Granados wrote: > On Fri, Apr 26, 2024 at 07:19:44AM -0700, Jakub Kicinski wrote: > > On Fri, 26 Apr 2024 08:59:31 +0200 Joel Granados wrote: > > > Sorry about this. I pulled the trigger way too early. This is already > > > fixed in my v4. > > > > | ^~~~~~~~~~ > > > > -- > > > > netdev FAQ tl;dr: > > > > - designate your patch to a tree - [PATCH net] or [PATCH net-next] > I'll add "net" to my V6. I wont change the my base commit which is > v6.9-rc1. Actually, I added net-next and rebased onto net-next/main. Please advise if I need to rebase to some other branch in pub/scm/linux/kernel/git/netdev/net-next Best -- Joel Granados -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 659 bytes Desc: not available URL: From devnull+j.granados.samsung.com at kernel.org Wed May 1 02:29:26 2024 From: devnull+j.granados.samsung.com at kernel.org (Joel Granados via B4 Relay) Date: Wed, 01 May 2024 11:29:26 +0200 Subject: [PATCH net-next v6 2/8] net: ipv{6,4}: Remove the now superfluous sentinel elements from ctl_table array In-Reply-To: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> References: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> Message-ID: <20240501-jag-sysctl_remset_net-v6-2-370b702b6b4a@samsung.com> From: Joel Granados This commit comes at the tail end of a greater effort to remove the empty elements at the end of the ctl_table arrays (sentinels) which will reduce the overall build time size of the kernel and run time memory bloat by ~64 bytes per sentinel (further information Link : https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo at bombadil.infradead.org/) * Remove sentinel element from ctl_table structs. * Remove the zeroing out of an array element (to make it look like a sentinel) in sysctl_route_net_init And ipv6_route_sysctl_init. This is not longer needed and is safe after commit c899710fe7f9 ("networking: Update to register_net_sysctl_sz") added the array size to the ctl_table registration. * Remove extra sentinel element in the declaration of devinet_vars. * Removed the "-1" in __devinet_sysctl_register, sysctl_route_net_init, ipv6_sysctl_net_init and ipv4_sysctl_init_net that adjusted for having an extra empty element when looping over ctl_table arrays * Replace the for loop stop condition in __addrconf_sysctl_register that tests for procname == NULL with one that depends on array size * Removing the unprivileged user check in ipv6_route_sysctl_init is safe as it is replaced by calling ipv6_route_sysctl_table_size; introduced in commit c899710fe7f9 ("networking: Update to register_net_sysctl_sz") * Use a table_size variable to keep the value of ARRAY_SIZE Signed-off-by: Joel Granados --- net/ipv4/devinet.c | 5 ++--- net/ipv4/ip_fragment.c | 2 -- net/ipv4/route.c | 8 ++------ net/ipv4/sysctl_net_ipv4.c | 7 +++---- net/ipv4/xfrm4_policy.c | 1 - net/ipv6/addrconf.c | 8 +++----- net/ipv6/icmp.c | 1 - net/ipv6/reassembly.c | 2 -- net/ipv6/route.c | 5 ----- net/ipv6/sysctl_net_ipv6.c | 8 +++----- net/ipv6/xfrm6_policy.c | 1 - 11 files changed, 13 insertions(+), 35 deletions(-) diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index 364dbf0cd9bf..a612c57b61c5 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -2520,7 +2520,7 @@ static int ipv4_doint_and_flush(struct ctl_table *ctl, int write, static struct devinet_sysctl_table { struct ctl_table_header *sysctl_header; - struct ctl_table devinet_vars[__IPV4_DEVCONF_MAX]; + struct ctl_table devinet_vars[IPV4_DEVCONF_MAX]; } devinet_sysctl = { .devinet_vars = { DEVINET_SYSCTL_COMPLEX_ENTRY(FORWARDING, "forwarding", @@ -2583,7 +2583,7 @@ static int __devinet_sysctl_register(struct net *net, char *dev_name, if (!t) goto out; - for (i = 0; i < ARRAY_SIZE(t->devinet_vars) - 1; i++) { + for (i = 0; i < ARRAY_SIZE(t->devinet_vars); i++) { t->devinet_vars[i].data += (char *)p - (char *)&ipv4_devconf; t->devinet_vars[i].extra1 = p; t->devinet_vars[i].extra2 = net; @@ -2657,7 +2657,6 @@ static struct ctl_table ctl_forward_entry[] = { .extra1 = &ipv4_devconf, .extra2 = &init_net, }, - { }, }; #endif diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c index 534b98a0744a..08e2c92e25ab 100644 --- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -580,7 +580,6 @@ static struct ctl_table ip4_frags_ns_ctl_table[] = { .proc_handler = proc_dointvec_minmax, .extra1 = &dist_min, }, - { } }; /* secret interval has been deprecated */ @@ -593,7 +592,6 @@ static struct ctl_table ip4_frags_ctl_table[] = { .mode = 0644, .proc_handler = proc_dointvec_jiffies, }, - { } }; static int __net_init ip4_frags_ns_ctl_register(struct net *net) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 0fd9a3d7ac4a..5fd54103174f 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -3496,7 +3496,6 @@ static struct ctl_table ipv4_route_table[] = { .mode = 0644, .proc_handler = proc_dointvec, }, - { } }; static const char ipv4_route_flush_procname[] = "flush"; @@ -3530,7 +3529,6 @@ static struct ctl_table ipv4_route_netns_table[] = { .mode = 0644, .proc_handler = proc_dointvec, }, - { }, }; static __net_init int sysctl_route_net_init(struct net *net) @@ -3548,16 +3546,14 @@ static __net_init int sysctl_route_net_init(struct net *net) /* Don't export non-whitelisted sysctls to unprivileged users */ if (net->user_ns != &init_user_ns) { - if (tbl[0].procname != ipv4_route_flush_procname) { - tbl[0].procname = NULL; + if (tbl[0].procname != ipv4_route_flush_procname) table_size = 0; - } } /* Update the variables to point into the current struct net * except for the first element flush */ - for (i = 1; i < ARRAY_SIZE(ipv4_route_netns_table) - 1; i++) + for (i = 1; i < table_size; i++) tbl[i].data += (void *)net - (void *)&init_net; } tbl[0].extra1 = net; diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index ce5d19978a26..162a0a3b6ba5 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -575,7 +575,6 @@ static struct ctl_table ipv4_table[] = { .extra1 = &sysctl_fib_sync_mem_min, .extra2 = &sysctl_fib_sync_mem_max, }, - { } }; static struct ctl_table ipv4_net_table[] = { @@ -1502,11 +1501,11 @@ static struct ctl_table ipv4_net_table[] = { .proc_handler = proc_dou8vec_minmax, .extra1 = SYSCTL_ONE, }, - { } }; static __net_init int ipv4_sysctl_init_net(struct net *net) { + size_t table_size = ARRAY_SIZE(ipv4_net_table); struct ctl_table *table; table = ipv4_net_table; @@ -1517,7 +1516,7 @@ static __net_init int ipv4_sysctl_init_net(struct net *net) if (!table) goto err_alloc; - for (i = 0; i < ARRAY_SIZE(ipv4_net_table) - 1; i++) { + for (i = 0; i < table_size; i++) { if (table[i].data) { /* Update the variables to point into * the current struct net @@ -1533,7 +1532,7 @@ static __net_init int ipv4_sysctl_init_net(struct net *net) } net->ipv4.ipv4_hdr = register_net_sysctl_sz(net, "net/ipv4", table, - ARRAY_SIZE(ipv4_net_table)); + table_size); if (!net->ipv4.ipv4_hdr) goto err_reg; diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c index fccbbd3e1a4b..0294fef577fa 100644 --- a/net/ipv4/xfrm4_policy.c +++ b/net/ipv4/xfrm4_policy.c @@ -152,7 +152,6 @@ static struct ctl_table xfrm4_policy_table[] = { .mode = 0644, .proc_handler = proc_dointvec, }, - { } }; static __net_init int xfrm4_net_sysctl_init(struct net *net) diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 9aa0900abfa1..5c424a0e7232 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -7184,14 +7184,12 @@ static const struct ctl_table addrconf_sysctl[] = { .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_TWO, }, - { - /* sentinel */ - } }; static int __addrconf_sysctl_register(struct net *net, char *dev_name, struct inet6_dev *idev, struct ipv6_devconf *p) { + size_t table_size = ARRAY_SIZE(addrconf_sysctl); int i, ifindex; struct ctl_table *table; char path[sizeof("net/ipv6/conf/") + IFNAMSIZ]; @@ -7200,7 +7198,7 @@ static int __addrconf_sysctl_register(struct net *net, char *dev_name, if (!table) goto out; - for (i = 0; table[i].data; i++) { + for (i = 0; i < table_size; i++) { table[i].data += (char *)p - (char *)&ipv6_devconf; /* If one of these is already set, then it is not safe to * overwrite either of them: this makes proc_dointvec_minmax @@ -7215,7 +7213,7 @@ static int __addrconf_sysctl_register(struct net *net, char *dev_name, snprintf(path, sizeof(path), "net/ipv6/conf/%s", dev_name); p->sysctl_header = register_net_sysctl_sz(net, path, table, - ARRAY_SIZE(addrconf_sysctl)); + table_size); if (!p->sysctl_header) goto free; diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c index d285c1f6f1a6..7b31674644ef 100644 --- a/net/ipv6/icmp.c +++ b/net/ipv6/icmp.c @@ -1206,7 +1206,6 @@ static struct ctl_table ipv6_icmp_table_template[] = { .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_ONE, }, - { }, }; struct ctl_table * __net_init ipv6_icmp_sysctl_init(struct net *net) diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c index ee95cdcc8747..439f93512b0a 100644 --- a/net/ipv6/reassembly.c +++ b/net/ipv6/reassembly.c @@ -436,7 +436,6 @@ static struct ctl_table ip6_frags_ns_ctl_table[] = { .mode = 0644, .proc_handler = proc_dointvec_jiffies, }, - { } }; /* secret interval has been deprecated */ @@ -449,7 +448,6 @@ static struct ctl_table ip6_frags_ctl_table[] = { .mode = 0644, .proc_handler = proc_dointvec_jiffies, }, - { } }; static int __net_init ip6_frags_ns_sysctl_register(struct net *net) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 3e0b2cb20fd2..c43b0616742e 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -6428,7 +6428,6 @@ static struct ctl_table ipv6_route_table_template[] = { .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_ONE, }, - { } }; struct ctl_table * __net_init ipv6_route_sysctl_init(struct net *net) @@ -6452,10 +6451,6 @@ struct ctl_table * __net_init ipv6_route_sysctl_init(struct net *net) table[8].data = &net->ipv6.sysctl.ip6_rt_min_advmss; table[9].data = &net->ipv6.sysctl.ip6_rt_gc_min_interval; table[10].data = &net->ipv6.sysctl.skip_notify_on_dev_down; - - /* Don't export sysctls to unprivileged users */ - if (net->user_ns != &init_user_ns) - table[1].procname = NULL; } return table; diff --git a/net/ipv6/sysctl_net_ipv6.c b/net/ipv6/sysctl_net_ipv6.c index 75de55f907b0..c060285ff47f 100644 --- a/net/ipv6/sysctl_net_ipv6.c +++ b/net/ipv6/sysctl_net_ipv6.c @@ -213,7 +213,6 @@ static struct ctl_table ipv6_table_template[] = { .proc_handler = proc_doulongvec_minmax, .extra2 = &ioam6_id_wide_max, }, - { } }; static struct ctl_table ipv6_rotable[] = { @@ -248,11 +247,11 @@ static struct ctl_table ipv6_rotable[] = { .proc_handler = proc_dointvec, }, #endif /* CONFIG_NETLABEL */ - { } }; static int __net_init ipv6_sysctl_net_init(struct net *net) { + size_t table_size = ARRAY_SIZE(ipv6_table_template); struct ctl_table *ipv6_table; struct ctl_table *ipv6_route_table; struct ctl_table *ipv6_icmp_table; @@ -264,7 +263,7 @@ static int __net_init ipv6_sysctl_net_init(struct net *net) if (!ipv6_table) goto out; /* Update the variables to point into the current struct net */ - for (i = 0; i < ARRAY_SIZE(ipv6_table_template) - 1; i++) + for (i = 0; i < table_size; i++) ipv6_table[i].data += (void *)net - (void *)&init_net; ipv6_route_table = ipv6_route_sysctl_init(net); @@ -276,8 +275,7 @@ static int __net_init ipv6_sysctl_net_init(struct net *net) goto out_ipv6_route_table; net->ipv6.sysctl.hdr = register_net_sysctl_sz(net, "net/ipv6", - ipv6_table, - ARRAY_SIZE(ipv6_table_template)); + ipv6_table, table_size); if (!net->ipv6.sysctl.hdr) goto out_ipv6_icmp_table; diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c index 7924e08ee142..cc885d3aa9e5 100644 --- a/net/ipv6/xfrm6_policy.c +++ b/net/ipv6/xfrm6_policy.c @@ -184,7 +184,6 @@ static struct ctl_table xfrm6_policy_table[] = { .mode = 0644, .proc_handler = proc_dointvec, }, - { } }; static int __net_init xfrm6_net_sysctl_init(struct net *net) -- 2.43.0 From devnull+j.granados.samsung.com at kernel.org Wed May 1 02:29:25 2024 From: devnull+j.granados.samsung.com at kernel.org (Joel Granados via B4 Relay) Date: Wed, 01 May 2024 11:29:25 +0200 Subject: [PATCH net-next v6 1/8] net: Remove the now superfluous sentinel elements from ctl_table array In-Reply-To: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> References: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> Message-ID: <20240501-jag-sysctl_remset_net-v6-1-370b702b6b4a@samsung.com> From: Joel Granados This commit comes at the tail end of a greater effort to remove the empty elements at the end of the ctl_table arrays (sentinels) which will reduce the overall build time size of the kernel and run time memory bloat by ~64 bytes per sentinel (further information Link : https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo at bombadil.infradead.org/) * Remove sentinel element from ctl_table structs. * Remove the zeroing out of an array element (to make it look like a sentinel) in neigh_sysctl_register and lowpan_frags_ns_sysctl_register This is not longer needed and is safe after commit c899710fe7f9 ("networking: Update to register_net_sysctl_sz") added the array size to the ctl_table registration. * Replace the for loop stop condition in sysctl_core_net_init that tests for procname == NULL with one that depends on array size * Removed the "-1" in mpls_net_init that adjusted for having an extra empty element when looping over ctl_table arrays * Use a table_size variable to keep the value of ARRAY_SIZE Signed-off-by: Joel Granados --- net/core/neighbour.c | 5 +---- net/core/sysctl_net_core.c | 13 ++++++------- net/dccp/sysctl.c | 2 -- net/ieee802154/6lowpan/reassembly.c | 6 +----- net/mpls/af_mpls.c | 13 ++++++------- net/unix/sysctl_net_unix.c | 1 - 6 files changed, 14 insertions(+), 26 deletions(-) diff --git a/net/core/neighbour.c b/net/core/neighbour.c index af270c202d9a..45fd88405b6b 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -3733,7 +3733,7 @@ static int neigh_proc_base_reachable_time(struct ctl_table *ctl, int write, static struct neigh_sysctl_table { struct ctl_table_header *sysctl_header; - struct ctl_table neigh_vars[NEIGH_VAR_MAX + 1]; + struct ctl_table neigh_vars[NEIGH_VAR_MAX]; } neigh_sysctl_template __read_mostly = { .neigh_vars = { NEIGH_SYSCTL_ZERO_INTMAX_ENTRY(MCAST_PROBES, "mcast_solicit"), @@ -3784,7 +3784,6 @@ static struct neigh_sysctl_table { .extra2 = SYSCTL_INT_MAX, .proc_handler = proc_dointvec_minmax, }, - {}, }, }; @@ -3812,8 +3811,6 @@ int neigh_sysctl_register(struct net_device *dev, struct neigh_parms *p, if (dev) { dev_name_source = dev->name; /* Terminate the table early */ - memset(&t->neigh_vars[NEIGH_VAR_GC_INTERVAL], 0, - sizeof(t->neigh_vars[NEIGH_VAR_GC_INTERVAL])); neigh_vars_size = NEIGH_VAR_BASE_REACHABLE_TIME_MS + 1; } else { struct neigh_table *tbl = p->tbl; diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c index 6da5995ac86a..c9fb9ad87485 100644 --- a/net/core/sysctl_net_core.c +++ b/net/core/sysctl_net_core.c @@ -661,7 +661,6 @@ static struct ctl_table net_core_table[] = { .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO, }, - { } }; static struct ctl_table netns_core_table[] = { @@ -698,7 +697,6 @@ static struct ctl_table netns_core_table[] = { .extra2 = SYSCTL_ONE, .proc_handler = proc_dou8vec_minmax, }, - { } }; static int __init fb_tunnels_only_for_init_net_sysctl_setup(char *str) @@ -716,20 +714,21 @@ __setup("fb_tunnels=", fb_tunnels_only_for_init_net_sysctl_setup); static __net_init int sysctl_core_net_init(struct net *net) { - struct ctl_table *tbl, *tmp; + size_t table_size = ARRAY_SIZE(netns_core_table); + struct ctl_table *tbl; tbl = netns_core_table; if (!net_eq(net, &init_net)) { + int i; tbl = kmemdup(tbl, sizeof(netns_core_table), GFP_KERNEL); if (tbl == NULL) goto err_dup; - for (tmp = tbl; tmp->procname; tmp++) - tmp->data += (char *)net - (char *)&init_net; + for (i = 0; i < table_size; ++i) + tbl[i].data += (char *)net - (char *)&init_net; } - net->core.sysctl_hdr = register_net_sysctl_sz(net, "net/core", tbl, - ARRAY_SIZE(netns_core_table)); + net->core.sysctl_hdr = register_net_sysctl_sz(net, "net/core", tbl, table_size); if (net->core.sysctl_hdr == NULL) goto err_reg; diff --git a/net/dccp/sysctl.c b/net/dccp/sysctl.c index ee8d4f5afa72..3fc474d6e57d 100644 --- a/net/dccp/sysctl.c +++ b/net/dccp/sysctl.c @@ -90,8 +90,6 @@ static struct ctl_table dccp_default_table[] = { .mode = 0644, .proc_handler = proc_dointvec_ms_jiffies, }, - - { } }; static struct ctl_table_header *dccp_table_header; diff --git a/net/ieee802154/6lowpan/reassembly.c b/net/ieee802154/6lowpan/reassembly.c index 2a983cf450da..56ef873828f4 100644 --- a/net/ieee802154/6lowpan/reassembly.c +++ b/net/ieee802154/6lowpan/reassembly.c @@ -338,7 +338,6 @@ static struct ctl_table lowpan_frags_ns_ctl_table[] = { .mode = 0644, .proc_handler = proc_dointvec_jiffies, }, - { } }; /* secret interval has been deprecated */ @@ -351,7 +350,6 @@ static struct ctl_table lowpan_frags_ctl_table[] = { .mode = 0644, .proc_handler = proc_dointvec_jiffies, }, - { } }; static int __net_init lowpan_frags_ns_sysctl_register(struct net *net) @@ -370,10 +368,8 @@ static int __net_init lowpan_frags_ns_sysctl_register(struct net *net) goto err_alloc; /* Don't export sysctls to unprivileged users */ - if (net->user_ns != &init_user_ns) { - table[0].procname = NULL; + if (net->user_ns != &init_user_ns) table_size = 0; - } } table[0].data = &ieee802154_lowpan->fqdir->high_thresh; diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c index 5d2012d1cf4a..2dc7a908a6bb 100644 --- a/net/mpls/af_mpls.c +++ b/net/mpls/af_mpls.c @@ -1377,13 +1377,13 @@ static const struct ctl_table mpls_dev_table[] = { .proc_handler = mpls_conf_proc, .data = MPLS_PERDEV_SYSCTL_OFFSET(input_enabled), }, - { } }; static int mpls_dev_sysctl_register(struct net_device *dev, struct mpls_dev *mdev) { char path[sizeof("net/mpls/conf/") + IFNAMSIZ]; + size_t table_size = ARRAY_SIZE(mpls_dev_table); struct net *net = dev_net(dev); struct ctl_table *table; int i; @@ -1395,7 +1395,7 @@ static int mpls_dev_sysctl_register(struct net_device *dev, /* Table data contains only offsets relative to the base of * the mdev at this point, so make them absolute. */ - for (i = 0; i < ARRAY_SIZE(mpls_dev_table); i++) { + for (i = 0; i < table_size; i++) { table[i].data = (char *)mdev + (uintptr_t)table[i].data; table[i].extra1 = mdev; table[i].extra2 = net; @@ -1403,8 +1403,7 @@ static int mpls_dev_sysctl_register(struct net_device *dev, snprintf(path, sizeof(path), "net/mpls/conf/%s", dev->name); - mdev->sysctl = register_net_sysctl_sz(net, path, table, - ARRAY_SIZE(mpls_dev_table)); + mdev->sysctl = register_net_sysctl_sz(net, path, table, table_size); if (!mdev->sysctl) goto free; @@ -2653,11 +2652,11 @@ static const struct ctl_table mpls_table[] = { .extra1 = SYSCTL_ONE, .extra2 = &ttl_max, }, - { } }; static int mpls_net_init(struct net *net) { + size_t table_size = ARRAY_SIZE(mpls_table); struct ctl_table *table; int i; @@ -2673,11 +2672,11 @@ static int mpls_net_init(struct net *net) /* Table data contains only offsets relative to the base of * the mdev at this point, so make them absolute. */ - for (i = 0; i < ARRAY_SIZE(mpls_table) - 1; i++) + for (i = 0; i < table_size; i++) table[i].data = (char *)net + (uintptr_t)table[i].data; net->mpls.ctl = register_net_sysctl_sz(net, "net/mpls", table, - ARRAY_SIZE(mpls_table)); + table_size); if (net->mpls.ctl == NULL) { kfree(table); return -ENOMEM; diff --git a/net/unix/sysctl_net_unix.c b/net/unix/sysctl_net_unix.c index 44996af61999..357b3e5f3847 100644 --- a/net/unix/sysctl_net_unix.c +++ b/net/unix/sysctl_net_unix.c @@ -19,7 +19,6 @@ static struct ctl_table unix_table[] = { .mode = 0644, .proc_handler = proc_dointvec }, - { } }; int __net_init unix_sysctl_register(struct net *net) -- 2.43.0 From devnull+j.granados.samsung.com at kernel.org Wed May 1 02:29:30 2024 From: devnull+j.granados.samsung.com at kernel.org (Joel Granados via B4 Relay) Date: Wed, 01 May 2024 11:29:30 +0200 Subject: [PATCH net-next v6 6/8] netfilter: Remove the now superfluous sentinel elements from ctl_table array In-Reply-To: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> References: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> Message-ID: <20240501-jag-sysctl_remset_net-v6-6-370b702b6b4a@samsung.com> From: Joel Granados This commit comes at the tail end of a greater effort to remove the empty elements at the end of the ctl_table arrays (sentinels) which will reduce the overall build time size of the kernel and run time memory bloat by ~64 bytes per sentinel (further information Link : https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo at bombadil.infradead.org/) * Remove sentinel elements from ctl_table structs * Remove instances where an array element is zeroed out to make it look like a sentinel. This is not longer needed and is safe after commit c899710fe7f9 ("networking: Update to register_net_sysctl_sz") added the array size to the ctl_table registration * Remove the need for having __NF_SYSCTL_CT_LAST_SYSCTL as the sysctl array size is now in NF_SYSCTL_CT_LAST_SYSCTL * Remove extra element in ctl_table arrays declarations Acked-by: Kees Cook # loadpin & yama Signed-off-by: Joel Granados --- net/bridge/br_netfilter_hooks.c | 1 - net/ipv6/netfilter/nf_conntrack_reasm.c | 1 - net/netfilter/ipvs/ip_vs_ctl.c | 5 +---- net/netfilter/ipvs/ip_vs_lblc.c | 5 +---- net/netfilter/ipvs/ip_vs_lblcr.c | 5 +---- net/netfilter/nf_conntrack_standalone.c | 6 +----- net/netfilter/nf_log.c | 3 +-- 7 files changed, 5 insertions(+), 21 deletions(-) diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c index 7948a9e7542c..bf30c50b5689 100644 --- a/net/bridge/br_netfilter_hooks.c +++ b/net/bridge/br_netfilter_hooks.c @@ -1226,7 +1226,6 @@ static struct ctl_table brnf_table[] = { .mode = 0644, .proc_handler = brnf_sysctl_call_tables, }, - { } }; static inline void br_netfilter_sysctl_default(struct brnf_net *brnf) diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c b/net/ipv6/netfilter/nf_conntrack_reasm.c index ce8c14d8aff5..5e1b50c6a44d 100644 --- a/net/ipv6/netfilter/nf_conntrack_reasm.c +++ b/net/ipv6/netfilter/nf_conntrack_reasm.c @@ -62,7 +62,6 @@ static struct ctl_table nf_ct_frag6_sysctl_table[] = { .mode = 0644, .proc_handler = proc_doulongvec_minmax, }, - { } }; static int nf_ct_frag6_sysctl_register(struct net *net) diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index 143a341bbc0a..50b5dbe40eb8 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -2263,7 +2263,6 @@ static struct ctl_table vs_vars[] = { .proc_handler = proc_dointvec, }, #endif - { } }; #endif @@ -4286,10 +4285,8 @@ static int __net_init ip_vs_control_net_init_sysctl(struct netns_ipvs *ipvs) return -ENOMEM; /* Don't export sysctls to unprivileged users */ - if (net->user_ns != &init_user_ns) { - tbl[0].procname = NULL; + if (net->user_ns != &init_user_ns) ctl_table_size = 0; - } } else tbl = vs_vars; /* Initialize sysctl defaults */ diff --git a/net/netfilter/ipvs/ip_vs_lblc.c b/net/netfilter/ipvs/ip_vs_lblc.c index 8ceec7a2fa8f..2423513d701d 100644 --- a/net/netfilter/ipvs/ip_vs_lblc.c +++ b/net/netfilter/ipvs/ip_vs_lblc.c @@ -123,7 +123,6 @@ static struct ctl_table vs_vars_table[] = { .mode = 0644, .proc_handler = proc_dointvec_jiffies, }, - { } }; #endif @@ -563,10 +562,8 @@ static int __net_init __ip_vs_lblc_init(struct net *net) return -ENOMEM; /* Don't export sysctls to unprivileged users */ - if (net->user_ns != &init_user_ns) { - ipvs->lblc_ctl_table[0].procname = NULL; + if (net->user_ns != &init_user_ns) vars_table_size = 0; - } } else ipvs->lblc_ctl_table = vs_vars_table; diff --git a/net/netfilter/ipvs/ip_vs_lblcr.c b/net/netfilter/ipvs/ip_vs_lblcr.c index 0fb64707213f..cdb1d4bf6761 100644 --- a/net/netfilter/ipvs/ip_vs_lblcr.c +++ b/net/netfilter/ipvs/ip_vs_lblcr.c @@ -294,7 +294,6 @@ static struct ctl_table vs_vars_table[] = { .mode = 0644, .proc_handler = proc_dointvec_jiffies, }, - { } }; #endif @@ -749,10 +748,8 @@ static int __net_init __ip_vs_lblcr_init(struct net *net) return -ENOMEM; /* Don't export sysctls to unprivileged users */ - if (net->user_ns != &init_user_ns) { - ipvs->lblcr_ctl_table[0].procname = NULL; + if (net->user_ns != &init_user_ns) vars_table_size = 0; - } } else ipvs->lblcr_ctl_table = vs_vars_table; ipvs->sysctl_lblcr_expiration = DEFAULT_EXPIRATION; diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_conntrack_standalone.c index bb9dea676ec1..74112e9c5dab 100644 --- a/net/netfilter/nf_conntrack_standalone.c +++ b/net/netfilter/nf_conntrack_standalone.c @@ -616,11 +616,9 @@ enum nf_ct_sysctl_index { NF_SYSCTL_CT_LWTUNNEL, #endif - __NF_SYSCTL_CT_LAST_SYSCTL, + NF_SYSCTL_CT_LAST_SYSCTL, }; -#define NF_SYSCTL_CT_LAST_SYSCTL (__NF_SYSCTL_CT_LAST_SYSCTL + 1) - static struct ctl_table nf_ct_sysctl_table[] = { [NF_SYSCTL_CT_MAX] = { .procname = "nf_conntrack_max", @@ -957,7 +955,6 @@ static struct ctl_table nf_ct_sysctl_table[] = { .proc_handler = nf_hooks_lwtunnel_sysctl_handler, }, #endif - {} }; static struct ctl_table nf_ct_netfilter_table[] = { @@ -968,7 +965,6 @@ static struct ctl_table nf_ct_netfilter_table[] = { .mode = 0644, .proc_handler = proc_dointvec, }, - { } }; static void nf_conntrack_standalone_init_tcp_sysctl(struct net *net, diff --git a/net/netfilter/nf_log.c b/net/netfilter/nf_log.c index efedd2f13ac7..769fd7680fac 100644 --- a/net/netfilter/nf_log.c +++ b/net/netfilter/nf_log.c @@ -395,7 +395,7 @@ static const struct seq_operations nflog_seq_ops = { #ifdef CONFIG_SYSCTL static char nf_log_sysctl_fnames[NFPROTO_NUMPROTO-NFPROTO_UNSPEC][3]; -static struct ctl_table nf_log_sysctl_table[NFPROTO_NUMPROTO+1]; +static struct ctl_table nf_log_sysctl_table[NFPROTO_NUMPROTO]; static struct ctl_table_header *nf_log_sysctl_fhdr; static struct ctl_table nf_log_sysctl_ftable[] = { @@ -406,7 +406,6 @@ static struct ctl_table nf_log_sysctl_ftable[] = { .mode = 0644, .proc_handler = proc_dointvec, }, - { } }; static int nf_log_proc_dostring(struct ctl_table *table, int write, -- 2.43.0 From devnull+j.granados.samsung.com at kernel.org Wed May 1 02:29:29 2024 From: devnull+j.granados.samsung.com at kernel.org (Joel Granados via B4 Relay) Date: Wed, 01 May 2024 11:29:29 +0200 Subject: [PATCH net-next v6 5/8] net: Remove ctl_table sentinel elements from several networking subsystems In-Reply-To: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> References: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> Message-ID: <20240501-jag-sysctl_remset_net-v6-5-370b702b6b4a@samsung.com> From: Joel Granados This commit comes at the tail end of a greater effort to remove the empty elements at the end of the ctl_table arrays (sentinels) which will reduce the overall build time size of the kernel and run time memory bloat by ~64 bytes per sentinel (further information Link : https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo at bombadil.infradead.org/) To avoid lots of small commits, this commit brings together network changes from (as they appear in MAINTAINERS) LLC, MPTCP, NETROM NETWORK LAYER, PHONET PROTOCOL, ROSE NETWORK LAYER, RXRPC SOCKETS, SCTP PROTOCOL, SHARED MEMORY COMMUNICATIONS (SMC), TIPC NETWORK LAYER and NETWORKING [IPSEC] * Remove sentinel element from ctl_table structs. * Replace empty array registration with the register_net_sysctl_sz call in llc_sysctl_init * Replace the for loop stop condition that tests for procname == NULL with one that depends on array size in sctp_sysctl_net_register * Remove instances where an array element is zeroed out to make it look like a sentinel in xfrm_sysctl_init. This is not longer needed and is safe after commit c899710fe7f9 ("networking: Update to register_net_sysctl_sz") added the array size to the ctl_table registration * Use a table_size variable to keep the value of ARRAY_SIZE Signed-off-by: Joel Granados --- net/llc/sysctl_net_llc.c | 8 ++------ net/mptcp/ctrl.c | 1 - net/netrom/sysctl_net_netrom.c | 1 - net/phonet/sysctl.c | 1 - net/rose/sysctl_net_rose.c | 1 - net/rxrpc/sysctl.c | 1 - net/sctp/sysctl.c | 10 +++------- net/smc/smc_sysctl.c | 6 +++--- net/tipc/sysctl.c | 1 - net/xfrm/xfrm_sysctl.c | 5 +---- 10 files changed, 9 insertions(+), 26 deletions(-) diff --git a/net/llc/sysctl_net_llc.c b/net/llc/sysctl_net_llc.c index 8443a6d841b0..72e101135f8c 100644 --- a/net/llc/sysctl_net_llc.c +++ b/net/llc/sysctl_net_llc.c @@ -44,11 +44,6 @@ static struct ctl_table llc2_timeout_table[] = { .mode = 0644, .proc_handler = proc_dointvec_jiffies, }, - { }, -}; - -static struct ctl_table llc_station_table[] = { - { }, }; static struct ctl_table_header *llc2_timeout_header; @@ -56,8 +51,9 @@ static struct ctl_table_header *llc_station_header; int __init llc_sysctl_init(void) { + struct ctl_table empty[1] = {}; llc2_timeout_header = register_net_sysctl(&init_net, "net/llc/llc2/timeout", llc2_timeout_table); - llc_station_header = register_net_sysctl(&init_net, "net/llc/station", llc_station_table); + llc_station_header = register_net_sysctl_sz(&init_net, "net/llc/station", empty, 0); if (!llc2_timeout_header || !llc_station_header) { llc_sysctl_exit(); diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c index 8d661156ab8c..f4e7a53acc5a 100644 --- a/net/mptcp/ctrl.c +++ b/net/mptcp/ctrl.c @@ -156,7 +156,6 @@ static struct ctl_table mptcp_sysctl_table[] = { .mode = 0644, .proc_handler = proc_dointvec_jiffies, }, - {} }; static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet) diff --git a/net/netrom/sysctl_net_netrom.c b/net/netrom/sysctl_net_netrom.c index 79fb2d3f477b..7dc0fa628f2e 100644 --- a/net/netrom/sysctl_net_netrom.c +++ b/net/netrom/sysctl_net_netrom.c @@ -140,7 +140,6 @@ static struct ctl_table nr_table[] = { .extra1 = &min_reset, .extra2 = &max_reset }, - { } }; int __init nr_register_sysctl(void) diff --git a/net/phonet/sysctl.c b/net/phonet/sysctl.c index 0d0bf41381c2..82fc22467a09 100644 --- a/net/phonet/sysctl.c +++ b/net/phonet/sysctl.c @@ -81,7 +81,6 @@ static struct ctl_table phonet_table[] = { .mode = 0644, .proc_handler = proc_local_port_range, }, - { } }; int __init phonet_sysctl_init(void) diff --git a/net/rose/sysctl_net_rose.c b/net/rose/sysctl_net_rose.c index d391d7758f52..d801315b7083 100644 --- a/net/rose/sysctl_net_rose.c +++ b/net/rose/sysctl_net_rose.c @@ -112,7 +112,6 @@ static struct ctl_table rose_table[] = { .extra1 = &min_window, .extra2 = &max_window }, - { } }; void __init rose_register_sysctl(void) diff --git a/net/rxrpc/sysctl.c b/net/rxrpc/sysctl.c index c9bedd0e2d86..9bf9a1f6e4cb 100644 --- a/net/rxrpc/sysctl.c +++ b/net/rxrpc/sysctl.c @@ -127,7 +127,6 @@ static struct ctl_table rxrpc_sysctl_table[] = { .extra1 = (void *)SYSCTL_ONE, .extra2 = (void *)&four, }, - { } }; int __init rxrpc_sysctl_init(void) diff --git a/net/sctp/sysctl.c b/net/sctp/sysctl.c index 25bdf17c7262..61c6f3027e7f 100644 --- a/net/sctp/sysctl.c +++ b/net/sctp/sysctl.c @@ -80,8 +80,6 @@ static struct ctl_table sctp_table[] = { .mode = 0644, .proc_handler = proc_dointvec, }, - - { /* sentinel */ } }; /* The following index defines are used in sctp_sysctl_net_register(). @@ -384,8 +382,6 @@ static struct ctl_table sctp_net_table[] = { .extra1 = SYSCTL_ZERO, .extra2 = &pf_expose_max, }, - - { /* sentinel */ } }; static int proc_sctp_do_hmac_alg(struct ctl_table *ctl, int write, @@ -597,6 +593,7 @@ static int proc_sctp_do_probe_interval(struct ctl_table *ctl, int write, int sctp_sysctl_net_register(struct net *net) { + size_t table_size = ARRAY_SIZE(sctp_net_table); struct ctl_table *table; int i; @@ -604,7 +601,7 @@ int sctp_sysctl_net_register(struct net *net) if (!table) return -ENOMEM; - for (i = 0; table[i].data; i++) + for (i = 0; i < table_size; i++) table[i].data += (char *)(&net->sctp) - (char *)&init_net.sctp; table[SCTP_RTO_MIN_IDX].extra2 = &net->sctp.rto_max; @@ -613,8 +610,7 @@ int sctp_sysctl_net_register(struct net *net) table[SCTP_PS_RETRANS_IDX].extra1 = &net->sctp.pf_retrans; net->sctp.sysctl_header = register_net_sysctl_sz(net, "net/sctp", - table, - ARRAY_SIZE(sctp_net_table)); + table, table_size); if (net->sctp.sysctl_header == NULL) { kfree(table); return -ENOMEM; diff --git a/net/smc/smc_sysctl.c b/net/smc/smc_sysctl.c index 4e8baa2e7ea4..13f2bc092db1 100644 --- a/net/smc/smc_sysctl.c +++ b/net/smc/smc_sysctl.c @@ -90,11 +90,11 @@ static struct ctl_table smc_table[] = { .extra1 = &conns_per_lgr_min, .extra2 = &conns_per_lgr_max, }, - { } }; int __net_init smc_sysctl_net_init(struct net *net) { + size_t table_size = ARRAY_SIZE(smc_table); struct ctl_table *table; table = smc_table; @@ -105,12 +105,12 @@ int __net_init smc_sysctl_net_init(struct net *net) if (!table) goto err_alloc; - for (i = 0; i < ARRAY_SIZE(smc_table) - 1; i++) + for (i = 0; i < table_size; i++) table[i].data += (void *)net - (void *)&init_net; } net->smc.smc_hdr = register_net_sysctl_sz(net, "net/smc", table, - ARRAY_SIZE(smc_table)); + table_size); if (!net->smc.smc_hdr) goto err_reg; diff --git a/net/tipc/sysctl.c b/net/tipc/sysctl.c index 9fb65c988f7f..30d2e06e3d8c 100644 --- a/net/tipc/sysctl.c +++ b/net/tipc/sysctl.c @@ -91,7 +91,6 @@ static struct ctl_table tipc_table[] = { .mode = 0644, .proc_handler = proc_doulongvec_minmax, }, - {} }; int tipc_register_sysctl(void) diff --git a/net/xfrm/xfrm_sysctl.c b/net/xfrm/xfrm_sysctl.c index e972930c292b..ca003e8a0376 100644 --- a/net/xfrm/xfrm_sysctl.c +++ b/net/xfrm/xfrm_sysctl.c @@ -38,7 +38,6 @@ static struct ctl_table xfrm_table[] = { .mode = 0644, .proc_handler = proc_dointvec }, - {} }; int __net_init xfrm_sysctl_init(struct net *net) @@ -57,10 +56,8 @@ int __net_init xfrm_sysctl_init(struct net *net) table[3].data = &net->xfrm.sysctl_acq_expires; /* Don't export sysctls to unprivileged users */ - if (net->user_ns != &init_user_ns) { - table[0].procname = NULL; + if (net->user_ns != &init_user_ns) table_size = 0; - } net->xfrm.sysctl_hdr = register_net_sysctl_sz(net, "net/core", table, table_size); -- 2.43.0 From devnull+j.granados.samsung.com at kernel.org Wed May 1 02:29:32 2024 From: devnull+j.granados.samsung.com at kernel.org (Joel Granados via B4 Relay) Date: Wed, 01 May 2024 11:29:32 +0200 Subject: [PATCH net-next v6 8/8] ax.25: x.25: Remove the now superfluous sentinel elements from ctl_table array In-Reply-To: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> References: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> Message-ID: <20240501-jag-sysctl_remset_net-v6-8-370b702b6b4a@samsung.com> From: Joel Granados This commit comes at the tail end of a greater effort to remove the empty elements at the end of the ctl_table arrays (sentinels) which will reduce the overall build time size of the kernel and run time memory bloat by ~64 bytes per sentinel (further information Link : https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo at bombadil.infradead.org/) Avoid a buffer overflow when traversing the ctl_table by ensuring that AX25_MAX_VALUES is the same as the size of ax25_param_table. This is done with a BUILD_BUG_ON where ax25_param_table is defined and a CONFIG_AX25_DAMA_SLAVE guard in the unnamed enum definition as well as in the ax25_dev_device_up and ax25_ds_set_timer functions. The overflow happened when the sentinel was removed from ax25_param_table. The sentinel's data element was changed when CONFIG_AX25_DAMA_SLAVE was undefined. This had no adverse effects as it still stopped on the sentinel's null procname but needed to be addressed once the sentinel was removed. Signed-off-by: Joel Granados --- include/net/ax25.h | 2 ++ net/ax25/ax25_dev.c | 3 +++ net/ax25/ax25_ds_timer.c | 1 + net/ax25/sysctl_net_ax25.c | 3 +-- net/x25/sysctl_net_x25.c | 1 - 5 files changed, 7 insertions(+), 3 deletions(-) diff --git a/include/net/ax25.h b/include/net/ax25.h index 0d939e5aee4e..eb9cee8252c8 100644 --- a/include/net/ax25.h +++ b/include/net/ax25.h @@ -139,7 +139,9 @@ enum { AX25_VALUES_N2, /* Default N2 value */ AX25_VALUES_PACLEN, /* AX.25 MTU */ AX25_VALUES_PROTOCOL, /* Std AX.25, DAMA Slave, DAMA Master */ +#ifdef CONFIG_AX25_DAMA_SLAVE AX25_VALUES_DS_TIMEOUT, /* DAMA Slave timeout */ +#endif AX25_MAX_VALUES /* THIS MUST REMAIN THE LAST ENTRY OF THIS LIST */ }; diff --git a/net/ax25/ax25_dev.c b/net/ax25/ax25_dev.c index 282ec581c072..0bc682ffae9c 100644 --- a/net/ax25/ax25_dev.c +++ b/net/ax25/ax25_dev.c @@ -78,7 +78,10 @@ void ax25_dev_device_up(struct net_device *dev) ax25_dev->values[AX25_VALUES_N2] = AX25_DEF_N2; ax25_dev->values[AX25_VALUES_PACLEN] = AX25_DEF_PACLEN; ax25_dev->values[AX25_VALUES_PROTOCOL] = AX25_DEF_PROTOCOL; + +#ifdef CONFIG_AX25_DAMA_SLAVE ax25_dev->values[AX25_VALUES_DS_TIMEOUT]= AX25_DEF_DS_TIMEOUT; +#endif #if defined(CONFIG_AX25_DAMA_SLAVE) || defined(CONFIG_AX25_DAMA_MASTER) ax25_ds_setup_timer(ax25_dev); diff --git a/net/ax25/ax25_ds_timer.c b/net/ax25/ax25_ds_timer.c index c4f8adbf8144..c50a58d9e368 100644 --- a/net/ax25/ax25_ds_timer.c +++ b/net/ax25/ax25_ds_timer.c @@ -55,6 +55,7 @@ void ax25_ds_set_timer(ax25_dev *ax25_dev) ax25_dev->dama.slave_timeout = msecs_to_jiffies(ax25_dev->values[AX25_VALUES_DS_TIMEOUT]) / 10; mod_timer(&ax25_dev->dama.slave_timer, jiffies + HZ); + return; } /* diff --git a/net/ax25/sysctl_net_ax25.c b/net/ax25/sysctl_net_ax25.c index e0128dc9def3..68753aa30334 100644 --- a/net/ax25/sysctl_net_ax25.c +++ b/net/ax25/sysctl_net_ax25.c @@ -141,8 +141,6 @@ static const struct ctl_table ax25_param_table[] = { .extra2 = &max_ds_timeout }, #endif - - { } /* that's all, folks! */ }; int ax25_register_dev_sysctl(ax25_dev *ax25_dev) @@ -155,6 +153,7 @@ int ax25_register_dev_sysctl(ax25_dev *ax25_dev) if (!table) return -ENOMEM; + BUILD_BUG_ON(ARRAY_SIZE(ax25_param_table) != AX25_MAX_VALUES); for (k = 0; k < AX25_MAX_VALUES; k++) table[k].data = &ax25_dev->values[k]; diff --git a/net/x25/sysctl_net_x25.c b/net/x25/sysctl_net_x25.c index e9802afa43d0..643f50874dfe 100644 --- a/net/x25/sysctl_net_x25.c +++ b/net/x25/sysctl_net_x25.c @@ -71,7 +71,6 @@ static struct ctl_table x25_table[] = { .mode = 0644, .proc_handler = proc_dointvec, }, - { }, }; int __init x25_register_sysctl(void) -- 2.43.0 From devnull+j.granados.samsung.com at kernel.org Wed May 1 02:29:24 2024 From: devnull+j.granados.samsung.com at kernel.org (Joel Granados via B4 Relay) Date: Wed, 01 May 2024 11:29:24 +0200 Subject: [PATCH net-next v6 0/8] sysctl: Remove sentinel elements from networking Message-ID: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> From: Joel Granados What? These commits remove the sentinel element (last empty element) from the sysctl arrays of all the files under the "net/" directory that register a sysctl array. The merging of the preparation patches [4] to mainline allows us to just remove sentinel elements without changing behavior. This is safe because the sysctl registration code (register_sysctl() and friends) use the array size in addition to checking for a sentinel [1]. Why? By removing the sysctl sentinel elements we avoid kernel bloat as ctl_table arrays get moved out of kernel/sysctl.c into their own respective subsystems. This move was started long ago to avoid merge conflicts; the sentinel removal bit came after Mathew Wilcox suggested it to avoid bloating the kernel by one element as arrays moved out. This patchset will reduce the overall build time size of the kernel and run time memory bloat by about ~64 bytes per declared ctl_table array (more info here [5]). When are we done? There are 4 patchest (25 commits [2]) that are still outstanding to completely remove the sentinels: files under "net/" (this patchset), files under "kernel/" dir, misc dirs (files under mm/ security/ and others) and the final set that removes the unneeded check for ->procname == NULL. Testing: * Ran sysctl selftests (./tools/testing/selftests/sysctl/sysctl.sh) * Ran this through 0-day with no errors or warnings Savings in vmlinux: A total of 64 bytes per sentinel is saved after removal; I measured in x86_64 to give an idea of the aggregated savings. The actual savings will depend on individual kernel configuration. * bloat-o-meter - The "yesall" config saves 3976 bytes (bloat-o-meter output [6]) - A reduced config [3] saves 1263 bytes (bloat-o-meter output [7]) Savings in allocated memory: None in this set but will occur when the superfluous allocations are removed from proc_sysctl.c. I include it here for context. The estimated savings during boot for config [3] are 6272 bytes. See [8] for how to measure it. Comments/feedback greatly appreciated Changes in v6: - Rebased onto net-next/main. - Besides re-running my cocci scripts, I ran a new find script [9]. Found 0 hits in net/ - Moved "i" variable declaraction out of for() in sysctl_core_net_init - Removed forgotten sentinel in mpls_table - Removed CONFIG_AX25_DAMA_SLAVE guard from net/ax25/ax25_ds_timer.c. It is not needed because that file is compiled only when CONFIG_AX25_DAMA_SLAVE is set. - When traversing smc_table, stop on ARRAY_SIZE instead of ARRAY_SIZE-1. - Link to v5: https://lore.kernel.org/r/20240426-jag-sysctl_remset_net-v5-0-e3b12f6111a6 at samsung.com Changes in v5: - Added net files with additional variable to my test .config so the typo can be caught next time. - Fixed typo tabel_size -> table_size - Link to v4: https://lore.kernel.org/r/20240425-jag-sysctl_remset_net-v4-0-9e82f985777d at samsung.com Changes in v4: - Keep reverse xmas tree order when introducing new variables - Use a table_size variable to keep the value of ARRAY_SIZE - Separated the original "networking: Remove the now superfluous sentinel elements from ctl_table arra" into smaller commits to ease review - Merged x.25 and ax.25 commits together. - Removed any SOB from the commits that were changed - Link to v3: https://lore.kernel.org/r/20240412-jag-sysctl_remset_net-v3-0-11187d13c211 at samsung.com Changes in v3: - Reworkded ax.25 - Added a BUILD_BUG_ON for the ax.25 commit - Added a CONFIG_AX25_DAMA_SLAVE guard where needed - Link to v2: https://lore.kernel.org/r/20240328-jag-sysctl_remset_net-v2-0-52c9fad9a1af at samsung.com Changes in v2: - Rebased to v6.9-rc1 - Removed unneeded comment from sysctl_net_ax25.c - Link to v1: https://lore.kernel.org/r/20240314-jag-sysctl_remset_net-v1-0-aa26b44d29d9 at samsung.com Best Joel [1] https://lore.kernel.org/all/20230809105006.1198165-1-j.granados at samsung.com/ [2] https://git.kernel.org/pub/scm/linux/kernel/git/joel.granados/linux.git/tag/?h=sysctl_remove_empty_elem_v5 [3] https://gist.github.com/Joelgranados/feaca7af5537156ca9b73aeaec093171 [4] https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo at bombadil.infradead.org/ [5] Links Related to the ctl_table sentinel removal: * Good summaries from Luis: https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo at bombadil.infradead.org/ https://lore.kernel.org/all/ZMFizKFkVxUFtSqa at bombadil.infradead.org/ * Patches adjusting sysctl register calls: https://lore.kernel.org/all/20230302204612.782387-1-mcgrof at kernel.org/ https://lore.kernel.org/all/20230302202826.776286-1-mcgrof at kernel.org/ * Discussions about expectations and approach https://lore.kernel.org/all/20230321130908.6972-1-frank.li at vivo.com https://lore.kernel.org/all/20220220060626.15885-1-tangmeng at uniontech.com [6] add/remove: 0/1 grow/shrink: 2/67 up/down: 76/-4052 (-3976) Function old new delta llc_sysctl_init 306 377 +71 nf_log_net_init 866 871 +5 sysctl_core_net_init 375 366 -9 lowpan_frags_init_net 618 598 -20 ip_vs_control_net_init_sysctl 2446 2422 -24 sysctl_route_net_init 521 493 -28 __addrconf_sysctl_register 678 650 -28 xfrm_sysctl_init 405 374 -31 mpls_net_init 367 334 -33 sctp_sysctl_net_register 386 346 -40 __ip_vs_lblcr_init 546 501 -45 __ip_vs_lblc_init 546 501 -45 neigh_sysctl_register 1011 958 -53 mpls_dev_sysctl_register 475 419 -56 ipv6_route_sysctl_init 450 394 -56 xs_tunables_table 448 384 -64 xr_tunables_table 448 384 -64 xfrm_table 320 256 -64 xfrm6_policy_table 128 64 -64 xfrm4_policy_table 128 64 -64 x25_table 448 384 -64 vs_vars 1984 1920 -64 unix_table 128 64 -64 tipc_table 448 384 -64 svcrdma_parm_table 832 768 -64 smc_table 512 448 -64 sctp_table 256 192 -64 sctp_net_table 2304 2240 -64 rxrpc_sysctl_table 704 640 -64 rose_table 704 640 -64 rds_tcp_sysctl_table 192 128 -64 rds_sysctl_rds_table 384 320 -64 rds_ib_sysctl_table 384 320 -64 phonet_table 128 64 -64 nr_table 832 768 -64 nf_log_sysctl_table 768 704 -64 nf_log_sysctl_ftable 128 64 -64 nf_ct_sysctl_table 3200 3136 -64 nf_ct_netfilter_table 128 64 -64 nf_ct_frag6_sysctl_table 256 192 -64 netns_core_table 320 256 -64 net_core_table 2176 2112 -64 neigh_sysctl_template 1416 1352 -64 mptcp_sysctl_table 576 512 -64 mpls_dev_table 128 64 -64 lowpan_frags_ns_ctl_table 256 192 -64 lowpan_frags_ctl_table 128 64 -64 llc_station_table 64 - -64 llc2_timeout_table 320 256 -64 ipv6_table_template 1344 1280 -64 ipv6_route_table_template 768 704 -64 ipv6_rotable 320 256 -64 ipv6_icmp_table_template 448 384 -64 ipv4_table 1024 960 -64 ipv4_route_table 832 768 -64 ipv4_route_netns_table 320 256 -64 ipv4_net_table 7552 7488 -64 ip6_frags_ns_ctl_table 256 192 -64 ip6_frags_ctl_table 128 64 -64 ip4_frags_ns_ctl_table 320 256 -64 ip4_frags_ctl_table 128 64 -64 devinet_sysctl 2184 2120 -64 debug_table 384 320 -64 dccp_default_table 576 512 -64 ctl_forward_entry 128 64 -64 brnf_table 448 384 -64 ax25_param_table 960 896 -64 atalk_table 320 256 -64 addrconf_sysctl 3904 3840 -64 vs_vars_table 256 128 -128 Total: Before=440631035, After=440627059, chg -0.00% [7] add/remove: 0/0 grow/shrink: 1/22 up/down: 8/-1263 (-1255) Function old new delta sysctl_route_net_init 189 197 +8 __addrconf_sysctl_register 306 294 -12 ipv6_route_sysctl_init 201 185 -16 neigh_sysctl_register 385 366 -19 unix_table 128 64 -64 netns_core_table 256 192 -64 net_core_table 1664 1600 -64 neigh_sysctl_template 1416 1352 -64 ipv6_table_template 1344 1280 -64 ipv6_route_table_template 768 704 -64 ipv6_rotable 192 128 -64 ipv6_icmp_table_template 448 384 -64 ipv4_table 768 704 -64 ipv4_route_table 832 768 -64 ipv4_route_netns_table 320 256 -64 ipv4_net_table 7040 6976 -64 ip6_frags_ns_ctl_table 256 192 -64 ip6_frags_ctl_table 128 64 -64 ip4_frags_ns_ctl_table 320 256 -64 ip4_frags_ctl_table 128 64 -64 devinet_sysctl 2184 2120 -64 ctl_forward_entry 128 64 -64 addrconf_sysctl 3392 3328 -64 Total: Before=8523801, After=8522546, chg -0.01% [8] To measure the in memory savings apply this on top of this patchset. " diff --git i/fs/proc/proc_sysctl.c w/fs/proc/proc_sysctl.c index 37cde0efee57..896c498600e8 100644 --- i/fs/proc/proc_sysctl.c +++ w/fs/proc/proc_sysctl.c @@ -966,6 +966,7 @@ static struct ctl_dir *new_dir(struct ctl_table_set *set, table[0].procname = new_name; table[0].mode = S_IFDIR|S_IRUGO|S_IXUGO; init_header(&new->header, set->dir.header.root, set, node, table, 1); + printk("%ld sysctl saved mem kzalloc\n", sizeof(struct ctl_table)); return new; } @@ -1189,6 +1190,7 @@ static struct ctl_table_header *new_links(struct ctl_dir *dir, s> link_name += len; link++; } + printk("%ld sysctl saved mem kzalloc\n", sizeof(struct ctl_table)); init_header(links, dir->header.root, dir->header.set, node, link_table, head->ctl_table_size); links->nreg = nr_entries; " and then run the following bash script in the kernel: ``` accum=0 for n in $(dmesg | grep kzalloc | awk '{print $3}') ; do accum=$(calc "$accum + $n") done echo $accum ``` [9] ``` #!/usr/bin/gawk -f BEGINFILE { RS="," has_struct = 0 } /(static )?(const )?struct ctl_table/ { has_struct = 1 } has_struct && /^(\n)?[\t ]*{(\n)*[\t ]*}/ { print "Filename : " FILENAME ", Record Number : " FNR } ``` Signed-off-by: Joel Granados -- --- --- Joel Granados (8): net: Remove the now superfluous sentinel elements from ctl_table array net: ipv{6,4}: Remove the now superfluous sentinel elements from ctl_table array net: rds: Remove the now superfluous sentinel elements from ctl_table array net: sunrpc: Remove the now superfluous sentinel elements from ctl_table array net: Remove ctl_table sentinel elements from several networking subsystems netfilter: Remove the now superfluous sentinel elements from ctl_table array appletalk: Remove the now superfluous sentinel elements from ctl_table array ax.25: x.25: Remove the now superfluous sentinel elements from ctl_table array include/net/ax25.h | 2 ++ net/appletalk/sysctl_net_atalk.c | 1 - net/ax25/ax25_dev.c | 3 +++ net/ax25/ax25_ds_timer.c | 1 + net/ax25/sysctl_net_ax25.c | 3 +-- net/bridge/br_netfilter_hooks.c | 1 - net/core/neighbour.c | 5 +---- net/core/sysctl_net_core.c | 13 ++++++------- net/dccp/sysctl.c | 2 -- net/ieee802154/6lowpan/reassembly.c | 6 +----- net/ipv4/devinet.c | 5 ++--- net/ipv4/ip_fragment.c | 2 -- net/ipv4/route.c | 8 ++------ net/ipv4/sysctl_net_ipv4.c | 7 +++---- net/ipv4/xfrm4_policy.c | 1 - net/ipv6/addrconf.c | 8 +++----- net/ipv6/icmp.c | 1 - net/ipv6/netfilter/nf_conntrack_reasm.c | 1 - net/ipv6/reassembly.c | 2 -- net/ipv6/route.c | 5 ----- net/ipv6/sysctl_net_ipv6.c | 8 +++----- net/ipv6/xfrm6_policy.c | 1 - net/llc/sysctl_net_llc.c | 8 ++------ net/mpls/af_mpls.c | 13 ++++++------- net/mptcp/ctrl.c | 1 - net/netfilter/ipvs/ip_vs_ctl.c | 5 +---- net/netfilter/ipvs/ip_vs_lblc.c | 5 +---- net/netfilter/ipvs/ip_vs_lblcr.c | 5 +---- net/netfilter/nf_conntrack_standalone.c | 6 +----- net/netfilter/nf_log.c | 3 +-- net/netrom/sysctl_net_netrom.c | 1 - net/phonet/sysctl.c | 1 - net/rds/ib_sysctl.c | 1 - net/rds/sysctl.c | 1 - net/rds/tcp.c | 1 - net/rose/sysctl_net_rose.c | 1 - net/rxrpc/sysctl.c | 1 - net/sctp/sysctl.c | 10 +++------- net/smc/smc_sysctl.c | 6 +++--- net/sunrpc/sysctl.c | 1 - net/sunrpc/xprtrdma/svc_rdma.c | 1 - net/sunrpc/xprtrdma/transport.c | 1 - net/sunrpc/xprtsock.c | 1 - net/tipc/sysctl.c | 1 - net/unix/sysctl_net_unix.c | 1 - net/x25/sysctl_net_x25.c | 1 - net/xfrm/xfrm_sysctl.c | 5 +---- 47 files changed, 48 insertions(+), 119 deletions(-) --- base-commit: c2e6a872bde9912f1a7579639c5ca3adf1003916 change-id: 20240311-jag-sysctl_remset_net-d403a1a93d6b Best regards, -- Joel Granados From devnull+j.granados.samsung.com at kernel.org Wed May 1 02:29:28 2024 From: devnull+j.granados.samsung.com at kernel.org (Joel Granados via B4 Relay) Date: Wed, 01 May 2024 11:29:28 +0200 Subject: [PATCH net-next v6 4/8] net: sunrpc: Remove the now superfluous sentinel elements from ctl_table array In-Reply-To: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> References: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> Message-ID: <20240501-jag-sysctl_remset_net-v6-4-370b702b6b4a@samsung.com> From: Joel Granados This commit comes at the tail end of a greater effort to remove the empty elements at the end of the ctl_table arrays (sentinels) which will reduce the overall build time size of the kernel and run time memory bloat by ~64 bytes per sentinel (further information Link : https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo at bombadil.infradead.org/) * Remove sentinel element from ctl_table structs. Signed-off-by: Joel Granados --- net/sunrpc/sysctl.c | 1 - net/sunrpc/xprtrdma/svc_rdma.c | 1 - net/sunrpc/xprtrdma/transport.c | 1 - net/sunrpc/xprtsock.c | 1 - 4 files changed, 4 deletions(-) diff --git a/net/sunrpc/sysctl.c b/net/sunrpc/sysctl.c index 93941ab12549..5f3170a1c9bb 100644 --- a/net/sunrpc/sysctl.c +++ b/net/sunrpc/sysctl.c @@ -160,7 +160,6 @@ static struct ctl_table debug_table[] = { .mode = 0444, .proc_handler = proc_do_xprt, }, - { } }; void diff --git a/net/sunrpc/xprtrdma/svc_rdma.c b/net/sunrpc/xprtrdma/svc_rdma.c index f86970733eb0..474f7a98fe9e 100644 --- a/net/sunrpc/xprtrdma/svc_rdma.c +++ b/net/sunrpc/xprtrdma/svc_rdma.c @@ -209,7 +209,6 @@ static struct ctl_table svcrdma_parm_table[] = { .extra1 = &zero, .extra2 = &zero, }, - { }, }; static void svc_rdma_proc_cleanup(void) diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c index 29b0562d62e7..9a8ce5df83ca 100644 --- a/net/sunrpc/xprtrdma/transport.c +++ b/net/sunrpc/xprtrdma/transport.c @@ -137,7 +137,6 @@ static struct ctl_table xr_tunables_table[] = { .mode = 0644, .proc_handler = proc_dointvec, }, - { }, }; #endif diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index bb9b747d58a1..f62f7b65455b 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -160,7 +160,6 @@ static struct ctl_table xs_tunables_table[] = { .mode = 0644, .proc_handler = proc_dointvec_jiffies, }, - { }, }; /* -- 2.43.0 From devnull+j.granados.samsung.com at kernel.org Wed May 1 02:29:27 2024 From: devnull+j.granados.samsung.com at kernel.org (Joel Granados via B4 Relay) Date: Wed, 01 May 2024 11:29:27 +0200 Subject: [PATCH net-next v6 3/8] net: rds: Remove the now superfluous sentinel elements from ctl_table array In-Reply-To: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> References: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> Message-ID: <20240501-jag-sysctl_remset_net-v6-3-370b702b6b4a@samsung.com> From: Joel Granados This commit comes at the tail end of a greater effort to remove the empty elements at the end of the ctl_table arrays (sentinels) which will reduce the overall build time size of the kernel and run time memory bloat by ~64 bytes per sentinel (further information Link : https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo at bombadil.infradead.org/) * Remove sentinel element from ctl_table structs. Signed-off-by: Joel Granados --- net/rds/ib_sysctl.c | 1 - net/rds/sysctl.c | 1 - net/rds/tcp.c | 1 - 3 files changed, 3 deletions(-) diff --git a/net/rds/ib_sysctl.c b/net/rds/ib_sysctl.c index e4e41b3afce7..2af678e71e3c 100644 --- a/net/rds/ib_sysctl.c +++ b/net/rds/ib_sysctl.c @@ -103,7 +103,6 @@ static struct ctl_table rds_ib_sysctl_table[] = { .mode = 0644, .proc_handler = proc_dointvec, }, - { } }; void rds_ib_sysctl_exit(void) diff --git a/net/rds/sysctl.c b/net/rds/sysctl.c index e381bbcd9cc1..025f518a4349 100644 --- a/net/rds/sysctl.c +++ b/net/rds/sysctl.c @@ -89,7 +89,6 @@ static struct ctl_table rds_sysctl_rds_table[] = { .mode = 0644, .proc_handler = proc_dointvec, }, - { } }; void rds_sysctl_exit(void) diff --git a/net/rds/tcp.c b/net/rds/tcp.c index 2dba7505b414..d8111ac83bb6 100644 --- a/net/rds/tcp.c +++ b/net/rds/tcp.c @@ -86,7 +86,6 @@ static struct ctl_table rds_tcp_sysctl_table[] = { .proc_handler = rds_tcp_skbuf_handler, .extra1 = &rds_tcp_min_rcvbuf, }, - { } }; u32 rds_tcp_write_seq(struct rds_tcp_connection *tc) -- 2.43.0 From devnull+j.granados.samsung.com at kernel.org Wed May 1 02:29:31 2024 From: devnull+j.granados.samsung.com at kernel.org (Joel Granados via B4 Relay) Date: Wed, 01 May 2024 11:29:31 +0200 Subject: [PATCH net-next v6 7/8] appletalk: Remove the now superfluous sentinel elements from ctl_table array In-Reply-To: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> References: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> Message-ID: <20240501-jag-sysctl_remset_net-v6-7-370b702b6b4a@samsung.com> From: Joel Granados This commit comes at the tail end of a greater effort to remove the empty elements at the end of the ctl_table arrays (sentinels) which will reduce the overall build time size of the kernel and run time memory bloat by ~64 bytes per sentinel (further information Link : https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo at bombadil.infradead.org/) Remove sentinel from atalk_table ctl_table array. Acked-by: Kees Cook # loadpin & yama Signed-off-by: Joel Granados --- net/appletalk/sysctl_net_atalk.c | 1 - 1 file changed, 1 deletion(-) diff --git a/net/appletalk/sysctl_net_atalk.c b/net/appletalk/sysctl_net_atalk.c index d945b7c0176d..7aebfe903242 100644 --- a/net/appletalk/sysctl_net_atalk.c +++ b/net/appletalk/sysctl_net_atalk.c @@ -40,7 +40,6 @@ static struct ctl_table atalk_table[] = { .mode = 0644, .proc_handler = proc_dointvec_jiffies, }, - { }, }; static struct ctl_table_header *atalk_table_header; -- 2.43.0 From dhowells at redhat.com Wed May 1 04:51:13 2024 From: dhowells at redhat.com (David Howells) Date: Wed, 01 May 2024 12:51:13 +0100 Subject: [PATCH v2 07/22] mm: Provide a means of invalidation without using launder_folio In-Reply-To: <20240430140056.261997-8-dhowells@redhat.com> References: <20240430140056.261997-8-dhowells@redhat.com> <20240430140056.261997-1-dhowells@redhat.com> Message-ID: <438908.1714564273@warthog.procyon.org.uk> David Howells wrote: > + .range_start = first, > + .range_end = last, > ... > + truncate_inode_pages_range(mapping, first, last); These actually take file offsets and not page ranges and so the attached change is needed. Without this, the generic/412 xfstest fails. David --- diff --git a/mm/filemap.c b/mm/filemap.c index 53516305b4b4..3916fc8b10e6 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -4171,15 +4171,15 @@ int filemap_invalidate_inode(struct inode *inode, bool flush, struct writeback_control wbc = { .sync_mode = WB_SYNC_ALL, .nr_to_write = LONG_MAX, - .range_start = first, - .range_end = last, + .range_start = start, + .range_end = end, }; filemap_fdatawrite_wbc(mapping, &wbc); } /* Wait for writeback to complete on all folios and discard. */ - truncate_inode_pages_range(mapping, first, last); + truncate_inode_pages_range(mapping, start, end); unlock: filemap_invalidate_unlock(mapping); From sd at queasysnail.net Wed May 1 06:15:54 2024 From: sd at queasysnail.net (Sabrina Dubroca) Date: Wed, 1 May 2024 15:15:54 +0200 Subject: [PATCH net-next v6 8/8] ax.25: x.25: Remove the now superfluous sentinel elements from ctl_table array In-Reply-To: <20240501-jag-sysctl_remset_net-v6-8-370b702b6b4a@samsung.com> References: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> <20240501-jag-sysctl_remset_net-v6-8-370b702b6b4a@samsung.com> Message-ID: 2024-05-01, 11:29:32 +0200, Joel Granados via B4 Relay wrote: > From: Joel Granados > > This commit comes at the tail end of a greater effort to remove the > empty elements at the end of the ctl_table arrays (sentinels) which will > reduce the overall build time size of the kernel and run time memory > bloat by ~64 bytes per sentinel (further information Link : > https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo at bombadil.infradead.org/) > > Avoid a buffer overflow when traversing the ctl_table by ensuring that > AX25_MAX_VALUES is the same as the size of ax25_param_table. This is > done with a BUILD_BUG_ON where ax25_param_table is defined and a > CONFIG_AX25_DAMA_SLAVE guard in the unnamed enum definition as well as > in the ax25_dev_device_up and ax25_ds_set_timer functions. ^^ nit: not anymore ;) (but not worth a repost IMO) > diff --git a/net/ax25/ax25_ds_timer.c b/net/ax25/ax25_ds_timer.c > index c4f8adbf8144..c50a58d9e368 100644 > --- a/net/ax25/ax25_ds_timer.c > +++ b/net/ax25/ax25_ds_timer.c > @@ -55,6 +55,7 @@ void ax25_ds_set_timer(ax25_dev *ax25_dev) > ax25_dev->dama.slave_timeout = > msecs_to_jiffies(ax25_dev->values[AX25_VALUES_DS_TIMEOUT]) / 10; > mod_timer(&ax25_dev->dama.slave_timer, jiffies + HZ); > + return; nit: return not needed here since we're already at the bottom of the function, but probably not worth a repost of the series. > } -- Sabrina From sd at queasysnail.net Wed May 1 06:22:08 2024 From: sd at queasysnail.net (Sabrina Dubroca) Date: Wed, 1 May 2024 15:22:08 +0200 Subject: [PATCH net-next v6 0/8] sysctl: Remove sentinel elements from networking In-Reply-To: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> References: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> Message-ID: 2024-05-01, 11:29:24 +0200, Joel Granados via B4 Relay wrote: > From: Joel Granados > > What? > These commits remove the sentinel element (last empty element) from the > sysctl arrays of all the files under the "net/" directory that register > a sysctl array. The merging of the preparation patches [4] to mainline > allows us to just remove sentinel elements without changing behavior. > This is safe because the sysctl registration code (register_sysctl() and > friends) use the array size in addition to checking for a sentinel [1]. > > Why? > By removing the sysctl sentinel elements we avoid kernel bloat as > ctl_table arrays get moved out of kernel/sysctl.c into their own > respective subsystems. This move was started long ago to avoid merge > conflicts; the sentinel removal bit came after Mathew Wilcox suggested > it to avoid bloating the kernel by one element as arrays moved out. This > patchset will reduce the overall build time size of the kernel and run > time memory bloat by about ~64 bytes per declared ctl_table array (more > info here [5]). > > When are we done? > There are 4 patchest (25 commits [2]) that are still outstanding to > completely remove the sentinels: files under "net/" (this patchset), > files under "kernel/" dir, misc dirs (files under mm/ security/ and > others) and the final set that removes the unneeded check for ->procname > == NULL. > > Testing: > * Ran sysctl selftests (./tools/testing/selftests/sysctl/sysctl.sh) > * Ran this through 0-day with no errors or warnings > > Savings in vmlinux: > A total of 64 bytes per sentinel is saved after removal; I measured in > x86_64 to give an idea of the aggregated savings. The actual savings > will depend on individual kernel configuration. > * bloat-o-meter > - The "yesall" config saves 3976 bytes (bloat-o-meter output [6]) > - A reduced config [3] saves 1263 bytes (bloat-o-meter output [7]) > > Savings in allocated memory: > None in this set but will occur when the superfluous allocations are > removed from proc_sysctl.c. I include it here for context. The > estimated savings during boot for config [3] are 6272 bytes. See [8] > for how to measure it. > > Comments/feedback greatly appreciated > > Changes in v6: > - Rebased onto net-next/main. > - Besides re-running my cocci scripts, I ran a new find script [9]. > Found 0 hits in net/ > - Moved "i" variable declaraction out of for() in sysctl_core_net_init > - Removed forgotten sentinel in mpls_table > - Removed CONFIG_AX25_DAMA_SLAVE guard from net/ax25/ax25_ds_timer.c. It > is not needed because that file is compiled only when > CONFIG_AX25_DAMA_SLAVE is set. > - When traversing smc_table, stop on ARRAY_SIZE instead of ARRAY_SIZE-1. > - Link to v5: https://lore.kernel.org/r/20240426-jag-sysctl_remset_net-v5-0-e3b12f6111a6 at samsung.com I pointed out a few tiny details in the ax25 patch but either way, the series looks good to me. Thanks! Series: Reviewed-by: Sabrina Dubroca Note that you could have kept the ack/reviewed-by on patch 4 since it was not modified. Jeff and Chuck, your reviews got lost in the repost. -- Sabrina From dhowells at redhat.com Wed May 1 10:00:59 2024 From: dhowells at redhat.com (David Howells) Date: Wed, 01 May 2024 18:00:59 +0100 Subject: [PATCH v2 14/22] netfs: New writeback implementation In-Reply-To: <20240430140056.261997-15-dhowells@redhat.com> References: <20240430140056.261997-15-dhowells@redhat.com> <20240430140056.261997-1-dhowells@redhat.com> Message-ID: <458060.1714582859@warthog.procyon.org.uk> This needs the attached change. It needs to allow for netfs_perform_write() changing i_size whilst we're doing writeback. The issue is that i_size is cached in the netfs_io_request struct (as that's what we're going to tell the server the new i_size should be), but we're not updating this properly if i_size moves between us creating the request and us deciding to write out the folio in which i_size was when we created the request. This can lead to the folio_zero_segment() that can be seen in the patch below clearing the wrong amount of the final page - assuming it's still the final page. David --- diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c index 69c50f4cbf41..e190043bc0da 100644 --- a/fs/netfs/write_issue.c +++ b/fs/netfs/write_issue.c @@ -315,13 +315,19 @@ static int netfs_write_folio(struct netfs_io_request *wreq, struct netfs_group *fgroup; /* TODO: Use this with ceph */ struct netfs_folio *finfo; size_t fsize = folio_size(folio), flen = fsize, foff = 0; - loff_t fpos = folio_pos(folio); + loff_t fpos = folio_pos(folio), i_size; bool to_eof = false, streamw = false; bool debug = false; _enter(""); - if (fpos >= wreq->i_size) { + /* netfs_perform_write() may shift i_size around the page or from out + * of the page to beyond it, but cannot move i_size into or through the + * page since we have it locked. + */ + i_size = i_size_read(wreq->inode); + + if (fpos >= i_size) { /* mmap beyond eof. */ _debug("beyond eof"); folio_start_writeback(folio); @@ -332,6 +338,9 @@ static int netfs_write_folio(struct netfs_io_request *wreq, return 0; } + if (fpos + fsize > wreq->i_size) + wreq->i_size = i_size; + fgroup = netfs_folio_group(folio); finfo = netfs_folio_info(folio); if (finfo) { @@ -342,14 +351,14 @@ static int netfs_write_folio(struct netfs_io_request *wreq, if (wreq->origin == NETFS_WRITETHROUGH) { to_eof = false; - if (flen > wreq->i_size - fpos) - flen = wreq->i_size - fpos; - } else if (flen > wreq->i_size - fpos) { - flen = wreq->i_size - fpos; + if (flen > i_size - fpos) + flen = i_size - fpos; + } else if (flen > i_size - fpos) { + flen = i_size - fpos; if (!streamw) folio_zero_segment(folio, flen, fsize); to_eof = true; - } else if (flen == wreq->i_size - fpos) { + } else if (flen == i_size - fpos) { to_eof = true; } flen -= foff; From jaltman at auristor.com Thu May 2 06:16:38 2024 From: jaltman at auristor.com (Jeffrey E Altman) Date: Thu, 2 May 2024 09:16:38 -0400 Subject: Regarding CVE-2024-26736: afs: Increase buffer size in afs_update_volume_status() Message-ID: <10629460-7a0c-4489-8850-664beae51f05@auristor.com> https://lore.kernel.org/linux-cve-announce/2024040359-CVE-2024-26736-284d at gregkh/T/#u On 2024-04-03 CVE-2024-26736 was announced in response to the merging of commit 6ea38e2aeb72349cad50e38899b0ba6fbcb2af3d Author: Daniil Dulov Date:?? Mon Feb 19 14:39:03 2024 +0000 ??? afs: Increase buffer size in afs_update_volume_status() ??? The max length of volume->vid value is 20 characters. ??? So increase idbuf[] size up to 24 to avoid overflow. ??? Found by Linux Verification Center (linuxtesting.org) with SVACE. ??? [DH: Actually, it's 20 + NUL, so increase it to 24 and use snprintf()] ??? Fixes: d2ddc776a458 ("afs: Overhaul volume and server record caching and fileserver rotation") ??? Signed-off-by: Daniil Dulov ??? Signed-off-by: David Howells ??? Link: https://lore.kernel.org/r/20240211150442.3416-1-d.dulov at aladdin.ru/ # v1 ??? Link: https://lore.kernel.org/r/20240212083347.10742-1-d.dulov at aladdin.ru/ # v2 ??? Link: https://lore.kernel.org/r/20240219143906.138346-3-dhowells at redhat.com ??? Signed-off-by: Christian Brauner After a careful examination of the change and the code history I believe the referenced "Fixes" commit is incorrect. It should be commit 3b6492df4153b8550d347dfc581856138678a231 Author: David Howells Date:?? Sat Oct 20 00:57:57 2018 +0100 ??? afs: Increase to 64-bit volume ID and 96-bit vnode ID for YFS ??? Increase the sizes of the volume ID to 64 bits and the vnode ID (inode ??? number equivalent) to 96 bits to allow the support of YFS. ??? This requires the iget comparator to check the vnode->fid rather than i_ino ??? and i_generation as i_ino is not sufficiently capacious. It also requires ??? this data to be placed into the vnode cache key for fscache. ??? For the moment, just discard the top 32 bits of the vnode ID when returning ??? it though stat. ??? Signed-off-by: David Howells which was initially merged as part of v4.20-rc1 and not 4.15 as indicated by CVE-2024-26736. commit 3b6492df4153b8550d347dfc581856138678a231 increased the size of typedef afs_volid_t from "unsigned int" to "u64" without increasing the size of idbuf[] within afs_update_volume_status().? However, since the introduction of 3b6492df4153b8550d347dfc581856138678a231 there has yet to be any implementation of either RPC YFSVL_GetEntryByName64 or YFSVL_GetEntryByID64 which would permit a volume id larger than 32-bits to be stored into the struct afs_volume.vid afs_volid_t typed field.? fs/afs uses the VL_GetEntryByNameU and VL_GetEntryByIDU RPC variants which only support 32-bit volume ids. Therefore, I do not believe that the code present in afs_update_volume_status() char idbuf[16]; idsz = sprintf(idbuf, "%llu", volume->vid); could in practice result in a buffer overflow as indicated by CVE-2024-26736 since the C-string generated by all of the possible volume id values would fit within 16 bytes. Sincerely, Jeffrey Altman From jan.henrik.sylvester at uni-hamburg.de Fri May 3 03:20:20 2024 From: jan.henrik.sylvester at uni-hamburg.de (Jan Henrik Sylvester) Date: Fri, 3 May 2024 12:20:20 +0200 Subject: Recursive mounts with kAFS, but not with OpenAFS Message-ID: Hello, I hope this is the correct forum to ask and I do not have to be subscribed to the list. kAFS and OpenAFS behave differently if there are recursive mounts. Should one of the behaviors be considered a bug? For each users' home directory, we (uni-hamburg.de) create a mount point of its respective backup volume AFSBAK directly below the home directory itself. That way ~/AFSBAK is a snapshot from the backup of last night for ~ and the users can find lost files themselves. That was never a problem with OpenAFS, because ~/AFSBAK/AFSBAK is invalid (No such device). Now I am evaluating kAFS for our clients (long term plan, now triggered by the fact that there is no OpenAFS for Ubuntu 24.04, yet) and a simple execution of find on a home directory had a machine hang by the oom-killer killing vital system processes, because kAFS can go more than a 1000 levels recursively into ~/AFSBAK/AFSBAK/AFSBAK... The volume user is mounted to /afs/math.uni-hamburg.de/users/area/user and user.backup is mounted to /afs/math.uni-hamburg.de/users/area/user/AFSBAK. Should /afs/math.uni-hamburg.de/users/area/user/AFSBAK/AFSBAK automatically be a mount point? Thanks, Jan Henrik From dhowells at redhat.com Fri May 3 07:22:32 2024 From: dhowells at redhat.com (David Howells) Date: Fri, 03 May 2024 15:22:32 +0100 Subject: [PATCH v2] afs: Fix fileserver rotation getting stuck Message-ID: <998836.1714746152@warthog.procyon.org.uk> Hi Christian, Could you pick this up, please? David --- afs: Fix fileserver rotation getting stuck Fix the fileserver rotation code in a couple of ways: (1) op->server_states is an array, not a pointer to a single record, so fix the places that access it to index it. (2) In the places that go through an address list to work out which one has the best priority, fix the loops to skip known failed addresses. Without this, the rotation algorithm may get stuck on addresses that are inaccessible or don't respond. This can be triggered manually by finding a server that advertises a non-routable address and giving it a higher priority, eg.: echo "add udp 192.168.0.0/16 3000" >/proc/fs/afs/addr_prefs if the server, say, includes the address 192.168.7.7 in its address list, and then attempting to access a volume on that server. Fixes: 495f2ae9e355 ("afs: Fix fileserver rotation") Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs at lists.infradead.org Link: https://lore.kernel.org/r/4005300.1712309731 at warthog.procyon.org.uk/ # v1 --- Changes ======= ver #2) - Use the untried address set precomputed in the 'set' variable. fs/afs/rotate.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/fs/afs/rotate.c b/fs/afs/rotate.c index ed04bd1eeae8..ed09d4d4c211 100644 --- a/fs/afs/rotate.c +++ b/fs/afs/rotate.c @@ -541,11 +541,13 @@ bool afs_select_fileserver(struct afs_operation *op) test_bit(AFS_SE_EXCLUDED, &se->flags) || !test_bit(AFS_SERVER_FL_RESPONDING, &s->flags)) continue; - es = op->server_states->endpoint_state; + es = op->server_states[i].endpoint_state; sal = es->addresses; afs_get_address_preferences_rcu(op->net, sal); for (j = 0; j < sal->nr_addrs; j++) { + if (es->failed_set & (1 << j)) + continue; if (!sal->addrs[j].peer) continue; if (sal->addrs[j].prio > best_prio) { @@ -605,6 +607,8 @@ bool afs_select_fileserver(struct afs_operation *op) best_prio = -1; addr_index = 0; for (i = 0; i < alist->nr_addrs; i++) { + if (!(set & (1 << i))) + continue; if (alist->addrs[i].prio > best_prio) { addr_index = i; best_prio = alist->addrs[i].prio; @@ -674,7 +678,7 @@ bool afs_select_fileserver(struct afs_operation *op) for (i = 0; i < op->server_list->nr_servers; i++) { struct afs_endpoint_state *estate; - estate = op->server_states->endpoint_state; + estate = op->server_states[i].endpoint_state; error = READ_ONCE(estate->error); if (error < 0) afs_op_accumulate_error(op, error, estate->abort_code); From dhowells at redhat.com Fri May 3 08:07:38 2024 From: dhowells at redhat.com (David Howells) Date: Fri, 3 May 2024 16:07:38 +0100 Subject: [PATCH net 0/5] rxrpc: Miscellaneous fixes Message-ID: <20240503150749.1001323-1-dhowells@redhat.com> Here some miscellaneous fixes for AF_RXRPC: (1) Fix the congestion control algorithm to start cwnd at 4 and to not cut ssthresh when the peer cuts its rwind size. (2) Only transmit a single ACK for all the DATA packets glued together into a jumbo packet to reduce the number of ACKs being generated. (3) Clean up the generation of flags in the protocol header when creating a packet for transmission. This means we don't carry the old REQUEST-ACK bit around from previous transmissions, will make it easier to fix the MORE-PACKETS flag and make it easier to do jumbo packet assembly in future. (4) Fix how the MORE-PACKETS flag is driven. We shouldn't be setting it in sendmsg() as the packet is then queued and the bit is left in that state, no matter how long it takes us to transmit the packet - and will still be in that state if the packet is retransmitted. (5) Request an ACK on an impending transmission stall due to the app layer not feeding us new data fast enough. If we don't request an ACK, we may have to hold on to the packet buffers for a significant amount of time until the receiver gets bored and sends us an ACK anyway. David --- The patches can be found here also: http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=rxrpc-fixes David Howells (5): rxrpc: Fix congestion control algorithm rxrpc: Only transmit one ACK per jumbo packet received rxrpc: Clean up Tx header flags generation handling rxrpc: Change how the MORE-PACKETS rxrpc wire header flag is driven rxrpc: Request an ACK on impending Tx stall include/trace/events/rxrpc.h | 2 +- net/rxrpc/ar-internal.h | 2 +- net/rxrpc/call_object.c | 7 +----- net/rxrpc/input.c | 49 +++++++++++++++++++++++++----------- net/rxrpc/output.c | 26 ++++++++++++++----- net/rxrpc/proc.c | 6 ++--- net/rxrpc/sendmsg.c | 3 --- 7 files changed, 61 insertions(+), 34 deletions(-) From dhowells at redhat.com Fri May 3 08:07:40 2024 From: dhowells at redhat.com (David Howells) Date: Fri, 3 May 2024 16:07:40 +0100 Subject: [PATCH net 2/5] rxrpc: Only transmit one ACK per jumbo packet received In-Reply-To: <20240503150749.1001323-1-dhowells@redhat.com> References: <20240503150749.1001323-1-dhowells@redhat.com> Message-ID: <20240503150749.1001323-3-dhowells@redhat.com> Only generate one ACK packet for all the subpackets in a jumbo packet. If we would like to generate more than one ACK, we prioritise them base on their reason code, in the order, highest first: OutOfSeq > NoSpace > ExceedsWin > Duplicate > Requested > Delay > Idle For the first four, we reference the lowest offending subpacket; for the last three, the highest. This reduces the number of ACKs we end up transmitting to one per UDP packet transmitted to reduce network loading and packet parsing. Fixes: 5d7edbc9231e ("rxrpc: Get rid of the Rx ring") Signed-off-by: David Howells cc: Marc Dionne cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: linux-afs at lists.infradead.org cc: netdev at vger.kernel.org --- net/rxrpc/input.c | 46 +++++++++++++++++++++++++++++++++++----------- 1 file changed, 35 insertions(+), 11 deletions(-) diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c index 860075f1290b..16d49a861dbb 100644 --- a/net/rxrpc/input.c +++ b/net/rxrpc/input.c @@ -9,6 +9,17 @@ #include "ar-internal.h" +/* Override priority when generating ACKs for received DATA */ +static const u8 rxrpc_ack_priority[RXRPC_ACK__INVALID] = { + [RXRPC_ACK_IDLE] = 1, + [RXRPC_ACK_DELAY] = 2, + [RXRPC_ACK_REQUESTED] = 3, + [RXRPC_ACK_DUPLICATE] = 4, + [RXRPC_ACK_EXCEEDS_WINDOW] = 5, + [RXRPC_ACK_NOSPACE] = 6, + [RXRPC_ACK_OUT_OF_SEQUENCE] = 7, +}; + static void rxrpc_proto_abort(struct rxrpc_call *call, rxrpc_seq_t seq, enum rxrpc_abort_reason why) { @@ -365,7 +376,7 @@ static void rxrpc_input_queue_data(struct rxrpc_call *call, struct sk_buff *skb, * Process a DATA packet. */ static void rxrpc_input_data_one(struct rxrpc_call *call, struct sk_buff *skb, - bool *_notify) + bool *_notify, rxrpc_serial_t *_ack_serial, int *_ack_reason) { struct rxrpc_skb_priv *sp = rxrpc_skb(skb); struct sk_buff *oos; @@ -418,8 +429,6 @@ static void rxrpc_input_data_one(struct rxrpc_call *call, struct sk_buff *skb, /* Send an immediate ACK if we fill in a hole */ else if (!skb_queue_empty(&call->rx_oos_queue)) ack_reason = RXRPC_ACK_DELAY; - else - call->ackr_nr_unacked++; window++; if (after(window, wtop)) { @@ -497,12 +506,16 @@ static void rxrpc_input_data_one(struct rxrpc_call *call, struct sk_buff *skb, } send_ack: - if (ack_reason >= 0) - rxrpc_send_ACK(call, ack_reason, serial, - rxrpc_propose_ack_input_data); - else - rxrpc_propose_delay_ACK(call, serial, - rxrpc_propose_ack_input_data); + if (ack_reason >= 0) { + if (rxrpc_ack_priority[ack_reason] > rxrpc_ack_priority[*_ack_reason]) { + *_ack_serial = serial; + *_ack_reason = ack_reason; + } else if (rxrpc_ack_priority[ack_reason] == rxrpc_ack_priority[*_ack_reason] && + ack_reason == RXRPC_ACK_REQUESTED) { + *_ack_serial = serial; + *_ack_reason = ack_reason; + } + } } /* @@ -513,9 +526,11 @@ static bool rxrpc_input_split_jumbo(struct rxrpc_call *call, struct sk_buff *skb struct rxrpc_jumbo_header jhdr; struct rxrpc_skb_priv *sp = rxrpc_skb(skb), *jsp; struct sk_buff *jskb; + rxrpc_serial_t ack_serial = 0; unsigned int offset = sizeof(struct rxrpc_wire_header); unsigned int len = skb->len - offset; bool notify = false; + int ack_reason = 0; while (sp->hdr.flags & RXRPC_JUMBO_PACKET) { if (len < RXRPC_JUMBO_SUBPKTLEN) @@ -535,7 +550,7 @@ static bool rxrpc_input_split_jumbo(struct rxrpc_call *call, struct sk_buff *skb jsp = rxrpc_skb(jskb); jsp->offset = offset; jsp->len = RXRPC_JUMBO_DATALEN; - rxrpc_input_data_one(call, jskb, ¬ify); + rxrpc_input_data_one(call, jskb, ¬ify, &ack_serial, &ack_reason); rxrpc_free_skb(jskb, rxrpc_skb_put_jumbo_subpacket); sp->hdr.flags = jhdr.flags; @@ -548,7 +563,16 @@ static bool rxrpc_input_split_jumbo(struct rxrpc_call *call, struct sk_buff *skb sp->offset = offset; sp->len = len; - rxrpc_input_data_one(call, skb, ¬ify); + rxrpc_input_data_one(call, skb, ¬ify, &ack_serial, &ack_reason); + + if (ack_reason > 0) { + rxrpc_send_ACK(call, ack_reason, ack_serial, + rxrpc_propose_ack_input_data); + } else { + call->ackr_nr_unacked++; + rxrpc_propose_delay_ACK(call, sp->hdr.serial, + rxrpc_propose_ack_input_data); + } if (notify) { trace_rxrpc_notify_socket(call->debug_id, sp->hdr.serial); rxrpc_notify_socket(call); From dhowells at redhat.com Fri May 3 08:07:41 2024 From: dhowells at redhat.com (David Howells) Date: Fri, 3 May 2024 16:07:41 +0100 Subject: [PATCH net 3/5] rxrpc: Clean up Tx header flags generation handling In-Reply-To: <20240503150749.1001323-1-dhowells@redhat.com> References: <20240503150749.1001323-1-dhowells@redhat.com> Message-ID: <20240503150749.1001323-4-dhowells@redhat.com> Clean up the generation of the header flags when building packet headers for transmission: (1) Assemble the flags in a local variable rather than in the txb->flags. (2) Do the flags masking and JUMBO-PACKET setting in one bit of code for both the main header and the jumbo headers. (3) Generate the REQUEST-ACK flag afresh each time. There's a possibility we might want to do jumbo retransmission packets in future. (4) Pass the local flags variable to the rxrpc_tx_data tracepoint rather than the combination of the txb flags and the wire header flags (the latter belong only to the first subpacket). This makes it easier to clean up setting the MORE-PACKETS flag Fixes: 44125d5aadda ("rxrpc: Split up the DATA packet transmission function") Signed-off-by: David Howells cc: Marc Dionne cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: linux-afs at lists.infradead.org cc: netdev at vger.kernel.org --- include/trace/events/rxrpc.h | 1 - net/rxrpc/ar-internal.h | 2 +- net/rxrpc/output.c | 18 ++++++++++++------ net/rxrpc/proc.c | 3 +-- 4 files changed, 14 insertions(+), 10 deletions(-) diff --git a/include/trace/events/rxrpc.h b/include/trace/events/rxrpc.h index a1b126a6b0d7..7b6c1db53401 100644 --- a/include/trace/events/rxrpc.h +++ b/include/trace/events/rxrpc.h @@ -449,7 +449,6 @@ #define rxrpc_req_ack_traces \ EM(rxrpc_reqack_ack_lost, "ACK-LOST ") \ - EM(rxrpc_reqack_already_on, "ALREADY-ON") \ EM(rxrpc_reqack_more_rtt, "MORE-RTT ") \ EM(rxrpc_reqack_no_srv_last, "NO-SRVLAST") \ EM(rxrpc_reqack_old_rtt, "OLD-RTT ") \ diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h index 08de24658f4f..c11a6043c8f2 100644 --- a/net/rxrpc/ar-internal.h +++ b/net/rxrpc/ar-internal.h @@ -110,7 +110,7 @@ struct rxrpc_net { atomic_t stat_tx_acks[256]; atomic_t stat_rx_acks[256]; - atomic_t stat_why_req_ack[8]; + atomic_t stat_why_req_ack[7]; atomic_t stat_io_loop; }; diff --git a/net/rxrpc/output.c b/net/rxrpc/output.c index 5ea9601efd05..bf2d0f847cdb 100644 --- a/net/rxrpc/output.c +++ b/net/rxrpc/output.c @@ -330,6 +330,8 @@ static void rxrpc_prepare_data_subpacket(struct rxrpc_call *call, struct rxrpc_t struct rxrpc_wire_header *whdr = txb->kvec[0].iov_base; enum rxrpc_req_ack_trace why; struct rxrpc_connection *conn = call->conn; + bool last; + u8 flags; _enter("%x,{%d}", txb->seq, txb->len); @@ -339,6 +341,10 @@ static void rxrpc_prepare_data_subpacket(struct rxrpc_call *call, struct rxrpc_t txb->seq == 1) whdr->userStatus = RXRPC_USERSTATUS_SERVICE_UPGRADE; + txb->flags &= ~RXRPC_REQUEST_ACK; + flags = txb->flags & RXRPC_TXBUF_WIRE_FLAGS; + last = txb->flags & RXRPC_LAST_PACKET; + /* If our RTT cache needs working on, request an ACK. Also request * ACKs if a DATA packet appears to have been lost. * @@ -346,9 +352,7 @@ static void rxrpc_prepare_data_subpacket(struct rxrpc_call *call, struct rxrpc_t * service call, lest OpenAFS incorrectly send us an ACK with some * soft-ACKs in it and then never follow up with a proper hard ACK. */ - if (txb->flags & RXRPC_REQUEST_ACK) - why = rxrpc_reqack_already_on; - else if ((txb->flags & RXRPC_LAST_PACKET) && rxrpc_sending_to_client(txb)) + if (last && rxrpc_sending_to_client(txb)) why = rxrpc_reqack_no_srv_last; else if (test_and_clear_bit(RXRPC_CALL_EV_ACK_LOST, &call->events)) why = rxrpc_reqack_ack_lost; @@ -367,15 +371,17 @@ static void rxrpc_prepare_data_subpacket(struct rxrpc_call *call, struct rxrpc_t rxrpc_inc_stat(call->rxnet, stat_why_req_ack[why]); trace_rxrpc_req_ack(call->debug_id, txb->seq, why); - if (why != rxrpc_reqack_no_srv_last) + if (why != rxrpc_reqack_no_srv_last) { txb->flags |= RXRPC_REQUEST_ACK; + flags |= RXRPC_REQUEST_ACK; + } dont_set_request_ack: - whdr->flags = txb->flags & RXRPC_TXBUF_WIRE_FLAGS; + whdr->flags = flags; whdr->serial = htonl(txb->serial); whdr->cksum = txb->cksum; - trace_rxrpc_tx_data(call, txb->seq, txb->serial, txb->flags, false); + trace_rxrpc_tx_data(call, txb->seq, txb->serial, flags, false); } /* diff --git a/net/rxrpc/proc.c b/net/rxrpc/proc.c index 263a2251e3d2..3b7e34dd4385 100644 --- a/net/rxrpc/proc.c +++ b/net/rxrpc/proc.c @@ -519,9 +519,8 @@ int rxrpc_stats_show(struct seq_file *seq, void *v) atomic_read(&rxnet->stat_rx_acks[RXRPC_ACK_DELAY]), atomic_read(&rxnet->stat_rx_acks[RXRPC_ACK_IDLE])); seq_printf(seq, - "Why-Req-A: acklost=%u already=%u mrtt=%u ortt=%u\n", + "Why-Req-A: acklost=%u mrtt=%u ortt=%u\n", atomic_read(&rxnet->stat_why_req_ack[rxrpc_reqack_ack_lost]), - atomic_read(&rxnet->stat_why_req_ack[rxrpc_reqack_already_on]), atomic_read(&rxnet->stat_why_req_ack[rxrpc_reqack_more_rtt]), atomic_read(&rxnet->stat_why_req_ack[rxrpc_reqack_old_rtt])); seq_printf(seq, From dhowells at redhat.com Fri May 3 08:07:43 2024 From: dhowells at redhat.com (David Howells) Date: Fri, 3 May 2024 16:07:43 +0100 Subject: [PATCH net 5/5] rxrpc: Request an ACK on impending Tx stall In-Reply-To: <20240503150749.1001323-1-dhowells@redhat.com> References: <20240503150749.1001323-1-dhowells@redhat.com> Message-ID: <20240503150749.1001323-6-dhowells@redhat.com> Set the REQUEST-ACK flag on the DATA packet we're about to send if we're about to stall transmission because the app layer isn't keeping up supplying us with data to transmit. Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Signed-off-by: David Howells cc: Marc Dionne cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: linux-afs at lists.infradead.org cc: netdev at vger.kernel.org --- include/trace/events/rxrpc.h | 1 + net/rxrpc/ar-internal.h | 2 +- net/rxrpc/output.c | 2 ++ net/rxrpc/proc.c | 5 +++-- 4 files changed, 7 insertions(+), 3 deletions(-) diff --git a/include/trace/events/rxrpc.h b/include/trace/events/rxrpc.h index 7b6c1db53401..dca9f4759dcb 100644 --- a/include/trace/events/rxrpc.h +++ b/include/trace/events/rxrpc.h @@ -449,6 +449,7 @@ #define rxrpc_req_ack_traces \ EM(rxrpc_reqack_ack_lost, "ACK-LOST ") \ + EM(rxrpc_reqack_app_stall, "APP-STALL ") \ EM(rxrpc_reqack_more_rtt, "MORE-RTT ") \ EM(rxrpc_reqack_no_srv_last, "NO-SRVLAST") \ EM(rxrpc_reqack_old_rtt, "OLD-RTT ") \ diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h index c11a6043c8f2..08de24658f4f 100644 --- a/net/rxrpc/ar-internal.h +++ b/net/rxrpc/ar-internal.h @@ -110,7 +110,7 @@ struct rxrpc_net { atomic_t stat_tx_acks[256]; atomic_t stat_rx_acks[256]; - atomic_t stat_why_req_ack[7]; + atomic_t stat_why_req_ack[8]; atomic_t stat_io_loop; }; diff --git a/net/rxrpc/output.c b/net/rxrpc/output.c index 4ebd0bd40a02..32626ff377e1 100644 --- a/net/rxrpc/output.c +++ b/net/rxrpc/output.c @@ -372,6 +372,8 @@ static void rxrpc_prepare_data_subpacket(struct rxrpc_call *call, struct rxrpc_t why = rxrpc_reqack_more_rtt; else if (ktime_before(ktime_add_ms(call->peer->rtt_last_req, 1000), ktime_get_real())) why = rxrpc_reqack_old_rtt; + else if (!more && !last) + why = rxrpc_reqack_app_stall; else goto dont_set_request_ack; diff --git a/net/rxrpc/proc.c b/net/rxrpc/proc.c index 3b7e34dd4385..1bab7f5a7d0f 100644 --- a/net/rxrpc/proc.c +++ b/net/rxrpc/proc.c @@ -519,10 +519,11 @@ int rxrpc_stats_show(struct seq_file *seq, void *v) atomic_read(&rxnet->stat_rx_acks[RXRPC_ACK_DELAY]), atomic_read(&rxnet->stat_rx_acks[RXRPC_ACK_IDLE])); seq_printf(seq, - "Why-Req-A: acklost=%u mrtt=%u ortt=%u\n", + "Why-Req-A: acklost=%u mrtt=%u ortt=%u stall=%u\n", atomic_read(&rxnet->stat_why_req_ack[rxrpc_reqack_ack_lost]), atomic_read(&rxnet->stat_why_req_ack[rxrpc_reqack_more_rtt]), - atomic_read(&rxnet->stat_why_req_ack[rxrpc_reqack_old_rtt])); + atomic_read(&rxnet->stat_why_req_ack[rxrpc_reqack_old_rtt]), + atomic_read(&rxnet->stat_why_req_ack[rxrpc_reqack_app_stall])); seq_printf(seq, "Why-Req-A: nolast=%u retx=%u slows=%u smtxw=%u\n", atomic_read(&rxnet->stat_why_req_ack[rxrpc_reqack_no_srv_last]), From dhowells at redhat.com Fri May 3 08:07:42 2024 From: dhowells at redhat.com (David Howells) Date: Fri, 3 May 2024 16:07:42 +0100 Subject: [PATCH net 4/5] rxrpc: Change how the MORE-PACKETS rxrpc wire header flag is driven In-Reply-To: <20240503150749.1001323-1-dhowells@redhat.com> References: <20240503150749.1001323-1-dhowells@redhat.com> Message-ID: <20240503150749.1001323-5-dhowells@redhat.com> Currently, the MORE-PACKETS rxrpc header flag is set by sendmsg trying to guess how it should be set by looking to see if there's space in the Tx window and setting it if there is - long before the packet gets transmitted (and it gets left in this state). As a consequence, it's not very meaningful. Change this such that it is turned on at the point of transmission if we have more packets after it in the send buffers and it is left clear if we don't yet. Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") cc: David Howells cc: Marc Dionne cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: linux-afs at lists.infradead.org cc: netdev at vger.kernel.org --- net/rxrpc/output.c | 8 +++++++- net/rxrpc/sendmsg.c | 3 --- 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/net/rxrpc/output.c b/net/rxrpc/output.c index bf2d0f847cdb..4ebd0bd40a02 100644 --- a/net/rxrpc/output.c +++ b/net/rxrpc/output.c @@ -330,7 +330,7 @@ static void rxrpc_prepare_data_subpacket(struct rxrpc_call *call, struct rxrpc_t struct rxrpc_wire_header *whdr = txb->kvec[0].iov_base; enum rxrpc_req_ack_trace why; struct rxrpc_connection *conn = call->conn; - bool last; + bool more, last; u8 flags; _enter("%x,{%d}", txb->seq, txb->len); @@ -345,6 +345,12 @@ static void rxrpc_prepare_data_subpacket(struct rxrpc_call *call, struct rxrpc_t flags = txb->flags & RXRPC_TXBUF_WIRE_FLAGS; last = txb->flags & RXRPC_LAST_PACKET; + more = (!last && + (!list_is_last(&txb->call_link, &call->tx_buffer) || + !list_empty(&call->tx_sendmsg))); + if (more) + flags |= RXRPC_MORE_PACKETS; + /* If our RTT cache needs working on, request an ACK. Also request * ACKs if a DATA packet appears to have been lost. * diff --git a/net/rxrpc/sendmsg.c b/net/rxrpc/sendmsg.c index 894b8fa68e5e..eaf4441a340b 100644 --- a/net/rxrpc/sendmsg.c +++ b/net/rxrpc/sendmsg.c @@ -384,9 +384,6 @@ static int rxrpc_send_data(struct rxrpc_sock *rx, (msg_data_left(msg) == 0 && !more)) { if (msg_data_left(msg) == 0 && !more) txb->flags |= RXRPC_LAST_PACKET; - else if (call->tx_top - call->acks_hard_ack < - call->tx_winsize) - txb->flags |= RXRPC_MORE_PACKETS; ret = call->security->secure_packet(call, txb); if (ret < 0) From patchwork-bot+netdevbpf at kernel.org Wed May 1 15:00:30 2024 From: patchwork-bot+netdevbpf at kernel.org (patchwork-bot+netdevbpf at kernel.org) Date: Wed, 01 May 2024 22:00:30 +0000 Subject: [PATCH net v2] rxrpc: Clients must accept conn from any address In-Reply-To: <20240419163057.4141728-1-marc.dionne@auristor.com> References: <20240419163057.4141728-1-marc.dionne@auristor.com> Message-ID: <171460083096.4291.841928169259901736.git-patchwork-notify@kernel.org> Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski : On Fri, 19 Apr 2024 13:30:57 -0300 you wrote: > From: Jeffrey Altman > > The find connection logic of Transarc's Rx was modified in the mid-1990s > to support multi-homed servers which might send a response packet from > an address other than the destination address in the received packet. > The rules for accepting a packet by an Rx initiator (RX_CLIENT_CONNECTION) > were altered to permit acceptance of a packet from any address provided > that the port number was unchanged and all of the connection identifiers > matched (Epoch, CID, SecurityClass, ...). > > [...] Here is the summary with links: - [net,v2] rxrpc: Clients must accept conn from any address https://git.kernel.org/netdev/net/c/8953285d7bd6 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html From ryncsn at gmail.com Thu May 2 01:46:03 2024 From: ryncsn at gmail.com (Kairui Song) Date: Thu, 2 May 2024 16:46:03 +0800 Subject: [PATCH v4 06/12] afs: drop usage of folio_file_pos In-Reply-To: <20240502084609.28376-1-ryncsn@gmail.com> References: <20240502084609.28376-1-ryncsn@gmail.com> Message-ID: <20240502084609.28376-7-ryncsn@gmail.com> From: Kairui Song folio_file_pos is only needed for mixed usage of page cache and swap cache, for pure page cache usage, the caller can just use folio_pos instead. It can't be a swap cache page here. Swap mapping may only call into fs through swap_rw and that is not supported for afs. So just drop it and use folio_pos instead. Signed-off-by: Kairui Song Cc: David Howells Cc: Marc Dionne Cc: linux-afs at lists.infradead.org --- fs/afs/dir.c | 6 +++--- fs/afs/dir_edit.c | 4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/afs/dir.c b/fs/afs/dir.c index 67afe68972d5..f8622ed72e08 100644 --- a/fs/afs/dir.c +++ b/fs/afs/dir.c @@ -533,14 +533,14 @@ static int afs_dir_iterate(struct inode *dir, struct dir_context *ctx, break; } - offset = round_down(ctx->pos, sizeof(*dblock)) - folio_file_pos(folio); + offset = round_down(ctx->pos, sizeof(*dblock)) - folio_pos(folio); size = min_t(loff_t, folio_size(folio), - req->actual_len - folio_file_pos(folio)); + req->actual_len - folio_pos(folio)); do { dblock = kmap_local_folio(folio, offset); ret = afs_dir_iterate_block(dvnode, ctx, dblock, - folio_file_pos(folio) + offset); + folio_pos(folio) + offset); kunmap_local(dblock); if (ret != 1) goto out; diff --git a/fs/afs/dir_edit.c b/fs/afs/dir_edit.c index e2fa577b66fe..a71bff10496b 100644 --- a/fs/afs/dir_edit.c +++ b/fs/afs/dir_edit.c @@ -256,7 +256,7 @@ void afs_edit_dir_add(struct afs_vnode *vnode, folio = folio0; } - block = kmap_local_folio(folio, b * AFS_DIR_BLOCK_SIZE - folio_file_pos(folio)); + block = kmap_local_folio(folio, b * AFS_DIR_BLOCK_SIZE - folio_pos(folio)); /* Abandon the edit if we got a callback break. */ if (!test_bit(AFS_VNODE_DIR_VALID, &vnode->flags)) @@ -417,7 +417,7 @@ void afs_edit_dir_remove(struct afs_vnode *vnode, folio = folio0; } - block = kmap_local_folio(folio, b * AFS_DIR_BLOCK_SIZE - folio_file_pos(folio)); + block = kmap_local_folio(folio, b * AFS_DIR_BLOCK_SIZE - folio_pos(folio)); /* Abandon the edit if we got a callback break. */ if (!test_bit(AFS_VNODE_DIR_VALID, &vnode->flags)) -- 2.44.0 From allison.henderson at oracle.com Thu May 2 19:27:22 2024 From: allison.henderson at oracle.com (Allison Henderson) Date: Fri, 3 May 2024 02:27:22 +0000 Subject: [PATCH net-next v6 3/8] net: rds: Remove the now superfluous sentinel elements from ctl_table array In-Reply-To: <20240501-jag-sysctl_remset_net-v6-3-370b702b6b4a@samsung.com> References: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> <20240501-jag-sysctl_remset_net-v6-3-370b702b6b4a@samsung.com> Message-ID: <4a2154ccbf8e0e73f09e717d49864eb1003d5cfa.camel@oracle.com> On Wed, 2024-05-01 at 11:29 +0200, Joel Granados via B4 Relay wrote: > From: Joel Granados > > This commit comes at the tail end of a greater effort to remove the > empty elements at the end of the ctl_table arrays (sentinels) which > will reduce the overall build time size of the kernel and run time > memory bloat by ~64 bytes per sentinel (further information Link : > https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo at bombadil.infradead.org/ > ) > > * Remove sentinel element from ctl_table structs. > > Signed-off-by: Joel Granados These changes look fine to me. Thank you! Acked-by: Allison Henderson > --- > ?net/rds/ib_sysctl.c | 1 - > ?net/rds/sysctl.c??? | 1 - > ?net/rds/tcp.c?????? | 1 - > ?3 files changed, 3 deletions(-) > > diff --git a/net/rds/ib_sysctl.c b/net/rds/ib_sysctl.c > index e4e41b3afce7..2af678e71e3c 100644 > --- a/net/rds/ib_sysctl.c > +++ b/net/rds/ib_sysctl.c > @@ -103,7 +103,6 @@ static struct ctl_table rds_ib_sysctl_table[] = { > ????????????????.mode???????????= 0644, > ????????????????.proc_handler???= proc_dointvec, > ????????}, > -???????{ } > ?}; > ? > ?void rds_ib_sysctl_exit(void) > diff --git a/net/rds/sysctl.c b/net/rds/sysctl.c > index e381bbcd9cc1..025f518a4349 100644 > --- a/net/rds/sysctl.c > +++ b/net/rds/sysctl.c > @@ -89,7 +89,6 @@ static struct ctl_table rds_sysctl_rds_table[] = { > ????????????????.mode?????????? = 0644, > ????????????????.proc_handler?? = proc_dointvec, > ????????}, > -???????{ } > ?}; > ? > ?void rds_sysctl_exit(void) > diff --git a/net/rds/tcp.c b/net/rds/tcp.c > index 2dba7505b414..d8111ac83bb6 100644 > --- a/net/rds/tcp.c > +++ b/net/rds/tcp.c > @@ -86,7 +86,6 @@ static struct ctl_table rds_tcp_sysctl_table[] = { > ????????????????.proc_handler?? = rds_tcp_skbuf_handler, > ????????????????.extra1?????????= &rds_tcp_min_rcvbuf, > ????????}, > -???????{ } > ?}; > ? > ?u32 rds_tcp_write_seq(struct rds_tcp_connection *tc) > From j.granados at samsung.com Fri May 3 05:18:11 2024 From: j.granados at samsung.com (Joel Granados) Date: Fri, 3 May 2024 14:18:11 +0200 Subject: [PATCH net-next v6 8/8] ax.25: x.25: Remove the now superfluous sentinel elements from ctl_table array In-Reply-To: References: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> <20240501-jag-sysctl_remset_net-v6-8-370b702b6b4a@samsung.com> Message-ID: <20240503121811.fsmriwsgugzm2o7i@joelS2.panther.com> On Wed, May 01, 2024 at 03:15:54PM +0200, Sabrina Dubroca wrote: > 2024-05-01, 11:29:32 +0200, Joel Granados via B4 Relay wrote: > > From: Joel Granados > > > > This commit comes at the tail end of a greater effort to remove the > > empty elements at the end of the ctl_table arrays (sentinels) which will > > reduce the overall build time size of the kernel and run time memory > > bloat by ~64 bytes per sentinel (further information Link : > > https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo at bombadil.infradead.org/) > > > > Avoid a buffer overflow when traversing the ctl_table by ensuring that > > AX25_MAX_VALUES is the same as the size of ax25_param_table. This is > > done with a BUILD_BUG_ON where ax25_param_table is defined and a > > CONFIG_AX25_DAMA_SLAVE guard in the unnamed enum definition as well as > > in the ax25_dev_device_up and ax25_ds_set_timer functions. > ^^ > nit: not anymore ;) > (but not worth a repost IMO) > > > > diff --git a/net/ax25/ax25_ds_timer.c b/net/ax25/ax25_ds_timer.c > > index c4f8adbf8144..c50a58d9e368 100644 > > --- a/net/ax25/ax25_ds_timer.c > > +++ b/net/ax25/ax25_ds_timer.c > > @@ -55,6 +55,7 @@ void ax25_ds_set_timer(ax25_dev *ax25_dev) > > ax25_dev->dama.slave_timeout = > > msecs_to_jiffies(ax25_dev->values[AX25_VALUES_DS_TIMEOUT]) / 10; > > mod_timer(&ax25_dev->dama.slave_timer, jiffies + HZ); > > + return; > > nit: return not needed here since we're already at the bottom of the > function, but probably not worth a repost of the series. > Thx. I will not repost, but I have changed them locally so they are there in case a V7 is required. Best -- Joel Granados -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 659 bytes Desc: not available URL: From j.granados at samsung.com Fri May 3 05:20:45 2024 From: j.granados at samsung.com (Joel Granados) Date: Fri, 3 May 2024 14:20:45 +0200 Subject: [PATCH net-next v6 0/8] sysctl: Remove sentinel elements from networking In-Reply-To: References: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> Message-ID: <20240503122045.6uvwnijtem3sbcl5@joelS2.panther.com> On Wed, May 01, 2024 at 03:22:08PM +0200, Sabrina Dubroca wrote: > 2024-05-01, 11:29:24 +0200, Joel Granados via B4 Relay wrote: > > From: Joel Granados > > ... > > Changes in v6: > > - Rebased onto net-next/main. > > - Besides re-running my cocci scripts, I ran a new find script [9]. > > Found 0 hits in net/ > > - Moved "i" variable declaraction out of for() in sysctl_core_net_init > > - Removed forgotten sentinel in mpls_table > > - Removed CONFIG_AX25_DAMA_SLAVE guard from net/ax25/ax25_ds_timer.c. It > > is not needed because that file is compiled only when > > CONFIG_AX25_DAMA_SLAVE is set. > > - When traversing smc_table, stop on ARRAY_SIZE instead of ARRAY_SIZE-1. > > - Link to v5: https://lore.kernel.org/r/20240426-jag-sysctl_remset_net-v5-0-e3b12f6111a6 at samsung.com > > I pointed out a few tiny details in the ax25 patch but either way, the > series looks good to me. Thanks! > > Series: > Reviewed-by: Sabrina Dubroca Thx > > Note that you could have kept the ack/reviewed-by on patch 4 since it > was not modified. Jeff and Chuck, your reviews got lost in the repost. Indeed. I have been having issues with my b4. Have posted this https://lore.kernel.org/tools/20240503120506.p7g5nn6jocrrdlck at joelS2.panther.com/ which explains my situation. Best -- Joel Granados -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 659 bytes Desc: not available URL: From patchwork-bot+netdevbpf at kernel.org Fri May 3 05:40:31 2024 From: patchwork-bot+netdevbpf at kernel.org (patchwork-bot+netdevbpf at kernel.org) Date: Fri, 03 May 2024 12:40:31 +0000 Subject: [PATCH net-next v6 1/8] net: Remove the now superfluous sentinel elements from ctl_table array In-Reply-To: <20240501-jag-sysctl_remset_net-v6-1-370b702b6b4a@samsung.com> References: <20240501-jag-sysctl_remset_net-v6-1-370b702b6b4a@samsung.com> Message-ID: <171474003121.32261.9596059257751321282.git-patchwork-notify@kernel.org> Hello: This series was applied to netdev/net-next.git (main) by David S. Miller : On Wed, 01 May 2024 11:29:25 +0200 you wrote: > From: Joel Granados > > This commit comes at the tail end of a greater effort to remove the > empty elements at the end of the ctl_table arrays (sentinels) which > will reduce the overall build time size of the kernel and run time > memory bloat by ~64 bytes per sentinel (further information Link : > https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo at bombadil.infradead.org/) > > [...] Here is the summary with links: - [net-next,v6,1/8] net: Remove the now superfluous sentinel elements from ctl_table array https://git.kernel.org/netdev/net-next/c/ce218712b0f6 - [net-next,v6,2/8] net: ipv{6,4}: Remove the now superfluous sentinel elements from ctl_table array https://git.kernel.org/netdev/net-next/c/1c106eb01cee - [net-next,v6,3/8] net: rds: Remove the now superfluous sentinel elements from ctl_table array https://git.kernel.org/netdev/net-next/c/92bedf07836b - [net-next,v6,4/8] net: sunrpc: Remove the now superfluous sentinel elements from ctl_table array https://git.kernel.org/netdev/net-next/c/ca5d1fce7994 - [net-next,v6,5/8] net: Remove ctl_table sentinel elements from several networking subsystems https://git.kernel.org/netdev/net-next/c/73dbd8cf7947 - [net-next,v6,6/8] netfilter: Remove the now superfluous sentinel elements from ctl_table array https://git.kernel.org/netdev/net-next/c/635470eb0aa7 - [net-next,v6,7/8] appletalk: Remove the now superfluous sentinel elements from ctl_table array https://git.kernel.org/netdev/net-next/c/e00e35e217c0 - [net-next,v6,8/8] ax.25: x.25: Remove the now superfluous sentinel elements from ctl_table array https://git.kernel.org/netdev/net-next/c/78a7b5dbc060 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html From dhowells at redhat.com Fri May 3 08:07:39 2024 From: dhowells at redhat.com (David Howells) Date: Fri, 3 May 2024 16:07:39 +0100 Subject: [PATCH net 1/5] rxrpc: Fix congestion control algorithm In-Reply-To: <20240503150749.1001323-1-dhowells@redhat.com> References: <20240503150749.1001323-1-dhowells@redhat.com> Message-ID: <20240503150749.1001323-2-dhowells@redhat.com> Make the following fixes to the congestion control algorithm: (1) Don't vary the cwnd starting value by the size of RXRPC_TX_SMSS since that's currently held constant - set to the size of a jumbo subpacket payload so that we can create jumbo packets on the fly. The current code invariably picks 3 as the starting value. Further, the starting cwnd needs to be an even number because we ack every other packet, so set it to 4. (2) Don't cut ssthresh when we see an ACK come from the peer with a receive window (rwind) less than ssthresh. ssthresh keeps track of characteristics of the connection whereas rwind may be reduced by the peer for any reason - and may be reduced to 0. Fixes: 1fc4fa2ac93d ("rxrpc: Fix congestion management") Fixes: 0851115090a3 ("rxrpc: Reduce ssthresh to peer's receive window") Signed-off-by: David Howells Suggested-by: Simon Wilkinson cc: Marc Dionne cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: linux-afs at lists.infradead.org cc: netdev at vger.kernel.org --- net/rxrpc/ar-internal.h | 2 +- net/rxrpc/call_object.c | 7 +------ net/rxrpc/input.c | 3 --- 3 files changed, 2 insertions(+), 10 deletions(-) diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h index 08c0a32db8c7..08de24658f4f 100644 --- a/net/rxrpc/ar-internal.h +++ b/net/rxrpc/ar-internal.h @@ -697,7 +697,7 @@ struct rxrpc_call { * packets) rather than bytes. */ #define RXRPC_TX_SMSS RXRPC_JUMBO_DATALEN -#define RXRPC_MIN_CWND (RXRPC_TX_SMSS > 2190 ? 2 : RXRPC_TX_SMSS > 1095 ? 3 : 4) +#define RXRPC_MIN_CWND 4 u8 cong_cwnd; /* Congestion window size */ u8 cong_extra; /* Extra to send for congestion management */ u8 cong_ssthresh; /* Slow-start threshold */ diff --git a/net/rxrpc/call_object.c b/net/rxrpc/call_object.c index 01fa71e8b1f7..f9e983a12c14 100644 --- a/net/rxrpc/call_object.c +++ b/net/rxrpc/call_object.c @@ -174,12 +174,7 @@ struct rxrpc_call *rxrpc_alloc_call(struct rxrpc_sock *rx, gfp_t gfp, call->rx_winsize = rxrpc_rx_window_size; call->tx_winsize = 16; - if (RXRPC_TX_SMSS > 2190) - call->cong_cwnd = 2; - else if (RXRPC_TX_SMSS > 1095) - call->cong_cwnd = 3; - else - call->cong_cwnd = 4; + call->cong_cwnd = RXRPC_MIN_CWND; call->cong_ssthresh = RXRPC_TX_MAX_WINDOW; call->rxnet = rxnet; diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c index 3dedb8c0618c..860075f1290b 100644 --- a/net/rxrpc/input.c +++ b/net/rxrpc/input.c @@ -685,9 +685,6 @@ static void rxrpc_input_ack_trailer(struct rxrpc_call *call, struct sk_buff *skb call->tx_winsize = rwind; } - if (call->cong_ssthresh > rwind) - call->cong_ssthresh = rwind; - mtu = min(ntohl(trailer->maxMTU), ntohl(trailer->ifMTU)); peer = call->peer; From dan.carpenter at linaro.org Fri May 3 08:23:14 2024 From: dan.carpenter at linaro.org (Dan Carpenter) Date: Fri, 3 May 2024 18:23:14 +0300 Subject: [PATCH net-next v6 8/8] ax.25: x.25: Remove the now superfluous sentinel elements from ctl_table array In-Reply-To: <20240503121811.fsmriwsgugzm2o7i@joelS2.panther.com> References: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> <20240501-jag-sysctl_remset_net-v6-8-370b702b6b4a@samsung.com> <20240503121811.fsmriwsgugzm2o7i@joelS2.panther.com> Message-ID: <21f76a94-1b35-4cf7-914d-e341848b0b9e@moroto.mountain> On Fri, May 03, 2024 at 02:18:11PM +0200, Joel Granados wrote: > On Wed, May 01, 2024 at 03:15:54PM +0200, Sabrina Dubroca wrote: > > 2024-05-01, 11:29:32 +0200, Joel Granados via B4 Relay wrote: > > > From: Joel Granados > > > diff --git a/net/ax25/ax25_ds_timer.c b/net/ax25/ax25_ds_timer.c > > > index c4f8adbf8144..c50a58d9e368 100644 > > > --- a/net/ax25/ax25_ds_timer.c > > > +++ b/net/ax25/ax25_ds_timer.c > > > @@ -55,6 +55,7 @@ void ax25_ds_set_timer(ax25_dev *ax25_dev) > > > ax25_dev->dama.slave_timeout = > > > msecs_to_jiffies(ax25_dev->values[AX25_VALUES_DS_TIMEOUT]) / 10; > > > mod_timer(&ax25_dev->dama.slave_timer, jiffies + HZ); > > > + return; > > > > nit: return not needed here since we're already at the bottom of the > > function, but probably not worth a repost of the series. > > > Thx. I will not repost, but I have changed them locally so they are > there in case a V7 is required. > It's a checkpatch.pl -f warning so we probably will want to fix it eventually. regards, dan carpenter From brauner at kernel.org Sat May 4 03:22:14 2024 From: brauner at kernel.org (Christian Brauner) Date: Sat, 4 May 2024 12:22:14 +0200 Subject: [PATCH v2] afs: Fix fileserver rotation getting stuck In-Reply-To: <998836.1714746152@warthog.procyon.org.uk> References: <998836.1714746152@warthog.procyon.org.uk> Message-ID: <20240504-checken-bankdaten-9f8d7a288ba5@brauner> On Fri, 03 May 2024 15:22:32 +0100, David Howells wrote: > Could you pick this up, please? > > David > > Applied to the vfs.fixes branch of the vfs/vfs.git tree. Patches in the vfs.fixes branch should appear in linux-next soon. Please report any outstanding bugs that were missed during review in a new review to the original patch series allowing us to drop it. It's encouraged to provide Acked-bys and Reviewed-bys even though the patch has now been applied. If possible patch trailers will be updated. Note that commit hashes shown below are subject to change due to rebase, trailer updates or similar. If in doubt, please check the listed branch. tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git branch: vfs.fixes [1/1] afs: Fix fileserver rotation getting stuck https://git.kernel.org/vfs/vfs/c/21965b764cec From jaltman at auristor.com Sat May 4 14:04:25 2024 From: jaltman at auristor.com (Jeffrey Altman) Date: Sat, 4 May 2024 17:04:25 -0400 Subject: Recursive mounts with kAFS, but not with OpenAFS In-Reply-To: References: Message-ID: <4B8BEE81-844D-4450-A830-E22DB8734635@auristor.com> > On May 3, 2024, at 6:20?AM, Jan Henrik Sylvester wrote: > > Hello, > Hello Jan, > I hope this is the correct forum to ask and I do not have to be subscribed to the list. This is the correct mailing list to discuss net/rxrpc, fs/afs, kafs-client and related functionality. > > kAFS and OpenAFS behave differently if there are recursive mounts. Should one of the behaviors be considered a bug? Kafs behaves differently from OpenAFS in many respects. One significant way that kafs differs is that each AFS or AuriStorFS volume is a distinct device. Another significant difference is that kafs does not have a path based ioctl interface. > > For each users' home directory, we (uni-hamburg.de) create a mount point of its respective backup volume AFSBAK directly below the home directory itself. That way ~/AFSBAK is a snapshot from the backup of last night for ~ and the users can find lost files themselves. This is fairly common. > > That was never a problem with OpenAFS, because ~/AFSBAK/AFSBAK is invalid (No such device). Now I am evaluating kAFS for our clients (long term plan, now triggered by the fact that there is no OpenAFS for Ubuntu 24.04, yet) Ubuntu 24.04 is less than a week old and although AuriStorFS clients are available for it one of the significant benefits of kafs on distributions that support it is that it?s built in. There is never a wait for its availability or a need to synchronize an out-of-tree kernel module with the running kernel. > and a simple execution of find on a home directory had a machine hang by the oom-killer killing vital system processes, because kAFS can go more than a 1000 levels recursively into ~/AFSBAK/AFSBAK/AFSBAK... Markus Suvanto reported the same problem on Feb 20th. At the time Marc Dionne developed a patch to prevent the evaluation of a mount point to a volume name with the ?.backup? extension from a volume of type AFSVL_BACKVOL. Testing of the patch wasn?t complete so it did not get submitted. Hopefully Marc can submit it next week. Once merged by Linus it can be back ported to the actively maintained stable kernels. > > The volume user is mounted to /afs/math.uni-hamburg.de/users/area/user and user.backup is mounted to /afs/math.uni-hamburg.de/users/area/user/AFSBAK. Should /afs/math.uni-hamburg.de/users/area/user/AFSBAK/AFSBAK automatically be a mount point? It's a mount point but should not be traversed. > Thanks, > Jan Henrik Jeffrey Altman -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3929 bytes Desc: not available URL: From kuba at kernel.org Tue May 7 19:44:47 2024 From: kuba at kernel.org (Jakub Kicinski) Date: Tue, 7 May 2024 19:44:47 -0700 Subject: [PATCH net 0/5] rxrpc: Miscellaneous fixes In-Reply-To: <20240503150749.1001323-1-dhowells@redhat.com> References: <20240503150749.1001323-1-dhowells@redhat.com> Message-ID: <20240507194447.20bcfb60@kernel.org> On Fri, 3 May 2024 16:07:38 +0100 David Howells wrote: > Here some miscellaneous fixes for AF_RXRPC: > > (1) Fix the congestion control algorithm to start cwnd at 4 and to not cut > ssthresh when the peer cuts its rwind size. > > (2) Only transmit a single ACK for all the DATA packets glued together > into a jumbo packet to reduce the number of ACKs being generated. > > (3) Clean up the generation of flags in the protocol header when creating > a packet for transmission. This means we don't carry the old > REQUEST-ACK bit around from previous transmissions, will make it > easier to fix the MORE-PACKETS flag and make it easier to do jumbo > packet assembly in future. > > (4) Fix how the MORE-PACKETS flag is driven. We shouldn't be setting it > in sendmsg() as the packet is then queued and the bit is left in that > state, no matter how long it takes us to transmit the packet - and > will still be in that state if the packet is retransmitted. > > (5) Request an ACK on an impending transmission stall due to the app layer > not feeding us new data fast enough. If we don't request an ACK, we > may have to hold on to the packet buffers for a significant amount of > time until the receiver gets bored and sends us an ACK anyway. Looks like these got marked as Rejected in patchwork. I think either because lore is confused and attaches an exchange with DaveM from 2022 to them (?) or because I mentioned to DaveM that I'm not sure these are fixes. So let me ask - on a scale of 1 to 10, how convinced are you that these should go to Linus this week rather than being categorized as general improvements and go during the merge window (without the Fixes tags)? From jaltman at auristor.com Wed May 8 00:57:43 2024 From: jaltman at auristor.com (Jeffrey Altman) Date: Wed, 8 May 2024 01:57:43 -0600 Subject: [PATCH net 0/5] rxrpc: Miscellaneous fixes In-Reply-To: <20240507194447.20bcfb60@kernel.org> References: <20240503150749.1001323-1-dhowells@redhat.com> <20240507194447.20bcfb60@kernel.org> Message-ID: <955B77FD-C0C2-479E-9D85-D2F62E3DA48C@auristor.com> > On May 7, 2024, at 8:44?PM, Jakub Kicinski wrote: > > On Fri, 3 May 2024 16:07:38 +0100 David Howells wrote: >> Here some miscellaneous fixes for AF_RXRPC: >> >> (1) Fix the congestion control algorithm to start cwnd at 4 and to not cut >> ssthresh when the peer cuts its rwind size. >> >> (2) Only transmit a single ACK for all the DATA packets glued together >> into a jumbo packet to reduce the number of ACKs being generated. >> >> (3) Clean up the generation of flags in the protocol header when creating >> a packet for transmission. This means we don't carry the old >> REQUEST-ACK bit around from previous transmissions, will make it >> easier to fix the MORE-PACKETS flag and make it easier to do jumbo >> packet assembly in future. >> >> (4) Fix how the MORE-PACKETS flag is driven. We shouldn't be setting it >> in sendmsg() as the packet is then queued and the bit is left in that >> state, no matter how long it takes us to transmit the packet - and >> will still be in that state if the packet is retransmitted. >> >> (5) Request an ACK on an impending transmission stall due to the app layer >> not feeding us new data fast enough. If we don't request an ACK, we >> may have to hold on to the packet buffers for a significant amount of >> time until the receiver gets bored and sends us an ACK anyway. > > Looks like these got marked as Rejected in patchwork. > I think either because lore is confused and attaches an exchange with > DaveM from 2022 to them (?) or because I mentioned to DaveM that I'm > not sure these are fixes. So let me ask - on a scale of 1 to 10, how > convinced are you that these should go to Linus this week rather than > being categorized as general improvements and go during the merge > window (without the Fixes tags)? Jakub, In my opinion, the first two patches in the series I believe are important to back port to the stable branches. Reviewed-by: Jeffrey Altman > Jeffrey -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3929 bytes Desc: not available URL: From kuba at kernel.org Wed May 8 06:54:36 2024 From: kuba at kernel.org (Jakub Kicinski) Date: Wed, 8 May 2024 06:54:36 -0700 Subject: [PATCH net 0/5] rxrpc: Miscellaneous fixes In-Reply-To: <955B77FD-C0C2-479E-9D85-D2F62E3DA48C@auristor.com> References: <20240503150749.1001323-1-dhowells@redhat.com> <20240507194447.20bcfb60@kernel.org> <955B77FD-C0C2-479E-9D85-D2F62E3DA48C@auristor.com> Message-ID: <20240508065436.2f0c07e9@kernel.org> On Wed, 8 May 2024 01:57:43 -0600 Jeffrey Altman wrote: > > Looks like these got marked as Rejected in patchwork. > > I think either because lore is confused and attaches an exchange with > > DaveM from 2022 to them (?) or because I mentioned to DaveM that I'm > > not sure these are fixes. So let me ask - on a scale of 1 to 10, how > > convinced are you that these should go to Linus this week rather than > > being categorized as general improvements and go during the merge > > window (without the Fixes tags)? > > Jakub, > > In my opinion, the first two patches in the series I believe are important to back port to the stable branches. > > Reviewed-by: Jeffrey Altman > Are they regressions? Seems possible from the Fixes tag but unclear from the text of the commit messages. In any case, taking the first two may be a reasonable compromise. Does it sounds good to you, David? From dhowells at redhat.com Wed May 8 07:00:28 2024 From: dhowells at redhat.com (David Howells) Date: Wed, 08 May 2024 15:00:28 +0100 Subject: [PATCH net 0/5] rxrpc: Miscellaneous fixes In-Reply-To: <20240507194447.20bcfb60@kernel.org> References: <20240507194447.20bcfb60@kernel.org> <20240503150749.1001323-1-dhowells@redhat.com> Message-ID: <1478421.1715176828@warthog.procyon.org.uk> Jakub Kicinski wrote: > Looks like these got marked as Rejected in patchwork. > I think either because lore is confused and attaches an exchange with > DaveM from 2022 to them (?) or because I mentioned to DaveM that I'm > not sure these are fixes. So let me ask - on a scale of 1 to 10, how > convinced are you that these should go to Linus this week rather than > being categorized as general improvements and go during the merge > window (without the Fixes tags)? Ah, sorry. I marked them rejected as I put myself as cc: not S-o-b on one of them, but then got distracted and didn't get around to reposting them. And Jeff mentioned that the use of the MORE-PACKETS flag is not exactly consistent between various implementations. So if you could take just the first two for the moment? Thanks, David From kuba at kernel.org Wed May 8 08:07:57 2024 From: kuba at kernel.org (Jakub Kicinski) Date: Wed, 8 May 2024 08:07:57 -0700 Subject: [PATCH net 0/5] rxrpc: Miscellaneous fixes In-Reply-To: <1478421.1715176828@warthog.procyon.org.uk> References: <20240507194447.20bcfb60@kernel.org> <20240503150749.1001323-1-dhowells@redhat.com> <1478421.1715176828@warthog.procyon.org.uk> Message-ID: <20240508080757.5022b867@kernel.org> On Wed, 08 May 2024 15:00:28 +0100 David Howells wrote: > Jakub Kicinski wrote: > > > Looks like these got marked as Rejected in patchwork. > > I think either because lore is confused and attaches an exchange with > > DaveM from 2022 to them (?) or because I mentioned to DaveM that I'm > > not sure these are fixes. So let me ask - on a scale of 1 to 10, how > > convinced are you that these should go to Linus this week rather than > > being categorized as general improvements and go during the merge > > window (without the Fixes tags)? > > Ah, sorry. I marked them rejected as I put myself as cc: not S-o-b on one of > them, but then got distracted and didn't get around to reposting them. And > Jeff mentioned that the use of the MORE-PACKETS flag is not exactly > consistent between various implementations. Ah, mystery solved :) > So if you could take just the first two for the moment? Done! From patchwork-bot+netdevbpf at kernel.org Wed May 8 08:10:30 2024 From: patchwork-bot+netdevbpf at kernel.org (patchwork-bot+netdevbpf at kernel.org) Date: Wed, 08 May 2024 15:10:30 +0000 Subject: [PATCH net 0/5] rxrpc: Miscellaneous fixes In-Reply-To: <20240503150749.1001323-1-dhowells@redhat.com> References: <20240503150749.1001323-1-dhowells@redhat.com> Message-ID: <171518103016.16796.12754313955798930000.git-patchwork-notify@kernel.org> Hello: This series was applied to netdev/net.git (main) by Jakub Kicinski : On Fri, 3 May 2024 16:07:38 +0100 you wrote: > Here some miscellaneous fixes for AF_RXRPC: > > (1) Fix the congestion control algorithm to start cwnd at 4 and to not cut > ssthresh when the peer cuts its rwind size. > > (2) Only transmit a single ACK for all the DATA packets glued together > into a jumbo packet to reduce the number of ACKs being generated. > > [...] Here is the summary with links: - [net,1/5] rxrpc: Fix congestion control algorithm https://git.kernel.org/netdev/net/c/ba4e103848d3 - [net,2/5] rxrpc: Only transmit one ACK per jumbo packet received https://git.kernel.org/netdev/net/c/012b7206918d - [net,3/5] rxrpc: Clean up Tx header flags generation handling (no matching commit) - [net,4/5] rxrpc: Change how the MORE-PACKETS rxrpc wire header flag is driven (no matching commit) - [net,5/5] rxrpc: Request an ACK on impending Tx stall (no matching commit) You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html From j.granados at samsung.com Tue May 7 01:02:19 2024 From: j.granados at samsung.com (Joel Granados) Date: Tue, 7 May 2024 10:02:19 +0200 Subject: [PATCH net-next v6 8/8] ax.25: x.25: Remove the now superfluous sentinel elements from ctl_table array In-Reply-To: <21f76a94-1b35-4cf7-914d-e341848b0b9e@moroto.mountain> References: <20240501-jag-sysctl_remset_net-v6-0-370b702b6b4a@samsung.com> <20240501-jag-sysctl_remset_net-v6-8-370b702b6b4a@samsung.com> <20240503121811.fsmriwsgugzm2o7i@joelS2.panther.com> <21f76a94-1b35-4cf7-914d-e341848b0b9e@moroto.mountain> Message-ID: <20240507080219.xp6m5lyx5mt655yg@joelS2.panther.com> On Fri, May 03, 2024 at 06:23:14PM +0300, Dan Carpenter wrote: > On Fri, May 03, 2024 at 02:18:11PM +0200, Joel Granados wrote: > > On Wed, May 01, 2024 at 03:15:54PM +0200, Sabrina Dubroca wrote: > > > 2024-05-01, 11:29:32 +0200, Joel Granados via B4 Relay wrote: > > > > From: Joel Granados > > > > diff --git a/net/ax25/ax25_ds_timer.c b/net/ax25/ax25_ds_timer.c > > > > index c4f8adbf8144..c50a58d9e368 100644 > > > > --- a/net/ax25/ax25_ds_timer.c > > > > +++ b/net/ax25/ax25_ds_timer.c > > > > @@ -55,6 +55,7 @@ void ax25_ds_set_timer(ax25_dev *ax25_dev) > > > > ax25_dev->dama.slave_timeout = > > > > msecs_to_jiffies(ax25_dev->values[AX25_VALUES_DS_TIMEOUT]) / 10; > > > > mod_timer(&ax25_dev->dama.slave_timer, jiffies + HZ); > > > > + return; > > > > > > nit: return not needed here since we're already at the bottom of the > > > function, but probably not worth a repost of the series. > > > > > Thx. I will not repost, but I have changed them locally so they are > > there in case a V7 is required. > > > > It's a checkpatch.pl -f warning so we probably will want to fix it > eventually. According to [1] the patchset has already been applied. So I'll just send another patch for it to be applied on top. Thx for pointing this out. [1] https://patchwork.kernel.org/project/netdevbpf/patch/20240501-jag-sysctl_remset_net-v6-1-370b702b6b4a at samsung.com/ -- Joel Granados -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 659 bytes Desc: not available URL: From linyunsheng at huawei.com Wed May 8 06:34:01 2024 From: linyunsheng at huawei.com (Yunsheng Lin) Date: Wed, 8 May 2024 21:34:01 +0800 Subject: [PATCH net-next v3 06/13] mm: page_frag: add '_va' suffix to page_frag API In-Reply-To: <20240508133408.54708-1-linyunsheng@huawei.com> References: <20240508133408.54708-1-linyunsheng@huawei.com> Message-ID: <20240508133408.54708-7-linyunsheng@huawei.com> Currently the page_frag API is returning 'virtual address' or 'va' when allocing and expecting 'virtual address' or 'va' as input when freeing. As we are about to support new use cases that the caller need to deal with 'struct page' or need to deal with both 'va' and 'struct page'. In order to differentiate the API handling between 'va' and 'struct page', add '_va' suffix to the corresponding API mirroring the page_pool_alloc_va() API of the page_pool. So that callers expecting to deal with va, page or both va and page may call page_frag_alloc_va*, page_frag_alloc_pg*, or page_frag_alloc* API accordingly. CC: Alexander Duyck Signed-off-by: Yunsheng Lin --- drivers/net/ethernet/google/gve/gve_rx.c | 4 ++-- drivers/net/ethernet/intel/ice/ice_txrx.c | 2 +- drivers/net/ethernet/intel/ice/ice_txrx.h | 2 +- drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 2 +- .../net/ethernet/intel/ixgbevf/ixgbevf_main.c | 4 ++-- .../marvell/octeontx2/nic/otx2_common.c | 2 +- drivers/net/ethernet/mediatek/mtk_wed_wo.c | 4 ++-- drivers/nvme/host/tcp.c | 8 +++---- drivers/nvme/target/tcp.c | 22 +++++++++---------- drivers/vhost/net.c | 6 ++--- include/linux/page_frag_cache.h | 21 +++++++++--------- include/linux/skbuff.h | 2 +- kernel/bpf/cpumap.c | 2 +- mm/page_frag_cache.c | 12 +++++----- mm/page_frag_test.c | 11 +++++----- net/core/skbuff.c | 18 +++++++-------- net/core/xdp.c | 2 +- net/rxrpc/txbuf.c | 15 +++++++------ net/sunrpc/svcsock.c | 6 ++--- 19 files changed, 74 insertions(+), 71 deletions(-) diff --git a/drivers/net/ethernet/google/gve/gve_rx.c b/drivers/net/ethernet/google/gve/gve_rx.c index acb73d4d0de6..b6c10100e462 100644 --- a/drivers/net/ethernet/google/gve/gve_rx.c +++ b/drivers/net/ethernet/google/gve/gve_rx.c @@ -729,7 +729,7 @@ static int gve_xdp_redirect(struct net_device *dev, struct gve_rx_ring *rx, total_len = headroom + SKB_DATA_ALIGN(len) + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); - frame = page_frag_alloc(&rx->page_cache, total_len, GFP_ATOMIC); + frame = page_frag_alloc_va(&rx->page_cache, total_len, GFP_ATOMIC); if (!frame) { u64_stats_update_begin(&rx->statss); rx->xdp_alloc_fails++; @@ -742,7 +742,7 @@ static int gve_xdp_redirect(struct net_device *dev, struct gve_rx_ring *rx, err = xdp_do_redirect(dev, &new, xdp_prog); if (err) - page_frag_free(frame); + page_frag_free_va(frame); return err; } diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index 8bb743f78fcb..399b317c509d 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -126,7 +126,7 @@ ice_unmap_and_free_tx_buf(struct ice_tx_ring *ring, struct ice_tx_buf *tx_buf) dev_kfree_skb_any(tx_buf->skb); break; case ICE_TX_BUF_XDP_TX: - page_frag_free(tx_buf->raw_buf); + page_frag_free_va(tx_buf->raw_buf); break; case ICE_TX_BUF_XDP_XMIT: xdp_return_frame(tx_buf->xdpf); diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h index feba314a3fe4..6379f57d8228 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx.h @@ -148,7 +148,7 @@ static inline int ice_skb_pad(void) * @ICE_TX_BUF_DUMMY: dummy Flow Director packet, unmap and kfree() * @ICE_TX_BUF_FRAG: mapped skb OR &xdp_buff frag, only unmap DMA * @ICE_TX_BUF_SKB: &sk_buff, unmap and consume_skb(), update stats - * @ICE_TX_BUF_XDP_TX: &xdp_buff, unmap and page_frag_free(), stats + * @ICE_TX_BUF_XDP_TX: &xdp_buff, unmap and page_frag_free_va(), stats * @ICE_TX_BUF_XDP_XMIT: &xdp_frame, unmap and xdp_return_frame(), stats * @ICE_TX_BUF_XSK_TX: &xdp_buff on XSk queue, xsk_buff_free(), stats */ diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c index 2719f0e20933..a1a41a14df0d 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c @@ -250,7 +250,7 @@ ice_clean_xdp_tx_buf(struct device *dev, struct ice_tx_buf *tx_buf, switch (tx_buf->type) { case ICE_TX_BUF_XDP_TX: - page_frag_free(tx_buf->raw_buf); + page_frag_free_va(tx_buf->raw_buf); break; case ICE_TX_BUF_XDP_XMIT: xdp_return_frame_bulk(tx_buf->xdpf, bq); diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index b938dc06045d..fcd1b149a45d 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -303,7 +303,7 @@ static bool ixgbevf_clean_tx_irq(struct ixgbevf_q_vector *q_vector, /* free the skb */ if (ring_is_xdp(tx_ring)) - page_frag_free(tx_buffer->data); + page_frag_free_va(tx_buffer->data); else napi_consume_skb(tx_buffer->skb, napi_budget); @@ -2413,7 +2413,7 @@ static void ixgbevf_clean_tx_ring(struct ixgbevf_ring *tx_ring) /* Free all the Tx ring sk_buffs */ if (ring_is_xdp(tx_ring)) - page_frag_free(tx_buffer->data); + page_frag_free_va(tx_buffer->data); else dev_kfree_skb_any(tx_buffer->skb); diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c index a85ac039d779..8eb5820b8a70 100644 --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c @@ -553,7 +553,7 @@ static int __otx2_alloc_rbuf(struct otx2_nic *pfvf, struct otx2_pool *pool, *dma = dma_map_single_attrs(pfvf->dev, buf, pool->rbsize, DMA_FROM_DEVICE, DMA_ATTR_SKIP_CPU_SYNC); if (unlikely(dma_mapping_error(pfvf->dev, *dma))) { - page_frag_free(buf); + page_frag_free_va(buf); return -ENOMEM; } diff --git a/drivers/net/ethernet/mediatek/mtk_wed_wo.c b/drivers/net/ethernet/mediatek/mtk_wed_wo.c index 7063c78bd35f..c4228719f8a4 100644 --- a/drivers/net/ethernet/mediatek/mtk_wed_wo.c +++ b/drivers/net/ethernet/mediatek/mtk_wed_wo.c @@ -142,8 +142,8 @@ mtk_wed_wo_queue_refill(struct mtk_wed_wo *wo, struct mtk_wed_wo_queue *q, dma_addr_t addr; void *buf; - buf = page_frag_alloc(&q->cache, q->buf_size, - GFP_ATOMIC | GFP_DMA32); + buf = page_frag_alloc_va(&q->cache, q->buf_size, + GFP_ATOMIC | GFP_DMA32); if (!buf) break; diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index fdbcdcedcee9..79eddd74bfbb 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -500,7 +500,7 @@ static void nvme_tcp_exit_request(struct blk_mq_tag_set *set, { struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq); - page_frag_free(req->pdu); + page_frag_free_va(req->pdu); } static int nvme_tcp_init_request(struct blk_mq_tag_set *set, @@ -514,7 +514,7 @@ static int nvme_tcp_init_request(struct blk_mq_tag_set *set, struct nvme_tcp_queue *queue = &ctrl->queues[queue_idx]; u8 hdgst = nvme_tcp_hdgst_len(queue); - req->pdu = page_frag_alloc(&queue->pf_cache, + req->pdu = page_frag_alloc_va(&queue->pf_cache, sizeof(struct nvme_tcp_cmd_pdu) + hdgst, GFP_KERNEL | __GFP_ZERO); if (!req->pdu) @@ -1331,7 +1331,7 @@ static void nvme_tcp_free_async_req(struct nvme_tcp_ctrl *ctrl) { struct nvme_tcp_request *async = &ctrl->async_req; - page_frag_free(async->pdu); + page_frag_free_va(async->pdu); } static int nvme_tcp_alloc_async_req(struct nvme_tcp_ctrl *ctrl) @@ -1340,7 +1340,7 @@ static int nvme_tcp_alloc_async_req(struct nvme_tcp_ctrl *ctrl) struct nvme_tcp_request *async = &ctrl->async_req; u8 hdgst = nvme_tcp_hdgst_len(queue); - async->pdu = page_frag_alloc(&queue->pf_cache, + async->pdu = page_frag_alloc_va(&queue->pf_cache, sizeof(struct nvme_tcp_cmd_pdu) + hdgst, GFP_KERNEL | __GFP_ZERO); if (!async->pdu) diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index a5422e2c979a..ea356ce22672 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -1462,24 +1462,24 @@ static int nvmet_tcp_alloc_cmd(struct nvmet_tcp_queue *queue, c->queue = queue; c->req.port = queue->port->nport; - c->cmd_pdu = page_frag_alloc(&queue->pf_cache, + c->cmd_pdu = page_frag_alloc_va(&queue->pf_cache, sizeof(*c->cmd_pdu) + hdgst, GFP_KERNEL | __GFP_ZERO); if (!c->cmd_pdu) return -ENOMEM; c->req.cmd = &c->cmd_pdu->cmd; - c->rsp_pdu = page_frag_alloc(&queue->pf_cache, + c->rsp_pdu = page_frag_alloc_va(&queue->pf_cache, sizeof(*c->rsp_pdu) + hdgst, GFP_KERNEL | __GFP_ZERO); if (!c->rsp_pdu) goto out_free_cmd; c->req.cqe = &c->rsp_pdu->cqe; - c->data_pdu = page_frag_alloc(&queue->pf_cache, + c->data_pdu = page_frag_alloc_va(&queue->pf_cache, sizeof(*c->data_pdu) + hdgst, GFP_KERNEL | __GFP_ZERO); if (!c->data_pdu) goto out_free_rsp; - c->r2t_pdu = page_frag_alloc(&queue->pf_cache, + c->r2t_pdu = page_frag_alloc_va(&queue->pf_cache, sizeof(*c->r2t_pdu) + hdgst, GFP_KERNEL | __GFP_ZERO); if (!c->r2t_pdu) goto out_free_data; @@ -1494,20 +1494,20 @@ static int nvmet_tcp_alloc_cmd(struct nvmet_tcp_queue *queue, return 0; out_free_data: - page_frag_free(c->data_pdu); + page_frag_free_va(c->data_pdu); out_free_rsp: - page_frag_free(c->rsp_pdu); + page_frag_free_va(c->rsp_pdu); out_free_cmd: - page_frag_free(c->cmd_pdu); + page_frag_free_va(c->cmd_pdu); return -ENOMEM; } static void nvmet_tcp_free_cmd(struct nvmet_tcp_cmd *c) { - page_frag_free(c->r2t_pdu); - page_frag_free(c->data_pdu); - page_frag_free(c->rsp_pdu); - page_frag_free(c->cmd_pdu); + page_frag_free_va(c->r2t_pdu); + page_frag_free_va(c->data_pdu); + page_frag_free_va(c->rsp_pdu); + page_frag_free_va(c->cmd_pdu); } static int nvmet_tcp_alloc_cmds(struct nvmet_tcp_queue *queue) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index f16279351db5..6691fac01e0d 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -686,8 +686,8 @@ static int vhost_net_build_xdp(struct vhost_net_virtqueue *nvq, return -ENOSPC; buflen += SKB_DATA_ALIGN(len + pad); - buf = page_frag_alloc_align(&net->pf_cache, buflen, GFP_KERNEL, - SMP_CACHE_BYTES); + buf = page_frag_alloc_va_align(&net->pf_cache, buflen, GFP_KERNEL, + SMP_CACHE_BYTES); if (unlikely(!buf)) return -ENOMEM; @@ -734,7 +734,7 @@ static int vhost_net_build_xdp(struct vhost_net_virtqueue *nvq, return 0; err: - page_frag_free(buf); + page_frag_free_va(buf); return ret; } diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag_cache.h index 9da7cbd0ee47..a5747cf7a3a1 100644 --- a/include/linux/page_frag_cache.h +++ b/include/linux/page_frag_cache.h @@ -25,23 +25,24 @@ struct page_frag_cache { void page_frag_cache_drain(struct page_frag_cache *nc); void __page_frag_cache_drain(struct page *page, unsigned int count); -void *__page_frag_alloc_align(struct page_frag_cache *nc, unsigned int fragsz, - gfp_t gfp_mask, unsigned int align_mask); +void *__page_frag_alloc_va_align(struct page_frag_cache *nc, + unsigned int fragsz, gfp_t gfp_mask, + unsigned int align_mask); -static inline void *page_frag_alloc_align(struct page_frag_cache *nc, - unsigned int fragsz, gfp_t gfp_mask, - unsigned int align) +static inline void *page_frag_alloc_va_align(struct page_frag_cache *nc, + unsigned int fragsz, + gfp_t gfp_mask, unsigned int align) { WARN_ON_ONCE(!is_power_of_2(align) || align > PAGE_SIZE); - return __page_frag_alloc_align(nc, fragsz, gfp_mask, -align); + return __page_frag_alloc_va_align(nc, fragsz, gfp_mask, -align); } -static inline void *page_frag_alloc(struct page_frag_cache *nc, - unsigned int fragsz, gfp_t gfp_mask) +static inline void *page_frag_alloc_va(struct page_frag_cache *nc, + unsigned int fragsz, gfp_t gfp_mask) { - return __page_frag_alloc_align(nc, fragsz, gfp_mask, ~0u); + return __page_frag_alloc_va_align(nc, fragsz, gfp_mask, ~0u); } -void page_frag_free(void *addr); +void page_frag_free_va(void *addr); #endif diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index ce077d14eab6..adaaa478fdce 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -3337,7 +3337,7 @@ static inline struct sk_buff *netdev_alloc_skb_ip_align(struct net_device *dev, static inline void skb_free_frag(void *addr) { - page_frag_free(addr); + page_frag_free_va(addr); } void *__napi_alloc_frag_align(unsigned int fragsz, unsigned int align_mask); diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c index a8e34416e960..3a6a237e7dd3 100644 --- a/kernel/bpf/cpumap.c +++ b/kernel/bpf/cpumap.c @@ -322,7 +322,7 @@ static int cpu_map_kthread_run(void *data) /* Bring struct page memory area to curr CPU. Read by * build_skb_around via page_is_pfmemalloc(), and when - * freed written by page_frag_free call. + * freed written by page_frag_free_va call. */ prefetchw(page); } diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c index 152ae5dec58a..c0ecfa733727 100644 --- a/mm/page_frag_cache.c +++ b/mm/page_frag_cache.c @@ -61,9 +61,9 @@ void __page_frag_cache_drain(struct page *page, unsigned int count) } EXPORT_SYMBOL(__page_frag_cache_drain); -void *__page_frag_alloc_align(struct page_frag_cache *nc, - unsigned int fragsz, gfp_t gfp_mask, - unsigned int align_mask) +void *__page_frag_alloc_va_align(struct page_frag_cache *nc, + unsigned int fragsz, gfp_t gfp_mask, + unsigned int align_mask) { unsigned int size, offset; struct page *page; @@ -124,16 +124,16 @@ void *__page_frag_alloc_align(struct page_frag_cache *nc, return nc->va + offset; } -EXPORT_SYMBOL(__page_frag_alloc_align); +EXPORT_SYMBOL(__page_frag_alloc_va_align); /* * Frees a page fragment allocated out of either a compound or order 0 page. */ -void page_frag_free(void *addr) +void page_frag_free_va(void *addr) { struct page *page = virt_to_head_page(addr); if (unlikely(put_page_testzero(page))) free_unref_page(page, compound_order(page)); } -EXPORT_SYMBOL(page_frag_free); +EXPORT_SYMBOL(page_frag_free_va); diff --git a/mm/page_frag_test.c b/mm/page_frag_test.c index f1c861709551..92eb288aab75 100644 --- a/mm/page_frag_test.c +++ b/mm/page_frag_test.c @@ -265,7 +265,7 @@ static int page_frag_pop_thread(void *arg) if (obj) { nr--; - page_frag_free(obj); + page_frag_free_va(obj); } else { cond_resched(); } @@ -295,17 +295,18 @@ static int page_frag_push_thread(void *arg) size = clamp(size, 1U, PAGE_SIZE); if (test_align) - va = page_frag_alloc_align(&test_frag, size, GFP_KERNEL, - SMP_CACHE_BYTES); + va = page_frag_alloc_va_align(&test_frag, size, + GFP_KERNEL, + SMP_CACHE_BYTES); else - va = page_frag_alloc(&test_frag, size, GFP_KERNEL); + va = page_frag_alloc_va(&test_frag, size, GFP_KERNEL); if (!va) continue; ret = objpool_push(va, pool); if (ret) { - page_frag_free(va); + page_frag_free_va(va); cond_resched(); } else { nr--; diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 466999a7515e..dca4e7445348 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -309,8 +309,8 @@ void *__napi_alloc_frag_align(unsigned int fragsz, unsigned int align_mask) fragsz = SKB_DATA_ALIGN(fragsz); - return __page_frag_alloc_align(&nc->page, fragsz, GFP_ATOMIC, - align_mask); + return __page_frag_alloc_va_align(&nc->page, fragsz, GFP_ATOMIC, + align_mask); } EXPORT_SYMBOL(__napi_alloc_frag_align); @@ -322,15 +322,15 @@ void *__netdev_alloc_frag_align(unsigned int fragsz, unsigned int align_mask) if (in_hardirq() || irqs_disabled()) { struct page_frag_cache *nc = this_cpu_ptr(&netdev_alloc_cache); - data = __page_frag_alloc_align(nc, fragsz, GFP_ATOMIC, - align_mask); + data = __page_frag_alloc_va_align(nc, fragsz, GFP_ATOMIC, + align_mask); } else { struct napi_alloc_cache *nc; local_bh_disable(); nc = this_cpu_ptr(&napi_alloc_cache); - data = __page_frag_alloc_align(&nc->page, fragsz, GFP_ATOMIC, - align_mask); + data = __page_frag_alloc_va_align(&nc->page, fragsz, GFP_ATOMIC, + align_mask); local_bh_enable(); } return data; @@ -740,12 +740,12 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev, unsigned int len, if (in_hardirq() || irqs_disabled()) { nc = this_cpu_ptr(&netdev_alloc_cache); - data = page_frag_alloc(nc, len, gfp_mask); + data = page_frag_alloc_va(nc, len, gfp_mask); pfmemalloc = nc->pfmemalloc; } else { local_bh_disable(); nc = this_cpu_ptr(&napi_alloc_cache.page); - data = page_frag_alloc(nc, len, gfp_mask); + data = page_frag_alloc_va(nc, len, gfp_mask); pfmemalloc = nc->pfmemalloc; local_bh_enable(); } @@ -833,7 +833,7 @@ struct sk_buff *napi_alloc_skb(struct napi_struct *napi, unsigned int len) } else { len = SKB_HEAD_ALIGN(len); - data = page_frag_alloc(&nc->page, len, gfp_mask); + data = page_frag_alloc_va(&nc->page, len, gfp_mask); pfmemalloc = nc->page.pfmemalloc; } diff --git a/net/core/xdp.c b/net/core/xdp.c index 41693154e426..245a2d011aeb 100644 --- a/net/core/xdp.c +++ b/net/core/xdp.c @@ -391,7 +391,7 @@ void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct, page_pool_put_full_page(page->pp, page, napi_direct); break; case MEM_TYPE_PAGE_SHARED: - page_frag_free(data); + page_frag_free_va(data); break; case MEM_TYPE_PAGE_ORDER0: page = virt_to_page(data); /* Assumes order0 page*/ diff --git a/net/rxrpc/txbuf.c b/net/rxrpc/txbuf.c index c3913d8a50d3..dccb0353ee84 100644 --- a/net/rxrpc/txbuf.c +++ b/net/rxrpc/txbuf.c @@ -33,8 +33,8 @@ struct rxrpc_txbuf *rxrpc_alloc_data_txbuf(struct rxrpc_call *call, size_t data_ data_align = umax(data_align, L1_CACHE_BYTES); mutex_lock(&call->conn->tx_data_alloc_lock); - buf = page_frag_alloc_align(&call->conn->tx_data_alloc, total, gfp, - data_align); + buf = page_frag_alloc_va_align(&call->conn->tx_data_alloc, total, gfp, + data_align); mutex_unlock(&call->conn->tx_data_alloc_lock); if (!buf) { kfree(txb); @@ -96,17 +96,18 @@ struct rxrpc_txbuf *rxrpc_alloc_ack_txbuf(struct rxrpc_call *call, size_t sack_s if (!txb) return NULL; - buf = page_frag_alloc(&call->local->tx_alloc, - sizeof(*whdr) + sizeof(*ack) + 1 + 3 + sizeof(*trailer), gfp); + buf = page_frag_alloc_va(&call->local->tx_alloc, + sizeof(*whdr) + sizeof(*ack) + 1 + 3 + sizeof(*trailer), gfp); if (!buf) { kfree(txb); return NULL; } if (sack_size) { - buf2 = page_frag_alloc(&call->local->tx_alloc, sack_size, gfp); + buf2 = page_frag_alloc_va(&call->local->tx_alloc, sack_size, + gfp); if (!buf2) { - page_frag_free(buf); + page_frag_free_va(buf); kfree(txb); return NULL; } @@ -180,7 +181,7 @@ static void rxrpc_free_txbuf(struct rxrpc_txbuf *txb) rxrpc_txbuf_free); for (i = 0; i < txb->nr_kvec; i++) if (txb->kvec[i].iov_base) - page_frag_free(txb->kvec[i].iov_base); + page_frag_free_va(txb->kvec[i].iov_base); kfree(txb); atomic_dec(&rxrpc_nr_txbuf); } diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index 6b3f01beb294..42d20412c1c3 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -1222,8 +1222,8 @@ static int svc_tcp_sendmsg(struct svc_sock *svsk, struct svc_rqst *rqstp, /* The stream record marker is copied into a temporary page * fragment buffer so that it can be included in rq_bvec. */ - buf = page_frag_alloc(&svsk->sk_frag_cache, sizeof(marker), - GFP_KERNEL); + buf = page_frag_alloc_va(&svsk->sk_frag_cache, sizeof(marker), + GFP_KERNEL); if (!buf) return -ENOMEM; memcpy(buf, &marker, sizeof(marker)); @@ -1235,7 +1235,7 @@ static int svc_tcp_sendmsg(struct svc_sock *svsk, struct svc_rqst *rqstp, iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, rqstp->rq_bvec, 1 + count, sizeof(marker) + rqstp->rq_res.len); ret = sock_sendmsg(svsk->sk_sock, &msg); - page_frag_free(buf); + page_frag_free_va(buf); if (ret < 0) return ret; *sentp += ret; -- 2.33.0 From linyunsheng at huawei.com Wed May 8 06:34:02 2024 From: linyunsheng at huawei.com (Yunsheng Lin) Date: Wed, 8 May 2024 21:34:02 +0800 Subject: [PATCH net-next v3 07/13] mm: page_frag: avoid caller accessing 'page_frag_cache' directly In-Reply-To: <20240508133408.54708-1-linyunsheng@huawei.com> References: <20240508133408.54708-1-linyunsheng@huawei.com> Message-ID: <20240508133408.54708-8-linyunsheng@huawei.com> Use appropriate frag_page API instead of caller accessing 'page_frag_cache' directly. CC: Alexander Duyck Signed-off-by: Yunsheng Lin --- drivers/vhost/net.c | 2 +- include/linux/page_frag_cache.h | 10 ++++++++++ mm/page_frag_test.c | 2 +- net/core/skbuff.c | 6 +++--- net/rxrpc/conn_object.c | 4 +--- net/rxrpc/local_object.c | 4 +--- net/sunrpc/svcsock.c | 6 ++---- 7 files changed, 19 insertions(+), 15 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index 6691fac01e0d..b2737dc0dc50 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -1325,7 +1325,7 @@ static int vhost_net_open(struct inode *inode, struct file *f) vqs[VHOST_NET_VQ_RX]); f->private_data = n; - n->pf_cache.va = NULL; + page_frag_cache_init(&n->pf_cache); return 0; } diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag_cache.h index a5747cf7a3a1..024ff73a7ea4 100644 --- a/include/linux/page_frag_cache.h +++ b/include/linux/page_frag_cache.h @@ -23,6 +23,16 @@ struct page_frag_cache { bool pfmemalloc; }; +static inline void page_frag_cache_init(struct page_frag_cache *nc) +{ + nc->va = NULL; +} + +static inline bool page_frag_cache_is_pfmemalloc(struct page_frag_cache *nc) +{ + return !!nc->pfmemalloc; +} + void page_frag_cache_drain(struct page_frag_cache *nc); void __page_frag_cache_drain(struct page *page, unsigned int count); void *__page_frag_alloc_va_align(struct page_frag_cache *nc, diff --git a/mm/page_frag_test.c b/mm/page_frag_test.c index 92eb288aab75..8a974d0588bf 100644 --- a/mm/page_frag_test.c +++ b/mm/page_frag_test.c @@ -329,7 +329,7 @@ static int __init page_frag_test_init(void) u64 duration; int ret; - test_frag.va = NULL; + page_frag_cache_init(&test_frag); atomic_set(&nthreads, 2); init_completion(&wait); diff --git a/net/core/skbuff.c b/net/core/skbuff.c index dca4e7445348..caee22db1cc7 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -741,12 +741,12 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev, unsigned int len, if (in_hardirq() || irqs_disabled()) { nc = this_cpu_ptr(&netdev_alloc_cache); data = page_frag_alloc_va(nc, len, gfp_mask); - pfmemalloc = nc->pfmemalloc; + pfmemalloc = page_frag_cache_is_pfmemalloc(nc); } else { local_bh_disable(); nc = this_cpu_ptr(&napi_alloc_cache.page); data = page_frag_alloc_va(nc, len, gfp_mask); - pfmemalloc = nc->pfmemalloc; + pfmemalloc = page_frag_cache_is_pfmemalloc(nc); local_bh_enable(); } @@ -834,7 +834,7 @@ struct sk_buff *napi_alloc_skb(struct napi_struct *napi, unsigned int len) len = SKB_HEAD_ALIGN(len); data = page_frag_alloc_va(&nc->page, len, gfp_mask); - pfmemalloc = nc->page.pfmemalloc; + pfmemalloc = page_frag_cache_is_pfmemalloc(&nc->page); } if (unlikely(!data)) diff --git a/net/rxrpc/conn_object.c b/net/rxrpc/conn_object.c index 1539d315afe7..694c4df7a1a3 100644 --- a/net/rxrpc/conn_object.c +++ b/net/rxrpc/conn_object.c @@ -337,9 +337,7 @@ static void rxrpc_clean_up_connection(struct work_struct *work) */ rxrpc_purge_queue(&conn->rx_queue); - if (conn->tx_data_alloc.va) - __page_frag_cache_drain(virt_to_page(conn->tx_data_alloc.va), - conn->tx_data_alloc.pagecnt_bias); + page_frag_cache_drain(&conn->tx_data_alloc); call_rcu(&conn->rcu, rxrpc_rcu_free_connection); } diff --git a/net/rxrpc/local_object.c b/net/rxrpc/local_object.c index 504453c688d7..a8cffe47cf01 100644 --- a/net/rxrpc/local_object.c +++ b/net/rxrpc/local_object.c @@ -452,9 +452,7 @@ void rxrpc_destroy_local(struct rxrpc_local *local) #endif rxrpc_purge_queue(&local->rx_queue); rxrpc_purge_client_connections(local); - if (local->tx_alloc.va) - __page_frag_cache_drain(virt_to_page(local->tx_alloc.va), - local->tx_alloc.pagecnt_bias); + page_frag_cache_drain(&local->tx_alloc); } /* diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index 42d20412c1c3..4b1e87187614 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -1609,7 +1609,6 @@ static void svc_tcp_sock_detach(struct svc_xprt *xprt) static void svc_sock_free(struct svc_xprt *xprt) { struct svc_sock *svsk = container_of(xprt, struct svc_sock, sk_xprt); - struct page_frag_cache *pfc = &svsk->sk_frag_cache; struct socket *sock = svsk->sk_sock; trace_svcsock_free(svsk, sock); @@ -1619,8 +1618,7 @@ static void svc_sock_free(struct svc_xprt *xprt) sockfd_put(sock); else sock_release(sock); - if (pfc->va) - __page_frag_cache_drain(virt_to_head_page(pfc->va), - pfc->pagecnt_bias); + + page_frag_cache_drain(&svsk->sk_frag_cache); kfree(svsk); } -- 2.33.0 From andrea.righi at canonical.com Thu May 9 10:15:27 2024 From: andrea.righi at canonical.com (Andrea Righi) Date: Thu, 9 May 2024 19:15:27 +0200 Subject: [PATCH v5 40/40] 9p: Use netfslib read/write_iter In-Reply-To: <20231221132400.1601991-41-dhowells@redhat.com> References: <20231221132400.1601991-1-dhowells@redhat.com> <20231221132400.1601991-41-dhowells@redhat.com> Message-ID: On Thu, Dec 21, 2023 at 01:23:35PM +0000, David Howells wrote: > Use netfslib's read and write iteration helpers, allowing netfslib to take > over the management of the page cache for 9p files and to manage local disk > caching. In particular, this eliminates write_begin, write_end, writepage > and all mentions of struct page and struct folio from 9p. > > Note that netfslib now offers the possibility of write-through caching if > that is desirable for 9p: just set the NETFS_ICTX_WRITETHROUGH flag in > v9inode->netfs.flags in v9fs_set_netfs_context(). > > Note also this is untested as I can't get ganesha.nfsd to correctly parse > the config to turn on 9p support. It looks like this patch has introduced a regression with autopkgtest, see: https://bugs.launchpad.net/bugs/2056461 I haven't looked at the details yet, I just did some bisecting and apparently reverting this one seems to fix the problem. Let me know if you want me to test something in particular or if you already have a potential fix. Otherwise I'll take a look. Thanks, -Andrea From dhowells at redhat.com Thu May 9 14:33:37 2024 From: dhowells at redhat.com (David Howells) Date: Thu, 09 May 2024 22:33:37 +0100 Subject: [PATCH v5 40/40] 9p: Use netfslib read/write_iter In-Reply-To: References: <20231221132400.1601991-1-dhowells@redhat.com> <20231221132400.1601991-41-dhowells@redhat.com> Message-ID: <1567252.1715290417@warthog.procyon.org.uk> Andrea Righi wrote: > On Thu, Dec 21, 2023 at 01:23:35PM +0000, David Howells wrote: > > Use netfslib's read and write iteration helpers, allowing netfslib to take > > over the management of the page cache for 9p files and to manage local disk > > caching. In particular, this eliminates write_begin, write_end, writepage > > and all mentions of struct page and struct folio from 9p. > > > > Note that netfslib now offers the possibility of write-through caching if > > that is desirable for 9p: just set the NETFS_ICTX_WRITETHROUGH flag in > > v9inode->netfs.flags in v9fs_set_netfs_context(). > > > > Note also this is untested as I can't get ganesha.nfsd to correctly parse > > the config to turn on 9p support. > > It looks like this patch has introduced a regression with autopkgtest, > see: https://bugs.launchpad.net/bugs/2056461 > > I haven't looked at the details yet, I just did some bisecting and > apparently reverting this one seems to fix the problem. > > Let me know if you want me to test something in particular or if you > already have a potential fix. Otherwise I'll take a look. Do you have a reproducer? I'll be at LSF next week, so if I can't fix it tomorrow, I won't be able to poke at it until after that. David From andrea.righi at canonical.com Thu May 9 22:53:52 2024 From: andrea.righi at canonical.com (Andrea Righi) Date: Fri, 10 May 2024 07:53:52 +0200 Subject: [PATCH v5 40/40] 9p: Use netfslib read/write_iter In-Reply-To: <1567252.1715290417@warthog.procyon.org.uk> References: <20231221132400.1601991-1-dhowells@redhat.com> <20231221132400.1601991-41-dhowells@redhat.com> <1567252.1715290417@warthog.procyon.org.uk> Message-ID: On Thu, May 09, 2024 at 10:33:37PM +0100, David Howells wrote: > Andrea Righi wrote: > > > On Thu, Dec 21, 2023 at 01:23:35PM +0000, David Howells wrote: > > > Use netfslib's read and write iteration helpers, allowing netfslib to take > > > over the management of the page cache for 9p files and to manage local disk > > > caching. In particular, this eliminates write_begin, write_end, writepage > > > and all mentions of struct page and struct folio from 9p. > > > > > > Note that netfslib now offers the possibility of write-through caching if > > > that is desirable for 9p: just set the NETFS_ICTX_WRITETHROUGH flag in > > > v9inode->netfs.flags in v9fs_set_netfs_context(). > > > > > > Note also this is untested as I can't get ganesha.nfsd to correctly parse > > > the config to turn on 9p support. > > > > It looks like this patch has introduced a regression with autopkgtest, > > see: https://bugs.launchpad.net/bugs/2056461 > > > > I haven't looked at the details yet, I just did some bisecting and > > apparently reverting this one seems to fix the problem. > > > > Let me know if you want me to test something in particular or if you > > already have a potential fix. Otherwise I'll take a look. > > Do you have a reproducer? > > I'll be at LSF next week, so if I can't fix it tomorrow, I won't be able to > poke at it until after that. > > David The only reproducer that I have at the moment is the autopkgtest command mentioned in the bug, that is a bit convoluted, I'll try to see if I can better isolate the problem and find a simpler reproducer, but I'll also be travelling next week to a Canonical event. At the moment I'll temporarily revert the commit (that seems to prevent the issue from happening) and I'll keep you posted if I find something. Thanks, -Andrea From karol.michun at tradeharmony.pl Fri May 10 00:41:14 2024 From: karol.michun at tradeharmony.pl (Karol Michun) Date: Fri, 10 May 2024 07:41:14 GMT Subject: W sprawie samochodu Message-ID: <20240510084500-0.1.67.2bsrd.0.9bl3jbqzin@tradeharmony.pl> Dzie? dobry, chcieliby?my zapewni? Pa?stwu kompleksowe rozwi?zania, je?li chodzi o system monitoringu GPS. Precyzyjne monitorowanie pojazd?w na mapach cyfrowych, ?ledzenie ich parametr?w eksploatacyjnych w czasie rzeczywistym oraz kontrola paliwa to kluczowe funkcjonalno?ci naszego systemu. Organizowanie pracy pracownik?w jest dzi?ki temu prostsze i bardziej efektywne, a oszcz?dno?ci i optymalizacja w zakresie ponoszonych koszt?w, maj? dla ka?dego przedsi?biorcy ogromne znaczenie. Dopasujemy nasz? ofert? do Pa?stwa oczekiwa? i potrzeb organizacji. Czy mogliby?my porozmawia? o naszej propozycji? Pozdrawiam Karol Michun From dhowells at redhat.com Fri May 10 00:57:22 2024 From: dhowells at redhat.com (David Howells) Date: Fri, 10 May 2024 08:57:22 +0100 Subject: [PATCH v5 40/40] 9p: Use netfslib read/write_iter In-Reply-To: References: <20231221132400.1601991-1-dhowells@redhat.com> <20231221132400.1601991-41-dhowells@redhat.com> <1567252.1715290417@warthog.procyon.org.uk> Message-ID: <1578871.1715327842@warthog.procyon.org.uk> Andrea Righi wrote: > The only reproducer that I have at the moment is the autopkgtest command > mentioned in the bug, that is a bit convoluted, I'll try to see if I can > better isolate the problem and find a simpler reproducer, but I'll also > be travelling next week to a Canonical event. Note that the netfslib has some tracepoints that might help debug it. David From ryncsn at gmail.com Fri May 10 04:47:41 2024 From: ryncsn at gmail.com (Kairui Song) Date: Fri, 10 May 2024 19:47:41 +0800 Subject: [PATCH v5 06/12] afs: drop usage of folio_file_pos In-Reply-To: <20240510114747.21548-1-ryncsn@gmail.com> References: <20240510114747.21548-1-ryncsn@gmail.com> Message-ID: <20240510114747.21548-7-ryncsn@gmail.com> From: Kairui Song folio_file_pos is only needed for mixed usage of page cache and swap cache, for pure page cache usage, the caller can just use folio_pos instead. It can't be a swap cache page here. Swap mapping may only call into fs through swap_rw and that is not supported for afs. So just drop it and use folio_pos instead. Signed-off-by: Kairui Song Cc: David Howells Cc: Marc Dionne Cc: linux-afs at lists.infradead.org --- fs/afs/dir.c | 6 +++--- fs/afs/dir_edit.c | 4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/afs/dir.c b/fs/afs/dir.c index 67afe68972d5..f8622ed72e08 100644 --- a/fs/afs/dir.c +++ b/fs/afs/dir.c @@ -533,14 +533,14 @@ static int afs_dir_iterate(struct inode *dir, struct dir_context *ctx, break; } - offset = round_down(ctx->pos, sizeof(*dblock)) - folio_file_pos(folio); + offset = round_down(ctx->pos, sizeof(*dblock)) - folio_pos(folio); size = min_t(loff_t, folio_size(folio), - req->actual_len - folio_file_pos(folio)); + req->actual_len - folio_pos(folio)); do { dblock = kmap_local_folio(folio, offset); ret = afs_dir_iterate_block(dvnode, ctx, dblock, - folio_file_pos(folio) + offset); + folio_pos(folio) + offset); kunmap_local(dblock); if (ret != 1) goto out; diff --git a/fs/afs/dir_edit.c b/fs/afs/dir_edit.c index e2fa577b66fe..a71bff10496b 100644 --- a/fs/afs/dir_edit.c +++ b/fs/afs/dir_edit.c @@ -256,7 +256,7 @@ void afs_edit_dir_add(struct afs_vnode *vnode, folio = folio0; } - block = kmap_local_folio(folio, b * AFS_DIR_BLOCK_SIZE - folio_file_pos(folio)); + block = kmap_local_folio(folio, b * AFS_DIR_BLOCK_SIZE - folio_pos(folio)); /* Abandon the edit if we got a callback break. */ if (!test_bit(AFS_VNODE_DIR_VALID, &vnode->flags)) @@ -417,7 +417,7 @@ void afs_edit_dir_remove(struct afs_vnode *vnode, folio = folio0; } - block = kmap_local_folio(folio, b * AFS_DIR_BLOCK_SIZE - folio_file_pos(folio)); + block = kmap_local_folio(folio, b * AFS_DIR_BLOCK_SIZE - folio_pos(folio)); /* Abandon the edit if we got a callback break. */ if (!test_bit(AFS_VNODE_DIR_VALID, &vnode->flags)) -- 2.45.0 From marc.dionne at auristor.com Fri May 10 15:14:40 2024 From: marc.dionne at auristor.com (Marc Dionne) Date: Fri, 10 May 2024 19:14:40 -0300 Subject: [PATCH] afs: Don't cross .backup mountpoint from backup volume Message-ID: <20240510221440.755019-1-marc.dionne@auristor.com> Don't cross a mountpoint that explicitly specifies a backup volume (target is .backup) when starting from a backup volume. It it not uncommon to mount a volume's backup directly in the volume itself. This can cause tools that are not paying attention to get into a loop mounting the volume onto itself as they attempt to traverse the tree, leading to a variety of problems. This doesn't prevent the general case of loops in a sequence of mountpoints, but addresses a common special case in the same way as other afs clients. Signed-off-by: Marc Dionne --- fs/afs/mntpt.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c index 97f50e9fd9eb..fb6eea31ff31 100644 --- a/fs/afs/mntpt.c +++ b/fs/afs/mntpt.c @@ -140,6 +140,11 @@ static int afs_mntpt_set_params(struct fs_context *fc, struct dentry *mntpt) put_page(page); if (ret < 0) return ret; + + /* Don't cross a backup volume mountpoint from a backup volume */ + if (src_as->volume && src_as->volume->type == AFSVL_BACKVOL + && ctx->type == AFSVL_BACKVOL) + return -ENODEV; } return 0; -- 2.45.0 From jaltman at auristor.com Fri May 10 21:03:50 2024 From: jaltman at auristor.com (Jeffrey E Altman) Date: Sat, 11 May 2024 00:03:50 -0400 Subject: [PATCH] afs: Don't cross .backup mountpoint from backup volume In-Reply-To: <20240510221440.755019-1-marc.dionne@auristor.com> References: <20240510221440.755019-1-marc.dionne@auristor.com> Message-ID: <0672ef81-6f15-4949-b7b6-37b8281bd038@auristor.com> On 5/10/2024 6:14 PM, Marc Dionne wrote: > Don't cross a mountpoint that explicitly specifies a backup volume > (target is .backup) when starting from a backup volume. > > It it not uncommon to mount a volume's backup directly in the volume > itself. This can cause tools that are not paying attention to get > into a loop mounting the volume onto itself as they attempt to > traverse the tree, leading to a variety of problems. > > This doesn't prevent the general case of loops in a sequence of > mountpoints, but addresses a common special case in the same way > as other afs clients. > > Signed-off-by: Marc Dionne > --- > fs/afs/mntpt.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c > index 97f50e9fd9eb..fb6eea31ff31 100644 > --- a/fs/afs/mntpt.c > +++ b/fs/afs/mntpt.c > @@ -140,6 +140,11 @@ static int afs_mntpt_set_params(struct fs_context *fc, struct dentry *mntpt) > put_page(page); > if (ret < 0) > return ret; > + > + /* Don't cross a backup volume mountpoint from a backup volume */ > + if (src_as->volume && src_as->volume->type == AFSVL_BACKVOL > + && ctx->type == AFSVL_BACKVOL) > + return -ENODEV; > } > > return 0; Reviewed-by: Jeffrey Altman Reported-by: Jan Henrik Sylvester Link: http://lists.infradead.org/pipermail/linux-afs/2024-May/008454.html Reported-by: Markus Suvanto Link: http://lists.infradead.org/pipermail/linux-afs/2024-February/008074.html Cc: stable at vger.kernel.org From kamila.parandyk at promisingmarket.pl Mon May 13 00:31:26 2024 From: kamila.parandyk at promisingmarket.pl (Kamila Parandyk) Date: Mon, 13 May 2024 07:31:26 GMT Subject: =?UTF-8?Q?Post=C4=99powanie_restrukturyzacyjne_?= Message-ID: <20240513084500-0.1.s.1i5j.0.e3hfobo7r7@promisingmarket.pl> Drodzy Pa?stwo, trudno?ci z p?ynno?ci? finansow?, utrata kluczowego kontrahenta czy kwestie zwi?zane z umowami kredytowymi mog? stwarza? trudne wyzwania biznesowe. Restrukturyzacja to proces, kt?ry nie tylko naprawia aktualne problemy finansowe firmy, ale r?wnie? ukierunkowuje j? na d?ugoterminow? stabilno?? i wzrost. Zmiany mog? dotyczy? sfery finansowej lub ekonomicznej, a tak?e wi?za? si? ze zmian? technologii w firmie. Nasza kancelaria adwokacka pomaga przedsi?biorcom prowadzi? biznes bez zad?u?e? i skupi? si? na budowaniu nowych ?r?de? przychod?w. Proponujemy rozwi?zania, kt?re gwarantuj? bezpiecze?stwo maj?tku, zawieszaj?c raty kredyt?w, egzekucje komornicze czy umarzaj?c odsetki. Dzi?ki odpowiednim procedurom zabezpieczamy prawnie warto?? firmy, daj?c czas na napraw? kondycji dzia?alno?ci i odd?u?enie bez konieczno?ci og?aszania upad?o?ci. Jeste?my gotowi podj?? dzia?ania, kt?re pozwol? Pa?stwu odzyska? stabilno?? finansow? firmy. Je?eli s? Pa?stwo zainteresowani rozpocz?ciem tego procesu, prosz? o informacj? zwrotn?. Pozdrawiam Kamila Parandyk From gregkh at linuxfoundation.org Wed May 15 00:35:45 2024 From: gregkh at linuxfoundation.org (gregkh at linuxfoundation.org) Date: Wed, 15 May 2024 09:35:45 +0200 Subject: Patch "keys: Fix overwrite of key expiration on instantiation" has been added to the 6.9-stable tree Message-ID: <2024051545-trinity-walk-e7e1@gregkh> This is a note to let you know that I've just added the patch titled keys: Fix overwrite of key expiration on instantiation to the 6.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: keys-fix-overwrite-of-key-expiration-on-instantiation.patch and it can be found in the queue-6.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let know about it. >From 9da27fb65a14c18efd4473e2e82b76b53ba60252 Mon Sep 17 00:00:00 2001 From: Silvio Gissi Date: Fri, 15 Mar 2024 15:05:39 -0400 Subject: keys: Fix overwrite of key expiration on instantiation From: Silvio Gissi commit 9da27fb65a14c18efd4473e2e82b76b53ba60252 upstream. The expiry time of a key is unconditionally overwritten during instantiation, defaulting to turn it permanent. This causes a problem for DNS resolution as the expiration set by user-space is overwritten to TIME64_MAX, disabling further DNS updates. Fix this by restoring the condition that key_set_expiry is only called when the pre-parser sets a specific expiry. Fixes: 39299bdd2546 ("keys, dns: Allow key types (eg. DNS) to be reclaimed immediately on expiry") Signed-off-by: Silvio Gissi cc: David Howells cc: Hazem Mohamed Abuelfotoh cc: linux-afs at lists.infradead.org cc: linux-cifs at vger.kernel.org cc: keyrings at vger.kernel.org cc: netdev at vger.kernel.org cc: stable at vger.kernel.org Reviewed-by: Jarkko Sakkinen Signed-off-by: Jarkko Sakkinen Signed-off-by: Greg Kroah-Hartman --- security/keys/key.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/security/keys/key.c +++ b/security/keys/key.c @@ -463,7 +463,8 @@ static int __key_instantiate_and_link(st if (authkey) key_invalidate(authkey); - key_set_expiry(key, prep->expiry); + if (prep->expiry != TIME64_MAX) + key_set_expiry(key, prep->expiry); } } Patches currently in stable-queue which might be from sifonsec at amazon.com are queue-6.9/keys-fix-overwrite-of-key-expiration-on-instantiation.patch From gregkh at linuxfoundation.org Wed May 15 00:40:36 2024 From: gregkh at linuxfoundation.org (gregkh at linuxfoundation.org) Date: Wed, 15 May 2024 09:40:36 +0200 Subject: Patch "keys: Fix overwrite of key expiration on instantiation" has been added to the 5.10-stable tree Message-ID: <2024051535-crummiest-angrily-f531@gregkh> This is a note to let you know that I've just added the patch titled keys: Fix overwrite of key expiration on instantiation to the 5.10-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: keys-fix-overwrite-of-key-expiration-on-instantiation.patch and it can be found in the queue-5.10 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let know about it. >From 9da27fb65a14c18efd4473e2e82b76b53ba60252 Mon Sep 17 00:00:00 2001 From: Silvio Gissi Date: Fri, 15 Mar 2024 15:05:39 -0400 Subject: keys: Fix overwrite of key expiration on instantiation From: Silvio Gissi commit 9da27fb65a14c18efd4473e2e82b76b53ba60252 upstream. The expiry time of a key is unconditionally overwritten during instantiation, defaulting to turn it permanent. This causes a problem for DNS resolution as the expiration set by user-space is overwritten to TIME64_MAX, disabling further DNS updates. Fix this by restoring the condition that key_set_expiry is only called when the pre-parser sets a specific expiry. Fixes: 39299bdd2546 ("keys, dns: Allow key types (eg. DNS) to be reclaimed immediately on expiry") Signed-off-by: Silvio Gissi cc: David Howells cc: Hazem Mohamed Abuelfotoh cc: linux-afs at lists.infradead.org cc: linux-cifs at vger.kernel.org cc: keyrings at vger.kernel.org cc: netdev at vger.kernel.org cc: stable at vger.kernel.org Reviewed-by: Jarkko Sakkinen Signed-off-by: Jarkko Sakkinen Signed-off-by: Greg Kroah-Hartman --- security/keys/key.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/security/keys/key.c +++ b/security/keys/key.c @@ -464,7 +464,8 @@ static int __key_instantiate_and_link(st if (authkey) key_invalidate(authkey); - key_set_expiry(key, prep->expiry); + if (prep->expiry != TIME64_MAX) + key_set_expiry(key, prep->expiry); } } Patches currently in stable-queue which might be from sifonsec at amazon.com are queue-5.10/keys-fix-overwrite-of-key-expiration-on-instantiation.patch From gregkh at linuxfoundation.org Wed May 15 00:40:50 2024 From: gregkh at linuxfoundation.org (gregkh at linuxfoundation.org) Date: Wed, 15 May 2024 09:40:50 +0200 Subject: Patch "keys: Fix overwrite of key expiration on instantiation" has been added to the 5.15-stable tree Message-ID: <2024051550-helium-modulator-c5cf@gregkh> This is a note to let you know that I've just added the patch titled keys: Fix overwrite of key expiration on instantiation to the 5.15-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: keys-fix-overwrite-of-key-expiration-on-instantiation.patch and it can be found in the queue-5.15 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let know about it. >From 9da27fb65a14c18efd4473e2e82b76b53ba60252 Mon Sep 17 00:00:00 2001 From: Silvio Gissi Date: Fri, 15 Mar 2024 15:05:39 -0400 Subject: keys: Fix overwrite of key expiration on instantiation From: Silvio Gissi commit 9da27fb65a14c18efd4473e2e82b76b53ba60252 upstream. The expiry time of a key is unconditionally overwritten during instantiation, defaulting to turn it permanent. This causes a problem for DNS resolution as the expiration set by user-space is overwritten to TIME64_MAX, disabling further DNS updates. Fix this by restoring the condition that key_set_expiry is only called when the pre-parser sets a specific expiry. Fixes: 39299bdd2546 ("keys, dns: Allow key types (eg. DNS) to be reclaimed immediately on expiry") Signed-off-by: Silvio Gissi cc: David Howells cc: Hazem Mohamed Abuelfotoh cc: linux-afs at lists.infradead.org cc: linux-cifs at vger.kernel.org cc: keyrings at vger.kernel.org cc: netdev at vger.kernel.org cc: stable at vger.kernel.org Reviewed-by: Jarkko Sakkinen Signed-off-by: Jarkko Sakkinen Signed-off-by: Greg Kroah-Hartman --- security/keys/key.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/security/keys/key.c +++ b/security/keys/key.c @@ -464,7 +464,8 @@ static int __key_instantiate_and_link(st if (authkey) key_invalidate(authkey); - key_set_expiry(key, prep->expiry); + if (prep->expiry != TIME64_MAX) + key_set_expiry(key, prep->expiry); } } Patches currently in stable-queue which might be from sifonsec at amazon.com are queue-5.15/keys-fix-overwrite-of-key-expiration-on-instantiation.patch From gregkh at linuxfoundation.org Wed May 15 00:41:01 2024 From: gregkh at linuxfoundation.org (gregkh at linuxfoundation.org) Date: Wed, 15 May 2024 09:41:01 +0200 Subject: Patch "keys: Fix overwrite of key expiration on instantiation" has been added to the 6.1-stable tree Message-ID: <2024051501-vocally-herbal-992b@gregkh> This is a note to let you know that I've just added the patch titled keys: Fix overwrite of key expiration on instantiation to the 6.1-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: keys-fix-overwrite-of-key-expiration-on-instantiation.patch and it can be found in the queue-6.1 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let know about it. >From 9da27fb65a14c18efd4473e2e82b76b53ba60252 Mon Sep 17 00:00:00 2001 From: Silvio Gissi Date: Fri, 15 Mar 2024 15:05:39 -0400 Subject: keys: Fix overwrite of key expiration on instantiation From: Silvio Gissi commit 9da27fb65a14c18efd4473e2e82b76b53ba60252 upstream. The expiry time of a key is unconditionally overwritten during instantiation, defaulting to turn it permanent. This causes a problem for DNS resolution as the expiration set by user-space is overwritten to TIME64_MAX, disabling further DNS updates. Fix this by restoring the condition that key_set_expiry is only called when the pre-parser sets a specific expiry. Fixes: 39299bdd2546 ("keys, dns: Allow key types (eg. DNS) to be reclaimed immediately on expiry") Signed-off-by: Silvio Gissi cc: David Howells cc: Hazem Mohamed Abuelfotoh cc: linux-afs at lists.infradead.org cc: linux-cifs at vger.kernel.org cc: keyrings at vger.kernel.org cc: netdev at vger.kernel.org cc: stable at vger.kernel.org Reviewed-by: Jarkko Sakkinen Signed-off-by: Jarkko Sakkinen Signed-off-by: Greg Kroah-Hartman --- security/keys/key.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/security/keys/key.c +++ b/security/keys/key.c @@ -464,7 +464,8 @@ static int __key_instantiate_and_link(st if (authkey) key_invalidate(authkey); - key_set_expiry(key, prep->expiry); + if (prep->expiry != TIME64_MAX) + key_set_expiry(key, prep->expiry); } } Patches currently in stable-queue which might be from sifonsec at amazon.com are queue-6.1/keys-fix-overwrite-of-key-expiration-on-instantiation.patch From gregkh at linuxfoundation.org Wed May 15 00:41:14 2024 From: gregkh at linuxfoundation.org (gregkh at linuxfoundation.org) Date: Wed, 15 May 2024 09:41:14 +0200 Subject: Patch "keys: Fix overwrite of key expiration on instantiation" has been added to the 6.6-stable tree Message-ID: <2024051514-feast-twisting-59da@gregkh> This is a note to let you know that I've just added the patch titled keys: Fix overwrite of key expiration on instantiation to the 6.6-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: keys-fix-overwrite-of-key-expiration-on-instantiation.patch and it can be found in the queue-6.6 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let know about it. >From 9da27fb65a14c18efd4473e2e82b76b53ba60252 Mon Sep 17 00:00:00 2001 From: Silvio Gissi Date: Fri, 15 Mar 2024 15:05:39 -0400 Subject: keys: Fix overwrite of key expiration on instantiation From: Silvio Gissi commit 9da27fb65a14c18efd4473e2e82b76b53ba60252 upstream. The expiry time of a key is unconditionally overwritten during instantiation, defaulting to turn it permanent. This causes a problem for DNS resolution as the expiration set by user-space is overwritten to TIME64_MAX, disabling further DNS updates. Fix this by restoring the condition that key_set_expiry is only called when the pre-parser sets a specific expiry. Fixes: 39299bdd2546 ("keys, dns: Allow key types (eg. DNS) to be reclaimed immediately on expiry") Signed-off-by: Silvio Gissi cc: David Howells cc: Hazem Mohamed Abuelfotoh cc: linux-afs at lists.infradead.org cc: linux-cifs at vger.kernel.org cc: keyrings at vger.kernel.org cc: netdev at vger.kernel.org cc: stable at vger.kernel.org Reviewed-by: Jarkko Sakkinen Signed-off-by: Jarkko Sakkinen Signed-off-by: Greg Kroah-Hartman --- security/keys/key.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/security/keys/key.c +++ b/security/keys/key.c @@ -464,7 +464,8 @@ static int __key_instantiate_and_link(st if (authkey) key_invalidate(authkey); - key_set_expiry(key, prep->expiry); + if (prep->expiry != TIME64_MAX) + key_set_expiry(key, prep->expiry); } } Patches currently in stable-queue which might be from sifonsec at amazon.com are queue-6.6/keys-fix-overwrite-of-key-expiration-on-instantiation.patch From gregkh at linuxfoundation.org Wed May 15 00:41:25 2024 From: gregkh at linuxfoundation.org (gregkh at linuxfoundation.org) Date: Wed, 15 May 2024 09:41:25 +0200 Subject: Patch "keys: Fix overwrite of key expiration on instantiation" has been added to the 6.8-stable tree Message-ID: <2024051525-swell-destitute-5478@gregkh> This is a note to let you know that I've just added the patch titled keys: Fix overwrite of key expiration on instantiation to the 6.8-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: keys-fix-overwrite-of-key-expiration-on-instantiation.patch and it can be found in the queue-6.8 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let know about it. >From 9da27fb65a14c18efd4473e2e82b76b53ba60252 Mon Sep 17 00:00:00 2001 From: Silvio Gissi Date: Fri, 15 Mar 2024 15:05:39 -0400 Subject: keys: Fix overwrite of key expiration on instantiation From: Silvio Gissi commit 9da27fb65a14c18efd4473e2e82b76b53ba60252 upstream. The expiry time of a key is unconditionally overwritten during instantiation, defaulting to turn it permanent. This causes a problem for DNS resolution as the expiration set by user-space is overwritten to TIME64_MAX, disabling further DNS updates. Fix this by restoring the condition that key_set_expiry is only called when the pre-parser sets a specific expiry. Fixes: 39299bdd2546 ("keys, dns: Allow key types (eg. DNS) to be reclaimed immediately on expiry") Signed-off-by: Silvio Gissi cc: David Howells cc: Hazem Mohamed Abuelfotoh cc: linux-afs at lists.infradead.org cc: linux-cifs at vger.kernel.org cc: keyrings at vger.kernel.org cc: netdev at vger.kernel.org cc: stable at vger.kernel.org Reviewed-by: Jarkko Sakkinen Signed-off-by: Jarkko Sakkinen Signed-off-by: Greg Kroah-Hartman --- security/keys/key.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/security/keys/key.c +++ b/security/keys/key.c @@ -464,7 +464,8 @@ static int __key_instantiate_and_link(st if (authkey) key_invalidate(authkey); - key_set_expiry(key, prep->expiry); + if (prep->expiry != TIME64_MAX) + key_set_expiry(key, prep->expiry); } } Patches currently in stable-queue which might be from sifonsec at amazon.com are queue-6.8/keys-fix-overwrite-of-key-expiration-on-instantiation.patch