[openwrt/openwrt] kernel: scale nf_conntrack_max more reasonably
LEDE Commits
lede-commits at lists.infradead.org
Thu Aug 11 13:54:57 PDT 2022
rsalvaterra pushed a commit to openwrt/openwrt.git, branch openwrt-22.03:
https://git.openwrt.org/0855549b4bdfb7ff0aacfcfe888919c4060ed102
commit 0855549b4bdfb7ff0aacfcfe888919c4060ed102
Author: Vincent Pelletier <plr.vincent at gmail.com>
AuthorDate: Sat Feb 19 02:06:23 2022 +0000
kernel: scale nf_conntrack_max more reasonably
Use the kernel's built-in formula for computing this value.
The value applied by OpenWRT's sysctl configuration file does not scale
with the available memory, under-using hardware capabilities.
Also, that formula also influences net.netfilter.nf_conntrack_buckets,
which should improve conntrack performance in average (fewer connections
per hashtable bucket).
Backport upstream commit for its effect on the number of connections per
hashtable bucket.
Apply a hack patch to set the RAM size divisor to a more reasonable value (2048,
down from 16384) for our use case, a typical router handling several thousands
of connections.
Signed-off-by: Vincent Pelletier <plr.vincent at gmail.com>
Signed-off-by: Rui Salvaterra <rsalvaterra at gmail.com>
(cherry picked from commit 15fbb916669dcdfcc706e9e75263ab63f9f27c00)
---
.../kernel/linux/files/sysctl-nf-conntrack.conf | 1 -
...onntrack-sanitize-table-size-default-sett.patch | 100 +++++++++++++++++++++
...nel-ct-size-the-hashtable-more-adequately.patch | 25 ++++++
3 files changed, 125 insertions(+), 1 deletion(-)
diff --git a/package/kernel/linux/files/sysctl-nf-conntrack.conf b/package/kernel/linux/files/sysctl-nf-conntrack.conf
index 37baf5fd6f..c6a0ef362b 100644
--- a/package/kernel/linux/files/sysctl-nf-conntrack.conf
+++ b/package/kernel/linux/files/sysctl-nf-conntrack.conf
@@ -3,7 +3,6 @@
net.netfilter.nf_conntrack_acct=1
net.netfilter.nf_conntrack_checksum=0
-net.netfilter.nf_conntrack_max=16384
net.netfilter.nf_conntrack_tcp_timeout_established=7440
net.netfilter.nf_conntrack_udp_timeout=60
net.netfilter.nf_conntrack_udp_timeout_stream=180
diff --git a/target/linux/generic/backport-5.10/612-v5.15-netfilter-conntrack-sanitize-table-size-default-sett.patch b/target/linux/generic/backport-5.10/612-v5.15-netfilter-conntrack-sanitize-table-size-default-sett.patch
new file mode 100644
index 0000000000..55bf0f612b
--- /dev/null
+++ b/target/linux/generic/backport-5.10/612-v5.15-netfilter-conntrack-sanitize-table-size-default-sett.patch
@@ -0,0 +1,100 @@
+From d532bcd0b2699d84d71a0c71d37157ac6eb3be25 Mon Sep 17 00:00:00 2001
+Message-Id: <d532bcd0b2699d84d71a0c71d37157ac6eb3be25.1645246598.git.plr.vincent at gmail.com>
+From: Florian Westphal <fw at strlen.de>
+Date: Thu, 26 Aug 2021 15:54:19 +0200
+Subject: [PATCH] netfilter: conntrack: sanitize table size default settings
+
+conntrack has two distinct table size settings:
+nf_conntrack_max and nf_conntrack_buckets.
+
+The former limits how many conntrack objects are allowed to exist
+in each namespace.
+
+The second sets the size of the hashtable.
+
+As all entries are inserted twice (once for original direction, once for
+reply), there should be at least twice as many buckets in the table than
+the maximum number of conntrack objects that can exist at the same time.
+
+Change the default multiplier to 1 and increase the chosen bucket sizes.
+This results in the same nf_conntrack_max settings as before but reduces
+the average bucket list length.
+
+Signed-off-by: Florian Westphal <fw at strlen.de>
+Signed-off-by: Pablo Neira Ayuso <pablo at netfilter.org>
+---
+ .../networking/nf_conntrack-sysctl.rst | 13 ++++----
+ net/netfilter/nf_conntrack_core.c | 30 +++++++++----------
+ 2 files changed, 22 insertions(+), 21 deletions(-)
+
+--- a/Documentation/networking/nf_conntrack-sysctl.rst
++++ b/Documentation/networking/nf_conntrack-sysctl.rst
+@@ -17,9 +17,8 @@ nf_conntrack_acct - BOOLEAN
+ nf_conntrack_buckets - INTEGER
+ Size of hash table. If not specified as parameter during module
+ loading, the default size is calculated by dividing total memory
+- by 16384 to determine the number of buckets but the hash table will
+- never have fewer than 32 and limited to 16384 buckets. For systems
+- with more than 4GB of memory it will be 65536 buckets.
++ by 16384 to determine the number of buckets. The hash table will
++ never have fewer than 1024 and never more than 262144 buckets.
+ This sysctl is only writeable in the initial net namespace.
+
+ nf_conntrack_checksum - BOOLEAN
+@@ -100,8 +99,12 @@ nf_conntrack_log_invalid - INTEGER
+ Log invalid packets of a type specified by value.
+
+ nf_conntrack_max - INTEGER
+- Size of connection tracking table. Default value is
+- nf_conntrack_buckets value * 4.
++ Maximum number of allowed connection tracking entries. This value is set
++ to nf_conntrack_buckets by default.
++ Note that connection tracking entries are added to the table twice -- once
++ for the original direction and once for the reply direction (i.e., with
++ the reversed address). This means that with default settings a maxed-out
++ table will have a average hash chain length of 2, not 1.
+
+ nf_conntrack_tcp_be_liberal - BOOLEAN
+ - 0 - disabled (default)
+--- a/net/netfilter/nf_conntrack_core.c
++++ b/net/netfilter/nf_conntrack_core.c
+@@ -2575,26 +2575,24 @@ int nf_conntrack_init_start(void)
+ spin_lock_init(&nf_conntrack_locks[i]);
+
+ if (!nf_conntrack_htable_size) {
+- /* Idea from tcp.c: use 1/16384 of memory.
+- * On i386: 32MB machine has 512 buckets.
+- * >= 1GB machines have 16384 buckets.
+- * >= 4GB machines have 65536 buckets.
+- */
+ nf_conntrack_htable_size
+ = (((nr_pages << PAGE_SHIFT) / 16384)
+ / sizeof(struct hlist_head));
+- if (nr_pages > (4 * (1024 * 1024 * 1024 / PAGE_SIZE)))
+- nf_conntrack_htable_size = 65536;
++ if (BITS_PER_LONG >= 64 &&
++ nr_pages > (4 * (1024 * 1024 * 1024 / PAGE_SIZE)))
++ nf_conntrack_htable_size = 262144;
+ else if (nr_pages > (1024 * 1024 * 1024 / PAGE_SIZE))
+- nf_conntrack_htable_size = 16384;
+- if (nf_conntrack_htable_size < 32)
+- nf_conntrack_htable_size = 32;
++ nf_conntrack_htable_size = 65536;
+
+- /* Use a max. factor of four by default to get the same max as
+- * with the old struct list_heads. When a table size is given
+- * we use the old value of 8 to avoid reducing the max.
+- * entries. */
+- max_factor = 4;
++ if (nf_conntrack_htable_size < 1024)
++ nf_conntrack_htable_size = 1024;
++ /* Use a max. factor of one by default to keep the average
++ * hash chain length at 2 entries. Each entry has to be added
++ * twice (once for original direction, once for reply).
++ * When a table size is given we use the old value of 8 to
++ * avoid implicit reduction of the max entries setting.
++ */
++ max_factor = 1;
+ }
+
+ nf_conntrack_hash = nf_ct_alloc_hashtable(&nf_conntrack_htable_size, 1);
diff --git a/target/linux/generic/hack-5.10/661-kernel-ct-size-the-hashtable-more-adequately.patch b/target/linux/generic/hack-5.10/661-kernel-ct-size-the-hashtable-more-adequately.patch
new file mode 100644
index 0000000000..dd67c76b13
--- /dev/null
+++ b/target/linux/generic/hack-5.10/661-kernel-ct-size-the-hashtable-more-adequately.patch
@@ -0,0 +1,25 @@
+From 804fbb3f2ec9283f7b778e057a68bfff440a0be6 Mon Sep 17 00:00:00 2001
+From: Rui Salvaterra <rsalvaterra at gmail.com>
+Date: Wed, 30 Mar 2022 22:51:55 +0100
+Subject: [PATCH] kernel: ct: size the hashtable more adequately
+
+To set the default size of the connection tracking hash table, a divider of
+16384 becomes inadequate for a router handling lots of connections. Divide by
+2048 instead, making the default size scale better with the available RAM.
+
+Signed-off-by: Rui Salvaterra <rsalvaterra at gmail.com>
+---
+ net/netfilter/nf_conntrack_core.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+--- a/net/netfilter/nf_conntrack_core.c
++++ b/net/netfilter/nf_conntrack_core.c
+@@ -2576,7 +2576,7 @@ int nf_conntrack_init_start(void)
+
+ if (!nf_conntrack_htable_size) {
+ nf_conntrack_htable_size
+- = (((nr_pages << PAGE_SHIFT) / 16384)
++ = (((nr_pages << PAGE_SHIFT) / 2048)
+ / sizeof(struct hlist_head));
+ if (BITS_PER_LONG >= 64 &&
+ nr_pages > (4 * (1024 * 1024 * 1024 / PAGE_SIZE)))
More information about the lede-commits
mailing list