Restarting ocserv doesn't clean up all workers
Niels Peen
niels at peen.ch
Sun Oct 5 09:43:00 PDT 2014
> On 05 Oct 2014, at 03:17, Nikos Mavrogiannopoulos <nmav at gnutls.org> wrote:
>
> So, if I understand correctly, there was a user connection at some
> point, which go stuck?
Yes. As far as I can tell these are worker processes that handle a user’s connection. At some point the user disconnects (or loses signal - many of the disconnects are unintentional) and the worker doesn’t get killed. Looking at today’s log It happens to about 1 in 400 workers.
> There are numerous places where this could occur. Would it be possible
> to run:
> $ gdb /usr/sbin/ocserv 21306
> $ bt full
Hope this helps:
#0 0x00007f00907fcc62 in recv () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#1 0x00007f0091538e8d in ?? () from /usr/lib/x86_64-linux-gnu/libgnutls-deb0.so.28
No symbol table info available.
#2 0x00007f0091533db2 in ?? () from /usr/lib/x86_64-linux-gnu/libgnutls-deb0.so.28
No symbol table info available.
#3 0x00007f0091535119 in _gnutls_recv_int () from /usr/lib/x86_64-linux-gnu/libgnutls-deb0.so.28
No symbol table info available.
#4 0x00007f009153575d in gnutls_record_recv () from /usr/lib/x86_64-linux-gnu/libgnutls-deb0.so.28
No symbol table info available.
#5 0x000000000040fdb5 in cstp_recv_nb (ws=ws at entry=0xa803a0, data=data at entry=0xa807d4, data_size=<optimized out>) at tlslib.c:144
ret = <optimized out>
#6 0x000000000040afc4 in tls_mainloop (tnow=0x7fffff131b70, ws=0xa803a0) at worker-vpn.c:1210
ret = <optimized out>
l = <optimized out>
#7 connect_handler (ws=ws at entry=0xa803a0) at worker-vpn.c:1964
req = 0xa803b8
rfds = {fds_bits = {256, 0 <repeats 15 times>}}
e = <optimized out>
max = 16
ret = <optimized out>
t = 2048
p = <optimized out>
tv = {tv_sec = 10, tv_nsec = 0}
tls_pending = 0
dtls_pending = 0
i = <optimized out>
tnow = {tv_sec = 1412487975, tv_nsec = 300402000}
proto_overhead = 28
ip6 = <optimized out>
sl = 4
emptyset = {__val = {0 <repeats 16 times>}}
blockset = {__val = {16384, 0 <repeats 15 times>}}
#8 0x000000000040cc0c in vpn_server (ws=ws at entry=0xa803a0) at worker-vpn.c:829
buf = "CONNECT /CSCOSSLC/tunnel HTTP/1.1\r\nHost: XXXX\r\nUser-Agent: OpenConnect VPN Agent (Java) v6.00-unknown\r\nCookie: webvpn=sQxf2ORWgUJY3DIMhgvKTBpcZj9ab1C1XTxN6kR8U/LSiwXPQgk+1oovHfkft6Az0Q3A3A3Lb"...
ret = <optimized out>
nparsed = <optimized out>
nrecvd = <optimized out>
session = 0xa7df90
parser = {type = 0, flags = 0, state = 18, header_state = 0, index = 0, nread = 0, content_length = 18446744073709551615, http_major = 1, http_minor = 1, status_code = 0, method = 5, http_errno = 0,
upgrade = 1, data = 0xa803a0}
settings = {on_message_begin = 0, on_url = 0x409030 <url_cb>, on_status = 0, on_header_field = 0x4097f0 <header_field_cb>, on_header_value = 0x409870 <header_value_cb>,
on_headers_complete = 0x4097d0 <header_complete_cb>, on_body = 0x409130 <body_cb>, on_message_complete = 0x408f00 <message_complete_cb>}
requests_left = <optimized out>
#9 0x000000000040772f in main (argc=<optimized out>, argv=<optimized out>) at main.c:1182
stype = 0
fd = 8
pid = <optimized out>
e = <optimized out>
ltmp = <optimized out>
ctmp = <optimized out>
cpos = <optimized out>
req = 0xa803b8
rfds = {fds_bits = {256, 0 <repeats 15 times>}}
e = <optimized out>
max = 16
ret = <optimized out>
t = 2048
p = <optimized out>
tv = {tv_sec = 10, tv_nsec = 0}
tls_pending = 0
dtls_pending = 0
i = <optimized out>
tnow = {tv_sec = 1412487975, tv_nsec = 300402000}
proto_overhead = 28
ip6 = <optimized out>
sl = 4
emptyset = {__val = {0 <repeats 16 times>}}
blockset = {__val = {16384, 0 <repeats 15 times>}}
#8 0x000000000040cc0c in vpn_server (ws=ws at entry=0xa803a0) at worker-vpn.c:829
buf = "CONNECT /CSCOSSLC/tunnel HTTP/1.1\r\nHost: XXXX\r\nUser-Agent: OpenConnect VPN Agent (Java) v6.00-unknown\r\nCookie: webvpn=sQxf2ORWgUJY3DIMhgvKTBpcZj9ab1C1XTxN6kR8U/LSiwXPQgk+1oovHfkft6Az0Q3A3A3Lb"...
ret = <optimized out>
nparsed = <optimized out>
nrecvd = <optimized out>
session = 0xa7df90
parser = {type = 0, flags = 0, state = 18, header_state = 0, index = 0, nread = 0, content_length = 18446744073709551615, http_major = 1, http_minor = 1, status_code = 0, method = 5, http_errno = 0,
upgrade = 1, data = 0xa803a0}
settings = {on_message_begin = 0, on_url = 0x409030 <url_cb>, on_status = 0, on_header_field = 0x4097f0 <header_field_cb>, on_header_value = 0x409870 <header_value_cb>,
on_headers_complete = 0x4097d0 <header_complete_cb>, on_body = 0x409130 <body_cb>, on_message_complete = 0x408f00 <message_complete_cb>}
requests_left = <optimized out>
#9 0x000000000040772f in main (argc=<optimized out>, argv=<optimized out>) at main.c:1182
stype = 0
fd = 8
pid = <optimized out>
e = <optimized out>
ltmp = <optimized out>
ctmp = <optimized out>
cpos = <optimized out>
rd_set = {fds_bits = {16, 0 <repeats 15 times>}}
wr_set = {fds_bits = {0 <repeats 16 times>}}
n = 16
ret = <optimized out>
flags = <optimized out>
ts = {tv_sec = 30, tv_nsec = 0}
cmd_fd = {15, 16}
ws = 0xa803a0
worker_pool = <optimized out>
main_pool = 0xa764c0
set = <optimized out>
s = 0xa80600
emptyset = {__val = {0 <repeats 16 times>}}
blockset = {__val = {90115, 0 <repeats 15 times>}}
creds = {xcred = 0xa778a0, cprio = 0xa7c840, dh_params = 0xa5d760}
> There is a debug level log which should say:
> "removing client 'XYZ' with id 'PID'
>
> Do you see that message for these specific clients?
I’ll have to enable debug logging. Will do that next time I restart. (Don’t want to upset the users too often.)
Regards,
Niels
More information about the openconnect-devel
mailing list