Restarting ocserv doesn't clean up all workers

Nikos Mavrogiannopoulos nmav at gnutls.org
Sat Oct 4 12:17:16 PDT 2014


On Sat, 2014-10-04 at 21:36 +0800, Niels Peen wrote:

> > Seeing it again there may be an issue in the way waitpid() is handled.
> > That should fix it:
> > http://git.infradead.org/ocserv.git/commitdiff/accdb24050a1de06c0408c9d783aa0575e35e831
> The problem seems to starts earlier than that. When I let the server untouched for a while 
> users started complaining that logins were rejected. The log indicates that ocserv believes 
> these users are already connected a maximum number of times:
> ocserv[21306]: main: 1.2.3.4:58843 user ‘XYZ' tried to connect more than 2 times
> Doing a tcpdump on the interfaces created for this user showed no traffic at all, which is 
> unlikely is something was truly connected.

So, if I understand correctly, there was a user connection at some
point, which go stuck?

> Just to be sure I also did a tcpdump on the external interface to see if there was at least
> some DPD traffic going to the client’s IP. There was none. 
> Typical strace for these processes:
> strace -p 21306
> Process 21306 attached - interrupt to quit
> recvfrom(8, 

There are numerous places where this could occur. Would it be possible
to run:
$ gdb /usr/sbin/ocserv 21306
$ bt full

> These are also the processes that didn’t die after I restart ocserv.

There is a debug level log which should say:
"removing client 'XYZ' with id 'PID'

Do you see that message for these specific clients?

regards,
Nikos





More information about the openconnect-devel mailing list