tunnel upstream is unusably slow

Ortwin Glück odi at odi.ch
Wed Apr 2 06:55:10 EDT 2014


Hi,

The tunnel throughput I can achieve with openconnect is just around 50
KB/s of upload:

$ scp LSIUtil_1.62.zip user at remote.example.com
LSIUtil_1.62.zip                                              100%
2290KB  63.6KB/s 896.0KB/s   00:36

Downstream is fast however:
$ scp user at remote.example.com:LSIUtil_1.62.zip .
LSIUtil_1.62.zip                                              100%
2290KB   1.1MB/s   1.5MB/s   00:02

It can't be the network (it's 100MBit symmetrical). And with other
connections I have no such problem.

With Wireshark I see that the TCP connection is not saturated at all. I
see that there are huge delays (around 200ms) between an incoming ACK
and the next outgoing data packet. Like the sender is waiting for something.

Also strace shows that the write() by scp returns EAGAIN very often even
though the selector gets a signal that the fd is ready for writing:
[pid 32093] 12:24:27.106894 write(6,
"N\203E\354,\20\371\247\200\277\210\371k\246\300\347\3\237\217\365m@\340\217\3\177\355\216B\353sr"...,
16384) = -1 EAGAIN (Resource temporarily unavailable)
[pid 32093] 12:24:27.107005 write(6,
"N\203E\354,\20\371\247\200\277\210\371k\246\300\347\3\237\217\365m@\340\217\3\177\355\216B\353sr"...,
16384) = 16384
[pid 32093] 12:24:27.107145 write(6,
"\263\25\r\355\305\354\267\311>E\326\243\322\373\356Z\31\355\37W\243\362.\342\0237\361x\230\36:\346"...,
16384) = 16384
[pid 32093] 12:24:27.107282 write(6,
"\r\251=\261\373gq{i\242\210\16\357N\364g\337\377\230Q;\326s\177\242\31\236\207\273Z@\372"...,
16384) = -1 EAGAIN (Resource temporarily unavailable)
[pid 32094] 12:24:27.197132 write(3,
"Y\357\31\355V-\342\337\322\347\317\17\377\243[)\n\377\315v\366(t\375\204\313F\243\367\16\254Q"...,
131456) = 19940
[pid 32093] 12:24:27.198135 write(6,
"\r\251=\261\373gq{i\242\210\16\357N\364g\337\377\230Q;\326s\177\242\31\236\207\273Z@\372"...,
16384) = 16384
[pid 32093] 12:24:27.198429 write(6,
"\247\237\344\320oK\333\207\207\22\251=\345P\376\212\243\17],\177E\246\375\256\276\323\220\277\2\372P"...,
16384) = -1 EAGAIN (Resource temporarily unavailable)
[pid 32093] 12:24:27.198575 write(6,
"\247\237\344\320oK\333\207\207\22\251=\345P\376\212\243\17],\177E\246\375\256\276\323\220\277\2\372P"...,
16384) = 16384
[pid 32093] 12:24:27.198717 write(6,
"\300}\275\322V\273\276\256\2405\373\351\376Ia\177r[\10\376b5\32_\245\374\36\261\362k6\365"...,
16384) = -1 EAGAIN (Resource temporarily unavailable)

Maybe it's just (buggy) control of your selectors that is introducing
the delays, I don't know. But there seems to be a design problem here:

Looking at the code in mainloop.c I see that input and output are
handeled by the same thread:
- all packets are read from SSL and queued (*)
- all queued packets are written to SSL
- up to qlen packets are read from tun and queued (*)
- all queued packets are written to tun
- polls the command fd
- if any step marked with (*) saw a packet, loop, otherwise sleep in
selector before loop

I am not surprised that this introduces anti-symmetric behaviour for up-
and downstream. It should probably be rewritten to have one thread per
direction (like a normal pipe), which of course requires proper locking
for the shared data structures. Then you could actually get rid of the
queue completely (write each packet received on tun directly to SSL and
vice versa), which conveniently also eliminates buffer bloat problems.
By the way I also find it very ugly that each packet causes an malloc()
/ free() pair and copying between the buffers. Eliminating the queue
will also eliminate that.

Unfortunately my C skills are not fresh enough to be of any coding help
here.

Ortwin




More information about the openconnect-devel mailing list