www.cendio.com
Bug 7089 - too large XML-RPC messages hangs Windows client
: too large XML-RPC messages hangs Windows client
Status: CLOSED FIXED
: ThinLinc
Client
: 1.3.1
: PC Unknown
: P2 Normal
: 4.9.0
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2017-12-14 16:39 by
Modified: 2017-12-29 16:27 (History)
Acceptance Criteria:


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From cendio 2017-12-14 16:39:22
We have gotten a report that the Windows client will lock up if the user has
too many sessions running. The last line of the log is:

> 2017-12-14T16:06:55: Calling XML-RPC method 'get_user_sessions'

The problem does not occur on other platforms.

The issue seems to be with the IPC between ssh and tlclient. Some more debug
logging from XML-RPC shows this on Windows:

> 2017-12-14T16:06:55: XmlRpcSocket::nbRead: read/recv returned 4095.
> 2017-12-14T16:06:55: XmlRpcClient::readHeader: client has read 4095 bytes
> 2017-12-14T16:06:55: client read content length: 11205

Whilst on Linux it doesn't hang there:

> 2017-12-14T15:20:24: XmlRpcSocket::nbRead: read/recv returned 4095.
> 2017-12-14T15:20:24: XmlRpcClient::readHeader: client has read 4095 bytes
> 2017-12-14T15:20:24: client read content length: 11205
> 2017-12-14T15:20:24: XmlRpcSocket::nbRead: read/recv returned 4095.
> 2017-12-14T15:20:24: XmlRpcSocket::nbRead: read/recv returned 3057.
> 2017-12-14T15:20:24: XmlRpcClient::readResponse (read 11205 bytes)

The IPC consists of pipes, and increasing the buffer of the pipes makes
everything start working. So the issue seems to be that we aren't handling full
pipe buffers properly.

The tlclient side of things are very simple so I don't think the issue is
there. So it's either in ssh, or the data gets dropped by Windows somewhere.
------- Comment #2 From cendio 2017-12-15 16:09:29 -------
More debugging and the issue is in ssh. There is no way to check if a pipe is
writeable, so we simply claim it always is. Microsoft's documentation claims
that a pipe should be blocking by default, so the expected behaviour is
intermittent hangs in ssh until tlclient empties the pipe buffer. However in
practice it is non-blocking and write() returns ENOSPC.

Need to check if the documentation is wrong or if write() is misbehaving.
------- Comment #3 From cendio 2017-12-15 16:44:13 -------
The documentation was wrong. The pipes are non-blocking by default. And setting
them to blocking solves the issue.
------- Comment #4 From cendio 2017-12-15 16:46:38 -------
Or maybe not... I found some code in ssh that sets things to non-blocking
(haven't checked if it is called yet). However that code also figured out some
way to check the outgoing buffer. There might be room to improve things.
------- Comment #7 From cendio 2017-12-20 10:36:08 -------
Seems to work well now.

Tester should check that the Windows client can connect to a server where the
user already has many (5+) sessions.
------- Comment #8 From cendio 2017-12-29 16:26:48 -------
I could reproduce the issue on Windows 10 with client build 5621 and can verify
that it is fixed in build 5656. I could start 10 sessions with the same user
and the same server without any problem.