www.cendio.com
Bug 7285 - ssh hangs on connect on Debian 10 (Buster)
: ssh hangs on connect on Debian 10 (Buster)
Status: CLOSED FIXED
: ThinLinc
Client
: 1.3.1
: PC Unknown
: P2 Normal
: 4.10.0
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2018-11-28 15:30 by
Modified: 2018-12-18 16:31 (History)
Acceptance Criteria:


Attachments
test program (228 bytes, text/x-python)
2018-11-29 09:56, Pierre Ossman
Details


Note

You need to log in before you can comment on or make changes to this bug.


Description From cendio 2018-11-28 15:30:09
We got a report that the ThinLinc client hangs on connect when used on Debian
10 (Buster). Specifically it is ssh that locks up somewhere early in the
process, and this is the last lines in the log:

> 2018-11-16T21:58:52: ssh[E]: CONFIRM HOST KEY: xxx
> 2018-11-16T21:58:53: User accepted the new host key.
> 2018-11-16T21:58:53: Storing new host key for xxx.
------- Comment #1 From cendio 2018-11-28 15:34:31 -------
After some digging it turns out that ssh tries to connect to a UNIX socket with
an address of just "\0":s. This is apparently a valid address, but on most
systems there isn't anything listening on it. However on the customer system
"irqbalance" is listening on this address.

We don't know why ssh tries to connect here yet, but it has something to do
with SSH agent support.

We also tried reproducing it here but initially failed. It turned out we did
not get irqbalance installed by default. As soon as we installed it things
broke for us as well. We don't know why irqbalance is only installed in some
cases.
------- Comment #3 From cendio 2018-11-28 15:38:08 -------
It was also reported on our mailing list:

http://lists.cendio.se/pipermail/thinlinc-technical/2018-November/005887.html
------- Comment #4 From cendio 2018-11-28 15:39:37 -------
It seems like a bug in irqbalance that was introduced here:

https://github.com/Irqbalance/irqbalance/commit/19c25ddc5a13cf0b993cdb0edac0eee80143be34

So any distribution that uses 1.5.0 or newer will be affected.

I've filed a bug with them here:

https://github.com/Irqbalance/irqbalance/issues/85

We still need to fix ssh so it doesn't attempt to connect to that bogus address
though.
------- Comment #5 From cendio 2018-11-29 09:55:07 -------
It looks like this was caused by r23835 for bug 3430. In that commit we tried
to make sure our ssh wouldn't use any random ssh agent, and we did so by
setting  $SSH_AUTH_SOCK to "". Apparently ssh does not interpret "" as "no
agent" and never has. Instead it tries to connect to "", which ends up with the
address of just "\0":s.

It looks like what we need to do is completely remove $SSH_AUTH_SOCK, not just
empty it.
------- Comment #6 From cendio 2018-11-29 09:56:01 -------
Created an attachment (id=899) [details]
test program

Test program that provokes the bug. Simply run this on your client and tlclient
will hang on connect.
------- Comment #7 From cendio 2018-11-29 13:51:22 -------
The behaviour is still present in latest OpenSSH, so I reported a bug here:

https://bugzilla.mindrot.org/show_bug.cgi?id=2936
------- Comment #10 From cendio 2018-11-30 14:19:03 -------
The changed code affect passwords and public keys, so I tested password and
both our public key methods ("regular" and smart card) on Linux, Windows and
macOS.

I could no longer provoke the bug on Linux, and I could not see any regressions
on any platform.
------- Comment #11 From cendio 2018-12-18 16:31:35 -------
Verified that thinlinc-client_4.9.0post-5988_amd64 does work where
thinlinc-client_4.9.0-5775_amd64 failed to start on Debian Buster.

The release notes are fine.