We got a report that the ThinLinc client hangs on connect when used on Debian 10 (Buster). Specifically it is ssh that locks up somewhere early in the process, and this is the last lines in the log:
> 2018-11-16T21:58:52: ssh[E]: CONFIRM HOST KEY: xxx
> 2018-11-16T21:58:53: User accepted the new host key.
> 2018-11-16T21:58:53: Storing new host key for xxx.
After some digging it turns out that ssh tries to connect to a UNIX socket with an address of just "\0":s. This is apparently a valid address, but on most systems there isn't anything listening on it. However on the customer system "irqbalance" is listening on this address.
We don't know why ssh tries to connect here yet, but it has something to do with SSH agent support.
We also tried reproducing it here but initially failed. It turned out we did not get irqbalance installed by default. As soon as we installed it things broke for us as well. We don't know why irqbalance is only installed in some cases.
It was also reported on our mailing list:
It seems like a bug in irqbalance that was introduced here:
So any distribution that uses 1.5.0 or newer will be affected.
I've filed a bug with them here:
We still need to fix ssh so it doesn't attempt to connect to that bogus address though.
It looks like this was caused by r23835 for bug 3430. In that commit we tried to make sure our ssh wouldn't use any random ssh agent, and we did so by setting $SSH_AUTH_SOCK to "". Apparently ssh does not interpret "" as "no agent" and never has. Instead it tries to connect to "", which ends up with the address of just "\0":s.
It looks like what we need to do is completely remove $SSH_AUTH_SOCK, not just empty it.
Created attachment 899 [details]
Test program that provokes the bug. Simply run this on your client and tlclient will hang on connect.
The behaviour is still present in latest OpenSSH, so I reported a bug here:
The changed code affect passwords and public keys, so I tested password and both our public key methods ("regular" and smart card) on Linux, Windows and macOS.
I could no longer provoke the bug on Linux, and I could not see any regressions on any platform.
Verified that thinlinc-client_4.9.0post-5988_amd64 does work where thinlinc-client_4.9.0-5775_amd64 failed to start on Debian Buster.
The release notes are fine.