www.cendio.com
Bug 5589 - upgrade PulseAudio
: upgrade PulseAudio
Status: CLOSED FIXED
: ThinLinc
Sound
: pre-1.0
: PC Unknown
: P2 Normal
: 4.5.0
Assigned To:
:
:
:
: 4194
  Show dependency treegraph
 
Reported: 2015-07-01 16:13 by
Modified: 2015-09-25 14:00 (History)
Acceptance Criteria:


Attachments
Debug log file from freeze of thinlinc session. (308.09 KB, text/x-log)
2015-08-11 15:13, Henrik Andersson
Details
tlclient.log from eLux RL where audio doesn't work (6.82 KB, text/x-log)
2015-08-21 15:26, Samuel Mannehed
Details


Note

You need to log in before you can comment on or make changes to this bug.


Description From cendio 2015-07-01 16:13:51
We're currently on 4.0 and the latest version is 6.0. We should upgrade to get
any bug fixes and improvements, and to stay up to date with what's happening.
------- Comment #1 From cendio 2015-07-06 13:46:58 -------
We've now upgraded to 6.0 and switched over to the new tunnel modules.
Something is very off though. Using Fedora 22 as both the server and client,
and without bug 4194, I'm getting either complete stall or choppy audio when
trying to use youtube. I'm also getting insane errors in the log:

2015-07-06T13:44:44: pulseaudio[E]: W: [pulseaudio] protocol-native.c: Client
sent non-aligned memblock: index 0, length 16364, frame size: 8
------- Comment #2 From cendio 2015-07-06 16:18:33 -------
I find fixes for this upstream. I'm seeing a few more issues though:

 - No logging of buffer handling in the new modules. Would like at least
changes to buffer size, and underrun/overflow

 - No volume control in the new modules

 - Latency is rather low. This is normally a good thing, but I'm seeing a lot
of underruns on at least the armhf board (with F22) until it forces a
sufficient increase in the buffers. Hopefully we can fix this without screwing
up the low latency on other systems.
------- Comment #3 From cendio 2015-07-06 16:21:18 -------
(In reply to comment #1)
> We've now upgraded to 6.0 and switched over to the new tunnel modules.
> Something is very off though. Using Fedora 22 as both the server and client,
> and without bug 4194, I'm getting either complete stall or choppy audio when
> trying to use youtube. I'm also getting insane errors in the log:
> 
> 2015-07-06T13:44:44: pulseaudio[E]: W: [pulseaudio] protocol-native.c: Client
> sent non-aligned memblock: index 0, length 16364, frame size: 8

Fixed in r30576. See upstream:

https://bugs.freedesktop.org/show_bug.cgi?id=88452
------- Comment #4 From cendio 2015-07-07 09:07:31 -------
(In reply to comment #2)
>  - Latency is rather low. This is normally a good thing, but I'm seeing a lot
> of underruns on at least the armhf board (with F22) until it forces a
> sufficient increase in the buffers. Hopefully we can fix this without screwing
> up the low latency on other systems.

Not even aplay and paplay work reliably on that setup so I'm going to dismiss
that one as a platform issue rather than a ThinLinc bug, even if it happens to
be worse now than with 4.4.0. I'll try another distribution and see if it's
more sane.
------- Comment #5 From cendio 2015-07-07 09:42:22 -------
We also lost X11 support when we moved to the new modules. This time it's a bit
more tricky to solve as the handling is in the common libpulse, and not just
the tunnel modules. I think we'll have to do some dlopen() magic.
------- Comment #6 From cendio 2015-07-07 11:25:06 -------
(In reply to comment #4)
> (In reply to comment #2)
> >  - Latency is rather low. This is normally a good thing, but I'm seeing a lot
> > of underruns on at least the armhf board (with F22) until it forces a
> > sufficient increase in the buffers. Hopefully we can fix this without screwing
> > up the low latency on other systems.
> 
> Not even aplay and paplay work reliably on that setup so I'm going to dismiss
> that one as a platform issue rather than a ThinLinc bug, even if it happens to
> be worse now than with 4.4.0. I'll try another distribution and see if it's
> more sane.

Works well with the same board but with Xubuntu 13.10 instead.
------- Comment #7 From cendio 2015-07-07 13:07:25 -------
(In reply to comment #2)
> 
>  - No logging of buffer handling in the new modules. Would like at least
> changes to buffer size, and underrun/overflow
> 

r30578. Also sent upstream:

https://bugs.freedesktop.org/show_bug.cgi?id=91257
------- Comment #8 From cendio 2015-07-09 12:34:47 -------
(In reply to comment #2)
> 
>  - No volume control in the new modules
> 

Fixed in r30581 and sent upstream:
https://bugs.freedesktop.org/show_bug.cgi?id=91280
------- Comment #9 From cendio 2015-07-09 14:53:22 -------
(In reply to comment #5)
> We also lost X11 support when we moved to the new modules. This time it's a bit
> more tricky to solve as the handling is in the common libpulse, and not just
> the tunnel modules. I think we'll have to do some dlopen() magic.

Fixed in r30583.
------- Comment #10 From cendio 2015-07-09 15:12:58 -------
Should be all done now.

Tester needs to verify playback and recording on every arch. Preferably test
against an older server in order to avoid influences by bug 4194.

On Linux you also need to test the four modes (auto, pulse, alsa, oss). One
platform is sufficient for this.

Volume control needs to be tested on at least one platform. Remember to test
both playback and recording. Also remember to test both client to server and
server to client. Note that gnome-control-center is buggy when handling
recording streams. pactl can be used for more low level control.

Reading of X11 properties needs to be tested on Linux. E.g. screwing up the
PULSE_SERVER root window property and seeing that this causes us to fail to
connect to the local pulseaudio server.
------- Comment #11 From cendio 2015-07-16 16:53:33 -------
It's not working well on the ARM board, even with Xubuntu. It gets into a
buffer underrun state that it has severe difficulties getting out of. Most
easily triggered together with bug 4194.

The problem seems to be that our new PulseAudio is aiming for a much lower
latency. When playing youtube in Firefox I can see that the tunnel sink is
configuring itself for a latency of just 25 ms. Our old client used a fixed
latency of 250 ms. And I guess the board is simply too weak to keep up with the
25 ms latency.

Switching over to ALSA also shows the issue, but that module is smart enough to
compensate. It quickly detects that the configured latency is too low and
increases it (without glitches even). It stabilises slightly under 200 ms.

I see a few options here:

 - Revert behaviour back to a static 250 ms. Not nice as it punishes all decent
platforms.

 - Conditionally force a lower bound on weak platforms. Not sure how to
identify them though. All ARM?

 - Make the tunnel module more like the ALSA one and increase the latency as it
detects problems.
------- Comment #12 From cendio 2015-07-17 10:49:24 -------
(In reply to comment #11)
> 
>  - Make the tunnel module more like the ALSA one and increase the latency as it
> detects problems.

A very basic version of this has been implemented in r30610 and sent upstream:

https://bugs.freedesktop.org/show_bug.cgi?id=91370
------- Comment #13 From cendio 2015-08-11 15:13:36 -------
Created an attachment (id=635) [details]
Debug log file from freeze of thinlinc session.

Testing client build 4843 on Fc22 x86_64 workstation against ThinLinc 4.4.0 on
eudemo.thinlinc.com the following problems occured:

1. Played a youtube video for >15 minutes and the pulseaudio server died on the
client with the following assert:

  2015-08-11T14:16:17: pulseaudio[E]: E: [tunnel-sink] queue.c: Assertion
'!e->next' failed at pulsecore/queue.c:104, function pa_queue_pop(). Aborting.

2. Retried to reproduce the above issue but failed. Video played fine after 40
minutes. Then I tried to sleep on the tab key in a terminal to produce several
bells and then metacity hang. Killing client side pulseaudio server released
the freeze in the ThinLinc session. This time i also ran thinlinc client with
debug output, see attached file.
------- Comment #14 From cendio 2015-08-11 15:25:12 -------
(In reply to comment #13)

> 2. Retried to reproduce the above issue but failed. Video played fine after 40
> minutes. Then I tried to sleep on the tab key in a terminal to produce several
> bells and then metacity hang. Killing client side pulseaudio server released
> the freeze in the ThinLinc session. This time i also ran thinlinc client with
> debug output, see attached file.

This also triggers another bug:

Pulseaudio server on client is running at 100% and when closing tlclient which
should cleanup hung processes wont kill the pulseaudio server process. Killing
pulseaudio server from terminal using SIGKILL works as expected.
------- Comment #15 From cendio 2015-08-11 15:41:05 -------
(In reply to comment #13)

> 
> 2. Retried to reproduce the above issue but failed. Video played fine after 40
> minutes. Then I tried to sleep on the tab key in a terminal to produce several
> bells and then metacity hang. Killing client side pulseaudio server released
> the freeze in the ThinLinc session. This time i also ran thinlinc client with
> debug output, see attached file.

This is not reproduceable using nightly build of client on windows platform.
------- Comment #16 From cendio 2015-08-11 15:45:21 -------
(In reply to comment #13)

> 2. Retried to reproduce the above issue but failed. Video played fine after 40
> minutes. Then I tried to sleep on the tab key in a terminal to produce several
> bells and then metacity hang. Killing client side pulseaudio server released
> the freeze in the ThinLinc session. This time i also ran thinlinc client with
> debug output, see attached file.

Not reproducible using same Fc22 workstation but with 4.4.0 x86_64 client.
------- Comment #17 From cendio 2015-08-12 08:36:05 -------
(In reply to comment #14)
> 
> This also triggers another bug:
> 
> Pulseaudio server on client is running at 100% and when closing tlclient which
> should cleanup hung processes wont kill the pulseaudio server process. Killing
> pulseaudio server from terminal using SIGKILL works as expected.


Bug 5606.
------- Comment #18 From cendio 2015-08-12 10:44:46 -------
(In reply to comment #13)
> Created an attachment (id=635) [details] [details]
> Debug log file from freeze of thinlinc session.
> 
> Testing client build 4843 on Fc22 x86_64 workstation against ThinLinc 4.4.0 on
> eudemo.thinlinc.com the following problems occured:
> 
> 1. Played a youtube video for >15 minutes and the pulseaudio server died on the
> client with the following assert:
> 
>   2015-08-11T14:16:17: pulseaudio[E]: E: [tunnel-sink] queue.c: Assertion
> '!e->next' failed at pulsecore/queue.c:104, function pa_queue_pop(). Aborting.
> 
> 2. Retried to reproduce the above issue but failed. Video played fine after 40
> minutes. Then I tried to sleep on the tab key in a terminal to produce several
> bells and then metacity hang. Killing client side pulseaudio server released
> the freeze in the ThinLinc session. This time i also ran thinlinc client with
> debug output, see attached file.

Both are probably the same bug. The mainloop was accessed from two threads,
which isn't supported. Fixed in r30654.

The bug was in the volume handling, so it is primarily that which needs
re-testing.
------- Comment #19 From cendio 2015-08-13 08:59:04 -------
Commit r30547 upgrades dependencies for the new version of pulseaudio. However
building libjson which is one component of the update fails. Seems like a
missing 
build dep for autoconf.

The log is for arm buidl but all archs fails the same way:

  + /opt/cendio-build/bin/cbrun armhf make -j4
  (CDPATH="${ZSH_VERSION+.}:" && cd . && /bin/sh  
/home/hean01/Development/cenbuild/repo/rpmbuild/BUILD/json-c-0.12/missing
autoheader)
  /home/hean01/Development/cenbuild/repo/rpmbuild/BUILD/json-c-0.12/missing:
line   81: autoheader: command not found
  WARNING: 'autoheader' is missing on your system.
           You should only need it if you modified 'acconfig.h' or
           'configure.ac' or m4 files included by 'configure.ac'.
           The 'autoheader' program is part of the GNU Autoconf package:
           <http://www.gnu.org/software/autoconf/>
           It also requires GNU m4 and Perl in order to run:
           <http://www.gnu.org/software/m4/>
           <http://www.perl.org/>
  make: *** [config.h.in] Error 127
------- Comment #20 From cendio 2015-08-13 10:36:21 -------
Fixed in r30662.
------- Comment #21 From cendio 2015-08-13 12:04:07 -------
(In reply to comment #20)
> Fixed in r30662.

Works as expected.
------- Comment #22 From cendio 2015-08-21 15:22:39 -------
Using 4.4.0 server on Fedora 20 and a client from nightly build:

I have tested playback and recording with success using the client on Windows
10, OSX 10.10 and Fedora 22 (pulseaudio backend).

When I got to testing on eLux RL (the S900 terminal) neither playback nor
recording worked. Looking at tlclient.log it seems like when it doesn't detect
PulseAudio it doesn't go further and just gives up on audio.
------- Comment #23 From cendio 2015-08-21 15:26:08 -------
Created an attachment (id=636) [details]
tlclient.log from eLux RL where audio doesn't work
------- Comment #24 From cendio 2015-08-21 15:31:19 -------
(In reply to comment #22)
> When I got to testing on eLux RL (the S900 terminal) neither playback nor
> recording worked. Looking at tlclient.log it seems like when it doesn't detect
> PulseAudio it doesn't go further and just gives up on audio.

I also verified that there is no problem when using a 4.4.0 client on eLux RL.
------- Comment #25 From cendio 2015-08-21 15:45:40 -------
(In reply to comment #22)
> When I got to testing on eLux RL (the S900 terminal) neither playback nor
> recording worked. Looking at tlclient.log it seems like when it doesn't detect
> PulseAudio it doesn't go further and just gives up on audio.

Urgh. Our simplistic approach of auto detecting the audio system doesn't work
well with the new tunnel modules. They behave a bit too asynchronously. I think
we'll have to go with a more complex detection logic. :/
------- Comment #26 From cendio 2015-08-21 16:28:31 -------
New method in r30695.
------- Comment #27 From cendio 2015-08-28 11:27:46 -------
(In reply to comment #10)
> Tester needs to verify playback and recording on every arch. Preferably test
> against an older server in order to avoid influences by bug 4194.

I have verified playback and recording using a ThinLinc 4.4.0 server and
audacity on a Fedora 20 machine. I have tested the following clients:

- tlclient build 4865 (rpm) on Fedora 22
- tlclient build 4865 (exe) on Windows 10
- tlclient build 4871 (UC_RP) on eLux RP 4.8.0
- tlclient build 4871 (armhf) on Xubuntu 15.04
- tlclient build 4871 (UC_ARM) on eLux RT 3.0.0

> On Linux you also need to test the four modes (auto, pulse, alsa, oss). One
> platform is sufficient for this.

Verified auto, pulse and alsa on eLux RP 4.8.0 with client build 4871. I can't
find any machine to test oss on.. skipping that.

> Volume control needs to be tested on at least one platform. Remember to test
> both playback and recording. Also remember to test both client to server and
> server to client. Note that gnome-control-center is buggy when handling
> recording streams. pactl can be used for more low level control.

Verified on Fedora 22 with client build 4865.

> Reading of X11 properties needs to be tested on Linux. E.g. screwing up the
> PULSE_SERVER root window property and seeing that this causes us to fail to
> connect to the local pulseaudio server.

Verified on Fedora 22 with client build 4865. I used the following command to
change the property:

# xprop -root -f PULSE_SERVER 8s -set PULSE_SERVER tcp:example.com:4713

After that neither playback nor recording worked in tlclient. Running the
client with -d 5 gave the following information in the log:

2015-08-28T10:55:11: pulseaudio[E]: D: [tunnel-sink] context.c: Trying to
connect to tcp:example.com:4713...
2015-08-28T10:55:11: pulseaudio[E]: D: [tunnel-source] context.c: Trying to
connect to tcp:example.com:4713...
2015-08-28T10:55:11: pulseaudio[E]: D: [tunnel-sink] module-tunnel-sink-new.c:
Context failed: Connection refused.
2015-08-28T10:55:11: pulseaudio[E]: D: [tunnel-source]
module-tunnel-source-new.c: Context failed with err Connection refused.