www.cendio.com
Bug 4735 - congestion control misbehaves with large frame updates
: congestion control misbehaves with large frame updates
Status: CLOSED FIXED
: ThinLinc
VNC
: trunk
: PC Unknown
: P2 Normal
: 4.10.0
Assigned To:
:
:
: 7158
: 5106
  Show dependency treegraph
 
Reported: 2013-07-01 11:42 by
Modified: 2018-09-28 13:10 (History)
Acceptance Criteria:
* Performance should be equal, or better, compared to an old server for all network conditions * Frame rate (and bandwidth) should reach the maximum faster compared to the old server (mostly noticeable with high latency)


Attachments
Measurement for current code (347.00 KB, text/csv)
2017-03-31 11:08, Pierre Ossman
Details
Measurement for new code (679.94 KB, text/csv)
2017-03-31 11:09, Pierre Ossman
Details
Gnuplot script (149 bytes, text/plain)
2017-03-31 11:09, Pierre Ossman
Details


Note

You need to log in before you can comment on or make changes to this bug.


Description From cendio 2013-07-01 11:42:54
Currently a VNC screen update is sent all in one go. This is problematic when
we have large updates as it messes with the congestion control. The congestion
control relies on "ping" packets between the client and server. If there is too
large delay between these packages, it will revert back to 1 update per RTT.

We should introduce a mechanism to split up framebuffer updates over several
messages. That way we could interleave other commands between the pieces.

One easy way to solve this is to introduce a new "end of frame"
pseudo-encoding. The client would then look for such a rect instead of
triggering on the FBU message.


One slight implementation problem that needs to be solved is the timestamps of
the interleaved fence messages. They will most likely all be the same, even
thought they will be gradually pushed onto the pipe, giving an artificially
inflated RTT.
------- Comment #1 From cendio 2013-07-01 11:43:37 -------
We probably also need to solve bug 4734 for this to have the desired effect.
------- Comment #2 From cendio 2014-12-03 19:33:58 -------
Thinking about this a bit more, splitting up the updates is probably not the
answer, or at least not the whole answer.

In a "normal" congestion control system, you get lots and lots of feedback. TCP
gets more or less an ACK every 1500 bytes. I don't think we can split things up
that much without getting clobbered by the overhead.

A better alternative is to improve the congestion control handling:

Instead of doing the whole fire-and-forget we do now, we instead keep track of
every congestion marker we send out. When we get one back we can look at the
next expected marker (that's still on the wire), compare the time stamps and
data counters and make a qualified guess as to how much data has already been
consumed. That way we only need two markers per RTT.

Another upside of this is that we don't need to fill the markers with a lot of
data, reducing their overhead to just a few bytes.
------- Comment #3 From cendio 2016-10-10 15:53:45 -------
An improved version of the congestion control system is available here:

https://github.com/CendioOssman/tigervnc/tree/congestion

Works well for the obvious cases, but needs more testing to make sure there
aren't any corner cases that misbehave.
------- Comment #4 From cendio 2017-03-31 11:03:59 -------
There's a PR now which will hopefully get some testing:

https://github.com/TigerVNC/tigervnc/pull/442
------- Comment #5 From cendio 2017-03-31 11:08:30 -------
Created an attachment (id=782) [details]
Measurement for current code

Test run with the existing code and dump of relevant stats:

 - Throttled to 1 Mbps and 1500 ms RTT (BDP 183 KiB)
 - Started glxgears in fullscreen over 1024x768
 - Paused it a long period (trigger close of congestion window)
 - Paused it a short period

The columns show our computed congestion window and the computed in flight
data, and the same values as reported by the underlying socket.
------- Comment #6 From cendio 2017-03-31 11:09:04 -------
Created an attachment (id=783) [details]
Measurement for new code

Same measurement with the suggested changes.
------- Comment #7 From cendio 2017-03-31 11:09:49 -------
Created an attachment (id=784) [details]
Gnuplot script

Visualise the data using:

 $ gnuplot -c plot.gp before.csv
------- Comment #9 From cendio 2018-09-24 16:28:08 -------
The above files show the technical details, and how the internal calculations
are better.

But from a user point of view the differences are not that noticeable. When the
latency is low things are the same as before (as it isn't congested). If the
bandwidth is insufficient then we also get the same behaviour (the old code
relied on the fallback hack though). A small win that we can get the same
without hacks though.

However in the case there is sufficient bandwidth but high latency, then the
new code shows an improvement. It is able to reach the maximum bandwidth much
faster now. Mostly you see this when some kind of animation starts, but you can
also see it in the middle of videos if there is a prolonged "fade to black". At
that point the frame rate drops again, but the new code picks up speed much
faster.
------- Comment #10 From cendio 2018-09-24 16:29:40 -------
> * Performance should be equal, or better, compared to an old server for all network conditions

I could not find a scenario where it was worse.

> * Frame rate (and bandwidth) should reach the maximum faster compared to the old server (mostly noticeable with high latency)

I need 100+ ms to see this, but it does indeed happen. It becomes more
noticeable the higher the latency is.