www.cendio.com
Bug 5748 - upgrade pixman to get new performance enhancements
: upgrade pixman to get new performance enhancements
Status: CLOSED FIXED
: ThinLinc
Build system
: pre-1.0
: PC Unknown
: P2 Normal
: 4.6.0
Assigned To:
:
:
:
: 5106
  Show dependency treegraph
 
Reported: 2015-12-11 13:01 by
Modified: 2016-04-12 12:24 (History)
Acceptance Criteria:


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From cendio 2015-12-11 13:01:21
I was playing around with bug 5648 and noticed that most of the time was not
spent encoding, but rather a function called
bits_image_fetch_bilinear_affine_pad_x8r8g8b8(). This is part of pixman.

Our pixman is a few years behind, so I tried upgrading it to the latest. And
there were massive improvements. The relevant function fell from being 50% of
the CPU usage to 15%. Encoding is now back on top as the main CPU bottleneck.

There were no problems playing youtube in fullscreen at 1080p when both this
upgrade and bug 5648 were in place, whilst before there were noticeable frame
drops.
------- Comment #1 From cendio 2015-12-11 13:05:17 -------
For reference, the test case was Firefox 42.0 on my Fedora 23 workstation
(i7-3770) and youtube in full screen using Firefox' video player.
------- Comment #2 From cendio 2015-12-30 23:04:42 -------
I did some benchmarking using this tool I found here:

http://cgit.freedesktop.org/~aplattner/xrenderbenchmark/

Unfortunately it showed no significant changes between the old and new code.
But we must conclude that this test is then insufficient as we saw noticeable
improvements with Firefox.

So I found this little nice tirade on why all synthetic benchmarks suck:

https://cworth.org/intel/performance_measurement/

And it also points to a tracing tool in cairo that can replay graphical
operations from real application use. So let's see what that gives us.
------- Comment #3 From cendio 2016-01-04 15:19:05 -------
Urgh. That didn't really show much either. Need to make sure I'm not doing the
tests incorrectly. Could also try getting a trace from the firefox usage we saw
was improved.
------- Comment #4 From cendio 2016-01-04 15:43:12 -------
No dice. Firefox' rendering of video is not showing up in cairo traces.
------- Comment #5 From cendio 2016-01-05 14:27:57 -------
Did some more digging using perf and gdb.

Firefox is doing two CPU heavy operations; upscaling the video to the target
size, and compositing it in the browser offscreen pixmap.

The second of this is handled by sse2_composite_src_x888_8888 and was already
present in the old version of pixman.

The first step however was only partially accelerated in the old pixman, and
Firefox was using things in a way that was not accelerated. The new modes that
have been added are bilinear scaling with repeat modes active. The existing
code could only handle scaling with no repeat active.

There has also been some acceleration for adding a constant to all pixels of a
buffer, and bilinear scaling with a simple mask.



So the quick summary is that many more forms of scaling are now faster. I will
try to get a test of exactly how much faster.
------- Comment #7 From cendio 2016-01-05 15:33:10 -------
(In reply to comment #5)
>
> The first step however was only partially accelerated in the old pixman, and
> Firefox was using things in a way that was not accelerated. The new modes that
> have been added are bilinear scaling with repeat modes active. The existing
> code could only handle scaling with no repeat active.
> 

Only partially correct. The old code handled different repeat modes as well.
What it didn't handle was format conversion from x888 to 8888. So it's becoming
more and more of a corner case (although Firefox is a pretty common use case).

I've modified xrenderbenchmark to replicate these conditions. With the old
pixman I get:

> $ DISPLAY=:2 ./xrenderbenchmark -ops SRC -tests filter -time 20 -argb
> xrenderbenchmark version 1.0.2-agp1
> X Server from: The X.Org Foundation, Release: 11400000
> 	Xrender version: 0.11
> ---------------------------------------------
> Test: Src
> 		 Transformation/Bilinear filter...................96600 frames in 20.002 seconds = 4829.511 FPS

And perf shows this function being used:
bits_image_fetch_bilinear_affine_pad_x8r8g8b8


An upgraded pixman shows:

> $ DISPLAY=:2 ./xrenderbenchmark -ops SRC -tests filter -time 20 -argb
> xrenderbenchmark version 1.0.2-agp1
> X Server from: The X.Org Foundation, Release: 11400000
> 	Xrender version: 0.11
> ---------------------------------------------
> Test: Src
> 		 Transformation/Bilinear filter...................595650 frames in 20.0009 seconds = 29781.218 FPS

And perf now shows this function instead:
fast_composite_scaled_bilinear_sse2_x888_8888_pad_SRC

So about a 500% increase. Not shabby. :)
------- Comment #8 From cendio 2016-01-05 15:34:24 -------
There might also be more, smaller improvements in pixman. There has been 208
commits since our last update.
------- Comment #9 From cendio 2016-01-08 17:05:55 -------
I can't find any regressions. I have tested build 4996 on Fedora 23 using a
variety of different media players and browsers in the session. I have also
compared the performance to the old code playing a video in firefox and
verified the performance improvements. Closing.