There is a bug in our version of Xorg that makes fbBlt use an un-optimised code path. This is a fairly common operation so it is important for many use cases that it performs well.
This is fixed in newer versions of Xorg:
A quick test here with glxgears, firefox and some youtube resulted in a decrease of 17% to 12% for fbBlt in its overall CPU usage, as measured by perf.
Did a test with Xvnc from 4.9.0 and from build 5901. Manually started Xvnc, no client and just glxgears. perf then reports 34% vs 17% of the time spent in fbBlt.
I can also see the new version calling a sse2 optimised sub-function.
Lastly, glxgears reports a bit higher frame rate (2000 vs 1700).
Seems like we are indeed getting the faster version now.