Bug 3054 - xsri isn't included in most distributions
Summary: xsri isn't included in most distributions
Status: CLOSED FIXED
Alias: None
Product: ThinLinc
Classification: Unclassified
Component: Other (show other bugs)
Version: 2.1.0
Hardware: PC All
: P2 Enhancement
Target Milestone: 4.15.0
Assignee: Linn
URL:
Keywords: relnotes, tobfa_tester
: 5233 5718 (view as bug list)
Depends on: 7904
Blocks: 7906
  Show dependency treegraph
 
Reported: 2009-03-25 14:56 CET by Pierre Ossman
Modified: 2022-07-06 16:43 CEST (History)
5 users (show)

See Also:
Acceptance Criteria:
* The set background image and background colour should always be respected * If the background image cannot be found, the chosen background colour should be shown. * Transparent images should let the chosen background colour show through. * The background image and background colour should be set by the following hiveconf parameters: - /sessionstart/background_image - /sessionstart/background_color * The solution should gracefully handle faulty input data both for colour and image path * The solution should gracefully exit if pycairo can't be imported. * If the input colour is faulty, a default color should be used * The solution should work regardless of the distribution - Should not depend on packages that might "go out of fashion" and become unsupported


Attachments
Patch to select background colour for transparent pics (2.05 KB, patch)
2022-05-10 09:31 CEST, Linn
Details

Description Pierre Ossman cendio 2009-03-25 14:56:35 CET
Since bug 2768 we require xsri in order to set the background during session startup. Unfortunately xsri isn't installed by default on many dists, and it's non-trivial to ship ourselves as it uses GDK and GTK.

We should see if we can produce a shippable version of xsri (perhaps we've created .a libraries of gtk now?) or make a more stripped down version that uses just libX11 and libpng.
Comment 1 Pierre Ossman cendio 2014-09-02 12:59:05 CEST
*** Bug 5233 has been marked as a duplicate of this bug. ***
Comment 2 Henrik Andersson cendio 2015-11-18 09:31:44 CET
*** Bug 5718 has been marked as a duplicate of this bug. ***
Comment 3 Pierre Ossman cendio 2022-04-12 13:50:34 CEST
This seems to have gotten a lot worse over the years. It doesn't seem to be included at all in neither Fedora nor Ubuntu. So it is now very difficult for users to get this tool.
Comment 5 Pierre Ossman cendio 2022-04-13 10:37:02 CEST
The tools feh and wmsetbg seem to be current alternatives to xsri. If we don't want to write our own tool.
Comment 6 Linn cendio 2022-05-10 09:31:47 CEST
Created attachment 1036 [details]
Patch to select background colour for transparent pics
Comment 7 Linn cendio 2022-05-10 09:35:10 CEST
feh seems to be a good match for what we want to do, as well as being lightweight. Its official gitrepo can be found here: https://github.com/derf/feh

However, to get it to fully work with transparent pictures I had to add a small patch as otherwise flags --bg-fill and --bg-scale did not work, see the attached file.
Comment 10 Linn cendio 2022-05-11 09:03:11 CEST
Note that the vendordrop structure for feh has not been fully tested - I manually committed feh to the vendor tree, and then ran update to upstream script. However, when aborting the first commit of the script, it still continued to do the tagging commit.

However, these 2 commits (adding to vendor and tagging) are what the update script should do, so instead of running the script again I manually added feh to ctc in commit r38359.

It is questionable if feh ever will need to be updated since our use of it is very limited, so I will leave this untested for now.


As a side note, the latest commit included in our feh is this one:
https://github.com/derf/feh/commit/cd02c0eacef0c080c2d62a097142cebbeaa0ba78
Comment 11 Linn cendio 2022-05-11 09:05:15 CEST
Changing back acceptance criteria as these were unintentionally changed in comment 10.
Comment 12 Linn cendio 2022-05-23 16:37:09 CEST
The idea with of using 'feh' have been scrapped for the moment. In order for us to be able to ship it as an executable we need to statically link its dependencies, which proved to not be attainable due to an issue with statically linking imlib2 and its dependencies. 

In order to find a new way forward, I have looked into the following options:
  
* Including xsri as an executable. However, it is not a good fit since it has a dependency on GTK which would be troublesome to link statically. 

* By using the xsri code as a base, I tried to implement a basic prototype in Gdk and Cairo. However, xsri uses pixmap to set the background, and pixmaps were removed in GTK3. In order to change the background through Surfaces (replacements of pixmap), the caller has to manually make sure X11 variables have the correct values. This was not needed with pixmap, so this option turned out more complex than initially thought.

* Using ctypes in python to use X11 directly, via library 'pyxlib-ctypes'. Currently examining the possibilities for this by making a prototype.
Comment 24 Linn cendio 2022-06-10 17:15:27 CEST
The route using pyxlib-ctypes worked, and I have made a script based on it that successfully changes the root background to a solid color during profile choosing.

The next step is being able to set a background image. The plan is to use pycairo and its Surface class to do it.

Note however that currently pycairo is not installed correctly on SLES and Ubuntu, since it is not marked as a dependency of GTK's python bindings (see bug 7904 comment 27). This issue has to be fixed before this bug can be fully closed.
Comment 42 Linn cendio 2022-06-23 12:53:59 CEST
Pycairo's ImageSurfaces are used to set both the background color and the background image. 

However, note that in version 1.10 (seen on SLES 12 and RHEL 7), function get_data() is not available. Instead, we have a fallback behaviour using GdkPixbuf to get the pixel values. Once we don't need to support any dist with Pycairo 1.10, this fallback can be removed.

I also checked the version of Pycairo on some other dists, all of which had get_data() available:

SLES 15       |  v.1.15
RHEL 8        |  v.1.16
Ubuntu 18.04  |  v.1.16
Comment 56 Linn cendio 2022-06-28 17:46:24 CEST
Pycairo 1.10 also does not have attribute 'cairo.Error'. This issue was solved by at least Pycairo 1.12 if not earlier. 

For now, our code has some fallback behaviour for Pycairo < 1.12, but this can be removed once these versions no longer has to be supported. See comment 42 for more details regarding versions on different dists.
Comment 57 Linn cendio 2022-06-28 17:53:04 CEST
Updated acceptance criteria with a bullet point about not crashing if pycairo is not installed or can't be imported for any other reason.
Comment 58 Linn cendio 2022-06-28 17:55:17 CEST
Tested the acceptance criteria on RHEL 7 (Pycairo 1.10.0) and Ubuntu 22.04 (Pycairo 1.20.1).


> * The set background image and background colour should always be respected
Yes, both the correct image and color show up as the background.

> * If the background image cannot be found, the chosen background colour should be shown. 
Yes, only the colour is shown and a warning about 'file not found' is shown in xinit.log.

> * Transparent images should let the chosen background colour show through.
Yes, it does.

> * The background image and background colour should be set by the following hiveconf parameters:
>   - /sessionstart/background_image
>   - /sessionstart/background_color
Yes, the background image and colour are changed according to these two parameters.

> * The solution should gracefully handle faulty input data both for colour and image path
Yes, there are warnings in the logs about bad input, but the script uses "correct input" when possible, e.g. setting the image along with the default colour, or only setting the colour without the image. Checked the following:

Image:
 - Bad format (using a jpg)
 - Non-existing path
 - Corrupted image (renaming a .jpg to .jpg.png)

Colour:
 - Too long hexstring
 - Too short hexstring
 - Non-hex value
 - Negative hex value (e.g. #99-a99)
 
* The solution should gracefully exit if pycairo can't be imported.
Yes, the script writes an error to xinit.log and exits if pycairo can't be imported.

> * If the input colour is faulty, a default color should be used
Yes. Black is the chosen default colour.

> * The solution should work regardless of the distribution
>   - Should not depend on packages that might "go out of fashion" and become unsupported
This solution depends on Python, Pycairo and Gdk. These are all quite stable project that already is used in other parts of our product, which should make this solution more robust than what xsri was.
Comment 59 Linn cendio 2022-06-28 18:00:44 CEST
There are a few things left before this bug is fully done:

* Improve the tests by making them runable independently of where the parentfolder startupbg is placed, i.e. don't depend on other structures through hardcoded paths.

* Optimise the conversion from python bytearray to ctypes array. A more suitable way to convert it is by using 'from_buffer()'
Comment 60 Linn cendio 2022-06-29 13:21:35 CEST
Tips for tester:

* Test on dists with different versions of Pycairo. A small list of versions can be found in comment 42.

* If your test machine skips the profile choosing step because of only having one available profile, add the terminal profile. Install xterm and enable terminal profile from webadmin (Profiles -> Profile List) or directly in the hconf files.
Comment 61 William Sjöblom cendio 2022-06-29 15:32:51 CEST
The tester should also make sure the performance is sufficient with large framebuffers. Without the `from_buffer()' fix discussed in comment 59 it takes roughly 8 seconds between session startup and the point where the background is first rendered. This was measured with a framebuffer size of 4320x1920.
Comment 62 Linn cendio 2022-06-29 15:39:32 CEST
(In reply to Linn from comment #59)
> There are a few things left before this bug is fully done:
> 
> * Improve the tests by making them runable independently of where the
> parentfolder startupbg is placed, i.e. don't depend on other structures
> through hardcoded paths.
The thing we want to remove from the tests are these two lines, since they depend on where folder tlmisc/modules/.
> modules_path = os.path.abspath(os.path.join(get_origin_dir(), "../../modules"))
> sys.path = [modules_path] + sys.path
When removing these lines, 'thinlinc' has to be added to the list in variable 'fakemods' in order for the tests to run. However, adding 'thinlinc' to this list causes issues with the mocking, seen by the fact that the tests cast this assertion error:
> AssertionError: Depth of screen is 1, must be 24 or 32
The '1' is the return value for int(MagicMock()), but it is unclear why this issue with unmatching mocks happens. It may be something regarding nestling mocks, but in that case it is not obvious what causes the issue.

Below are some code examples of mocks that gives different behaviours:
> @patch("thinlinc.pyxlibctypes.xlib") # Gives unmatching mocks
> @patch("tl-startup-bg.xlib") # Gives correct mocks
Some tips for troubleshooting is to print mocks both on top-level of the script and inside main() and see if they differ. It might also help to print the mocked contents of sys.modules, to see that the intermediate mocks look correct.
Comment 65 William Sjöblom cendio 2022-06-30 11:05:46 CEST
The ctypes array construction has now been sped up significantly and the tests no longer rely on their tlmisc-relative position in the tree.

Marking as resolved.
Comment 68 Tobias cendio 2022-07-04 09:02:46 CEST
Loading of background is unreliable on RHEL8, with a success rate spanning roughly 75% to 90%, while the phenomenon cannot be reproduced on neither RHEL7 nor RHEL9. This has been observed while testing server build #2728 on RHEL8 and client build #2632 on Fedora36.
Comment 69 Pierre Ossman cendio 2022-07-04 13:05:48 CEST
I've been able to reproduce it on my RHEL 8 as well, so it seems to be a general issue.

There are no log lines from tl-startup-bg, unfortunately. And no obvious differences in xinit.log between a working and broken case.
Comment 70 Pierre Ossman cendio 2022-07-04 13:10:54 CEST
A theory is that XFlush() isn't completely reliably and may end up discarding data, e.g. if the socket is full. And since we use XPutImage() rather than XShmPutImage(), we try to send a lot of data to the server.

Under this theory, this is a race between tl-startup-bg sending data and Xvnc consuming data quickly enough for the final XFlush() to not overlook anything.

To test this theory, I added a sleep(2) to ProcPutImage() in Xvnc. With that in place, I can trigger the issue every time.

Next, I replaced XFlush() with an XSync(), to more robustly push out everything. With this, the bug goes away, even with the sleep() in Xvnc.

So the issue seems to be an improper flushing and closing from tl-startup-bg. We should probably even consider XCloseDisplay() to really neatly terminate things.
Comment 71 Pierre Ossman cendio 2022-07-04 13:12:21 CEST
(In reply to Pierre Ossman from comment #70)
> A theory is that XFlush() isn't completely reliably and may end up
> discarding data, e.g. if the socket is full.

To clarify, the theory isn't that XFlush() *discards* data. Rather, that it leaves data in some internal buffer. Which then gets discarded when we terminate, as we only make the single call to XFlush().
Comment 73 Pierre Ossman cendio 2022-07-04 13:37:19 CEST
Should be reliable now.
Comment 74 Tobias cendio 2022-07-04 14:46:06 CEST
Concerning comment 70 and comment 71: reducing the number of CPUs from 2 to 1 further impedes the ability of the server to display our image before XFlush(), resulting in a very low success rate -- less than 1/10 attempts. However, while still on 1 CPU, replacing XFlush() with XCloseDisplay() consistently yields our image as intended.
Comment 75 Tobias cendio 2022-07-05 13:26:15 CEST
Having pycairo but missing pygobject results in the following traceback,

Traceback (most recent call last):
  File "/opt/thinlinc/libexec/tl-startup-bg", line 230, in <module>
    i1Ii ( )
  File "/opt/thinlinc/libexec/tl-startup-bg", line 223, in i1Ii
    OOOOO = oooOoO0000 ( OoOOooO0oOO0Oo )
  File "/opt/thinlinc/libexec/tl-startup-bg", line 133, in oooOoO0000
    O0OOooO , oOOoOOOO000 )
KeyError: 'could not find foreign type Surface'

which concerns the call Gdk.pixbuf_get_from_surface().
Comment 76 Pierre Ossman cendio 2022-07-05 13:28:25 CEST
This is the kind of issues we saw on bug 7904. I guess we'll have to port a few fixes over.
Comment 80 Pierre Ossman cendio 2022-07-05 15:46:57 CEST
All relevant checks from tlgtk have now been ported. Only the fallback path has been affected.
Comment 81 Tobias cendio 2022-07-06 12:40:03 CEST
General
=======

Tested server build #2736 -- containing later fixes relating to problems
mentioned in comment 68 through comment 80 -- on dists

Ubuntu20.04
Ubuntu22.04
RHEL7
RHEL8
RHEL9
SLES12
OpenSUSE15.3
Fedora33
Fedora36

with client build #2639 on

Fedora36.

Both background image and color are displayed as intended and additionally the
image is sharp on a HiDPI setup.

The principle behaviour between official release 4.14 and the developers build
is identical, in terms of output when color and background hiveconf values are a
combination of missing values (e.g. commented or deleted row), empty values
(e.g. empty string background_color=), or faulty values.

The way errors are communicated in the log
(/var/opt/thinlinc/sessions/$USER/last/xinit.log) is a bit different though --
it has generally been improved; for instance an empty parameter value is not
complained about anymore. However, in the case with a faulty color it no longer
writes specifically what color the program couldn't parse, which would be
valuable debugging information for the user.

Acceptance criteria
===================

[x] The set background image and background colour should always be respected

[x] If the background image cannot be found, the chosen background colour should
    be shown.

[x] Transparent images should let the chosen background colour show through.

[x] The background image and background colour should be set by the following
    hiveconf parameters /sessionstart/background_image and
    /sessionstart/background_color.

[x] The solution should gracefully handle faulty input data both for colour and
    image path.

[x] The solution should gracefully exit if pycairo can't be imported.

[x] If the input colour is faulty, a default color should be used.

[x] The solution should work regardless of the distribution. Should not depend
    on packages that might "go out of fashion" and become unsupported.

Conclusion
==========

Making a tiny change to logging of faulty color, after which this bug will be
Closed.
Comment 84 Tobias cendio 2022-07-06 14:24:23 CEST
The logging change mentioned in comment 81 has been implemented and tested using
server build #2741 on RHEL9 and client build #2639 on Fedora36. The faulty color
string is now mentioned in the error log followed by the default fallback.

Closing.

Note You need to log in before you can comment on or make changes to this bug.