Xpra: Ticket #1129: Use XDG_RUNTIME_DIR

Currently, xpra defaults to ~/.xpra for several paths related to sockets, logfiles (1) and temporary scripts (2).

The XDG specification (3) says: $XDG_RUNTIME_DIR defines the base directory relative to which user-specific non-essential runtime files and other file objects (such as sockets, named pipes, ...) should be stored. The directory MUST be owned by the user, and he MUST be the only one having read and write access to it. Its Unix access mode MUST be 0700.

This seems to be the ideal location for these files.

In fact, Fedora already bends the paths to point to that location (4), (5), though currently literally instead of by using $XDG_RUNTIME_DIR (6).



Fri, 19 Feb 2016 03:01:51 GMT - Antoine Martin: status, description changed

The move to /var/run for sockets and log files is complete already and will be part of the 0.17 release, see: #888, #963. The change that moves the run-xpra script there is a very bad idea: it is too simplistic and it makes Fedora incompatible with every other distro out there (and MS Windows, OSX, etc - then people just think xpra doesn't work... ouch).

It looks like replacing /var/run/user/$UID/ with XDG_RUNTIME_DIR makes sense... except that again, it's not that simple: we use this same code to generate the default config file. And you really don't want it to contain the buildbot's UID.


Fri, 19 Feb 2016 03:03:31 GMT - Antoine Martin: description changed


Fri, 19 Feb 2016 08:55:10 GMT - jonathan.underwood:

Replying to antoine:

The move to /var/run for sockets and log files is complete already and will be part of the 0.17 release, see: #888, #963.

Any hope of backporting this to 0.16? I'd really liketo get this change in to F24 if possible.

The change that moves the run-xpra script there is a very bad idea: it is too simplistic and it makes Fedora incompatible with every other distro out there (and MS Windows, OSX, etc - then people just think xpra doesn't work... ouch).

OK, this was only done on the rawhide branch to see what broke :). However, I think this does need some consideration moving forward - having the run script in the home directory is fragile on NFS home directory mounted systems.

It looks like replacing /var/run/user/$UID/ with XDG_RUNTIME_DIR makes sense... except that again, it's not that simple: we use this same code to generate the default config file. And you really don't want it to contain the buildbot's UID.

Hm. OK, this could be achieved with a bit of seddism after building the config file, or some judicious use of quoting during building. Will look into it.

The reporter (urzds) seems to be doing some intensive testing using xpra in minimal containers (judging by the Fedora bugs filed), which is really great - this is a great opportunity to make this use case solid.


Fri, 19 Feb 2016 09:48:50 GMT - Antoine Martin: owner, status changed

Any hope of backporting this to 0.16? I'd really liketo get this change in to F24 if possible.


None whatsoever - too big. 0.17 will be frozen in the next 2 weeks though. BTW: you own #963 ;)

having the run script in the home directory is fragile on NFS home directory mounted systems.


Not just that, but there's no reason you can't mount your home directory noexec or readonly.

The problem here is that all existing clients connect via ssh by running this command:

xpra initenv;~/.xpra/run-xpra _proxy

(the _proxy subcommand just opens stdin/stdout pipes to the xpra server) If we can come up with a safe and backwards compatible way of changing that string, then I'm happy to backport it. Then we can move the script anywhere. Another difficulty here is that this needs to work with all shells. Any ideas?


r11996 looks up XDG_RUNTIME_DIR for both socket dirs and log dirs. Note: this goes into the config file with the UID replaced with $UID which we convert back to a UID at runtime.. (so really, not much has changed except we use the env var rather than building the path ourselves) We could have the actual string $XDG_RUNTIME_DIR in the config file, and it this string to the env substitution whitelist - not sure there's much to be gained from doing that though.


... xpra in minimal containers..


Sounds like a great use case, send bug reports this way!


Fri, 19 Feb 2016 09:51:29 GMT - jonathan.underwood: cc set


Fri, 19 Feb 2016 11:59:18 GMT - jonathan.underwood:

Replying to antoine:

Any hope of backporting this to 0.16? I'd really liketo get this change in to F24 if possible.


None whatsoever - too big. 0.17 will be frozen in the next 2 weeks though. BTW: you own #963 ;)

Ah, yes. Have been bogged down with $dayjob, but will get to that, thansk for the poke.


having the run script in the home directory is fragile on NFS home directory mounted systems.


Not just that, but there's no reason you can't mount your home directory noexec or readonly.

The problem here is that all existing clients connect via ssh by running this command:

xpra initenv;~/.xpra/run-xpra _proxy

(the _proxy subcommand just opens stdin/stdout pipes to the xpra server) If we can come up with a safe and backwards compatible way of changing that string, then I'm happy to backport it. Then we can move the script anywhere. Another difficulty here is that this needs to work with all shells. Any ideas?

Something like this?

xpra initenv;~/.xpra/run-xpra _proxy || $XDG_RUNTIME_DIR/run-xpra _proxy

I think conditional execution is posix compliant, so I think that's portable.

So, in paths.py we could have do_get_script_bin_dir() return a list of places to look for the run_xpra script, and then compose the ssh line from that list, with || as the separator.

This seems too simple, I am sure I am missing a bunch of complexity :)


r11996 looks up XDG_RUNTIME_DIR for both socket dirs and log dirs. Note: this goes into the config file with the UID replaced with $UID which we convert back to a UID at runtime.. (so really, not much has changed except we use the env var rather than building the path ourselves) We could have the actual string $XDG_RUNTIME_DIR in the config file, and it this string to the env substitution whitelist - not sure there's much to be gained from doing that though.


... xpra in minimal containers..


Sounds like a great use case, send bug reports this way!

So far, most of the bugs have been me missing some Requires for the Fedora package :).


Fri, 19 Feb 2016 16:32:54 GMT - Antoine Martin: owner, status changed

I think conditional execution is posix compliant, so I think that's portable.


I think I had considered that and decided against it. But now that I look at it again, I can't see why this wouldn't work!

Let me play with it.


Sun, 21 Feb 2016 07:28:13 GMT - Antoine Martin: owner, status changed

Done in r12005 (+fixup in r12006) - please see commit message for more details. I've tried to do this as cleanly as I can, which also means this is not 0.16 backport material..

It would be a little bit difficult to test because by default we run "xpra initenv", which re-creates all the scripts we may need. To workaround that, use XPRA_INITENV_COMMAND="" xpra attach ssh:... to skip this step.

If the scripts we try to run on the remote end do not exist (intentionally removed for testing, or just not present when connecting to an older xpra version), you get this sort of message on stderr (slightly ugly, but we can live with it):

bash: /run/user/1000/username/run-xpra: No such file or directory
bash: /home/username/.xpra/run-xpra: No such file or directory

At some point in the future, we can drop compatibility with older versions and remove "~/.xpra" from the list of directories where we save the script / try to run from. Until then, failure to create or run a script from this location is no longer fatal. (the same approach applies to socket-dir, see #963)

Note: we could actually drop the call to "xpra initenv" at the same time: the new remote command line will also try to run just a plain "xpra" (from the PATH), so this is now a little bit redundant.

@urzds: does that work for you?


Sun, 13 Mar 2016 02:52:03 GMT - Antoine Martin: priority changed

We do have a problem, on Fedora at least, when the /run/user/$UID directory does not exist: things don't work at all. Ouch.

I was expecting su to create this directory, but it doesn't, so this breaks badly:

su - test
xpra start ...

What are we supposed to do? If this directory cannot be relied upon, maybe I need to undo some of those changes?

@urzds / @jonathan.underwood: ideas?

The only slightly related ticket that I found are: https://bugzilla.redhat.com/show_bug.cgi?id=967509. The claim is that su -l should create the directory, but this does not work on Fedora 23.


Sun, 13 Mar 2016 12:58:02 GMT - jonathan.underwood:

Well, this problem is actually more general, and related to a problem I was trying to understand with xpra and Fedora: if I start an xpra session from within a gnome session, putting the socket etc under /run/user/$UID, and then log out of the gnome session, the xpra Xdummy process ends up bring killed (though the xpra process itself remains, oddly). The reason for that is that logging into gnome creates a new cgroup, and process started within gnome are started within that cgroup. On logging out, the cgroup is torn down. By default, it shouldn't really be killing off the Xdummy process, but for some reason it does. I haven't got to the bottom of what's going on here. But it's related to this issue of what to do if /run/user/$UID isn't there - that directory is created when systemd is asked to fire up a new cgroup.

What I think xpra should really be doing is starting itself as a systemd managed service. That would create the directory, and also mean the process could be nicely isolated in its own cgroup. On systems where systemd isn't being used, libcg and related tools could be used instead.

I realize this is a very incomplete picture... I am still reading about all this to try and understand it, and where the integration points lie. But, this would be a really nice thing to get working.


Sun, 13 Mar 2016 22:38:59 GMT - jonathan.underwood:

Replying to antoine:

We do have a problem, on Fedora at least, when the /run/user/$UID directory does not exist: things don't work at all. Ouch.

Just to clarify: /run/user/$UID is created by the pam_systemd PAM module on user login. So, the only case where that directory wouldn't exist would be when the user trying to start xpra isn't logged in, and so doesn't have an active login session. I am struggling a bit to work out when that could actually happen.

I was expecting su to create this directory, but it doesn't, so this breaks badly:

su - test
xpra start ...

What are we supposed to do? If this directory cannot be relied upon, maybe I need to undo some of those changes?

That does seem to be a su bug - I am not sure why that's going around the pam stack.

@urzds / @jonathan.underwood: ideas?

The only slightly related ticket that I found are: https://bugzilla.redhat.com/show_bug.cgi?id=967509. The claim is that su -l should create the directory, but this does not work on Fedora 23.

It's working here on F23...


Sun, 13 Mar 2016 22:51:21 GMT - jonathan.underwood:

Replying to jonathan.underwood: [snip]

The only slightly related ticket that I found are: https://bugzilla.redhat.com/show_bug.cgi?id=967509. The claim is that su -l should create the directory, but this does not work on Fedora 23.

It's working here on F23...

Addendum: actually no it's not working on F23. This is a nasty bug of su.


Sun, 13 Mar 2016 22:57:57 GMT - jonathan.underwood:

Bugzilla entry opened against su:

https://bugzilla.redhat.com/show_bug.cgi?id=1317304


Sun, 13 Mar 2016 23:10:54 GMT - jonathan.underwood:

Actually, after reading this:

https://bugzilla.redhat.com/show_bug.cgi?id=753882#c29

I suspect this won't get fixed. Bottom line is that su --login (or su -l or su -) isn't actually designed to behave as though the new user was logging in. The recommended way would be ssh localhost or some variant thereof.

I suppose xpra could grow a command line option to specify the user to run as. But then xpra would also have to get into the business of logging a user in, and starting a session etc.


Mon, 14 Mar 2016 13:15:03 GMT - Antoine Martin:

.. isn't actually designed to behave as though the new user was logging in...


What a mess. The man pages reads Start the shell as a login shell with an environment similar to a real login. One would expect this environment to actually be usable... Requiring an ssh server to get a directory created is beyond bizarre.

Whatever the justification is, this means that the new XDG_RUNTIME_DIR is useless for a large number of valid use cases. End users will not care that $XYZ is the right way of doing something, they will just rightfully complain that something that used to work just fine no longer does.

So I am tempted to undo this change in the default config, those who really insist on using this location can make the change themselves and deal with the fallout.


Mon, 14 Mar 2016 13:34:22 GMT - jonathan.underwood:

Yeah. It is a mess. Why not check to see if $XDG_RUNTIME_DIR is set, and if not, fall back to using the home directory?

Of relevance to this whole issue:

https://tlhp.cf/lennart-poettering-su/


Mon, 14 Mar 2016 13:40:43 GMT - Antoine Martin:

Why not check to see if $XDG_RUNTIME_DIR is set


I believe we now continue after printing a warning when we fail to create a socket in one of the locations specified in the config file (or command line), handling for the server log files would require more ugly code.

And we can't even do the fallback for the Xorg command line, because it is generated in the config file, and the vfb will just fail to start. That is, unless we substitute the log location in that string at runtime. More code, more chances of breakage. More documentation to update, more bug reports to deal with. Sigh.


Thu, 31 Mar 2016 15:09:17 GMT - Antoine Martin: status changed; resolution set

As per the comments above, this directory is not usable so we won't try to use it.

r12297 disables all this code, nice waste of my time writing it.


Fri, 22 Apr 2016 18:22:50 GMT - urzds:

@urzds: does that work for you?

This appears to work with r12469 on Fedora Rawhide, as can be tested with the Docker image quay.io/urzds/xpra-test:v0.17-pre12469-fedora-1, which is built from these layered repositories:

As a hack I made use of Xpra automatically creating the ~/.xpra directory and set XDG_RUNTIME_DIR=$HOME/.xpra, to make sure everything ends up in the same spot without relying on any system directories.


Mon, 02 May 2016 08:03:21 GMT - Antoine Martin: status changed; resolution deleted

The syntax suggested in comment:6 is apparently not posix compliant and causes failures with tcsh, see xpra-0.17.0 on Centos 7.

So r12511 tries "~/.xpra/run-xpra" first which makes tcsh happy again. But that's not ideal, and trying all the proxy locations with || is not ideal either: there could be more than one script present and if the first script returns an error, we shouldn't try to run the next one..

@urzds / @jonathan.underwood: Ideas?


Wed, 01 Jun 2016 10:32:28 GMT - Antoine Martin:

I'm now on Fedora 24 and this seems to have been fixed? Upon ssh-ing (or su-ing) to another newly created user, XDG_RUNTIME_DIR is set and exists, when this user logs out the directory disappears. No idea when / how this got fixed or how we're supposed to check if the system libraries are good enough for our needs. So switching to this location is likely to require a build time switch, which can then be enabled for Fedora>=24... (and RHEL 8.x, etc..)

See also: #1105.


Sun, 14 Aug 2016 14:01:05 GMT - Antoine Martin: milestone changed

This will help with the selinux policy, see #815.


Tue, 16 Aug 2016 13:12:15 GMT - Antoine Martin: owner, status changed

Gah this is painful.

Platform testing notes:


So:

@smo: ready for testing.


Wed, 17 Aug 2016 09:15:18 GMT - Antoine Martin:

See also ticket:1105#comment:8 about pam_systemd: If it does not exist yet, the user runtime directory /run/user/$USER is created and its ownership changed to the user that is logging in


Tue, 13 Sep 2016 03:45:30 GMT - Antoine Martin: owner, status changed

As reported in ticket:1303#comment:2, XDG_RUNTIME_DIR is not set and not created if you su - USERNAME from root... but it is set if you sudo su - THATSAMEUSERNAME. What an awful mess.


Tue, 13 Sep 2016 11:16:27 GMT - Antoine Martin:

And it just keeps getting worse: XAUTHORITY now lives in /run/user/$UID/gdm/Xauthority if you login via gdm... which means that we cannot access a session started from a GUI session (I assume other dm place their xauth file in similar locations) without overriding the xauth settings since we use the "standard" xauth file $HOME/.Xauthority, ie just to launch an xterm:

DISPLAY=:100 XAUTHORITY=$HOME/.Xauthority xterm

And since we have no way of knowing which xauthority path is the right one to use, this looks unfixable. What an epic clusterfsck.


Sun, 09 Oct 2016 14:54:09 GMT - Antoine Martin: status changed; resolution set

Some updates: ticket:888#comment:60.

We now use XDG_RUNTIME_DIR if it is present for sockets, log files and the "run-xpra" script, and we still use the older default as fallback.


Sat, 16 Sep 2017 07:56:35 GMT - Antoine Martin:

More catch 22: systemd : Issues with XDG_RUNTIME_DIR when working in HPC environments The systemd answer given in that ticket is that logins should call pam to setup the environment, the big problem is that oses like centos7 do not do that when you "su"... leaving users with no workable option and no known workarounds.


Sat, 16 Sep 2017 08:35:06 GMT - jonathan.underwood:

Replying to Antoine Martin:

More catch 22: systemd : Issues with XDG_RUNTIME_DIR when working in HPC environments The systemd answer given in that ticket is that logins should call pam to setup the environment, the big problem is that oses like centos7 do not do that when you "su"... leaving users with no workable option and no known workarounds.

Don't have a centos7 box to hand right now, but is it possible to do a "machinectl shell"? Upstream systemd proposes that as the preferred approach to getting a new shell with the correct environment. At this point there seems to be little hope of su being fixed to work as you/I would have expected.


Sat, 16 Sep 2017 08:36:32 GMT - Antoine Martin:

but is it possible to do a "machinectl shell"?

This is not supported on centos7 either! (no such subcommand)


Sat, 16 Sep 2017 08:43:02 GMT - jonathan.underwood:

Replying to Antoine Martin:

but is it possible to do a "machinectl shell"?

This is not supported on centos7 either! (no such subcommand)

That is miserable. Little option but to have a fullback code path for when the XDG_* directories are absent.

I really wish the systemd devs had consulted more thoroughly before implementing the current design, and ensured tools like su - continued to work.


Sat, 16 Sep 2017 09:10:00 GMT - Antoine Martin:

Little option but to have a fullback code path for when the XDG_* directories are absent.

We have this already: in this case, we log a warning and we create in the legacy locations only (ie: $HOME/.xpra)


Wed, 27 Jun 2018 09:27:45 GMT - Antoine Martin:

See also #1723


Sat, 23 Jan 2021 05:15:47 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/1129