xpra icon
Bug tracker and wiki

Opened 21 months ago

Closed 14 months ago

Last modified 2 months ago

#1129 closed defect (fixed)

Use XDG_RUNTIME_DIR

Reported by: urzds Owned by: Antoine Martin
Priority: blocker Milestone: 1.0
Component: platforms Version: trunk
Keywords: Cc: jonathan.underwood@…

Description (last modified by Antoine Martin)

Currently, xpra defaults to ~/.xpra for several paths related to sockets, logfiles (1) and temporary scripts (2).

The XDG specification (3) says:
$XDG_RUNTIME_DIR defines the base directory relative to which user-specific non-essential runtime files and other file objects (such as sockets, named pipes, ...) should be stored. The directory MUST be owned by the user, and he MUST be the only one having read and write access to it. Its Unix access mode MUST be 0700.

This seems to be the ideal location for these files.

In fact, Fedora already bends the paths to point to that location (4), (5), though currently literally instead of by using $XDG_RUNTIME_DIR (6).

Change History (32)

comment:1 Changed 21 months ago by Antoine Martin

Description: modified (diff)
Status: newassigned

The move to /var/run for sockets and log files is complete already and will be part of the 0.17 release, see: #888, #963.
The change that moves the run-xpra script there is a very bad idea: it is too simplistic and it makes Fedora incompatible with every other distro out there (and MS Windows, OSX, etc - then people just think xpra doesn't work... ouch).

It looks like replacing /var/run/user/$UID/ with XDG_RUNTIME_DIR makes sense... except that again, it's not that simple: we use this same code to generate the default config file. And you really don't want it to contain the buildbot's UID.

comment:2 Changed 21 months ago by Antoine Martin

Description: modified (diff)

comment:3 in reply to:  1 Changed 21 months ago by jonathan.underwood

Replying to antoine:

The move to /var/run for sockets and log files is complete already and will be part of the 0.17 release, see: #888, #963.

Any hope of backporting this to 0.16? I'd really liketo get this change in to F24 if possible.

The change that moves the run-xpra script there is a very bad idea: it is too simplistic and it makes Fedora incompatible with every other distro out there (and MS Windows, OSX, etc - then people just think xpra doesn't work... ouch).

OK, this was only done on the rawhide branch to see what broke :). However, I think this does need some consideration moving forward - having the run script in the home directory is fragile on NFS home directory mounted systems.

It looks like replacing /var/run/user/$UID/ with XDG_RUNTIME_DIR makes sense... except that again, it's not that simple: we use this same code to generate the default config file. And you really don't want it to contain the buildbot's UID.

Hm. OK, this could be achieved with a bit of seddism after building the config file, or some judicious use of quoting during building. Will look into it.

The reporter (urzds) seems to be doing some intensive testing using xpra in minimal containers (judging by the Fedora bugs filed), which is really great - this is a great opportunity to make this use case solid.

Last edited 21 months ago by jonathan.underwood (previous) (diff)

comment:4 Changed 21 months ago by Antoine Martin

Owner: changed from Antoine Martin to jonathan.underwood
Status: assignednew

Any hope of backporting this to 0.16? I'd really liketo get this change in to F24 if possible.


None whatsoever - too big. 0.17 will be frozen in the next 2 weeks though.
BTW: you own #963 ;)

having the run script in the home directory is fragile on NFS home directory mounted systems.


Not just that, but there's no reason you can't mount your home directory noexec or readonly.

The problem here is that all existing clients connect via ssh by running this command:

xpra initenv;~/.xpra/run-xpra _proxy

(the _proxy subcommand just opens stdin/stdout pipes to the xpra server)
If we can come up with a safe and backwards compatible way of changing that string, then I'm happy to backport it. Then we can move the script anywhere.
Another difficulty here is that this needs to work with all shells. Any ideas?


r11996 looks up XDG_RUNTIME_DIR for both socket dirs and log dirs.
Note: this goes into the config file with the UID replaced with $UID which we convert back to a UID at runtime.. (so really, not much has changed except we use the env var rather than building the path ourselves)
We could have the actual string $XDG_RUNTIME_DIR in the config file, and it this string to the env substitution whitelist - not sure there's much to be gained from doing that though.


... xpra in minimal containers..


Sounds like a great use case, send bug reports this way!

comment:5 Changed 21 months ago by jonathan.underwood

Cc: jonathan.underwood@… added

comment:6 in reply to:  4 Changed 21 months ago by jonathan.underwood

Replying to antoine:

Any hope of backporting this to 0.16? I'd really liketo get this change in to F24 if possible.


None whatsoever - too big. 0.17 will be frozen in the next 2 weeks though.
BTW: you own #963 ;)

Ah, yes. Have been bogged down with $dayjob, but will get to that, thansk for the poke.


having the run script in the home directory is fragile on NFS home directory mounted systems.


Not just that, but there's no reason you can't mount your home directory noexec or readonly.

The problem here is that all existing clients connect via ssh by running this command:

xpra initenv;~/.xpra/run-xpra _proxy

(the _proxy subcommand just opens stdin/stdout pipes to the xpra server)
If we can come up with a safe and backwards compatible way of changing that string, then I'm happy to backport it. Then we can move the script anywhere.
Another difficulty here is that this needs to work with all shells. Any ideas?

Something like this?

xpra initenv;~/.xpra/run-xpra _proxy || $XDG_RUNTIME_DIR/run-xpra _proxy

I think conditional execution is posix compliant, so I think that's portable.

So, in paths.py we could have do_get_script_bin_dir() return a list of places to look for the run_xpra script, and then compose the ssh line from that list, with || as the separator.

This seems too simple, I am sure I am missing a bunch of complexity :)


r11996 looks up XDG_RUNTIME_DIR for both socket dirs and log dirs.
Note: this goes into the config file with the UID replaced with $UID which we convert back to a UID at runtime.. (so really, not much has changed except we use the env var rather than building the path ourselves)
We could have the actual string $XDG_RUNTIME_DIR in the config file, and it this string to the env substitution whitelist - not sure there's much to be gained from doing that though.


... xpra in minimal containers..


Sounds like a great use case, send bug reports this way!

So far, most of the bugs have been me missing some Requires for the Fedora package :).

Last edited 21 months ago by jonathan.underwood (previous) (diff)

comment:7 Changed 21 months ago by Antoine Martin

Owner: changed from jonathan.underwood to Antoine Martin
Status: newassigned

I think conditional execution is posix compliant, so I think that's portable.


I think I had considered that and decided against it. But now that I look at it again, I can't see why this wouldn't work!

Let me play with it.

Last edited 21 months ago by Antoine Martin (previous) (diff)

comment:8 Changed 21 months ago by Antoine Martin

Owner: changed from Antoine Martin to urzds
Status: assignednew

Done in r12005 (+fixup in r12006) - please see commit message for more details. I've tried to do this as cleanly as I can, which also means this is not 0.16 backport material..

It would be a little bit difficult to test because by default we run "xpra initenv", which re-creates all the scripts we may need. To workaround that, use XPRA_INITENV_COMMAND="" xpra attach ssh:... to skip this step.

If the scripts we try to run on the remote end do not exist (intentionally removed for testing, or just not present when connecting to an older xpra version), you get this sort of message on stderr (slightly ugly, but we can live with it):

bash: /run/user/1000/username/run-xpra: No such file or directory
bash: /home/username/.xpra/run-xpra: No such file or directory

At some point in the future, we can drop compatibility with older versions and remove "~/.xpra" from the list of directories where we save the script / try to run from.
Until then, failure to create or run a script from this location is no longer fatal.
(the same approach applies to socket-dir, see #963)

Note: we could actually drop the call to "xpra initenv" at the same time: the new remote command line will also try to run just a plain "xpra" (from the PATH), so this is now a little bit redundant.

@urzds: does that work for you?

Last edited 21 months ago by Antoine Martin (previous) (diff)

comment:9 Changed 21 months ago by Antoine Martin

Priority: majorblocker

We do have a problem, on Fedora at least, when the /run/user/$UID directory does not exist: things don't work at all. Ouch.

I was expecting su to create this directory, but it doesn't, so this breaks badly:

su - test
xpra start ...

What are we supposed to do?
If this directory cannot be relied upon, maybe I need to undo some of those changes?

@urzds / @jonathan.underwood: ideas?

The only slightly related ticket that I found are: https://bugzilla.redhat.com/show_bug.cgi?id=961235 and https://bugzilla.redhat.com/show_bug.cgi?id=967509.
The claim is that su -l should create the directory, but this does not work on Fedora 23.

comment:10 Changed 21 months ago by jonathan.underwood

Well, this problem is actually more general, and related to a problem I was trying to understand with xpra and Fedora: if I start an xpra session from within a gnome session, putting the socket etc under /run/user/$UID, and then log out of the gnome session, the xpra Xdummy process ends up bring killed (though the xpra process itself remains, oddly). The reason for that is that logging into gnome creates a new cgroup, and process started within gnome are started within that cgroup. On logging out, the cgroup is torn down. By default, it shouldn't really be killing off the Xdummy process, but for some reason it does. I haven't got to the bottom of what's going on here. But it's related to this issue of what to do if /run/user/$UID isn't there - that directory is created when systemd is asked to fire up a new cgroup.

What I think xpra should really be doing is starting itself as a systemd managed service. That would create the directory, and also mean the process could be nicely isolated in its own cgroup. On systems where systemd isn't being used, libcg and related tools could be used instead.

I realize this is a very incomplete picture... I am still reading about all this to try and understand it, and where the integration points lie. But, this would be a really nice thing to get working.

comment:11 in reply to:  9 ; Changed 21 months ago by jonathan.underwood

Replying to antoine:

We do have a problem, on Fedora at least, when the /run/user/$UID directory does not exist: things don't work at all. Ouch.

Just to clarify: /run/user/$UID is created by the pam_systemd PAM module on user login. So, the only case where that directory wouldn't exist would be when the user trying to start xpra isn't logged in, and so doesn't have an active login session. I am struggling a bit to work out when that could actually happen.

I was expecting su to create this directory, but it doesn't, so this breaks badly:

su - test
xpra start ...

What are we supposed to do?
If this directory cannot be relied upon, maybe I need to undo some of those changes?

That does seem to be a su bug - I am not sure why that's going around the pam stack.

@urzds / @jonathan.underwood: ideas?

The only slightly related ticket that I found are: https://bugzilla.redhat.com/show_bug.cgi?id=961235 and https://bugzilla.redhat.com/show_bug.cgi?id=967509.
The claim is that su -l should create the directory, but this does not work on Fedora 23.

It's working here on F23...

comment:12 in reply to:  11 Changed 21 months ago by jonathan.underwood

Replying to jonathan.underwood:
[snip]

The only slightly related ticket that I found are: https://bugzilla.redhat.com/show_bug.cgi?id=961235 and https://bugzilla.redhat.com/show_bug.cgi?id=967509.
The claim is that su -l should create the directory, but this does not work on Fedora 23.

It's working here on F23...

Addendum: actually no it's not working on F23. This is a nasty bug of su.

comment:13 Changed 21 months ago by jonathan.underwood

Bugzilla entry opened against su:

https://bugzilla.redhat.com/show_bug.cgi?id=1317304

comment:14 Changed 21 months ago by jonathan.underwood

Actually, after reading this:

https://bugzilla.redhat.com/show_bug.cgi?id=753882#c29

I suspect this won't get fixed. Bottom line is that su --login (or su -l or su -) isn't actually designed to behave as though the new user was logging in. The recommended way would be ssh localhost or some variant thereof.

I suppose xpra could grow a command line option to specify the user to run as. But then xpra would also have to get into the business of logging a user in, and starting a session etc.

comment:15 Changed 21 months ago by Antoine Martin

.. isn't actually designed to behave as though the new user was logging in...


What a mess. The man pages reads Start the shell as a login shell with an environment similar to a real login. One would expect this environment to actually be usable...
Requiring an ssh server to get a directory created is beyond bizarre.

Whatever the justification is, this means that the new XDG_RUNTIME_DIR is useless for a large number of valid use cases.
End users will not care that $XYZ is the right way of doing something, they will just rightfully complain that something that used to work just fine no longer does.

So I am tempted to undo this change in the default config, those who really insist on using this location can make the change themselves and deal with the fallout.

comment:16 Changed 21 months ago by jonathan.underwood

Yeah. It is a mess. Why not check to see if $XDG_RUNTIME_DIR is set, and if not, fall back to using the home directory?

Of relevance to this whole issue:

https://tlhp.cf/lennart-poettering-su/

comment:17 Changed 21 months ago by Antoine Martin

Why not check to see if $XDG_RUNTIME_DIR is set


I believe we now continue after printing a warning when we fail to create a socket in one of the locations specified in the config file (or command line), handling for the server log files would require more ugly code.

And we can't even do the fallback for the Xorg command line, because it is generated in the config file, and the vfb will just fail to start.
That is, unless we substitute the log location in that string at runtime. More code, more chances of breakage. More documentation to update, more bug reports to deal with. Sigh.

comment:18 Changed 20 months ago by Antoine Martin

Resolution: wontfix
Status: newclosed

As per the comments above, this directory is not usable so we won't try to use it.

r12297 disables all this code, nice waste of my time writing it.

comment:19 Changed 19 months ago by urzds

@urzds: does that work for you?

This appears to work with r12469 on Fedora Rawhide, as can be tested with the Docker image quay.io/urzds/xpra-test:v0.17-pre12469-fedora-1, which is built from these layered repositories:

As a hack I made use of Xpra automatically creating the ~/.xpra directory and set XDG_RUNTIME_DIR=$HOME/.xpra, to make sure everything ends up in the same spot without relying on any system directories.

comment:20 Changed 19 months ago by Antoine Martin

Resolution: wontfix
Status: closedreopened

The syntax suggested in comment:6 is apparently not posix compliant and causes failures with tcsh, see xpra-0.17.0 on Centos 7.

So r12511 tries "~/.xpra/run-xpra" first which makes tcsh happy again.
But that's not ideal, and trying all the proxy locations with || is not ideal either: there could be more than one script present and if the first script returns an error, we shouldn't try to run the next one..

@urzds / @jonathan.underwood: Ideas?

Last edited 19 months ago by Antoine Martin (previous) (diff)

comment:21 Changed 18 months ago by Antoine Martin

I'm now on Fedora 24 and this seems to have been fixed?
Upon ssh-ing (or su-ing) to another newly created user, XDG_RUNTIME_DIR is set and exists, when this user logs out the directory disappears.
No idea when / how this got fixed or how we're supposed to check if the system libraries are good enough for our needs. So switching to this location is likely to require a build time switch, which can then be enabled for Fedora>=24... (and RHEL 8.x, etc..)

See also: #1105.

Last edited 18 months ago by Antoine Martin (previous) (diff)

comment:22 Changed 16 months ago by Antoine Martin

Milestone: 0.171.0
  • as of r13343, we enable /var/run/user/$UID/xpra for storing the unix domain sockets (as well as the other legacy paths, see r13342)
  • r13344 does the same for the run-xpra script and log files (both Xorg's and xpra's)

This will help with the selinux policy, see #815.

Last edited 16 months ago by Antoine Martin (previous) (diff)

comment:23 Changed 15 months ago by Antoine Martin

Owner: changed from urzds to Smo
Status: reopenednew

Gah this is painful.

Platform testing notes:

  • centos<7, debian jessie, ubuntu 14.04 and earlier, etc don't have /var/run/user so we use the old defaults (".xpra")
  • centos 7, debian stretch, fedora 23, ubuntu 16.04, etc have /var/run/user but fail to create the $UID directory if you use "su --login", it does get created with an ssh login or regular interactive login
  • fedora>=24 is the only distro that works OK (directory is created with su as well as interactive / ssh logins)

So:

  • r13367 adds patches to disable XDG_RUNTIME_DIR and tweak the selinux policy to allow access to the socket in "~/.xpra"
  • r13369 enables those 2 patches for all distros except for Fedora >= 24

@smo: ready for testing.

comment:24 Changed 15 months ago by Antoine Martin

See also ticket:1105#comment:8 about pam_systemd: If it does not exist yet, the user runtime directory /run/user/$USER is created and its ownership changed to the user that is logging in

comment:25 Changed 15 months ago by Antoine Martin

Owner: changed from Smo to Antoine Martin
Status: newassigned

As reported in ticket:1303#comment:2, XDG_RUNTIME_DIR is not set and not created if you su - USERNAME from root... but it is set if you sudo su - THATSAMEUSERNAME.
What an awful mess.

comment:26 Changed 15 months ago by Antoine Martin

And it just keeps getting worse: XAUTHORITY now lives in /run/user/$UID/gdm/Xauthority if you login via gdm... which means that we cannot access a session started from a GUI session (I assume other dm place their xauth file in similar locations) without overriding the xauth settings since we use the "standard" xauth file $HOME/.Xauthority, ie just to launch an xterm:

DISPLAY=:100 XAUTHORITY=$HOME/.Xauthority xterm

And since we have no way of knowing which xauthority path is the right one to use, this looks unfixable. What an epic clusterfsck.

comment:27 Changed 14 months ago by Antoine Martin

Resolution: fixed
Status: assignedclosed

Some updates: ticket:888#comment:60.

We now use XDG_RUNTIME_DIR if it is present for sockets, log files and the "run-xpra" script, and we still use the older default as fallback.

comment:28 Changed 2 months ago by Antoine Martin

More catch 22: systemd : Issues with XDG_RUNTIME_DIR when working in HPC environments
The systemd answer given in that ticket is that logins should call pam to setup the environment, the big problem is that oses like centos7 do not do that when you "su"... leaving users with no workable option and no known workarounds.

comment:29 in reply to:  28 Changed 2 months ago by jonathan.underwood

Replying to Antoine Martin:

More catch 22: systemd : Issues with XDG_RUNTIME_DIR when working in HPC environments
The systemd answer given in that ticket is that logins should call pam to setup the environment, the big problem is that oses like centos7 do not do that when you "su"... leaving users with no workable option and no known workarounds.

Don't have a centos7 box to hand right now, but is it possible to do a "machinectl shell"? Upstream systemd proposes that as the preferred approach to getting a new shell with the correct environment. At this point there seems to be little hope of su being fixed to work as you/I would have expected.

comment:30 Changed 2 months ago by Antoine Martin

but is it possible to do a "machinectl shell"?

This is not supported on centos7 either! (no such subcommand)

comment:31 in reply to:  30 Changed 2 months ago by jonathan.underwood

Replying to Antoine Martin:

but is it possible to do a "machinectl shell"?

This is not supported on centos7 either! (no such subcommand)

That is miserable. Little option but to have a fullback code path for when the XDG_* directories are absent.

I really wish the systemd devs had consulted more thoroughly before implementing the current design, and ensured tools like su - continued to work.

comment:32 Changed 2 months ago by Antoine Martin

Little option but to have a fullback code path for when the XDG_* directories are absent.

We have this already: in this case, we log a warning and we create in the legacy locations only (ie: $HOME/.xpra)

Last edited 2 months ago by Antoine Martin (previous) (diff)
Note: See TracTickets for help on using tickets.