Xpra: Ticket #492: suspending a local client with opengl windows can show corrupted pixels

This is only relevant to local servers: resuming a client connected to a remote server should break the connection (eventually - we may want to break it quicker then) which is fine.

On Linux, we should be able to get the event from the UPower Resuming dbus signal. Found some example code:

We could use this same code to force a server encoder refresh too, as hardware encoders (nvenc / opencl / cuda) tend to get messed up during suspend-resume, but we only notice next time we try to use them.

On win32, we could detect WM_POWERBROADCAST events.

Then we can just ask for a server lossless refresh to make sure the windows display clean contents.

This looks like a driver bug to me: the GPU buffers should be preserved, maybe it is the OpenGL paint state that is inconsistent?



Mon, 20 Jan 2014 11:16:10 GMT - Antoine Martin: owner, status changed

Easy to reproduce, and should be easy to fix too.


Tue, 04 Feb 2014 15:02:21 GMT - Antoine Martin: description changed


Tue, 04 Feb 2014 16:29:39 GMT - Antoine Martin:

The dbus approach sounds nice, except it doesn't work... I can't get any of the code examples to fire. This is also meant to fire the same Resuming signal (found in /usr/lib/systemd/system/upower.service), but does nothing:

dbus-send --system --type=signal --dest=org.freedesktop.UPower \
    /org/freedesktop/UPower org.freedesktop.UPower.Resuming

Posted a question here: system suspend - dbus upower signals are not seen

I have now also created a Fedora ticket for this: bugzilla 1064906


Wed, 05 Feb 2014 06:49:45 GMT - Antoine Martin: attachment set

dbus hooks for trying to get suspend/resume notifications


Sun, 16 Mar 2014 15:21:00 GMT - Antoine Martin:

According to this answer: Newer upower versions no longer emit that signal since this handled by systemd. Now I have to ask systemd what we're supposed to do... this is not going to make the code any nicer!


Sun, 16 Mar 2014 15:42:10 GMT - Antoine Martin:

The systemd / logind equivallent is PrepareForSleep: The PrepareForShutdown() resp. PrepareForSleep() signals are sent right before (with the argument True) and after (with the argument False) the system goes down for reboot/poweroff, resp. suspend/hibernate.

So it looks like we need to look for logind and listen for this new signal, and fallback to upower otherwise.


Mon, 17 Mar 2014 04:47:07 GMT - Antoine Martin: attachment set

dbus script using the new login1 interface


Mon, 17 Mar 2014 08:01:45 GMT - Antoine Martin: owner, status changed

r5821 simply logs the suspend and resume events, like so:

2014-03-17 18:55:55,400 system is suspending
2014-03-17 20:09:40,209 system resumed, was suspended for 1:13:44

afarr: please test that the message does show up on the platforms that are meant to be already supported and which I am unable to test as virtualbox does not support OS level suspend and resume:

Eventually, we may also fire other actions from those callbacks to notify the server or re-connect if necessary.

r5824 contains some critical fixes, and r5826 fires the window refresh. It works here. Bug fixed.

As for OSX... it's never simple, and again I won't be able to test with virtualbox, here are some pointers:


Mon, 17 Mar 2014 15:48:05 GMT - Antoine Martin:

OSX is done in r5828, it's not pretty but it works!

So please test this too - on virtualbox, suspending via the apple menu, shows "suspending", followed by "resuming" just 2 seconds later. So it seems to be working.


Mon, 17 Mar 2014 19:45:44 GMT - alas:

I don't currently have access to a debian or ubuntu system for testing.

You explicitly mention the following:

This is only relevant to local servers: resuming a client connected to a remote server should break the connection (eventually - we may want to break it quicker then) which is fine.

... which I will confirm is the case. With a win32 (0.12.0 r5828) client attached to a fedora 19 server, when I suspend (sleep) the windows 7 machine the connection is nearly instantly severed. The server session carries on happily, but the client disconnects.

However, when I try to run a "local server" - I am informed that "(This xpra installation does not support starting local servers.)"

C:\Program Files (x86)\Xpra>xpra_cmd.exe --no-daemon --bind-tcp=0.0.0.0:1201 --s
tart-child=xterm --start-child=xterm start :17
Usage:
        xpra_cmd.exe attach [DISPLAY]
        xpra_cmd.exe detach [DISPLAY]
        xpra_cmd.exe screenshot filename [DISPLAY]
        xpra_cmd.exe info [DISPLAY]
        xpra_cmd.exe control DISPLAY command [arg1] [arg2]..
        xpra_cmd.exe version [DISPLAY]
        xpra_cmd.exe shadow [DISPLAY]
(This xpra installation does not support starting local servers.)

Tue, 18 Mar 2014 01:11:21 GMT - Antoine Martin:

Sorry I should have made this clearer: although the visual corruption is only relevant to local servers, as only local servers will still be connected when resumed (usually - but this also works with virtual machines on the same host), the suspend & resume state detection code is what I am interested in. The lines:

system is suspending
system resumed, was suspended for XX:XX:XX

And whether the state detection is timely and accurate.

I don't currently have access to a debian or ubuntu system for testing


I believe smo does, you can re-assign to him once you have tested the platforms you do have.

FYI: it may be used in the future (ie: #493), and will probably be used in this release to warn the server that a disconnection event is likely, and stop wasting bandwidth sending data that will never arrive at its destination - as per #543. I can only add this code once I am confident that the suspend and resume events are received reliably.


Thu, 20 Mar 2014 14:22:44 GMT - Antoine Martin: priority changed

Raising as this is blocking #543


Thu, 20 Mar 2014 19:31:11 GMT - alas:

Trying to test with windows 7, I'm not seeing any suspend messages.

Is there a different suspend mode that you have in mind for the windows client while connected, other than sleep?

Server side, I just get the "Disconnecting ... reason is: client ping timeout, - waited 60 seconds without a response" message.


Thu, 20 Mar 2014 20:46:39 GMT - alas:

I think I found the problem - just noticed the previous testing was with r5444, repeating with r5828...

2014-03-20 13:40:49,585 system is suspending
2014-03-20 13:40:54,286 server is not responding, drawing spinners over the windows
2014-03-20 13:40:57,993 system resumed, was suspended for 0:00:08
2014-03-20 13:40:59,197 server is OK again
2014-03-20 13:40:59,891 re-starting speaker because of overrun
2014-03-20 13:41:03,323 using audio codec: MPEG 1 Audio, Layer 3 (MP3)
2014-03-20 13:41:05,269 re-starting speaker because of overrun
2014-03-20 13:41:06,025 using audio codec: MPEG 1 Audio, Layer 3 (MP3)
2014-03-20 13:42:21,933 system is suspending
2014-03-20 13:42:24,628 server is not responding, drawing spinners over the windows
2014-03-20 13:43:40,279 system resumed, was suspended for 0:01:18
2014-03-20 13:43:40,358 WM_TIMECHANGE: time change event: 0 / 0
2014-03-20 13:43:40,390 server ping timeout - waited 60 seconds without a response
2014-03-20 13:43:41,920 Connection lost

Thu, 20 Mar 2014 21:04:51 GMT - alas:

Testing with osx r5458 ...

2014-03-20 13:54:15,269 system is suspending
2014-03-20 13:54:18,129 re-starting speaker because of overrun
2014-03-20 13:54:26,124 server is not responding, drawing spinners over the windows
2014-03-20 13:54:42,131 system resumed, was suspended for 0:00:26
2014-03-20 13:54:47,912 using audio codec: MPEG 1 Audio, Layer 3 (MP3)
2014-03-20 13:54:47,918 server is OK again
2014-03-20 13:54:47,920 re-starting speaker because of overrun
2014-03-20 13:54:50,492 using audio codec: MPEG 1 Audio, Layer 3 (MP3)
2014-03-20 13:57:35,566 system is suspending
2014-03-20 13:58:54,053 server is not responding, drawing spinners over the windows
2014-03-20 13:59:00,285 read connection reset for SocketConnection(('10.0.11.191', 51408) - ('10.0.32.172', 1201))
2014-03-20 13:59:00,287 connection lost: read connection reset: [Errno 54] Connection reset by peer
2014-03-20 13:59:00,289 Connection lost

Fri, 21 Mar 2014 04:36:45 GMT - Antoine Martin:

I've added a test application ("Events_Test.exe") for win32 in r5873, which should make it easier to investigate power events.


I've hooked power events into the window refresh code in r5875 - see #543. More follow up work in #540.


afarr: The OpenGL issue remains fixed, so please just check that a quick suspend-resume cycle works as well as it did before and then close this ticket.


Mon, 24 Mar 2014 23:03:44 GMT - J. Max Mena:

Tested with r5903:

However, on my laptop I was not able to get a resume even with a short sleep cycle, instead only the following errors printed regardless of sleep length (I only pasted the relevant prints).

2014-03-24 15:55:42,141 system is suspending
2014-03-24 15:55:43,796 read error for SocketConnection(('10.0.11.77', 54092) - ('10.0.32.172', 1200))
Traceback (most recent call last):
  File "xpra\net\protocol.pyc", line 606, in _io_thread_loop
  File "xpra\net\protocol.pyc", line 660, in _read
  File "xpra\net\bytestreams.pyc", line 117, in read
  File "xpra\net\bytestreams.pyc", line 60, in _read
  File "xpra\net\bytestreams.pyc", line 52, in untilConcludes
  File "xpra\net\bytestreams.pyc", line 22, in untilConcludes
error: [Errno 10053] An established connection was aborted by the software in your host machine
2014-03-24 15:55:43,798 connection lost: read error on connection: [Errno 10053] An established connection was aborted by the software in your host machine
2014-03-24 15:55:43,798 Connection lost

This seems to be only related to the individual laptop's sleep cycle, if it seems worth pursuing let me know and I'll test it further; otherwise this is good to be closed.


Tue, 25 Mar 2014 08:56:58 GMT - Antoine Martin: status changed; resolution set

Judging by the Windows Sockets Error Codes

In any case, this led to a "connection lost", which is fine. (what we don't want is for the connection to stay up after we told the server to slow down, without telling it to speed up again)

r5904 should remove the ugly stacktrace (we add a bunch of win32 specific error codes to the ignore list)


Fri, 05 Sep 2014 07:02:19 GMT - Antoine Martin: attachment set

this is what my xterm looked like when I resumed


Wed, 09 Sep 2015 10:44:15 GMT - Antoine Martin:

r10573 should finally fix this properly: refreshing the pixels is not always enough, we may have to also reinitialize the window backing.

See also: #901, #924

Minor regressions: #2482, #2484


Sat, 23 Jan 2021 04:57:14 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/492