This is only relevant to local servers: resuming a client connected to a remote server should break the connection (eventually - we may want to break it quicker then) which is fine.
On Linux, we should be able to get the event from the UPower Resuming dbus signal. Found some example code:
We could use this same code to force a server encoder refresh too, as hardware encoders (nvenc / opencl / cuda) tend to get messed up during suspend-resume, but we only notice next time we try to use them.
On win32, we could detect WM_POWERBROADCAST
events.
Then we can just ask for a server lossless refresh to make sure the windows display clean contents.
This looks like a driver bug to me: the GPU buffers should be preserved, maybe it is the OpenGL
paint state that is inconsistent?
Easy to reproduce, and should be easy to fix too.
The dbus approach sounds nice, except it doesn't work... I can't get any of the code examples to fire. This is also meant to fire the same Resuming
signal (found in /usr/lib/systemd/system/upower.service
), but does nothing:
dbus-send --system --type=signal --dest=org.freedesktop.UPower \ /org/freedesktop/UPower org.freedesktop.UPower.Resuming
Posted a question here: system suspend - dbus upower signals are not seen
I have now also created a Fedora ticket for this: bugzilla 1064906
dbus hooks for trying to get suspend/resume notifications
According to this answer: Newer upower versions no longer emit that signal since this handled by systemd. Now I have to ask systemd what we're supposed to do... this is not going to make the code any nicer!
The systemd / logind equivallent is PrepareForSleep
:
The PrepareForShutdown
() resp. PrepareForSleep
() signals are sent right before (with the argument True) and after (with the argument False) the system goes down for reboot/poweroff, resp. suspend/hibernate.
So it looks like we need to look for logind
and listen for this new signal, and fallback to upower otherwise.
dbus script using the new login1 interface
r5821 simply logs the suspend and resume events, like so:
2014-03-17 18:55:55,400 system is suspending 2014-03-17 20:09:40,209 system resumed, was suspended for 1:13:44
afarr: please test that the message does show up on the platforms that are meant to be already supported and which I am unable to test as virtualbox does not support OS level suspend and resume:
Eventually, we may also fire other actions from those callbacks to notify the server or re-connect if necessary.
r5824 contains some critical fixes, and r5826 fires the window refresh. It works here. Bug fixed.
As for OSX... it's never simple, and again I won't be able to test with virtualbox, here are some pointers:
OSX is done in r5828, it's not pretty but it works!
So please test this too - on virtualbox, suspending via the apple menu, shows "suspending", followed by "resuming" just 2 seconds later. So it seems to be working.
I don't currently have access to a debian or ubuntu system for testing.
You explicitly mention the following:
This is only relevant to local servers: resuming a client connected to a remote server should break the connection (eventually - we may want to break it quicker then) which is fine.
... which I will confirm is the case. With a win32 (0.12.0 r5828) client attached to a fedora 19 server, when I suspend (sleep) the windows 7 machine the connection is nearly instantly severed. The server session carries on happily, but the client disconnects.
However, when I try to run a "local server" - I am informed that "(This xpra installation does not support starting local servers.)"
C:\Program Files (x86)\Xpra>xpra_cmd.exe --no-daemon --bind-tcp=0.0.0.0:1201 --s tart-child=xterm --start-child=xterm start :17 Usage: xpra_cmd.exe attach [DISPLAY] xpra_cmd.exe detach [DISPLAY] xpra_cmd.exe screenshot filename [DISPLAY] xpra_cmd.exe info [DISPLAY] xpra_cmd.exe control DISPLAY command [arg1] [arg2].. xpra_cmd.exe version [DISPLAY] xpra_cmd.exe shadow [DISPLAY] (This xpra installation does not support starting local servers.)
Sorry I should have made this clearer: although the visual corruption is only relevant to local servers, as only local servers will still be connected when resumed (usually - but this also works with virtual machines on the same host), the suspend & resume state detection code is what I am interested in. The lines:
system is suspending system resumed, was suspended for XX:XX:XX
And whether the state detection is timely and accurate.
I don't currently have access to a debian or ubuntu system for testing
I believe smo does, you can re-assign to him once you have tested the platforms you do have.
FYI: it may be used in the future (ie: #493), and will probably be used in this release to warn the server that a disconnection event is likely, and stop wasting bandwidth sending data that will never arrive at its destination - as per #543. I can only add this code once I am confident that the suspend and resume events are received reliably.
Raising as this is blocking #543
Trying to test with windows 7, I'm not seeing any suspend messages.
unexpected message: 50006 / 0 / 0
.
2014-03-20 12:13:11,664 re-starting speaker because of overrun 2014-03-20 12:13:12,351 using audio codec: MPEG 1 Audio, Layer 3 (MP3) 2014-03-20 12:13:38,335 unexpected message: WM_POWERBROADCAST / 4 / 0 2014-03-20 12:13:40,029 re-starting speaker because of overrun 2014-03-20 12:13:40,717 using audio codec: MPEG 1 Audio, Layer 3 (MP3) 2014-03-20 12:13:49,749 unexpected message: WM_POWERBROADCAST / 18 / 0 2014-03-20 12:13:49,815 unexpected message: WM_POWERBROADCAST / 7 / 0 2014-03-20 12:13:49,826 unexpected message: WM_TIMECHANGE / 0 / 0 2014-03-20 12:13:49,977 server is not responding, drawing spinners over the windows 2014-03-20 12:13:57,947 server is OK again 2014-03-20 12:13:57,960 re-starting speaker because of overrun 2014-03-20 12:13:59,098 using audio codec: MPEG 1 Audio, Layer 3 (MP3)
2014-03-20 12:18:29,937 unexpected message: WM_POWERBROADCAST / 4 / 0 2014-03-20 12:19:50,506 server ping timeout - waited 60 seconds without a response 2014-03-20 12:19:52,766 server is not responding, drawing spinners over the windows 2014-03-20 12:19:52,811 Connection lost 2014-03-20 12:19:52,953 server is not responding, drawing spinners over the windows
Is there a different suspend mode that you have in mind for the windows client while connected, other than sleep
?
2014-03-20 12:24:54,490 unexpected message: WM_POWERBROADCAST / 4 / 0 2014-03-20 12:24:58,250 unexpected message: WM_NCCALCSIZE / 1 / 1635532 2014-03-20 12:24:58,286 unexpected message: WM_WINDOWPOSCHANGED / 0 / 1635572 2014-03-20 12:24:58,349 unexpected message: WM_NCCALCSIZE / 1 / 1634520 2014-03-20 12:24:58,414 unexpected message: 798 / 0 / 0 2014-03-20 12:27:31,762 unexpected message: WM_TIMECHANGE / 0 / 0 2014-03-20 12:27:32,118 unexpected message: WM_POWERBROADCAST / 7 / 0 2014-03-20 12:27:33,349 server is not responding, drawing spinners over the windows 2014-03-20 12:27:33,960 unexpected message: WM_POWERBROADCAST / 18 / 0 2014-03-20 12:27:39,076 unexpected message: WM_NCCALCSIZE / 1 / 1635532 2014-03-20 12:27:39,085 unexpected message: WM_WINDOWPOSCHANGED / 0 / 1635572 2014-03-20 12:27:39,098 unexpected message: WM_NCCALCSIZE / 1 / 1634520 2014-03-20 12:27:39,108 unexpected message: 798 / 0 / 0 2014-03-20 12:27:42,506 unexpected message: WM_WININICHANGE / 47 / 582344 2014-03-20 12:27:44,555 unexpected message: WM_WININICHANGE / 47 / 582344
Server side, I just get the "Disconnecting ... reason is: client ping timeout, - waited 60 seconds without a response" message.
I think I found the problem - just noticed the previous testing was with r5444, repeating with r5828...
2014-03-20 13:36:30,243 unexpected message: 49841 / 0 / 0 2014-03-20 13:38:04,336 unexpected message: 49841 / 0 / 0
2014-03-20 13:40:49,585 system is suspending 2014-03-20 13:40:54,286 server is not responding, drawing spinners over the windows 2014-03-20 13:40:57,993 system resumed, was suspended for 0:00:08 2014-03-20 13:40:59,197 server is OK again 2014-03-20 13:40:59,891 re-starting speaker because of overrun 2014-03-20 13:41:03,323 using audio codec: MPEG 1 Audio, Layer 3 (MP3) 2014-03-20 13:41:05,269 re-starting speaker because of overrun 2014-03-20 13:41:06,025 using audio codec: MPEG 1 Audio, Layer 3 (MP3)
2014-03-20 13:42:21,933 system is suspending 2014-03-20 13:42:24,628 server is not responding, drawing spinners over the windows 2014-03-20 13:43:40,279 system resumed, was suspended for 0:01:18 2014-03-20 13:43:40,358 WM_TIMECHANGE: time change event: 0 / 0 2014-03-20 13:43:40,390 server ping timeout - waited 60 seconds without a response 2014-03-20 13:43:41,920 Connection lost
Testing with osx r5458 ...
2014-03-20 13:54:15,269 system is suspending 2014-03-20 13:54:18,129 re-starting speaker because of overrun 2014-03-20 13:54:26,124 server is not responding, drawing spinners over the windows 2014-03-20 13:54:42,131 system resumed, was suspended for 0:00:26 2014-03-20 13:54:47,912 using audio codec: MPEG 1 Audio, Layer 3 (MP3) 2014-03-20 13:54:47,918 server is OK again 2014-03-20 13:54:47,920 re-starting speaker because of overrun 2014-03-20 13:54:50,492 using audio codec: MPEG 1 Audio, Layer 3 (MP3)
2014-03-20 13:57:35,566 system is suspending 2014-03-20 13:58:54,053 server is not responding, drawing spinners over the windows 2014-03-20 13:59:00,285 read connection reset for SocketConnection(('10.0.11.191', 51408) - ('10.0.32.172', 1201)) 2014-03-20 13:59:00,287 connection lost: read connection reset: [Errno 54] Connection reset by peer 2014-03-20 13:59:00,289 Connection lost
I've added a test application ("Events_Test.exe
") for win32 in r5873, which should make it easier to investigate power events.
I've hooked power events into the window refresh code in r5875 - see #543. More follow up work in #540.
afarr: The OpenGL
issue remains fixed, so please just check that a quick suspend-resume cycle works as well as it did before and then close this ticket.
Tested with r5903:
However, on my laptop I was not able to get a resume even with a short sleep cycle, instead only the following errors printed regardless of sleep length (I only pasted the relevant prints).
2014-03-24 15:55:42,141 system is suspending 2014-03-24 15:55:43,796 read error for SocketConnection(('10.0.11.77', 54092) - ('10.0.32.172', 1200)) Traceback (most recent call last): File "xpra\net\protocol.pyc", line 606, in _io_thread_loop File "xpra\net\protocol.pyc", line 660, in _read File "xpra\net\bytestreams.pyc", line 117, in read File "xpra\net\bytestreams.pyc", line 60, in _read File "xpra\net\bytestreams.pyc", line 52, in untilConcludes File "xpra\net\bytestreams.pyc", line 22, in untilConcludes error: [Errno 10053] An established connection was aborted by the software in your host machine 2014-03-24 15:55:43,798 connection lost: read error on connection: [Errno 10053] An established connection was aborted by the software in your host machine 2014-03-24 15:55:43,798 Connection lost
This seems to be only related to the individual laptop's sleep cycle, if it seems worth pursuing let me know and I'll test it further; otherwise this is good to be closed.
Judging by the Windows Sockets Error Codes
In any case, this led to a "connection lost", which is fine. (what we don't want is for the connection to stay up after we told the server to slow down, without telling it to speed up again)
r5904 should remove the ugly stacktrace (we add a bunch of win32 specific error codes to the ignore list)
this is what my xterm looked like when I resumed
r10573 should finally fix this properly: refreshing the pixels is not always enough, we may have to also reinitialize the window backing.
Minor regressions: #2482, #2484
this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/492