Solutions proposed:
Took approach number 2 in r229: we keep the benefits of doing it all server side (work with all apps that rely on keyboard state) but we rely on the client to regularly send us the "key-repeat" signal. If we don't get a signal in time, we unpress the key. (we also unpress all keys whenever we lose focus which should solve other cases too)
I have tested this by setting an artificially low timeout and the key was unpressed as expected. (could also test by delaying the "key-repeat" signal - not done yet). Can you please confirm if this works for you?
I'm not the original reporter, just discovered xpra a few days ago and ran straight into this problem quite badly (Wifi + cable modem + multihop SSH doesn't help jitter). Have tried briefly now with 0.0.7.28 and it works fine for me. Will report back after some prolonged use.
PS. xpra is awesome, thanks!
Don't know whether that's jitter or just a programming bug ...
I used to have the repeat error, until I used "xset r rate" to turn off key repeat on the server. (This was with 0.7.27 or something like that.)
Now I'm on 0.7.30; the problem is now a bit different. In xchat I switch the channels with CTRL+PageUp
(resp. PageDown
); this used to work, but now it switches many channels at once. (Easily visible because all visited channels loose the "unread messages" color.)
I hold STRG, and press PageUp/PageDown
for a very short time - but it seems that it gets sent multiple times to the server.
Perhaps it's just a very fast repetition?
"xset q" gives me (on my primary X)
auto repeat delay: 250 repeat rate: 30
so it should take at least 250msec for repeat to kick in - and I'm holding the keys _not_ that long. Perhaps the time for STRG is accounted somehow?
Damn, I'm not seeing this here at all... Can you tell me your exact client and server OS/lib versions? My desktop is set to 500/33 and when I test the key repeat with a simple xterm (no xchat installed here), I see the expected number of keys at the expected rate... can you try that too?
Can you please also check "xset q
" from within an xpra session to check that it did get propagated to the virtual session correctly?
Otherwise, the last option is to look at the debug log with: "xpra -d all | grep -i key
" (this is going to be quite verbose but should tell us what key events we simulate on the virtual session and for what reason)
Both are at 0.0.7.30+dfsg-1. Yes, "xset q" via xpra gives the same result: auto repeat delay: 250 repeat rate: 30
Here's the logfile.
xpra attach ... -d all | grep -i Key
The client log looks absolutely fine, can you post the server log? (same grepping)
More details: the key repeat timer fires for the Control
key, but the one for the Page_Up
key gets cancelled before firing (since you released the key).
Well, that means killing the running clients ... or should a "xpra upgrade -d all" work here?
BTW, I'm in #winswitch now, easier to talk there.
server log
The client log shows that the client correctly sends "key-repeat
" messages to the server.
And here is the relevant bit from the server log:
got ['key-action', 1, 'Control_L', 1, [], 65507, '', 37] now 1pressing keycode=37, keyname=Control_L handle keycode pressing 37: key Control_L scheduling key repeat timer with delay 250 got ['key-repeat', 37] cancelling key repeat timer: 742 scheduling key repeat timer with delay 250 got ['key-repeat', 37] (...) got ['key-action', 1, 'Page_Down', 1, ['control'], 65366, '', 117] now 1pressing keycode=117, keyname=Page_Down handle keycode pressing 117: key Page_Down scheduling key repeat timer with delay 250 got ['key-repeat', 37] cancelling key repeat timer: 739 scheduling key repeat timer with delay 250 got ['key-repeat', 37] cancelling key repeat timer: 742 scheduling key repeat timer with delay 250 key repeat timeout for keycode 117 / 'Page_Down' - clearing it handle keycode releasing 117: key Page_Down cancelling key repeat timer: 741 key repeat timeout for keycode 37 / 'Control_L' - clearing it handle keycode releasing 37: key Control_L cancelling key repeat timer: 746 got ['key-repeat', 37] cancelling key repeat timer: 746 scheduling key repeat timer with delay 250 got ['key-repeat', 37] cancelling key repeat timer: 777 scheduling key repeat timer with delay 250 got ['key-action', 1, 'Page_Down', 0, ['control'], 65366, '', 117] now 0pressing keycode=117, keyname=Page_Down handle keycode 117: key Page_Down was already pressed/released, ignoring got ['key-repeat', 37] cancelling key repeat timer: 778 scheduling key repeat timer with delay 250 got ['key-repeat', 37] cancelling key repeat timer: 779 scheduling key repeat timer with delay 250
Between the moment the "Page_Down
" is pressed and released (look for "key-action
" messages), we get a "key repeat timeout
" for both "Control_L
" (37) and "Page_Down
" (117).
This means that both the key-release ("key-action
" with first value 1) event for Control_L
and the usual "key-repeat
" for "Control_L
" took an unusually long time to reach the server's read processing loop.
Since this does not happen at any other time during this log trace, it would seem that whatever "Control_L + Page_Down
" triggered (a screen update is quite likely), it also caused the delay in processing/receiving those key messages and therefore caused the keys to remain pressed longer than intended.
It would be quite useful to print timing information in the log messages to be able to confirm this...
Idea discussed on IRC: provide the key-release messages additionally via an UDP channel (openvpn?), so that a single packet loss is not enough to cause the problem.
(This assumes that the problem is a dropped TCP packet, with the timeout, retransmission, etc.)
The threaded server introduced in r284 drastically reduces latency on the server and r285 prioritizes "key-repeat" packets (just in case - probably not making a huge difference). Please let me know if this improves the situation, and if the problem still occurs, what is the load on both server and client at the time?
The only location where the client can lag is when receiving very large damage data, at the moment this is processed in the main thread (see draw()
in client.py
).
The same threaded approach cannot be used here as the expensive call is in gtk - which therefore must be called from the main thread... but maybe we can split the loading in chunks (won't be easy):
loader = gtk.gdk.PixbufLoader(coding) loader.write(img_data, len(img_data)) loader.close() pixbuf = loader.get_pixbuf()
The splitting in chunks is not the hard part (decide on a chunk size and just consume the buffer via idle_add
calls), the problem is that we would end up consuming network packets faster than we actually process them... so the "damage-sequence" code would need to be changed (not sure if firing it after we actually finish loading and draw something would be enough: we may have accumulated too much pending data already...)
r288 makes the parsing and decompression of network packets threaded so we don't starve the main loop for long. This reduces the latency significantly when dealing with very large packets (like full screen updates at high res).
This helps the client the most as it receives much more data than the server.
If you add the latency-measurement, please provide some way to get the current value - dbus, commandline, whatever.
At least for bug reports it might be interesting to have an floating average for the last 1,5,10, and 60 seconds, too ;)
This simple patch just sends a ping every second and prints the latency when the echo arrives - comments/ideas welcome:
Index: xpra/server.py =================================================================== --- xpra/server.py (revision 322) +++ xpra/server.py (working copy) @@ -1292,6 +1292,8 @@ self._send_new_window_packet(window) if self.send_cursors: self.send_cursor() + if True: + self.send_ping() def send_hello(self, client_capabilities): capabilities = {} @@ -1533,6 +1535,16 @@ log("received sequence: %s", packet_sequence) self._server_source.last_client_packet_sequence = packet_sequence + def send_ping(self): + if not self._protocol._closed: + self._send(["ping", long(1000.0*time.time())]) + gobject.timeout_add(1000, self.send_ping) + + def _process_ping_echo(self, proto, packet): + (_, ts) = packet + now = long(1000.0*time.time()) + log.info("ping_echo time=%s, now=%s, ts=%s, latency=%s", time.time(), now, ts, now-ts) + def _process_buffer_refresh(self, proto, packet): (_, id, _, jpeg_qual) = packet if self.encoding=="jpeg": @@ -1591,6 +1603,7 @@ "shutdown-server": _process_shutdown_server, "jpeg-quality": _process_jpeg_quality, "damage-sequence": _process_damage_sequence, + "ping_echo": _process_ping_echo, "buffer-refresh": _process_buffer_refresh, "desktop_size": _process_desktop_size, "encoding": _process_encoding, Index: xpra/client.py =================================================================== --- xpra/client.py (revision 315) +++ xpra/client.py (working copy) @@ -876,6 +876,11 @@ log("_process_notify_close(%s)", id) self._client_extras.close_notify(id) + def _process_ping(self, packet): + (_, ts) = packet + log.info("ping: %s", ts) + self.send(["ping_echo", ts]) + def _process_window_metadata(self, packet): (_, id, metadata) = packet window = self._id_to_window[id] @@ -918,6 +923,7 @@ "bell": _process_bell, "notify_show": _process_notify_show, "notify_close": _process_notify_close, + "ping": _process_ping, "window-metadata": _process_window_metadata, "configure-override-redirect": _process_configure_override_redirect, "lost-window": _process_lost_window,
Sorry, I can't test that ...
# patch -l --dry-run -p1 /usr/lib/xpra/xpra/client.py < /tmp/xpra-patch patching file /usr/lib/xpra/xpra/client.py Hunk #1 FAILED at 1292. Hunk #2 FAILED at 1533. Hunk #3 FAILED at 1591. 3 out of 3 hunks FAILED -- saving rejects to file /usr/lib/xpra/xpra/client.py.rej patching file /usr/lib/xpra/xpra/client.py Hunk #1 succeeded at 877 (offset 1 line). Hunk #2 succeeded at 924 (offset 1 line).
This is with xpra 0.0.7.31.
patching adding latency measurement via simple ping
[~/Xpra/trunk/src] patch -p0 < ./ping.patch patching file xpra/server.py patching file xpra/client.py
Applies cleanly to trunk
new session-info window as shown when connecting to old servers (limited details)
session-info for new servers (more details)
I needed an easy way to see the latency from both sides, so r331 now has a dedicated window for seeing this sort of useful debugging information. See screenshots.
Please try it out and report on the type of values you see when problems occur.
r349 adds the "--no-keyboard-sync
" option, the man page reads:
Disables keyboard state synchronization. Normally the key presses and key release events are sent to the server as they occur so that the server can maintain a consistent keyboard state. Using this option can prevent keys from repeating unexpectedly on high latency links but it may also disrupt applications which access the keyboard directly (games, etc).
This will not be made the default and even if we somehow hooked the latency measurements, we probably would not want to enable this on-the-fly automatically.
Feel free to re-open if you can still reproduce.
this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/21