Xpra: Ticket #581: xpra detach may cause a connection lost rather than a clean disconnection

Start client, call xpra detach.

Expected result: client is told to disconnect cleanly.

Actual result: client may get a "Connection lost" error a little while later, and it will exit with an error code.



Tue, 27 May 2014 13:20:34 GMT - Antoine Martin: owner, status changed

r6587 was only meant as a cleanup, but seems to fix the problem! odd.


Tue, 27 May 2014 14:11:04 GMT - Antoine Martin: owner, status changed

Damn: going back to v0.13.x, I can reproduce the bug again, but as soon as I enable -d protocol on the server, the bug disappears! Heisenbug, we meet again.

afarr: can you reproduce this? with protocol debugging enabled?


Tue, 27 May 2014 22:36:38 GMT - alas:

With 0.13.1 r6587 both (osx) client and (fedora20) server... I can't reproduce it, as far as I can tell. I see multiple instances of Connection lost and xpra client disconnected., but none seem to be coming with an error and none seem to be outputting any error code.

I tried against a fedora 19 server as well, 0.13.1 r 6527 (as I recall)... and I wasn't able to produce any errors there either.

I wasn't able to produce any error codes without -d protocol either... both seemed to be outputting similar messages.


Wed, 28 May 2014 02:07:47 GMT - alas:

In case it helps, this is the info I see with the -d protocol flag off:

xpra --bind-tcp=0.0.0.0:1201 --start-child=xterm --start-child=xterm start :23
 2014-05-27 18:57:34,043 New connection received: SocketConnection(/home/jimador/.xpra/zapopan.spikes.eng-23)
2014-05-27 18:57:34,056 Handshake complete; enabling connection
2014-05-27 18:57:34,056 Disconnecting existing client Protocol(SocketConnection(('10.0.32.53', 1201) - ('10.0.11.81', 60585))), reason is: new valid connection received, this sessi$
2014-05-27 18:57:34,057 xpra client disconnected.
2014-05-27 18:57:34,058 Connection lost
2014-05-27 18:57:34,061 Python/GObject Linux client version 0.13.1 connected from 'zapopan.spikes.eng' as 'jimador'
2014-05-27 18:57:34,058 Connection lost
2014-05-27 18:57:34,061 Python/GObject Linux client version 0.13.1 connected from 'zapopan.spikes.eng' as 'jimador'
2014-05-27 18:57:34,061 windows/pixels forwarding is disabled for this client
2014-05-27 18:57:34,075 Connection lost
2014-05-27 18:57:34,129 Connection lost
2014-05-27 18:57:34,129 xpra client disconnected.

While this is what I get with the -d protocol flag on:

xpra --bind-tcp=0.0.0.0:1201 --start-child=xterm --start-child=xterm -d protocol start :23
2014-05-27 19:00:33,116 New connection received: SocketConnection(/home/jimador/.xpra/zapopan.spikes.eng-23)
2014-05-27 19:00:33,089 add_packet_to_queue(sound-data ...)
2014-05-27 19:00:33,116 New connection received: SocketConnection(/home/jimador/.xpra/zapopan.spikes.eng-23)
2014-05-27 19:00:33,116 enable_bencode()
2014-05-27 19:00:33,117 io_thread_loop(write, <bound method Protocol._write of Protocol(SocketConnection(/home/jimador/.xpra/zapopan.spikes.eng-23))>) loop starting
2014-05-27 19:00:33,118 read_parse_thread_loop starting
2014-05-27 19:00:33,119 io_thread_loop(read, <bound method Protocol._read of Protocol(SocketConnection(/home/jimador/.xpra/zapopan.spikes.eng-23))>) loop starting
2014-05-27 19:00:33,119 add_packet_to_queue(sound-data ...)
2014-05-27 19:00:33,120 write_format_thread_loop starting
2014-05-27 19:00:33,129 processing packet hello
2014-05-27 19:00:33,130 enable_rencode()
2014-05-27 19:00:33,130 enable_lz4()
2014-05-27 19:00:33,130 Handshake complete; enabling connection
2014-05-27 19:00:33,130 Disconnecting existing client Protocol(SocketConnection(('10.0.32.53', 1201) - ('10.0.11.81', 60587))), reason is: new valid connection received, this sessi$
2014-05-27 19:00:33,130 flush_then_close: acquired the write lock
2014-05-27 19:00:33,130 flush_then_close: queue is now empty, sending the last packet and closing
2014-05-27 19:00:33,131 close() closed=False
2014-05-27 19:00:33,131 terminate_queue_threads()
2014-05-27 19:00:33,132 io_thread_loop(write, <bound method Protocol._write of Protocol(None)>) loop ended, closed=True
2014-05-27 19:00:33,132 xpra client disconnected.
2014-05-27 19:00:33,133 Connection lost
2014-05-27 19:00:33,134 read thread: empty marker, exiting
2014-05-27 19:00:33,136 Python/GObject Linux client version 0.13.1 connected from 'zapopan.spikes.eng' as 'jimador'
2014-05-27 19:00:33,136 windows/pixels forwarding is disabled for this client
2014-05-27 19:00:33,141 add_packet_to_queue(hello ...)
2014-05-27 19:00:33,150 close() closed=True
2014-05-27 19:00:33,150 Connection lost
2014-05-27 19:00:33,205 read thread: eof
2014-05-27 19:00:33,205 close() closed=False
2014-05-27 19:00:33,205 Connection lost
2014-05-27 19:00:33,206 terminate_queue_threads()
2014-05-27 19:00:33,206 close() closed=True
2014-05-27 19:00:33,206 read thread: empty marker, exiting
2014-05-27 19:00:33,207 xpra client disconnected.
2014-05-27 19:00:33,209 write thread: empty marker, exiting
2014-05-27 19:00:33,209 close() closed=True
2014-05-27 19:00:33,210 io_thread_loop(write, <bound method Protocol._write of Protocol(None)>) loop ended, closed=True
2014-05-27 19:00:33,209 close() closed=True
2014-05-27 19:00:33,209 io_thread_loop(read, <bound method Protocol._read of Protocol(None)>) loop ended, closed=True
2014-05-27 19:00:38,137 close() closed=True
2014-05-27 19:00:39,275 io_thread_loop(read, <bound method Protocol._read of Protocol(None)>) loop ended, closed=True

Tue, 24 Jun 2014 17:59:53 GMT - alas:

I still haven't been able to reproduce the "connection lost" issue. Were there any other circumstances that might be worth probing further?


Tue, 24 Jun 2014 19:37:17 GMT - Antoine Martin:

I am definitely seeing this problem occasionally, possibly more often on win32 than on Linux. I've not had time to investigate this further, but there is definitely a bug in there. I just need to get steps to reproduce with debug switched on...


Tue, 29 Jul 2014 16:47:27 GMT - Antoine Martin:

Found two problems related to detach (3 counting a trunk-only bug in r7012), both should be backported, details are in the commit messages r7015 and r7016. As noted in the bug description, this was causing the client to wrongly exit with an error code.

afarr: with versions before r7015, this was fairly easily reproducible here by repeating the following sequence a number of times (usually less than 3 or 4 times to hit it):

Please try it a few times and close this ticket.


Thu, 31 Jul 2014 18:29:01 GMT - J. Max Mena: status changed; resolution set

Retested with r7041 Windows and CentOS r6929(old, but behaved nice) against r7041 Fedora 20. Connected and used xpra detach :13 a few times. Output from Windows Xpra_cmd.exe was:

C:\Users\labrat\Desktop>cd "C:\Program Files *(x86)\Xpra"
C:\Program Files (x86)\Xpra>cmd
Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.
C:\Program Files (x86)\Xpra>Xpra_cmd.exe attach tcp:10.0.32.52:1200
2014-07-31 11:01:48,454 xpra client version 0.14.0
2014-07-31 11:01:51,746 OpenGL_accelerate module loaded
2014-07-31 11:01:51,746 Using accelerated ArrayDatatype
2014-07-31 11:01:52,334 detected keyboard: layout=us
2014-07-31 11:01:52,335 desktop size is 1920x1080 with 1 screen(s):
2014-07-31 11:01:52,338   '1\WinSta-Default' (508x285 mm)
2014-07-31 11:01:52,339     DISPLAY1 (677x381 mm)
2014-07-31 11:01:52,380 server: Linux Fedora 20 Heisenbug, Xpra version 0.14.0 (
r7041)
2014-07-31 11:01:52,611 Attached to tcp:10.0.32.52:1200 (press Control-C to deta
ch)
2014-07-31 11:01:53,821 using audio codec: MPEG 1 Audio, Layer 3 (MP3)
2014-07-31 11:01:54,941 server requested disconnect: detach requested
2014-07-31 11:01:55,328 Connection lost
C:\Program Files (x86)\Xpra>Xpra_cmd.exe attach tcp:10.0.32.52:1200
2014-07-31 11:01:59,237 xpra client version 0.14.0
2014-07-31 11:02:02,644 OpenGL_accelerate module loaded
2014-07-31 11:02:02,644 Using accelerated ArrayDatatype
2014-07-31 11:02:03,214 detected keyboard: layout=us
2014-07-31 11:02:03,217 desktop size is 1920x1080 with 1 screen(s):
2014-07-31 11:02:03,223   '1\WinSta-Default' (508x285 mm)
2014-07-31 11:02:03,224     DISPLAY1 (677x381 mm)
2014-07-31 11:02:03,266 server: Linux Fedora 20 Heisenbug, Xpra version 0.14.0 (
r7041)
2014-07-31 11:02:03,499 Attached to tcp:10.0.32.52:1200 (press Control-C to deta
ch)
2014-07-31 11:02:04,243 using audio codec: MPEG 1 Audio, Layer 3 (MP3)
2014-07-31 11:02:06,595 server requested disconnect: detach requested
2014-07-31 11:02:06,944 Connection lost
C:\Program Files (x86)\Xpra>Xpra_cmd.exe attach tcp:10.0.32.52:1200
2014-07-31 11:02:17,844 xpra client version 0.14.0
2014-07-31 11:02:21,125 OpenGL_accelerate module loaded
2014-07-31 11:02:21,125 Using accelerated ArrayDatatype
2014-07-31 11:02:21,703 detected keyboard: layout=us
2014-07-31 11:02:21,704 desktop size is 1920x1080 with 1 screen(s):
2014-07-31 11:02:21,707   '1\WinSta-Default' (508x285 mm)
2014-07-31 11:02:21,709     DISPLAY1 (677x381 mm)
2014-07-31 11:02:21,756 server: Linux Fedora 20 Heisenbug, Xpra version 0.14.0 (
r7041)
2014-07-31 11:02:21,986 Attached to tcp:10.0.32.52:1200 (press Control-C to deta
ch)
2014-07-31 11:02:22,528 using audio codec: MPEG 1 Audio, Layer 3 (MP3)
2014-07-31 11:02:25,022 server requested disconnect: detach requested
2014-07-31 11:02:25,394 Connection lost
C:\Program Files (x86)\Xpra>

Output from CentOS client was identical even after 6 or 7 repeats(can provide logs if requested...on another machine). Closing.


Fri, 01 Aug 2014 03:03:18 GMT - Antoine Martin:

Thanks, backports in r7061 + r7062

See also #643 / #659.


Sat, 23 Jan 2021 04:59:59 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/581