Running 3 windows of google-chrome I experienced a couple of crashes of the windows client Xpra_Setup_0.15.0-r7577M.exe running against 0.14.4 Linux server. When the error occurs, the log on Windows (Win7 Pro) tells:
Gdk:ERROR:gdkdrawable-win32.c:2040:_gdk_win32_drawable_finish: assertion failed: (impl-)hdc_count == 0)
Then it says "This aplication has requested the Runtime to terminate it in an unusual way". The server terminates 60 seconds later due to a timeout.
Ouch, is this a new thing that only happens with the beta? Can you easily reproduce it? If so, can you try an older beta build until you find one without the bug? (that will help us narrow it down)
This could be a serious bug, afarr can you reproduce it?
Testing with xpra_setup_0.15.0-r7804 win client (windows 7), and your 0.14.4 xpra-0.14.4-1.fc20.x86_64.rpm fedora 20 server I wasn't able to get the crash listed with 3 google-chrome windows... but when I tried to open the session info my server kept disconnecting me with a huge amount of what looks like keyboard mapping debug info.
On the client side I did notice this:
C:\Program Files (x86)\Xpra>xpra_cmd.exe attach tcp:10.0.32.53:1201 2014-09-25 17:52:33,023 rencode import error: No module named rencode 2014-09-25 17:52:34,226 Warning: 'rencode' packet encoder not found 2014-09-25 17:52:34,230 the other packet encoders are much slower 2014-09-25 17:52:34,232 xpra client version 0.15.0 2014-09-25 17:52:38,543 OpenGL_accelerate module loaded 2014-09-25 17:52:38,545 Using accelerated ArrayDatatype 2014-09-25 17:52:38,548 OpenGL support could not be enabled: 2014-09-25 17:52:38,549 some required OpenGL functions are not available: glActiveTexture, glMultiTexCoord2i
... on connection, but the client worked despite that. The disconnection message client side was a more innocuous:
2014-09-25 17:52:53,686 Connection lost 2014-09-25 17:52:54,714 server is not responding, drawing spinners over the windows
but when I tried to open the session info my server kept disconnecting me with a huge amount of what looks like keyboard mapping debug info
This sounds like a bug caused by a problem encoding info data with the fallback bencoder. This is a serious bug that needs to be fixed. Please reproduce it and create a new ticket for it (linking to #614 which is where this should have been caught), forcing bencode if rencode has been fixed in your builds with: --packet-encoders=bencode
. (can verify it is using bencode via session info or xpra info
)
The opengl failures would be worth recording in #679: which chipset this is, with the opengl failure this causes: some required OpenGL functions are not available: glActiveTexture, glMultiTexCoord2i.
To test for this crash, I believe you need to use a machine which has opengl enabled. (at least I think that is where the _gdk_win32_drawable_finish
is coming from)
And since it looks opengl related, playing with single and double buffering may also make a difference. (using XPRA_OPENGL_DOUBLE_BUFFERED=0
for disabling double buffering, double buffering status is shown on session info)
Sorry for the delay. I tried to reproduce the bug. Right now I get "internal error: error in network packet reading/parsing"in xpra\net\protocol.pyc (line 585). I used just two "google-chrome" windows with heise.de opened. Then some patience, maybe window moves (did not really figure out any pattern, yet), and it crashes...
The same error occurs with Win32 0.14.4 (then its line 587). So I guess I cannot find an older version without bug. OpenGL is disabled. Session info says "n/a".
I need some instructions on how to proceed. I can offer to record a "tcpdump" of all the traffic. In that case I need to now whether I shall record the traffic between client and proxy or between proxy and server. However, I would like to send it to PM since it might contain personal information.
Right now I get "internal error: error in network packet reading/parsing"in xpra\net\protocol.pyc (line 585).
That's odd, and completely different from the bug above.
In that case I need to now whether I shall record the traffic between client and proxy or between proxy and server.
So you're using the proxy... that's a crucial bit of information which was missing until now.
I suspect that for one reason or another, your proxy ends up using bencode instead of rencode, and chokes on something.
Getting xpra info
from it might help. As would running with -d network
to see more details about the cause of the parsing loop crash.
How can I get xpra info
from the proxy process?
$xpra info :100 Warning: running as root server requires authentication, please provide a password
According to xpra man page there is no possibility to pass username and password. And if I use tcp:user:host:port syntax, I would reach the server xpra, not the proxy. Or did you actually mean getting "xpra info" from the server process?
With 0.14.4 client and server I just reproduced the bug. Here is the relevant part of the proxy log:
... 2014-09-26 21:58:11,197 processing packet draw 2014-09-26 21:58:11,197 add_packet_to_queue(draw ...) 2014-09-26 21:58:11,215 processing packet damage-sequence 2014-09-26 21:58:11,215 add_packet_to_queue(damage-sequence ...) 2014-09-26 21:58:11,232 internal error: read connection SocketConnection(('1.2.3.1', 123) - ('1.2.3.4', 49542)) reset: [Errno 104] Connection reset by peer 2014-09-26 21:58:11,233 connection lost: read connection SocketConnection(('1.2.3.1', 123) - ('1.2.3.4', 49542)) reset: [Errno 104] Connection reset by peer ...
The log for the server is almost the same (except host and port numbers) And here the relevant part of the client log:
2014-09-26 21:58:10,015 do_expose_event(<gtk.gdk.Event at 046018F0: GDK_EXPOSE area=[56, 181, 970, 250]>) area=gtk.gdk.Rectangle(56, 181, 970, 250) 2014-09-26 21:58:10,030 processing packet draw 2014-09-26 21:58:10,030 process_draw 2455083 bytes for window 2 using rgb24 encoding with options={'rgb_format': 'RGB'} 2014-09-26 21:58:10,030 draw_region(0, 0, 1151, 711, rgb24, 2455083 bytes, 3453, {'rgb_format': 'RGB'}, [<function record_decode_time at 0x04703730>, <function after_draw_refresh at 0x047035F0>]) 2014-09-26 21:58:10,030 record_decode_time(True) wid=2, rgb24: 1151x711, 0.0ms 2014-09-26 21:58:10,046 after_draw_refresh(True) 1151x711 at 0x0 encoding=rgb24, options={'rgb_format': 'RGB'} 2014-09-26 21:58:10,046 do_expose_event(<gtk.gdk.Event at 046018D8: GDK_EXPOSE area=[0, 0, 1151, 711]>) area=gtk.gdk.Rectangle(0, 0, 1151, 711) 2014-09-26 21:58:10,046 add_packet_to_queue(damage-sequence ...) 2014-09-26 21:58:10,078 internal error: error in network packet reading/parsing Traceback (most recent call last): File "xpra\net\protocol.pyc", line 587, in _read_parse_thread_loop File "xpra\net\protocol.pyc", line 616, in do_read_parse_thread_loop MemoryError 2014-09-26 21:58:10,078 connection lost: error in network packet reading/parsing 2014-09-26 21:58:10,078 close() closed=False 2014-09-26 21:58:10,078 terminate_queue_threads() 2014-09-26 21:58:10,078 Connection lost
Additionally, I hava attached the output of "xpra info :DISPLAY" on the server process.
Output of "xpra info" from 0.14.4 server
How can I get xpra info from the proxy process?
Run xpra list
as the user who owns the proxy instance (not the proxy master server) and you will see an entry you can connect to.
Traceback (most recent call last): File "xpra\net\protocol.pyc", line 587, in _read_parse_thread_loop File "xpra\net\protocol.pyc", line 616, in do_read_parse_thread_loop MemoryError
That's odd, is your client under memory pressure? The only relevant link I found is this ticket: socket read() can cause MemoryError in Windows
Does it make any difference if you use --encodings=jpeg
(to force jpeg only)
xpra list
does not show the proxy instance, only the xpra server. ps
lists the forked proxy instance and the xpra server, but xpra list
just displays the latter one as LIVE session at :1001
.
I tried again, with the latest Windows client 0.15.0-r7639, using default xpra.conf and the commandline parameters attach --debug=all --username=... --socket-dir=C:\temp\xpra --password-file=... --encoding=jpeg -z 0 --border=... --video-encoders=x264 tcp:...
(still using 0.14.4 server).
The system has 4 GB of RAM with nothing else than xpra running. I monitored memory consumption with the Windows task manager. Now I again get the "impl->hcd_count == 0" error from the beginning.
When I start the xpra client, it restores the two google-chrome windows which I immediately minimized. In that state, the xpra_cmd.exe process uses about 60 MB of memory. However, as soon as I unminimize both windows again, the memory consumption starts growing linearly. Then at some point, it crashes. From a few runs I guess the crash occurs somewhere between 600 MB and 1.2 GB, after about 1 or 2 minutes of waiting.
I already tried to remove the optional command line options one by one. Even with attach --username=... --password-file=... tcp:...
just one google-chrome window, memory keeps growing.
From my experience, memory grows if some rendering updates are transmitted. To clarify what I mean, a short example:
I started Firefox with AdblockPlus? showing heise.de. Memory consumption is stable. Whenever I reduce overlap form other windows, memory consumption jumps a bit up. When I disable AdblockPlus? some animated ad shows up on the right. If I don't overlap it, memory consumption linearly grows quite fast (a few MB/s).
Without looking into the code I guess there is a severe memory leak in the xpra win32 client. At least according to task manager there was always more than 2 GB of free memory available. I don't know why it crashes far before the 4 GB boundary, maybe stack corruption?
xpra list does not show the proxy instance
It does, but as I mentioned before, you need to run xpra list as the same user as the proxy instance. Maybe root in your case?
I have added more information here: wiki/ProxyServer.
Now I again get the "impl->hcd_count == 0" error from the beginning.
"Now"? As in, did it stop happening?
memory keeps growing
Sounds like there is a memory leak (which we will fix), but I doubt this has anything to do with the hdc_count
crash. I've moved this one to #696.
crashes far before the 4 GB boundary
Xpra is a 32-bit process, and not all 4GB are addressable, so it would be expected to crash before reaching 4GB with this memory leak.
Can I close this ticket and #696?
It seems that only I can reproduce this issue. Maybe the best way to proceed is that I will try 0.15 trunk for both, client and server and see if I still can reproduce it then.
I don't know why I got "internal error: error in network packet reading/parsing", actually I can only reproduce the "impl->hcd_count == 0".
Concerning #696 I have no idea about which "more details" could help.
I plan to check with trunk next week. If suggest to leave the ticket open until then.
I reproduced the crash with the following setup:
Server:
Centos 7, up to date Additional repo: www.xpra.org/dists/CentOS/7 xpra version: trunk (SVN, a few hours ago) ./setup.py build+install with no further options, only "export PKG_CONFIG_PATH=/usr/lib64/xpra/pkgconfig:/usr/share/pkgconfig" is needed default xpra.conf
Started server processes:
xpra proxy :100 --socket-dir=/tmp --bind-tcp=1.2.3.4:555 --auth=file --password-file=/etc/xpra/xpra.auth --no-daemon xpra start --debug=all :1234 --bind-tcp=127.0.0.1:31234 --no-daemon
Client:
Win7 x64, 4GB RAM xpra version 0.15.0-r7928 client command line parameters: attach --username=xyz --password-file=PATH\TO\FILE tcp:1.2.3.4:5555 default xpra.conf
Szenario:
Start google-chrome 1. Open any page with lots of movements, I chose youtube.de this time 2. Watch memory consumption grow and crash approach. 3. After the crash, just reconnect and continue with step 2.
The reason I previously was not able to provide the content of xpra info
was the option --socket-dir=/tmp
. I needed to specify the socket-dir also on list
and info
. Attached you find the output of the proxy instance at the point in time the client is just crashing.
I tried multiple times: the memory consumption always seems to go up to about 1.8 GB and then it crashes.
If I use encoding=rgb
for the client, I get a differen error. For details I will attach screenshots of both scenarios.
I am afraid the error is again different. However, with encoding=rgb
it clearly says "Memory Error". So I still guess the error messages are just many different ways of indicating something like stack overflow.
Let me know if I can provide any further information.
All this latest info seems to be related to the memory leak (which should have gone into #696) and not to the GTK assertion hdc_count==0
of this ticket, is that right? Or are you still also getting the hdc error?
Can you try with 0.14.10 just released today? (Client side, server probably does not matter much)
Does the memory leak progress at the same speed with all encodings?
Can I close this? And follow up the memory leak in #696?
Yes, thats fine.
this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/684