Xpra: Ticket #863: encode error with only video encodings enabled

I guess we probably want to support this somehow, even though it is going to be dreadfully slow. Originally reported in ticket:854#comment:23

Launch the client with:

xpra attach --encodings=h264 ...

If there are no non-video encodings, we need to send full window updates, every time.



Sat, 16 May 2015 05:41:01 GMT - Antoine Martin: owner changed

Fixed in r9386 + r9392 (trunk) - will backport.

Please close unless you can break it again somehow.


Mon, 18 May 2015 18:51:57 GMT - J. Max Mena:

Connected using Windows 8.1 and a Win32 0.15.0 beta build r9445 against a Fedora 21 trunk r9445 build:

Launching the server with xpra start :13 --no-daemon --bind-tcp=0.0.0.0:2200 --html=on --start-child=firefox --start-child=xterm

Traceback (most recent call last):
  File "xpra\client\ui_client_base.pyc", line 1975, in _draw_thread_loop
  File "xpra\client\ui_client_base.pyc", line 2021, in _do_draw
  File "xpra\client\client_window_base.pyc", line 423, in draw_region
  File "xpra\client\window_backing_base.pyc", line 473, in draw_region
  File "xpra\client\window_backing_base.pyc", line 264, in paint_rgb24
  File "xpra\client\window_backing_base.pyc", line 175, in process_delta
Exception: expected 5976 bytes for 83x24 with rowstride=249 but received 26 (34
compressed)
2015-05-18 11:23:42,654 internal error: read connection SocketConnection(('10.0.
11.124', 57327) - ('10.0.32.138', 2200)) reset: [Errno 10054] An existing connec
tion was forcibly closed by the remote host

Switching to a 0.15.0 r9445 build on the server and launching with the same parameters does not disconnect me, but the session appears to freeze after a second or so and I get the following tracebacks:

server side:

2015-05-18 11:36:28,543 error processing damage data: failed to get buffer from pixel object: <type 'memoryview'> (returned -1)
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/xpra/server/source.py", line 1734, in encode_loop
    fn_and_args[0](*fn_and_args[1:])
  File "/usr/lib64/python2.7/site-packages/xpra/server/window_source.py", line 1187, in make_data_packet_cb
    packet = self.make_data_packet(damage_time, process_damage_time, wid, image, coding, sequence, options)
  File "/usr/lib64/python2.7/site-packages/xpra/server/window_source.py", line 1524, in make_data_packet
    ret = encoder(coding, image, options)
  File "/usr/lib64/python2.7/site-packages/xpra/server/window_source.py", line 1595, in webp_encode
    return webp_encode(coding, image, self.rgb_formats, self.supports_transparency, q, s, options)
  File "/usr/lib64/python2.7/site-packages/xpra/server/picture_encode.py", line 62, in webp_encode
    cdata = enc_webp.compress(image.get_pixels(), w, h, stride=stride/4, quality=quality, speed=speed, has_alpha=alpha)
  File "xpra/codecs/webp/encode.pyx", line 342, in xpra.codecs.webp.encode.compress (xpra/codecs/webp/encode.c:1839)
AssertionError: failed to get buffer from pixel object: <type 'memoryview'> (returned -1)

and client side:

2015-05-18 11:36:28,759 error processing draw packet
Traceback (most recent call last):
  File "xpra\client\ui_client_base.pyc", line 1975, in _draw_thread_loop
  File "xpra\client\ui_client_base.pyc", line 2021, in _do_draw
  File "xpra\client\client_window_base.pyc", line 423, in draw_region
  File "xpra\client\window_backing_base.pyc", line 473, in draw_region
  File "xpra\client\window_backing_base.pyc", line 264, in paint_rgb24
  File "xpra\client\window_backing_base.pyc", line 175, in process_delta
Exception: expected 7488 bytes for 117x16 with rowstride=468 but received 26 (34
 compressed)
2015-05-18 11:40:38,854 error processing damage data:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/xpra/server/source.py", line 1734, in encode_loop
    fn_and_args[0](*fn_and_args[1:])
  File "/usr/lib64/python2.7/site-packages/xpra/server/window_source.py", line 1187, in make_data_packet_cb
    packet = self.make_data_packet(damage_time, process_damage_time, wid, image, coding, sequence, options)
  File "/usr/lib64/python2.7/site-packages/xpra/server/window_source.py", line 1524, in make_data_packet
    ret = encoder(coding, image, options)
  File "/usr/lib64/python2.7/site-packages/xpra/server/window_video_source.py", line 1261, in video_encode
    ret = self._video_encoder.compress_image(csc_image, quality, speed, options)
  File "xpra/codecs/enc_x264/encoder.pyx", line 520, in xpra.codecs.enc_x264.encoder.Encoder.compress_image (xpra/codecs/enc_x264/encoder.c:5861)
AssertionError

Mon, 18 May 2015 19:12:32 GMT - Antoine Martin:

0.15 should not be using memoryview by default, where did you get this build? What build command was used? As usual, having "xpra info" would help clarify things.

Is reconnecting necessary to get the first crash? I'll try to get a gdb backtrace tomorrow - assuming I can reproduce. Feel free to beat me to it.

The failure in the x264 encoder is hard to diagnose because 0.15.x doesn't have the debug code - I may have to backport it, unless you can reproduce with trunk? (you may need to build it with "--without-memoryview" to get the same behaviour as 0.15.x)


Mon, 18 May 2015 19:32:12 GMT - J. Max Mena:

The builds are from the trunk or 0.15.0 tagged repositories I have on one of my Fedora 21 test VMs.

I use the following command to build:

LDFLAGS=-Wl,-rpath=/usr/lib64/xpra PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/lib64/xpra/pkgconfig     ./setup.py install

Re-building with:

LDFLAGS=-Wl,-rpath=/usr/lib64/xpra PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/lib64/xpra/pkgconfig     ./setup.py install --without-memoryview

Switching VMs to another Fedora 21 machine using your latest Fedora 21 build from the beta repo works fine. It looks like it's an issue with my build environment.


Mon, 18 May 2015 19:45:55 GMT - Antoine Martin:

I'll put my money on webp issues. You can confirm this by enabling just webp, or by enabling everything but webp.

You need to make sure that you have the same webp at build time and at runtime. This is actually a known problem with webp, see #848.


Mon, 18 May 2015 21:13:15 GMT - J. Max Mena:

Relaunched with

xpra start :13 --no-daemon --bind-tcp=0.0.0.0:2200 --html=on --start-child=firefox --start-child=xterm --start-child=xterm --encodings=h264,png,jpeg,rgb
2015-05-18 14:10:17,647 error processing draw packet
Traceback (most recent call last):
  File "xpra\client\ui_client_base.pyc", line 1975, in _draw_thread_loop
  File "xpra\client\ui_client_base.pyc", line 2021, in _do_draw
  File "xpra\client\client_window_base.pyc", line 423, in draw_region
  File "xpra\client\window_backing_base.pyc", line 473, in draw_region
  File "xpra\client\window_backing_base.pyc", line 264, in paint_rgb24
  File "xpra\client\window_backing_base.pyc", line 175, in process_delta
Exception: expected 4964 bytes for 1241x1 with rowstride=4964 but received 26 (3
4 compressed)

Relaunching with

xpra start :13 --no-daemon --bind-tcp=0.0.0.0:2200 --html=on --start-child=firefox --start-child=xterm --start-child=xterm --encodings=h264,png,jpeg

Relaunching with

xpra start :13 --no-daemon --bind-tcp=0.0.0.0:2200 --html=on --start-child=firefox --start-child=xterm --start-child=xterm --encodings=h264,png,jpeg,webp

Interestingly it looks like enabling rgb is what's causing segfaults here


EDIT: changed wording for consistency EDIT2: Added client output on crash. EDIT3: I am bad at copy-paste. Fixed.


Tue, 19 May 2015 08:43:51 GMT - Antoine Martin: priority changed

It looks like the client is still a win32 system of some sort? (the log output is wrapping at 80 characters) Have you tried connecting from the same Fedora machine? Does it make any difference? (probably needs to have mmap turned off to trigger the bug)

The first 2 command lines in comment:6 are identical. But you said "Relaunching with.." which seems to imply that maybe it should be a different command?. Why would it work better the second time? Was anything else changed? Do I need firefox to trigger it? Any page in particular?

(general advice: always best to trim down the command lines and remove things that aren't relevant to the bug. ie: if html is not used, take it off, if sound or clipboard aren't relevant then turn them off, if you don't need two xterms then don't start two - if you need two during testing then test again afterwards without, etc..)

Assuming that the problem is with rgb, please try with "-z 0" and "-z 9" to see if it triggers it more easily. Please also provide the "-d encoding" server log, and the "-d paint,delta" client log around the time of the problem.

FWIW: I have tried many times, with different client OS, no crash whatsoever. I did hit this bug: #861, but that's a different issue. (and I tested before doing that fix) Are you sure that your build environment is configured properly? (all the dependencies like libwebp-xpra are installed, etc) Was there anything at all in the server log?

What makes you think that this has something to do with rgb? (the second one did not crash, and it had rgb enabled) Have you tried just with rgb? With h264 + rgb?

When you say it "crashes", is it the server or the client? Where is the crash message? (or does it just print this stacktrace and continue?) (xpra info is still missing.)

Raising priority again..


Wed, 20 May 2015 20:33:44 GMT - J. Max Mena:

Updated my server to trunk r9459 and re-built

Using the following commands to setup a server session:

xpra start :13 --no-daemon --bind-tcp=0.0.0.0:2200 --start-child=firefox --start-child=xterm --start-child=xterm

Also I'm connecting from Windows 8.1 using the r9445 Beta Win32 build from http://xpra.org/beta. I'd use Fedora to connect but we don't have any hardware Fedora 19/20/21 machines yet...they're not cooperating with our cloning solution, but that's a problem for another time. In addition I have my trusty old Cent6.4 machine that I can use as well


The first 2 command lines in comment:6 are identical.


My bad, I copy and pasted wrong, the working server start didn't have RGB. I'll edit the comment to fix it.


For starters, I am pretty sure this issue has something to do with my build environment, and even then only when the server paints with RGB. If I specify the client to connect with encodings other than RGB, then I can use the session with no problems. If I use a server from your beta repository (same server operating system - Fedora 21, just a different VM) then RGB encoding works fine, even after connecting with the same client, with the same server and client start commands.

That being said, setting up a session with only RGB :

xpra start :13 --no-daemon --bind-tcp=0.0.0.0:2200 --start-child=firefox --start-child=xterm --start-child=xterm  --encodings=rgb

And connecting with:

Xpra_cmd.exe attach tcp:10.0.32.137:2200

Connects and the server does not seg-fault(or print any errors, actually), however all my windows are black and I can not interact with anything, and the client floods the CMD window with tracebacks before I disconnect:

2015-05-20 12:58:22,726 error processing draw packet
Traceback (most recent call last):
  File "xpra\client\ui_client_base.pyc", line 1975, in _draw_thread_loop
  File "xpra\client\ui_client_base.pyc", line 2021, in _do_draw
  File "xpra\client\client_window_base.pyc", line 423, in draw_region
  File "xpra\client\window_backing_base.pyc", line 473, in draw_region
  File "xpra\client\window_backing_base.pyc", line 264, in paint_rgb24
  File "xpra\client\window_backing_base.pyc", line 175, in process_delta
Exception: expected 630736 bytes for 499x316 with rowstride=1996 but received 26
 (34 compressed)
2015-05-20 12:58:22,726 invalid img data <type 'str'>: <memory at 0x7f3a466c0640
>
2015-05-20 12:58:22,726 draw error
Traceback (most recent call last):
  File "xpra\client\ui_client_base.pyc", line 2021, in _do_draw
  File "xpra\client\client_window_base.pyc", line 423, in draw_region
  File "xpra\client\window_backing_base.pyc", line 473, in draw_region
  File "xpra\client\window_backing_base.pyc", line 264, in paint_rgb24
  File "xpra\client\window_backing_base.pyc", line 175, in process_delta
Exception: expected 630736 bytes for 499x316 with rowstride=1996 but received 26
 (34 compressed)
2015-05-20 12:58:22,727 error processing draw packet
Traceback (most recent call last):
  File "xpra\client\ui_client_base.pyc", line 1975, in _draw_thread_loop
  File "xpra\client\ui_client_base.pyc", line 2021, in _do_draw
  File "xpra\client\client_window_base.pyc", line 423, in draw_region
  File "xpra\client\window_backing_base.pyc", line 473, in draw_region
  File "xpra\client\window_backing_base.pyc", line 264, in paint_rgb24
  File "xpra\client\window_backing_base.pyc", line 175, in process_delta
Exception: expected 630736 bytes for 499x316 with rowstride=1996 but received 26
 (34 compressed)
2015-05-20 12:58:22,732 server requested disconnect: client request
2015-05-20 12:58:22,776 Connection lost

Connecting with a CentOS 6.4 beta client (05/18 build date) gives me the same errors, for what it's worth. I'll try in different OSs if/when I get a chance.


I will also attach the logs you requested. I set up a session with

xpra start :13 --no-daemon --bind-tcp=0.0.0.0:2200 --html=on --start-child=firefox --start-child=xterm --start-child=xterm -d encoding > xpra863encoding.txt 2>&1

and connected from my Cent6.4 machine with :

xpra attach tcp:10.0.32.138:2200 -d paint,delta > xpra863deltapaint.txt 2>&1

Connecting, and clicking on the close button on Firefox, causing it to load the homepage causes a seg-fault on the server.


Also, using "-z 0" and "-z 9" doesn't seem to have any effect. I'm not sure if I'm using them correctly...I'm just starting the server with

xpra start :13 --no-daemon --bind-tcp=0.0.0.0:2200 --html=on --start-child=firefox --start-child=xterm --start-child=xterm --encodings=rgb -z 9

For what it's worth: Selecting just h264 as an encoder works, it's just using RGB with other encoders that's causing issues....and even then only in my build environment.


Wed, 20 May 2015 20:38:56 GMT - J. Max Mena: attachment set

Connecting and loading the firefox home page (the fedora start page) before the server seg faults


Wed, 20 May 2015 20:40:21 GMT - J. Max Mena: attachment set

Same steps as the other logs, connecting and loading the fedora firefox start page before a server seg fault. This time from the Cent6.4 client's perspective.


Thu, 21 May 2015 02:47:34 GMT - Antoine Martin:

xpra info is still missing.


Thu, 21 May 2015 05:22:29 GMT - Antoine Martin: owner, status changed

If I use a server from your beta repository...


Then we need to figure out what is different between those two build environments and the packages that they produce. Having xpra info will help, also ls -l /usr/lib64/xpra/pkgconfig/, how you build and install the RPM, etc. And maybe PM me one of those problematic packages.


I'd use Fedora to connect but we don't have any hardware Fedora 19/20/21 machines yet.


I'm confused, I thought you had a Fedora VM you used as server? comment:4 says I have on one of my Fedora 21 test VMs.


Exception: expected 630736 bytes for 499x316 with rowstride=1996 but received 26 (34 compressed) 2015-05-20 12:58:22,726 invalid img data <type 'str'>: <memory at 0x7f3a466c0640>


It is the memoryview stuff that is causing this. It is only enabled by default in trunk. I'll look into it.

Although I am glad to see trunk getting some testing (any version getting some testing), the focus should be on 0.15 at this point. (I know that I did ask you to run trunk to get a log message previously).


I will also attach the logs you requested.


Thanks, that's very useful. Fixed a bug already I found in there: #865. Any errors in the logs like this one should always be investigated and reported as bugs, whether there is visual corruption on screen or not.


Connecting, and clicking on the close button on Firefox, causing it to load the homepage causes a seg-fault on the server.


Which close button? Close tab?

allows me to interact with the Xterms for a similar amount of time before the server seg-faults again


How do you make it crash? Close? Resize? It could well be related to #865. In which case the crash should be gone with latest trunk.


Thu, 21 May 2015 16:26:52 GMT - J. Max Mena: attachment set

requested Xpra Info.


Thu, 21 May 2015 16:56:11 GMT - J. Max Mena:

I uploaded the Xpra info.

I started the server with

xpra start :13 --no-daemon --bind-tcp=0.0.0.0:2200 --html=on --start-child=firefox --start-child=xterm --start-child=xterm --encodings=rgb

(I use only RGB) because then I can keep the server running long enough to get the requested Xpra Info

and then connected from Windows 8.1 with:

Xpra_cmd.exe attach tcp:10.0.32.138:2200

I should specify our testing environment:

We have a number of Fedora 20/21 VMs (2 Fedora 20 and as many Fedora 21 VMs as we need) but they aren't connected to any display; they just sit on a KVM server somewhere in our server room, so we can't use them as clients at all.

That being said, we do have a number of hardware machines that we can use as clients(or server) to test with, but they aren't playing well with Fedora at the moment, and I haven't had the time to sit down and really investigate the issue we have with them. Other than those machines (including a couple Mac Minis...one with Intel graphics, one with nvidia), I have my laptop(Macbook something that I run Windows on) and a low power Cent6.4 machine. Also whatever machines Alex has access to.


Firefox close button


When Firefox detects that it recovers from a crash, or can't open a new tab it gives you two options. One is restore your previous tabs, and another button marked "close" that just opens a new session. That's the button that I was referring to...all it does is launch a new session.


ls -l /usr/lib64/xpra/pkgconfig/ :

total 36
-rw-r--r--. 1 root root 405 Apr 11 07:22 libavcodec.pc
-rw-r--r--. 1 root root 422 Apr 11 07:22 libavfilter.pc
-rw-r--r--. 1 root root 443 Apr 11 07:22 libavformat.pc
-rw-r--r--. 1 root root 270 Apr 11 07:22 libavutil.pc
-rw-r--r--. 1 root root 299 Apr 11 07:22 libpostproc.pc
-rw-r--r--. 1 root root 307 Apr 11 07:22 libswresample.pc
-rw-r--r--. 1 root root 300 Apr 11 07:22 libswscale.pc
-rw-r--r--. 1 root root 311 Apr  4 09:32 vpx.pc
-rw-r--r--. 1 root root 255 Jan 18 22:49 x264.pc

Finally, I will try to recompile without memoryview and leave a comment in a bit.


Fri, 22 May 2015 13:21:58 GMT - Antoine Martin: owner, status changed

they aren't connected to any display; they just sit on a KVM server somewhere in our server room, so we can't use them as clients at all.


That's not true: you could use an xpra session to launch another xpra client connecting to another session. (or use VNC or whatever)



I am in serious need of a recap here. There is more than one issue I think, and so many versions and combinations that it is making my head spin.

And whether this affects all builds or just yours / mine.

Assuming that some of these bugs are still present (and assuming that the option is relevant for the version tested), please try:

If there are remaining issues, maybe we should split them into new tickets to clarify things. The original ticket description is about using --encodings=h264, which works fine for me. Not rgb.. The most important thing is to check that 0.15 runs OK, we can worry about 0.16 later.


Wed, 27 May 2015 22:26:47 GMT - J. Max Mena:

Okay, recap time:


Firstly, --encodings=h264 is working flawlessly, so as far as I can tell, everything within the scope of this ticket has been fixed (unless the Encoding issues I'm seeing are directly caused by the fix)....then again it's also your Trac so I'll defer to you if you want to spin off other tickets if you'd prefer.


Using xpra start :13 --no-daemon --bind-tcp=0.0.0.0:2200 --start-child=xterm and launching glxgears, I've tested every permutation I can think of using Fedora 21 as a server and Win8.1 and Cent6.4 as clients:

The only time I'm getting encoding issues is with my 0.15.0 branch server built from source. In all other instances it works fine. This includes trunk server built from source and trunk client built from source, which are working fine.

In addition I no longer see invalid img data in both the 0.15.X branch and trunk.

Just for clarity:


Switching compressors seems to have no noticeable effect.


Building --with-memoryview and --without-memoryview has no noticeable effect in the 0.15.X branch or trunk.


Thu, 28 May 2015 12:47:03 GMT - Antoine Martin:

OK, sounds good. The only thing I can think of is that you're hitting a compilation bug, maybe related to this I found in your xpra info:

server.build.cython=0.22.beta0

Can you try updating to 0.22 final to see if that helps?

If not, maybe you can PM me a download link of the compressed virtual machine image that you use so that I can run it here?


Thu, 28 May 2015 18:58:48 GMT - J. Max Mena:

Built Cython from their latest release http://cython.org/#download in Fedora 21 and Cent6.4, and I'm still getting assert errors, and glxgears totally stops painting after a second or two.


For what it's worth, using other encodings is fine vp9,vp8,jpeg,web,png,h264, so I'm only getting these errors when I have rgb enabled.


I'll bother Smo to see if we can get you a compressed image of the machine.


Thu, 28 May 2015 19:51:44 GMT - J. Max Mena:

Update:



Building on the new VM with trunk or 0.15.X works fine. Looks like the issue is confined to that specific machine. If you still want the image to the broken machine let me know.


Fri, 29 May 2015 03:10:03 GMT - Antoine Martin:

If you still want the image to the broken machine let me know.


I think it would be useful to get to the bottom of this issue, so that we can prevent it in the future. And maybe add a sanity test for the rgb encoder.


Thu, 18 Jun 2015 18:28:37 GMT - Antoine Martin: owner changed

@smo: can you help?


Thu, 17 Sep 2015 17:49:56 GMT - Smo: status changed; resolution set

I have removed the image from our system and compressed it I can upload it if you like but i'm closing this ticket for now.


Sat, 23 Jan 2021 05:08:12 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/863