Testing 0.15 through the proxy server, got this error:
error processing draw packet Traceback (most recent call last): File "/usr/lib64/python2.7/site-packages/xpra/client/ui_client_base.py", line 1975, in _draw_thread_loop self._do_draw(packet) File "/usr/lib64/python2.7/site-packages/xpra/client/ui_client_base.py", line 2021, in _do_draw window.draw_region(x, y, width, height, coding, data, rowstride, packet_sequence, options, [record_decode_time]) File "/usr/lib64/python2.7/site-packages/xpra/client/client_window_base.py", line 423, in draw_region self._backing.draw_region(x, y, width, height, coding, img_data, rowstride, options, callbacks) File "/usr/lib64/python2.7/site-packages/xpra/client/window_backing_base.py", line 473, in draw_region self.paint_rgb24(img_data, x, y, width, height, rowstride, options, callbacks) File "/usr/lib64/python2.7/site-packages/xpra/client/window_backing_base.py", line 264, in paint_rgb24 rgb24_data = self.process_delta(raw_data, width, height, rowstride, options) File "/usr/lib64/python2.7/site-packages/xpra/client/window_backing_base.py", line 186, in process_delta rgb_data = xor_str(img_data, ldata) File "xpra/codecs/xor/cyxor.pyx", line 19, in xpra.codecs.xor.cyxor.xor_str (xpra/codecs/xor/cyxor.c:736) assert len(buf)==len(xor), "cyxor cannot xor strings of different lengths (%s:%s vs %s:%s)" % (type(buf), len(buf), type(xor), len(xor)) AssertionError: cyxor cannot xor strings of different lengths (<type 'str'>:38192 vs <type 'str'>:28644)
Which is what you would expect if you were xoring RGBX with RGB (25% bigger). I don't see how that is possible because we verify the pixel format AND the size of the buffer before we xor on the server side...
Although I cannot reproduce this right now, I suspect that I hit this issue whilst testing through the proxy server, which does things with "draw" packets. If I cannot find the bug, we can always disable delta buckets (set it to 0) when going through the proxy.
Managed to reproduce, at least something similar... with:
xpra start :10 --bind-tcp=0.0.0.0:10000 --start-child=xterm
XPRA_PROXY_PASSTHROUGH=1 xpra proxy :20 --bind-tcp=0.0.0.0:10001 --auth=allow -d encoding
xpra attach tcp:192.168.42.100:10001 --no-mmap --password-file=./password.txt
XPRA_PROXY_PASSTHROUGHto make it easier to trigger and debug the passthrough code. Which normally only fires when the video context creation fails and PIL is missing... (I believe I was hitting the first condition because of r9413, not sure about the second!?)
The fix for this bug is in r9435, which also makes the code more generic: we no longer assume the pixels format is BGRX or RGBX format (with X aka alpha at the end). Backported to v0.14.x and v0.15.x in r9438.
I am keeping this ticket open because I am not convinced that the "25% bigger" in the ticket description is necessarily the same bug. In any case, we should also not assume that "rgb32" is available in all clients. But since I can no longer trigger any proxy encoding bugs, I am also lowering the priority.
I have hit this again, without the proxy... raising.
ensures we never mix encodings with delta - which could make us mix pixel formats on the receiving end
Maybe r9238 is wrong because maybe we do care what encoding is used. Different encodings use different decoders, and some might give us RGB, others RGBX... for the same input pixel format. Which could explain the differing sizes. So r9448 makes the checks more strict. (I couldn't get it to trigger during testing, but we should not be making assumptions like these)
But that's not all, setting
XPRA_MAX_DELTA_HITS low enough does trigger the problem.
The fix is obvious and embarrassing: see r9449.
Backports in r9450.
Seen the bug again.
Stacktraces always seem to come in pairs. Could be related to the window edges?
More fixes for multi delta and the expiry code, see #866.
Those should be different from the "25% bigger" bug reported here though...
If it does occur again, we can also enable
XPRA_INTEGRITY_HASH=1 as per ticket:866#comment:7 (maybe the debug output will help)
Maybe only occurring with opengl=on and rgb24 as per ticket:891#comment:10. If that's true, I have no idea why.
Found the problem, we sometimes call
rgb_reformat to save space (from say
RGB - which is 25% smaller!), and the client may not need to convert it back: with opengl we can just upload
RGB to the gpu.
But for xoring, we need the same format as the original...
This is not new to 0.15, but it just makes it easier to hit.
Fix on its way, but if it's too big the backport may more brutal: maybe just disable delta altogether!
Now I know:
RGB- by the time the second window update comes along, we have the full list, and we may send the pixels in a different format
Fixed in r9970: we just don't bother with delta if the pixel format is going to require reformatting as it's too difficult to guarantee that the client will be able to reconstruct the same buffer format at the other end in that case. (we often drop the unused 'X' part for example). Backports to v0.14.x and v0.15.x in r9971.
I still want to investigate why those errors didn't trigger a paint refresh. All paint errors should at worst cause a log message and slower repaint, not persistent visual corruption.
r9978 ensures we refresh the whole window if we ever get client-side decoding errors (previously this only worked for video modes) - this may be worth applying to 0.15 at some point.
I think this ticket is done, feel free to re-open if you ever hit it again.
Note: as of r9976, trunk is likely to fail with a more precise error message which will look like this:
delta region uses XXXX format, was expecting XXXX (where the XXXXs are usually
this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/861