#861 closed defect (fixed)
xor error with differing pixel format
Reported by: | Antoine Martin | Owned by: | Antoine Martin |
---|---|---|---|
Priority: | blocker | Milestone: | 0.16 |
Component: | encodings | Version: | 0.15.x |
Keywords: | Cc: | onlyjob@… |
Description
Testing 0.15 through the proxy server, got this error:
error processing draw packet Traceback (most recent call last): File "/usr/lib64/python2.7/site-packages/xpra/client/ui_client_base.py", line 1975, in _draw_thread_loop self._do_draw(packet) File "/usr/lib64/python2.7/site-packages/xpra/client/ui_client_base.py", line 2021, in _do_draw window.draw_region(x, y, width, height, coding, data, rowstride, packet_sequence, options, [record_decode_time]) File "/usr/lib64/python2.7/site-packages/xpra/client/client_window_base.py", line 423, in draw_region self._backing.draw_region(x, y, width, height, coding, img_data, rowstride, options, callbacks) File "/usr/lib64/python2.7/site-packages/xpra/client/window_backing_base.py", line 473, in draw_region self.paint_rgb24(img_data, x, y, width, height, rowstride, options, callbacks) File "/usr/lib64/python2.7/site-packages/xpra/client/window_backing_base.py", line 264, in paint_rgb24 rgb24_data = self.process_delta(raw_data, width, height, rowstride, options) File "/usr/lib64/python2.7/site-packages/xpra/client/window_backing_base.py", line 186, in process_delta rgb_data = xor_str(img_data, ldata) File "xpra/codecs/xor/cyxor.pyx", line 19, in xpra.codecs.xor.cyxor.xor_str (xpra/codecs/xor/cyxor.c:736) assert len(buf)==len(xor), "cyxor cannot xor strings of different lengths (%s:%s vs %s:%s)" % (type(buf), len(buf), type(xor), len(xor)) AssertionError: cyxor cannot xor strings of different lengths (<type 'str'>:38192 vs <type 'str'>:28644)
Which is what you would expect if you were xoring RGBX with RGB (25% bigger).
I don't see how that is possible because we verify the pixel format AND the size of the buffer before we xor on the server side...
Attachments (1)
Change History (14)
comment:1 Changed 6 years ago by
Status: | new → assigned |
---|
comment:2 Changed 6 years ago by
Priority: | critical → minor |
---|
Managed to reproduce, at least something similar...
with:
- server started with:
xpra start :10 --bind-tcp=0.0.0.0:10000 --start-child=xterm
- proxy server started with:
XPRA_PROXY_PASSTHROUGH=1 xpra proxy :20 --bind-tcp=0.0.0.0:10001 --auth=allow -d encoding
- client started with:
xpra attach tcp:192.168.42.100:10001 --no-mmap --password-file=./password.txt
Notes:
- r9434 makes it easier to debug encoding related issues with the proxy server (adds
-d encoding
support) - r9399 adds
XPRA_PROXY_PASSTHROUGH
to make it easier to trigger and debug the passthrough code. Which normally only fires when the video context creation fails and PIL is missing... (I believe I was hitting the first condition because of r9413, not sure about the second!?)
The fix for this bug is in r9435, which also makes the code more generic: we no longer assume the pixels format is BGRX or RGBX format (with X aka alpha at the end).
Backported to v0.14.x and v0.15.x in r9438.
I am keeping this ticket open because I am not convinced that the "25% bigger" in the ticket description is necessarily the same bug.
In any case, we should also not assume that "rgb32" is available in all clients.
But since I can no longer trigger any proxy encoding bugs, I am also lowering the priority.
comment:3 Changed 6 years ago by
Priority: | minor → blocker |
---|
I have hit this again, without the proxy... raising.
Changed 6 years ago by
Attachment: | strict-delta-coding.patch added |
---|
ensures we never mix encodings with delta - which could make us mix pixel formats on the receiving end
comment:4 Changed 6 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Maybe r9238 is wrong because maybe we do care what encoding is used. Different encodings use different decoders, and some might give us RGB, others RGBX... for the same input pixel format. Which could explain the differing sizes.
So r9448 makes the checks more strict. (I couldn't get it to trigger during testing, but we should not be making assumptions like these)
But that's not all, setting XPRA_MAX_DELTA_HITS
low enough does trigger the problem.
The fix is obvious and embarrassing: see r9449.
Backports in r9450.
comment:5 Changed 6 years ago by
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Seen the bug again.
Stacktraces always seem to come in pairs. Could be related to the window edges?
comment:6 Changed 6 years ago by
More fixes for multi delta and the expiry code, see #866.
Those should be different from the "25% bigger" bug reported here though...
If it does occur again, we can also enable XPRA_INTEGRITY_HASH=1
as per ticket:866#comment:7 (maybe the debug output will help)
comment:7 Changed 6 years ago by
Maybe only occurring with opengl=on and rgb24 as per ticket:891#comment:10. If that's true, I have no idea why.
comment:8 Changed 6 years ago by
Cc: | onlyjob@… added |
---|
comment:9 Changed 6 years ago by
Found the problem, we sometimes call rgb_reformat
to save space (from say BGRX
to RGB
- which is 25% smaller!), and the client may not need to convert it back: with opengl we can just upload RGB
to the gpu.
But for xoring, we need the same format as the original...
This is not new to 0.15, but it just makes it easier to hit.
Fix on its way, but if it's too big the backport may more brutal: maybe just disable delta altogether!
comment:10 Changed 6 years ago by
Now I know:
- why this only affects OR windows like menus: we send the pixels for those windows before the client maps the window, and therefore before we know the full list of pixel modes which are going to be available - and so we use conservative values.. like
RGB
- by the time the second window update comes along, we have the full list, and we may send the pixels in a different format - this only affects opengl=on, because the non opengl backends don't support
BGRX
directly
comment:11 Changed 6 years ago by
Fixed in r9970: we just don't bother with delta if the pixel format is going to require reformatting as it's too difficult to guarantee that the client will be able to reconstruct the same buffer format at the other end in that case. (we often drop the unused 'X' part for example).
Backports to v0.14.x and v0.15.x in r9971.
I still want to investigate why those errors didn't trigger a paint refresh. All paint errors should at worst cause a log message and slower repaint, not persistent visual corruption.
comment:12 Changed 6 years ago by
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
r9978 ensures we refresh the whole window if we ever get client-side decoding errors (previously this only worked for video modes) - this may be worth applying to 0.15 at some point.
I think this ticket is done, feel free to re-open if you ever hit it again.
Note: as of r9976, trunk is likely to fail with a more precise error message which will look like this: delta region uses XXXX format, was expecting XXXX
(where the XXXXs are usually BGRX
and RGB
)
comment:13 Changed 2 days ago by
this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/861
Although I cannot reproduce this right now, I suspect that I hit this issue whilst testing through the proxy server, which does things with "draw" packets.
If I cannot find the bug, we can always disable delta buckets (set it to 0) when going through the proxy.