Xpra: Ticket #1056: server encoding queue errors: AssertionError in video_encode

Running yet another overnight video playing test, 0.16.0 r11392 win32 client against 0.16.0 r11366 fedora 21 server, I got some encode queue errors about three hours in.

2015-12-15 20:05:15,480 error processing encode queue:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/xpra/server/window/window_source.py", line 1336, in encode_from_queue
    self.make_data_packet_cb(*item)
  File "/usr/lib64/python2.7/site-packages/xpra/server/window/window_source.py", line 1365, in make_data_packet_cb
    packet = self.make_data_packet(damage_time, process_damage_time, wid, image, coding, sequence, options, flush)
  File "/usr/lib64/python2.7/site-packages/xpra/server/window/window_source.py", line 1708, in make_data_packet
    ret = encoder(coding, image, options)
  File "/usr/lib64/python2.7/site-packages/xpra/server/window/window_video_source.py", line 1298, in video_encode
    assert ve
AssertionError
2015-12-15 20:56:13,910 error processing encode queue:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/xpra/server/window/window_source.py", line 1336, in encode_from_queue
    self.make_data_packet_cb(*item)
  File "/usr/lib64/python2.7/site-packages/xpra/server/window/window_source.py", line 1365, in make_data_packet_cb
    packet = self.make_data_packet(damage_time, process_damage_time, wid, image, coding, sequence, options, flush)
  File "/usr/lib64/python2.7/site-packages/xpra/server/window/window_source.py", line 1708, in make_data_packet
    ret = encoder(coding, image, options)
  File "/usr/lib64/python2.7/site-packages/xpra/server/window/window_video_source.py", line 1298, in video_encode
    assert ve
AssertionError
2015-12-15 21:15:03,622 Warning: client decoding error: 'NoneType' object does not support item assignment
2015-12-15 21:15:03,696 Warning: client decoding error: video decoder avcodec failed to decode 87829 bytes of h264 data
2015-12-16 00:08:44,201 error processing encode queue:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/xpra/server/window/window_source.py", line 1336, in encode_from_queue
    self.make_data_packet_cb(*item)
  File "/usr/lib64/python2.7/site-packages/xpra/server/window/window_source.py", line 1365, in make_data_packet_cb
    packet = self.make_data_packet(damage_time, process_damage_time, wid, image, coding, sequence, options, flush)
  File "/usr/lib64/python2.7/site-packages/xpra/server/window/window_source.py", line 1708, in make_data_packet
    ret = encoder(coding, image, options)
  File "/usr/lib64/python2.7/site-packages/xpra/server/window/window_video_source.py", line 1298, in video_encode
    assert ve
AssertionError

I'll attach xpra info (grabbed many hours later from still active session) as well, in case.



Wed, 16 Dec 2015 19:07:27 GMT - alas: attachment set

xpra info


Thu, 17 Dec 2015 02:35:39 GMT - Antoine Martin: priority, status changed

How lucky! (or unlucky depending on how you look at it). This is a very very small race condition: we check that the video encoder instance exists after check_pipeline returns True, and it only returns True if we have verified that the encoder and csc instance are correct for the given input, creating a new one if necessary. check_pipeline_score (which can run from the timer thread to re-evaluate the best video pipeline) must be clearing the video encoder just between the call to check_pipeline and the next instruction!

So r11412 worksaround that by using the fallback code for this rare case, I will backport then see if I can come up with a more proper fix. (though this one isn't too bad apart from the warning it prints: we only clear the reference to the instance, the actual encoder cleanup is always done in the encode thread, so there is no risk of memory corruption or worse)


Tue, 22 Dec 2015 13:43:48 GMT - Antoine Martin: status changed; resolution set

I am closing this as "fixed" because all the more "proper" fixes for this small race condition are just worse (ie: adding locking). We'll just have to live with the odd warning.


Tue, 22 Dec 2015 13:46:42 GMT - Antoine Martin: summary changed

(more descriptive bug title)


Sat, 23 Jan 2021 05:13:42 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/1056