xpra icon
Bug tracker and wiki

Opened 13 days ago

Last modified 13 days ago

#1633 assigned defect

protocol queue deadlock on close

Reported by: Antoine Martin Owned by: Antoine Martin
Priority: major Milestone: 2.2
Component: network Version: trunk
Keywords: Cc:

Description

Was easily triggered with the RFB adaptor (#1620) and an ultravnc client.
Shows up as this gdb backtrace:

Traceback (most recent call first):
  Waiting for the GIL
  File "/usr/lib64/python2.7/threading.py", line 340, in wait
    waiter.acquire()
  File "/usr/lib64/python2.7/Queue.py", line 126, in put
    self.not_full.wait()
  File "/usr/lib64/python2.7/site-packages/xpra/server/rfb/rfb_protocol.py", line 242, in raw_write
    self._write_queue.put(contents)
  File "/usr/lib64/python2.7/site-packages/xpra/server/rfb/rfb_protocol.py", line 233, in send
    self.raw_write(packet)
  File "/usr/lib64/python2.7/site-packages/xpra/server/rfb/rfb_source.py", line 88, in send
    p.send(msg)
  File "/usr/lib64/python2.7/site-packages/xpra/server/rfb/rfb_source.py", line 74, in damage
    self.send(pixels)
  File "/usr/lib64/python2.7/site-packages/xpra/server/server_base.py", line 3054, in _damage
    ss.damage(wid, window, x, y, width, height, options)
  File "/usr/lib64/python2.7/site-packages/xpra/server/rfb/rfb_server.py", line 152, in _process_rfb_FramebufferUpdateRequest
    self._damage(model, x, y, w, h)
  File "/usr/lib64/python2.7/site-packages/xpra/gtk_common/gtk_util.py", line 423, in gtk_main
    gtk.main()
  File "/usr/lib64/python2.7/site-packages/xpra/server/gtk_server_base.py", line 64, in do_run
    gtk_main()
  File "/usr/lib64/python2.7/site-packages/xpra/server/server_core.py", line 479, in run
    self.do_run()
  File "/usr/lib64/python2.7/site-packages/xpra/server/server_base.py", line 881, in run
    return ServerCore.run(self)
  File "/usr/lib64/python2.7/site-packages/xpra/scripts/server.py", line 1000, in run_server
    e = app.run()
  File "/usr/lib64/python2.7/site-packages/xpra/scripts/main.py", line 1469, in run_mode
    return run_server(error_cb, options, mode, script_file, args, current_display)
  File "/usr/lib64/python2.7/site-packages/xpra/scripts/main.py", line 177, in main
    return run_mode(script_file, err, options, args, mode, defaults)
  File "/usr/bin/xpra", line 15, in <module>
    sys.exit(main(sys.argv[0], sys.argv))

We're waiting for the queue to be empty to add a packet to it, but the queue has been flushed already and we've put the None packet in there and the IO thread has terminated (leaving the None packet in there). And so the queue will never be emptied again and this thread will wait forever - and since we call the RFB code in the main thread... that's even worse here.

Change History (1)

comment:1 Changed 13 days ago by Antoine Martin

Status: newassigned

r16796 takes the lazy approach of allowing 2 items in the write queue.

I am keeping this ticket open because:

  • a cleaner solution would be better
  • it may be possible to trigger a similar deadlock using the regular protocol class: if not now, later since this behaviour is not obvious
Note: See TracTickets for help on using tickets.