Xpra: Ticket #669: speaker forwarding may cause server to hang

Attaching/disconnecting with speaker forwarding sometimes leaves server in unresponsive state. Here are fragments ofom server log taken with "-d sound":

Server started:

2014-09-04 19:10:39,969 xpra server version 0.14.4 (r7467)
2014-09-04 19:10:39,969 running with pid 9207
E: [pulseaudio] pid.c: Daemon already running.
2014-09-04 19:10:40,012 child 'pulseaudio' with pid 9219 has terminated
2014-09-04 19:10:40,098 xpra is ready.
2014-09-04 19:10:41,965 Warning: pulseaudio has terminated. Either fix the pulseaudio command line or use --no-pulseaudio to avoid this warning.
2014-09-04 19:10:41,965  usually, only a single pulseaudio instance can be running for each user account, and one may be running already

Client attached and disconnected (1st time, all good):

2014-09-04 19:11:28,295 new_sound_buffer(<SoundSource object at 0x7f2fea99f7d0 (xpra+sound+src+SoundSource at 0x237fec0)>, 60, {'duration': 500000000L, 'timestamp': 285000
00000L}) source=<SoundSource object at 0x7f2fea99f7d0 (xpra+sound+src+SoundSource at 0x237fec0)>, suspended=False, sequence=-1
2014-09-04 19:11:28,793 new_sound_buffer(<SoundSource object at 0x7f2fea99f7d0 (xpra+sound+src+SoundSource at 0x237fec0)>, 60, {'duration': 500000000L, 'timestamp': 290000
00000L}) source=<SoundSource object at 0x7f2fea99f7d0 (xpra+sound+src+SoundSource at 0x237fec0)>, suspended=False, sequence=-1
2014-09-04 19:11:29,005 Connection lost
2014-09-04 19:11:29,006 stop_sending_sound() sound_source=<SoundSource object at 0x7f2fea99f7d0 (xpra+sound+src+SoundSource at 0x237fec0)>
2014-09-04 19:11:29,006 xpra client disconnected.
2014-09-04 19:11:29,006 stop_sending_sound_thread(())
2014-09-04 19:11:29,007 SoundPipeline.cleanup()
2014-09-04 19:11:29,007 SoundPipeline.stop()
2014-09-04 19:11:29,008 state changed: <gst.Message GstMessageState, old-state=(GstState)GST_STATE_PLAYING, new-state=(GstState)GST_STATE_PAUSED, pending-state=(GstState)G
ST_STATE_VOID_PENDING; from wavpackenc0 at 0x2659530>
2014-09-04 19:11:29,008 state changed: <gst.Message GstMessageState, old-state=(GstState)GST_STATE_PLAYING, new-state=(GstState)GST_STATE_PAUSED, pending-state=(GstState)G
ST_STATE_VOID_PENDING; from volume at 0x26594b0>
2014-09-04 19:11:29,008 state changed: <gst.Message GstMessageState, old-state=(GstState)GST_STATE_PLAYING, new-state=(GstState)GST_STATE_PAUSED, pending-state=(GstState)G
ST_STATE_VOID_PENDING; from audioconvert0 at 0x2659430>
2014-09-04 19:11:29,008 state changed: <gst.Message GstMessageState, old-state=(GstState)GST_STATE_PLAYING, new-state=(GstState)GST_STATE_PAUSED, pending-state=(GstState)G
ST_STATE_VOID_PENDING; from pulsesrc0 at 0x2659330>
2014-09-04 19:11:29,008 new-state=PAUSED
2014-09-04 19:11:29,009 SoundPipeline.stop() done
2014-09-04 19:11:29,009 SoundPipeline.cleanup() done
2014-09-04 19:11:29,009 stop_sending_sound_thread(()) done

Client attached and disconnected (2nd try, still good):

2014-09-04 19:11:47,894 new_sound_buffer(<SoundSource object at 0x7f2fea99f870 (xpra+sound+src+SoundSource at 0x7f2fa8002500)>, 60, {'duration': 500000000L, 'timestamp': 7
000000000L}) source=<SoundSource object at 0x7f2fea99f870 (xpra+sound+src+SoundSource at 0x7f2fa8002500)>, suspended=False, sequence=-1
2014-09-04 19:11:48,393 new_sound_buffer(<SoundSource object at 0x7f2fea99f870 (xpra+sound+src+SoundSource at 0x7f2fa8002500)>, 60, {'duration': 500000000L, 'timestamp': 7
500000000L}) source=<SoundSource object at 0x7f2fea99f870 (xpra+sound+src+SoundSource at 0x7f2fa8002500)>, suspended=False, sequence=-1
2014-09-04 19:11:48,841 client Protocol(SocketConnection(/home/hawk/.xpra/debstor-22)) has requested disconnection: exit on signal SIGINT
2014-09-04 19:11:48,841 Disconnecting client Protocol(SocketConnection(/home/hawk/.xpra/debstor-22)): client request
2014-09-04 19:11:48,842 stop_sending_sound() sound_source=<SoundSource object at 0x7f2fea99f870 (xpra+sound+src+SoundSource at 0x7f2fa8002500)>
2014-09-04 19:11:48,842 stop_sending_sound_thread(())
2014-09-04 19:11:48,842 SoundPipeline.cleanup()
2014-09-04 19:11:48,842 xpra client disconnected.
2014-09-04 19:11:48,843 SoundPipeline.stop()
2014-09-04 19:11:48,844 state changed: <gst.Message GstMessageState, old-state=(GstState)GST_STATE_PLAYING, new-state=(GstState)GST_STATE_PAUSED, pending-state=(GstState)G
ST_STATE_VOID_PENDING; from wavpackenc1 at 0x7f2fa4001ec0>
2014-09-04 19:11:48,844 state changed: <gst.Message GstMessageState, old-state=(GstState)GST_STATE_PLAYING, new-state=(GstState)GST_STATE_PAUSED, pending-state=(GstState)G
ST_STATE_VOID_PENDING; from volume at 0x7f2fa4001f40>
2014-09-04 19:11:48,844 state changed: <gst.Message GstMessageState, old-state=(GstState)GST_STATE_PLAYING, new-state=(GstState)GST_STATE_PAUSED, pending-state=(GstState)G
ST_STATE_VOID_PENDING; from audioconvert1 at 0x2674850>
2014-09-04 19:11:48,844 state changed: <gst.Message GstMessageState, old-state=(GstState)GST_STATE_PLAYING, new-state=(GstState)GST_STATE_PAUSED, pending-state=(GstState)G
ST_STATE_VOID_PENDING; from pulsesrc1 at 0x26748d0>
2014-09-04 19:11:48,845 new-state=PAUSED
2014-09-04 19:11:48,845 state changed: <gst.Message GstMessageState, old-state=(GstState)GST_STATE_PLAYING, new-state=(GstState)GST_STATE_PAUSED, pending-state=(GstState)G
ST_STATE_READY; from sink at 0x26749d0>
2014-09-04 19:11:48,845 SoundPipeline.stop() done
2014-09-04 19:11:48,845 SoundPipeline.cleanup() done
2014-09-04 19:11:48,845 stop_sending_sound_thread(()) done

3rd attach, renders empty windows:

2014-09-04 19:11:58,224 New connection received: SocketConnection(/home/hawk/.xpra/debstor-22)
2014-09-04 19:12:02,211 Handshake complete; enabling connection
2014-09-04 19:12:02,215 pulseaudio id=, server=, sound decoders=['wavpack'], sound encoders=['wavpack', 'flac', 'wav', 'speex'], receive=True, send=True
2014-09-04 19:12:02,216 Python/Gtk2 Linux client version 0.14.4 connected from 'deblab' as 'hawk' ('hawk')
2014-09-04 19:12:02,216 client supplied an mmap_file: /tmp/xpra.xdWpza.mmap but we cannot find it
2014-09-04 19:12:02,216 using rgb as primary encoding, also available: h264, vp8, png, png/P, png/L, webp, rgb24, jpeg, rgb32
2014-09-04 19:12:02,216 client root window size is 4400x2560 with 1 displays:
2014-09-04 19:12:02,216   ':0.0' (1161x675 mm) workarea: 4400x2527 at 0x33
2014-09-04 19:12:02,216     DisplayPort-1 1280x1024 at 0x1536 (360x270 mm)
2014-09-04 19:12:02,217     DVI-0 1440x2560 at 1280x0 (597x336 mm)
2014-09-04 19:12:02,217     DisplayPort-0 1680x1050 at 2720x1510 (474x296 mm)
2014-09-04 19:12:02,217 best resolution matching 4400x2560 is unchanged: 5120x3200
2014-09-04 19:12:02,219 setting key repeat rate from client: 660ms delay / 40ms interval
2014-09-04 19:12:02,221 keyboard mapping already configured (skipped)
2014-09-04 19:12:02,232 pactl_output('list',) returned 0
2014-09-04 19:12:02,242 pactl_output('list',) returned 0
2014-09-04 19:12:02,252 pactl_output('list',) returned 0
2014-09-04 19:12:02,264 pactl_output('list',) returned 0
2014-09-04 19:12:02,314 sound_control(start, ('wavpack',))
2014-09-04 19:12:02,314 start_sending_sound(wavpack)
2014-09-04 19:12:02,314 using sound codec wavpack
2014-09-04 19:12:02,327 pactl_output('stat',) returned 0
2014-09-04 19:12:02,327 get_pactl_stat_line(Server String:)=/tmp/pulse-8XiBnTK3trYQ/native
2014-09-04 19:12:02,328 start sound, remote pulseaudio server=, local pulseaudio server=/tmp/pulse-8XiBnTK3trYQ/native
2014-09-04 19:12:02,328 start sound, client id=, server id=
2014-09-04 19:12:02,341 pactl_output('list',) returned 0
2014-09-04 19:12:02,341 found pulseaudio monitor devices: {'auto_null.monitor': 'Monitor of Dummy Output'}
2014-09-04 19:12:02,353 pactl_output('set-source-mute', 'auto_null.monitor', '0') returned 0
2014-09-04 19:12:02,354 set_source_mute: output=
2014-09-04 19:12:02,354 pipeline elements=['pulsesrc device=auto_null.monitor', 'audioconvert', 'volume name=volume volume=1.0', 'wavpackenc', None, 'appsink name=sink']
2014-09-04 19:12:02,354 pipeline=pulsesrc device=auto_null.monitor ! audioconvert ! volume name=volume volume=1.0 ! wavpackenc ! appsink name=sink
2014-09-04 19:12:02,357 starting sound capture using pulseaudio device: Monitor of Dummy Output
2014-09-04 19:12:02,357 start_sending_sound() sound source=<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>
2014-09-04 19:12:02,357 SoundPipeline.start()
2014-09-04 19:12:02,362 SoundPipeline.start() done
2014-09-04 19:12:02,363 state changed: <gst.Message GstMessageState, old-state=(GstState)GST_STATE_NULL, new-state=(GstState)GST_STATE_READY, pending-state=(GstState)GST_S
TATE_VOID_PENDING; from sink at 0x26749d0>
2014-09-04 19:12:05,188 new preroll0: 112 bytes
2014-09-04 19:12:05,189 new_sound_buffer(<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, 112, {'duration': 500000000L, 'timestamp':
0L}) source=<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, suspended=False, sequence=-1
2014-09-04 19:12:05,189 new_sound_buffer(<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, 112, {'duration': 500000000L, 'timestamp':
0L}) source=<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, suspended=False, sequence=-1
2014-09-04 19:12:05,688 new_sound_buffer(<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, 60, {'duration': 500000000L, 'timestamp': 5
00000000L}) source=<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, suspended=False, sequence=-1
2014-09-04 19:12:06,186 new_sound_buffer(<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, 60, {'duration': 500000000L, 'timestamp': 1
000000000L}) source=<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, suspended=False, sequence=-1
2014-09-04 19:12:06,685 new_sound_buffer(<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, 60, {'duration': 500000000L, 'timestamp': 1
500000000L}) source=<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, suspended=False, sequence=-1

After disconnect server keeps logging:

2014-09-04 19:12:57,173 new_sound_buffer(<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, 60, {'duration': 500000000L, 'timestamp': 5
2000000000L}) source=<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, suspended=False, sequence=-1
2014-09-04 19:12:57,682 new_sound_buffer(<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, 60, {'duration': 500000000L, 'timestamp': 5
2500000000L}) source=<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, suspended=False, sequence=-1
2014-09-04 19:12:58,181 new_sound_buffer(<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, 60, {'duration': 500000000L, 'timestamp': 5
3000000000L}) source=<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, suspended=False, sequence=-1
2014-09-04 19:12:58,680 new_sound_buffer(<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, 60, {'duration': 500000000L, 'timestamp': 5
3500000000L}) source=<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, suspended=False, sequence=-1
2014-09-04 19:12:59,178 new_sound_buffer(<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, 60, {'duration': 500000000L, 'timestamp': 5
4000000000L}) source=<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, suspended=False, sequence=-1
2014-09-04 19:12:59,677 new_sound_buffer(<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, 60, {'duration': 500000000L, 'timestamp': 5
4500000000L}) source=<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, suspended=False, sequence=-1
2014-09-04 19:13:00,176 new_sound_buffer(<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, 60, {'duration': 500000000L, 'timestamp': 5
5000000000L}) source=<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, suspended=False, sequence=-1
2014-09-04 19:13:00,674 new_sound_buffer(<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, 60, {'duration': 500000000L, 'timestamp': 5
5500000000L}) source=<SoundSource object at 0x7f2fe822c6e0 (xpra+sound+src+SoundSource at 0x7f2fdc0021a0)>, suspended=False, sequence=-1

At this point server do not respond to new connections so I terminate it with "kill -9".

Workaround: attach with --no-speaker.

This is a very old issue which I can reproduce easily with 0.14.4.



Thu, 04 Sep 2014 09:42:46 GMT - onlyjob: version changed


Fri, 05 Sep 2014 14:34:51 GMT - Antoine Martin: owner changed

Can you check if maybe we're leaking something every time we start + stop sound? Maybe run it without limits, and / or even as root to try? This would cause us to hit some limit eventually (file descriptors?), which could explain why some things still run (new_sound_buffer) but others become unresponsive.


Tue, 23 Sep 2014 06:08:12 GMT - Antoine Martin:

See also: #663, ticket:400#comment:8.


Tue, 23 Sep 2014 13:34:29 GMT - Antoine Martin: attachment set

backtrace from a deep clipboard loop


Tue, 23 Sep 2014 13:35:06 GMT - Antoine Martin:

Well, well, maybe I can reproduce this after all. I was investigating something completely different and I think I managed to reproduce it semi reliably with:

The trick here was to also have synergy syncing the clipboard between the laptop and the host running the xp vm, as well as virtualbox doing more clipboard synchronization between the host and the xp vm, then finally we have xpra doing clipboard sync between the laptop and xp... full circle.

At one point, I saw 90+ kworker threads in my process list - they eventually cleared themselves.

But I also managed to lockup the laptop's display, requiring a reboot (killing X11 would have done it too I suppose).

Interestingly, killing a stuck server made it spit out:

2014-09-23 20:30:43,823 xpra client disconnected.
2014-09-23 20:30:43,827 loop nesting too deep: 10
2014-09-23 20:30:43,827 you may have a clipboard forwarding loop, disabling the clipboard
2014-09-23 20:30:43,828 New connection received: SocketConnection(('192.168.42.103', 10000) - ('192.168.42.195', 1316))
2014-09-23 20:30:43,829 New connection received: SocketConnection(/home/antoine/.xpra/laptop-0)
2014-09-23 20:30:43,830 New connection received: SocketConnection(/home/antoine/.xpra/laptop-0)
2014-09-23 20:30:43,831 New connection received: SocketConnection(/home/antoine/.xpra/laptop-0)
2014-09-23 20:30:43,831 New connection received: SocketConnection(/home/antoine/.xpra/laptop-0)
2014-09-23 20:30:43,832 New connection received: SocketConnection(/home/antoine/.xpra/laptop-0)
2014-09-23 20:30:43,833 New connection received: SocketConnection(/home/antoine/.xpra/laptop-0)
2014-09-23 20:30:43,849 client Protocol(SocketConnection(('192.168.42.103', 10000) - ('192.168.42.195', 1315))) has requested disconnection: exit on signal SIGINT
2014-09-23 20:30:43,849 Disconnecting client Protocol(SocketConnection(('192.168.42.103', 10000) - ('192.168.42.195', 1315))): client request
2014-09-23 20:30:43,850 internal error: read connection SocketConnection(('192.168.42.103', 10000) - ('192.168.42.195', 1316)) reset: [Errno 104] Connection reset by peer
2014-09-23 20:30:43,856 Handshake complete; enabling connection
2014-09-23 20:30:43,857 Python/Gtk2 Microsoft Windows XP client version 0.15.0 connected from 'XP-Pro-Dev-VBOX' as 'XPPro' ('XPPro')
2014-09-23 20:30:43,857 using h264 as primary encoding, also available: vp8, png, png/P, png/L, webp, rgb24, jpeg, rgb32
2014-09-23 20:30:43,857 client root window size is 1575x1177
2014-09-23 20:30:43,858 shadow server: setting default keymap translation
2014-09-23 20:30:43,907 Connection lost
2014-09-23 20:30:43,907 Connection lost
2014-09-23 20:30:43,907 Connection lost
2014-09-23 20:30:43,907 Connection lost
2014-09-23 20:30:43,907 Connection lost
2014-09-23 20:30:43,907 Connection lost

@onlyjob: can you reproduce your problem with clipboard disabled? (I cannot so far)


Tue, 23 Sep 2014 16:01:09 GMT - Antoine Martin: priority changed; milestone set

I've now connected well over a hundred times (scripted detach, with attach in a while true loop). No problem as long as the clipboard is disabled. I think that sound just makes it worse as it relies on the main loop to be running to get the data out of gstreamer + pulseaudio in time. But maybe there are other ways of getting the main loop stuck than the clipboard?

Maybe I made things worse with r7751? As it will cause a timeout on the main loop waiting for the reply to the message that causes us to stop the clipboard. I don't think so, as the main loop should still be running, just nested. But maybe this nesting doesn't play well with gstreamer? Worst case scenario: we have to rewrite the ugly clipboard nested loop code, or move the sound code to its own process (using multiprocessing).

@onlyjob: if you can reproduce this problem with the clipboard disabled, please provide a gdb backtrace.


Thu, 02 Oct 2014 17:14:29 GMT - onlyjob: owner changed

That's right, I can't reproduce when attaching with --no-clipboard.


Fri, 17 Oct 2014 10:08:04 GMT - Antoine Martin: status changed

I have finally managed to reproduce this one. I had to kill -9 the server and use xpra start --use-display to recover the session.

It took a laptop running the Fedora 21 alpha as server, and just my regular Fedora 20 client. Just a shame I am going away (hence the laptop upgrade)... this will have to wait until I get back.


Mon, 27 Oct 2014 18:08:41 GMT - Antoine Martin:

I have an idea: until we rewrite the clipboard using native calls (which would be nice as it would get rid of the ugly nested loop hack - but this is going to take a long time), we can move the sound code to its own process using multiprocessing - which means it will get its own main loop, independent of the clipboard code. It is Python 2.6+ only, which is fine for v0.15+. We can push sound data using a multiprocessing queue, and to get the events back out maybe we have to keep a thread around, just waiting for the out queue. It also means that this process will be less subjected to latency issues found in the main process, so less likely to get overruns.


Wed, 19 Nov 2014 04:04:29 GMT - onlyjob:

I love this idea. :) Debian dropped Python-2.6 and 2.7 is in Wheezy for a while so I believe it is worth abandoning support for 2.6 in order to benefit from multiprocessing.


Wed, 19 Nov 2014 22:51:07 GMT - Antoine Martin:

multiprocessing is available since python 2.6

Unfortunately, it just doesn't work with pygst, which relies on the gobject main loop and we cannot run two copies of the main loop...

The only solution that I can see would be to use a distinct subprocess that we spawn and talk to using pipes: we could use its stdin to send it messages (set_volume and stop? anything else? stop can be done with sigint) and get the compressed sound data from its stdout? Better yet would be to use mmap and re-use our video mmap code.


Mon, 16 Feb 2015 07:08:27 GMT - Antoine Martin: attachment set

moves the sound capture / play to a separate process - hackish


Mon, 16 Feb 2015 07:14:20 GMT - Antoine Martin:

The patch above is very hackish and would need cleaning up. It only handles sound capture via a separate process, not playback.

It doesn't handle any message exchange, we just assume the data is the compressed sound buffers.. (don't want to add the network layer there if it can be avoided)

It doesn't handle mmap transfers which would cave us a lot of unnecessary events and process context switching, if only we had an easy way of signalling the parent that some data is available without writing to the pipe. Can't see how.

That said, it works very well and start + stop are very reliable.


Mon, 09 Mar 2015 14:24:58 GMT - Antoine Martin: attachment set

very slightly updated patch


Tue, 10 Mar 2015 05:44:12 GMT - Antoine Martin: attachment set

define a new xpra subcommand and use the standard protocol for wrapping the data to transfer


Tue, 10 Mar 2015 11:11:33 GMT - Antoine Martin: attachment set

use the regular network stack for forwarding sound data and signals going both ways - also emulate method calls (but not return values..)


Tue, 10 Mar 2015 12:17:46 GMT - Antoine Martin:

Things left to do:


Tue, 10 Mar 2015 13:35:11 GMT - Antoine Martin: attachment set

mostly working patch


Wed, 11 Mar 2015 13:56:46 GMT - Antoine Martin: attachment set

latest work in progress


Sat, 14 Mar 2015 11:23:12 GMT - Antoine Martin: attachment set

updated patch with almost working client side and get_state emulation


Sat, 14 Mar 2015 13:55:54 GMT - Antoine Martin: attachment set

it all works, except when it doesn't (need to resolve state races now that the state updates take much longer to propagate - through the protocol layer and pipes)


Mon, 16 Mar 2015 05:13:55 GMT - Antoine Martin: attachment set

lots of problems on win32 (surprise surprise) because pipes don't work properly there


Mon, 16 Mar 2015 16:36:39 GMT - Antoine Martin: attachment set

updated patch - the pipes really are getting closed (ignoring the eof does not help)


Tue, 17 Mar 2015 03:29:36 GMT - Antoine Martin: attachment set

win32 works (with ugly platform code), but now we get underruns.. and choppy sound!


Tue, 17 Mar 2015 11:01:47 GMT - Antoine Martin: attachment set

work in progress: linux works fine (I think), but win32 still does not (jitter and underruns)


Tue, 17 Mar 2015 11:14:03 GMT - Antoine Martin:

Latest status: the win32 client seems to suffer from clock drift. But maybe this isn't a gstreamer pipeline clock drift but rather a system clock problem?? (added the "time" metadata attribute to see how this changes over time)

Notes:

Latest TODO list:

Later:


Tue, 17 Mar 2015 11:41:00 GMT - Antoine Martin: attachment set

I think this is it!


Tue, 17 Mar 2015 14:06:34 GMT - Antoine Martin: owner, status changed

Finally applied in r8786. See long commit message for details. (only very minor differences since v11 patch)

Some new environment variables (which should not be needed - I hope):

@afarr: what needs testing:

Please re-assign to me so that I can try to close the remaining issues (as per comment:13 - though many are likely to be re-scheduled unless critical)


Wed, 18 Mar 2015 05:00:46 GMT - Antoine Martin:

@afarr: as of r8793, we can tune the minimum queue time (the amount of data we buffer before we start feeding the sound card) as well as the overall buffer size.

ie:

XPRA_SOUND_QUEUE_TIME=300 XPRA_SOUND_QUEUE_MINTIME=100 xpra attach -d sound

shows:

pipeline elements=['appsrc name=src max-bytes=32768 emit-signals=0 block=0 is-live=0 stream-type=stream format=4', \
    'mp3parse', 'mad', 'audioconvert', 'audioresample', 'volume name=volume volume=1.0', \
    'queue name=queue min-threshold-time=100000000 max-size-buffers=0 max-size-bytes=0 max-size-time=300000000 leaky=2', \
    'autoaudiosink async=True qos=True sync=False']

I don't think we want to reduce the buffer size too much, or we will get more overruns. But maybe a little? (better to get overruns than having the sound go too far out of sync) To see the level of the sound buffer, you can

We should be able to reduce the min queue size. The default was to use half the buffer size, which was 225ms. r8794 reduces this to just a quarter, 112ms. This should be enough so that we don't get underruns (which can cause some crackling), without getting too much out of sync with the picture.

Note: a great idea that Lonnie has had: we should be able to see what range of values the queue buffer is using and self tune it. ie: if the queue has always between 200ms and 250ms worth of sound data in it, then we don't need such a large buffer: having just 150ms total and 50ms minimum would be enough to have 50ms safety on each side. Then we also need to take the video latency into account to try to sync things up... but at least we then know the range (or target?) we can aim for.


Fri, 20 Mar 2015 08:08:36 GMT - Antoine Martin: attachment set

move the subprocess bits to a generic class and add tests


Fri, 20 Mar 2015 15:02:11 GMT - Antoine Martin:

r8805 makes the code much more generic and robust. This shouldn't affect things (and I've tested it on multiple platforms, and it adds a unit test which should help us catch regressions in the future), but I have to mention it.

PS: the idea of prioritizing sound packets was a bad one, they are already queued up ahead of the pixel data, and there is very little else anyway - so no changes needed in that area.

See also: ticket:790#comment:20 which seems to have a way of causing sound restarts.


Sun, 22 Mar 2015 06:48:47 GMT - Antoine Martin: attachment set

osx always requires more effort and works less well - PITA


Mon, 23 Mar 2015 07:41:43 GMT - Antoine Martin: attachment set

osx is being really difficult


Mon, 23 Mar 2015 10:56:27 GMT - Antoine Martin:

Can you please test on various OSX versions and let me know how it goes? I am seeing lots of overruns on my 10.5 VM, but not on my 10.6 one. We can always tune the queue size if we find that it overruns too much (as suggested in comment:15): I think it is better to have sound that is out-of-sync by up to a second rather than having restarts every 2 seconds.. Testing will tell.


Mon, 23 Mar 2015 17:19:45 GMT - Antoine Martin:

If you have problems with stuttering sound and underruns, see ticket:400#comment:15


Mon, 30 Mar 2015 08:51:38 GMT - Antoine Martin:

Looks like something broke py3k (#640) support along the way, yay. I had to commit some minor fixes in r8864. It does spawn the new sound process, and the pipeline looks correct, but nothing happens:

2015-03-30 15:37:54,409 initialized CODECS:
2015-03-30 15:37:54,409 * wavpack : ('wavpackenc', None, 'wavpackdec', 'wavpackparse')
2015-03-30 15:37:54,409 * wav : ('wavenc', None, None, 'wavparse')
2015-03-30 15:37:54,409 * flac : ('flacenc', 'oggmux', 'flacdec', 'oggdemux')
2015-03-30 15:37:54,409 * speex : ('speexenc', 'oggmux', 'speexdec', 'oggdemux')
2015-03-30 15:37:54,412 is_pa_installed() pactl_bin=
2015-03-30 15:37:54,412 SoundSink(..) found matching codecs ['flac']
2015-03-30 15:37:54,413 pipeline elements=['appsrc name=src max-bytes=32768 emit-signals=0 block=0 is-live=0 stream-type=stream format=4', \
    'oggdemux', 'flacdec', 'audioconvert', 'audioresample', 'volume name=volume volume=1.0', \
    'queue name=queue min-threshold-time=112000000 max-size-buffers=0 max-size-bytes=0 max-size-time=450000000 leaky=2', \
    'autoaudiosink sync=False async=True qos=True']
2015-03-30 15:37:54,413 pipeline=appsrc name=src max-bytes=32768 emit-signals=0 block=0 is-live=0 stream-type=stream format=4 ! \
    oggdemux ! flacdec ! audioconvert ! audioresample ! volume name=volume volume=1.0 ! \
    queue name=queue min-threshold-time=112000000 max-size-buffers=0 max-size-bytes=0 max-size-time=450000000 leaky=2 ! \
    autoaudiosink sync=False async=True qos=True
2015-03-30 15:37:54,422 SoundPipeline.start()
2015-03-30 15:37:54,427 sound sink queue underrun: level=0
2015-03-30 15:37:54,428 SoundPipeline.start() done
2015-03-30 15:37:54,429 state changed: <GstMessage at 0x1728440>
2015-03-30 15:37:54,429 state changed: <GstMessage at 0x17284c0>
2015-03-30 15:37:54,429 state changed: <GstMessage at 0x1728540>
2015-03-30 15:37:54,429 state changed: <GstMessage at 0x17285c0>
2015-03-30 15:37:54,429 state changed: <GstMessage at 0x1728640>
2015-03-30 15:37:54,430 state changed: <GstMessage at 0x17286c0>
2015-03-30 15:37:54,430 state changed: <GstMessage at 0x1728740>
2015-03-30 15:37:54,430 state changed: <GstMessage at 0x17283c0>
2015-03-30 15:37:54,430 new-state=READY
2015-03-30 15:37:54,430 stream status: <GstMessage at 0x17561c0>
2015-03-30 15:37:54,431 state changed: <GstMessage at 0x1756240>
2015-03-30 15:37:54,431 state changed: <GstMessage at 0x17562c0>
2015-03-30 15:37:54,431 state changed: <GstMessage at 0x1756340>
2015-03-30 15:37:54,431 stream status: <GstMessage at 0x1756640>
2015-03-30 15:37:54,431 state changed: <GstMessage at 0x17563c0>
2015-03-30 15:37:54,431 state changed: <GstMessage at 0x1756440>
2015-03-30 15:37:54,431 state changed: <GstMessage at 0x17564c0>
2015-03-30 15:37:54,431 stream status: <GstMessage at 0x1756540>
2015-03-30 15:37:54,431 state changed: <GstMessage at 0x17565c0>
2015-03-30 15:37:54,432 stream status: <GstMessage at 0x7f5f60002480>
2015-03-30 15:37:54,432 sound_sink_state_changed(sink_subprocess_wrapper(<subprocess.Popen object at 0x7f58d4145f10>), b'ready') \
    on_sink_ready=<function UIXpraClient.start_receiving_sound.<locals>.sink_ready at 0x7f58d47cbef0>

But this won't hold up the release, the GTK3 port is not in a great state anyway.


Mon, 30 Mar 2015 09:49:54 GMT - Antoine Martin:

As of r8870, we lower the min queue time to just 50ms by default.


Tue, 31 Mar 2015 08:48:31 GMT - Antoine Martin:

@onlyjob: thinking of releasing 0.15.x quite soon, now would be a good time to shout if you still have problems... or wait another 6 months!


Tue, 31 Mar 2015 10:46:50 GMT - onlyjob:

I would be glad to test but I may not be able to do it for few weeks... Too many things on my plate at the moment... Give at least a week or two and hopefully I'll be back with you with a feedback. :)


Tue, 31 Mar 2015 23:47:37 GMT - alas:

Some testing with win/osx 0.15.0 r8872 clients against a fedora 20 0.15.0 r8872 server, turning the speakers on/off with the tray/menu controls works reliably.

Playing with the XPRA_SOUND_QUEUE_TIME & XPRA_SOUND_QUEUE_MINTIME I discovered that, with the settings you mentioned in the example above (TIME=300 & MINTIME=100) I was able to cause some speaker mischief by mousing quickly over assorted link widgets (youtube right side tile menu of suggested videos was particularly useful)... the lower the value the TIME was set to, the easier it was to not only cause overruns, but to then disable the speakers entirely (both osx and window clients).

-d sound logging, just FYI, looks like this:

...
2015-03-31 15:16:11,540 sound sink: pushed 417 bytes, new buffer level: 444ms
2015-03-31 15:16:11,541 add_data(522 bytes, {'duration': 26122449, 'timestamp': 14315102040L, 'caps': 'audio/mpeg, mpegversion=(int)1, mpegaudioversion=(int)1, layer=(int)3, channels=(int)2, rate=(int)44100', 'sequence': 2, 'time': 1427840171322L}) queue_state=pushing, src=/GstPipeline:pipeline0/GstAppSrc:src (__main__.GstAppSrc)
2015-03-31 15:16:11,541 sound sink: pushed 522 bytes, new buffer level: 470ms
2015-03-31 15:16:11,541 sound sink queue overrun: level=470
2015-03-31 15:16:11,542 re-starting speaker because of overrun
2015-03-31 15:16:11,542 stop_receiving_sound() sound sink=sink_subprocess_wrapper(<subprocess.Popen object at 0x7dbda90>)
2015-03-31 15:16:11,542 add_data(835 bytes, {'duration': 26122449, 'timestamp': 14341224489L, 'caps': 'audio/mpeg, mpegversion=(int)1, mpegaudioversion=(int)1, layer=(int)3, channels=(int)2, rate=(int)44100', 'sequence': 2, 'time': 1427840171354L}) queue_state=overrun, src=/GstPipeline:pipeline0/GstAppSrc:src (__main__.GstAppSrc)
2015-03-31 15:16:11,543 dropping new data because of overrun: 100%
2015-03-31 15:16:11,543 ignoring sound data with old sequence number 2
2015-03-31 15:16:11,543 internal error: write connection TwoFileConnection() reset: [Errno 32] Broken pipe
2015-03-31 15:16:11,543 the sound sink process has stopped (())
2015-03-31 15:16:11,543 stop_receiving_sound() sound sink=None
2015-03-31 15:16:11,543 SoundPipeline.stop()
2015-03-31 15:16:11,544 speaker is now disabled - dropping packet
2015-03-31 15:16:11,545 speaker is now disabled - dropping packet
2015-03-31 15:16:11,550 speaker is now disabled - dropping packet
2015-03-31 15:16:11,551 speaker is now disabled - dropping packet
2015-03-31 15:16:11,556 speaker is now disabled - dropping packet
2015-03-31 15:16:11,556 speaker is now disabled - dropping packet
2015-03-31 15:16:11,559 speaker is now disabled - dropping packet
2015-03-31 15:16:11,563 speaker is now disabled - dropping packet
2015-03-31 15:16:11,564 speaker is now disabled - dropping packet
2015-03-31 15:16:11,565 speaker is now disabled - dropping packet
2015-03-31 15:16:11,570 SoundPipeline.stop() done
2015-03-31 15:16:11,746 restart() sound_sink=None, codec=mp3, server_sound_sequence=True
2015-03-31 15:16:11,746 send_new_sound_sequence() sequence=3
2015-03-31 15:16:11,746 start_receiving_sound() sound sink=None
2015-03-31 15:16:11,746 start_sound_sink(mp3)
2015-03-31 15:16:11,747 starting mp3 sound sink
2015-03-31 15:16:11,747 start_receiving_sound(mp3)
2015-03-31 15:16:11,761 mp3 sound sink started
2015-03-31 15:16:11,912 ignoring sound data with old sequence number 2
2015-03-31 15:16:11,914 ignoring sound data with old sequence number 2
2015-03-31 15:16:11,914 ignoring sound data with old sequence number 2
2015-03-31 15:16:11,915 ignoring sound data with old sequence number 2
2015-03-31 15:16:11,915 ignoring sound data with old sequence number 2
2015-03-31 15:16:11,915 ignoring sound data with old sequence number 2
2015-03-31 15:16:11,916 ignoring sound data with old sequence number 2
...

Eventually the speakers don't restart without using a tray/menu link.

2015-03-31 15:16:12,012 sound_sink_state_changed(sink_subprocess_wrapper(<subprocess.Popen object at 0x7dbddd0>), ready) on_sink_ready=<function sink_ready at 0xd4d9cb0>
2015-03-31 15:16:12,012 sink_ready(()) codec=mp3
2015-03-31 15:16:12,017 server sent end-of-stream, closing sound pipeline
2015-03-31 15:16:12,018 stop_receiving_sound() sound sink=sink_subprocess_wrapper(<subprocess.Popen object at 0x7dbddd0>)
2015-03-31 15:16:12,019 SoundPipeline.stop()
2015-03-31 15:16:12,020 SoundPipeline.stop() done
2015-03-31 15:16:12,292 the sound sink process has stopped (())
2015-03-31 15:16:12,293 stop_receiving_sound() sound sink=None
2015-03-31 15:16:12,297 speaker is now disabled - dropping packet

With low TIME settings starting sound in a second tab (at the same time as some first tab) will also cause stuttering and make it even easier to disable speakers.

Testing and tweaking, it looks like XPRA_SOUND_QUEUE_TIME=500 and the now default XPRA_SOUND_QUEUE_MINTIME=50 makes it nearly impossible to disable the speakers with this trick.

Seems very stable otherwise.


Fri, 03 Apr 2015 13:51:15 GMT - Antoine Martin: owner, status changed

Some minor improvements and fixes in r8905, r8906, r8907, r8908. Note: this may actually make things "worse" in the sense that the overruns are now more likely to cause the sound to stop and not restart. I needed to fix that properly anyway. I still cannot figure out why we get the autorelease warnings (see ticket:533#comment:71) - maybe because we tear down the pipeline without giving it enough time to do it cleanly?


Tue, 07 Apr 2015 15:32:53 GMT - Antoine Martin:

more minor updates and refactorings: r8951, r8952, r8953. r8954 should also make the sound stop "cleaner" / more graceful: we try to close the pipeline properly (stopping it then destroying the objects) before killing the process.


Tue, 07 Apr 2015 16:43:26 GMT - Antoine Martin: owner, status changed

r8956 makes it easier to test overruns without changing the queue settings, just start the client with:

XPRA_FAKE_OVERRUN=N xpra attach tcp:192.168.42.100:10000

Where N is the number of seconds to wait before faking an overrun which restarts the sound pipeline.

Using this patch and printing the call stack from stop_receiving_sound, I can see that we call it from the sound_sink_overrun callback as we should, but also from sound_process_stopped and we end up killing the new process and telling the server to stop sending sound data.. r8955 fixes that.

I guess we could also restart the sound when the process dies, but this should never happen, so the log warning message should be enough for now.

I am also using the same XPRA_FAKE_OVERRUN=2 env var to trigger lots of restarts on win32, so far so good. No crashes, restarts cleanly everytime.

@afarr: looks ready for more testing. Can you break it again somehow?


Wed, 08 Apr 2015 01:29:45 GMT - alas:

Well... I couldn't get it to break, using a win32 0.15.0 r8957 client against a fedora 20 0.15.0 r8957 server, but setting the client with XPRA_FAKE_OVERRUN=7 I am noticing that the speaker overruns do cause around a half-second break before the speakers re-start (might just be an issue of the fake overruns in contrast with the "genuine" ones that usually have a much less noticeable "break" in the sound during the re-start process).

The speaker restarts were dependable though.

I did run into a couple of tracebacks server-side though.

2015-04-07 18:15:42,985 sound-source codec: MPEG-1 Layer 3 (MP3)
Exception in thread parse (most likely raised during interpreter shutdown):
Traceback (most recent call last):
  File "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
  File "/usr/lib64/python2.7/threading.py", line 764, in run
  File "/usr/lib64/python2.7/site-packages/xpra/net/protocol.py", line 592, in
_read_parse_thread_loop
  File "/usr/lib64/python2.7/site-packages/xpra/net/protocol.py", line 552, in
_internal_error
<type 'exceptions.AttributeError'>: 'NoneType' object has no attribute 'error'
2015-04-07 18:19:20,947 error processing callback <bound method ServerSource.new_stream of
<xpra.server.source.ServerSource object at 0x7fe8e40255d0>> for new-stream packet
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/xpra/net/subprocess_wrapper.py", line 343, in
process_packet
    cb(self, *all_args)
  File "/usr/lib64/python2.7/site-packages/xpra/server/source.py", line 798, in new_stream
    self.sound_source.codec = codec
AttributeError: 'NoneType' object has no attribute 'codec'
2015-04-07 18:19:21,002 using Pulseaudio device 'Monitor of Null Output'
** Message: pygobject_register_sinkfunc is deprecated (GstObject)
2015-04-07 18:19:21,406 sound-source codec: MPEG-1 Layer 3 (MP3)
2015-04-07 18:19:22,017 using Pulseaudio device 'Monitor of Null Output'
** Message: pygobject_register_sinkfunc is deprecated (GstObject)
2015-04-07 18:19:22,279 using Pulseaudio device 'Monitor of Null Output'
** Message: pygobject_register_sinkfunc is deprecated (GstObject)
2015-04-07 18:19:22,495 error setting up sound: a sound source already exists
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/xpra/server/source.py", line 775, in
start_sending_sound
    assert self.sound_source is None, "a sound source already exists"
AssertionError: a sound source already exists

Wed, 08 Apr 2015 05:40:28 GMT - Antoine Martin: attachment set

work in progress patch to restart the sink pipeline faster


Wed, 08 Apr 2015 05:47:23 GMT - Antoine Martin: owner, status changed

The speaker restarts were dependable though.


That's the most important thing.

I am noticing that the speaker overruns do cause around a half-second break before the speakers re-start


Hopefully, there won't be too many restarts then.

We do buffer a tiny bit more than before (the min queue thing), we wait before the pipeline is actually ready before throwing data at it.. (rather than hoping for the best and filling up the queue too quickly, which is what we did before) which does mean that this delay will actually be worse over high latency links.

I am working on making the restarts faster (which we should be able to do now in a different way: we don't have to wait for the previous sound process to terminate to start a new one - as long as it does terminate eventually...). The problem is that restarts tell the server to stop and start again, we get the "end-of-stream" message and that kills the new pipeline rather the old one! Will fix.

As for the stacktraces above (thanks for posting them), they are caused by us killing the sound process before it has a chance of exiting cleanly - I will try to see if I can just raise the kill delay to allow for a more graceful exit.

Taking the ticket back to deal with those small issues.


Wed, 08 Apr 2015 12:43:45 GMT - Antoine Martin:

Lots more tweaks and subtle fixes in r8958, r8959, r8960, r8961, r8962, r8965, r8966. Some of which should be backported (started in r8963 + r8964 - more needed).

Problem is that I get deadlocks on OSX after restarts, sometimes very quickly, sometimes after 20+ restarts. Not sure what causes it, but the UI thread is deadlocked:

2015-04-08 13:37:06,710 sound-sink pushed 522 bytes, new buffer level: 0ms
2015-04-08 13:37:06,711 sound-sink add_data(417 bytes, {'duration': 26122449, 'timestamp': 574693877, 'caps': 'audio/mpeg, mpegversion=(int)1, mpegaudioversion=(int)1, layer=(int)3, channels=(int)2, rate=(int)44100', 'sequence': 38, 'time': 1428496626675L}) queue_state=underrun, src=/GstPipeline:pipeline0/GstAppSrc:src (__main__.GstAppSrc)
2015-04-08 13:37:06,711 sound-sink pushed 417 bytes, new buffer level: 26ms
2015-04-08 13:37:06,712 sound-sink queue running: level=52
2015-04-08 13:37:06,712 sound-sink queue pushing: level=52
2015-04-08 13:37:06,768 sound-sink queue underrun: level=0
2015-04-08 13:37:06,775 poll_UI_loop() last_UI_thread_time was 1.5 seconds ago (max 1000), UI_blocked=False
2015-04-08 13:37:06,775 UI thread is now blocked
2015-04-08 13:37:06,779 check_server_echo(0) last=True, server_ok=True
(..)
2015-04-08 13:37:07,902 sound-sink pushed 417 bytes, new buffer level: 470ms
2015-04-08 13:37:07,902 sound-sink queue overrun: level=470
2015-04-08 13:37:07,903 process_packet callbacks(overrun)=[(<bound method XpraClient.sound_sink_overrun of gtk2.client>, [])]
2015-04-08 13:37:07,903 re-starting speaker because of overrun
2015-04-08 13:37:07,903 stop_receiving_sound(True) sound sink=sink_subprocess_wrapper(<subprocess.Popen object at 0x19586890>)
2015-04-08 13:37:07,903 stop_receiving_sound(True) calling <bound method sink_subprocess_wrapper.cleanup of sink_subprocess_wrapper(<subprocess.Popen object at 0x19586890>)>
2015-04-08 13:37:07,903 cleanup() sending request to cleanup
2015-04-08 13:37:07.904 Xpra[98954:5bfb] *** _NSAutoreleaseNoPool(): Object 0xc900d0 of class NSCFNumber autoreleased with no pool in place - just leaking
Stack: (0x1d5df4f 0x1c6a432 0x64b09c1 0x5b5c387 0x5b5cf07 0x1ac5ff7 0x1ac21ce 0x1add099 0x1adc5ea 0x1adca35 0x1ac894d 0x1ac8d35 0x1ac8ca1 0x5bc47b4 0x1ac4f88 0x1ac2829 0x1ac21ce 0x1adce0f 0x1adc5ea 0x1adca35 0x5c9aac9 0x5bc414e 0x79083ba 0x2b1b80 0x2b431c 0x22d9cb 0x1fba95 0x2aa5ce 0x1fbb40 0x7897684 0x1ac21ce 0x1add099 0x1adb7c7 0x7890484 0x2b1b80 0x2b431c 0x2b20ab 0x2b431c 0x22d9cb 0x1fba95 0x2aea26 0x2b3a9f 0x2b431c 0x2b20ab 0x2b431c 0x22dad2 0x1fba95 0x2aea26 0x2b3a9f 0x2b3a9f 0x2b431c 0x22d9cb 0x1fba95 0x20d29e 0x1fba95 0x2aa5ce 0x2f37ec 0x412055 0x411f12)
2015-04-08 13:37:07.905 Xpra[98954:5bfb] *** _NSAutoreleaseNoPool(): Object 0x116aaae0 of class NSCFDictionary autoreleased with no pool in place - just leaking
Stack: (0x1d5df4f 0x1c6a432 0x6c01be 0x64b09e9 0x5b5c387 0x5b5cf07 0x1ac5ff7 0x1ac21ce 0x1add099 0x1adc5ea 0x1adca35 0x1ac894d 0x1ac8d35 0x1ac8ca1 0x5bc47b4 0x1ac4f88 0x1ac2829 0x1ac21ce 0x1adce0f 0x1adc5ea 0x1adca35 0x5c9aac9 0x5bc414e 0x79083ba 0x2b1b80 0x2b431c 0x22d9cb 0x1fba95 0x2aa5ce 0x1fbb40 0x7897684 0x1ac21ce 0x1add099 0x1adb7c7 0x7890484 0x2b1b80 0x2b431c 0x2b20ab 0x2b431c 0x22d9cb 0x1fba95 0x2aea26 0x2b3a9f 0x2b431c 0x2b20ab 0x2b431c 0x22dad2 0x1fba95 0x2aea26 0x2b3a9f 0x2b3a9f 0x2b431c 0x22d9cb 0x1fba95 0x20d29e 0x1fba95 0x2aa5ce 0x2f37ec 0x412055 0x411f12)
2015-04-08 13:37:07.907 Xpra[98954:5bfb] *** _NSAutoreleaseNoPool(): Object 0xc93c40 of class NSCFNumber autoreleased with no pool in place - just leaking
Stack: (0x1d5df4f 0x1c6a432 0x64b09c1 0x5b5c3a8 0x5b5cf07 0x1ac5ff7 0x1ac21ce 0x1add099 0x1adc5ea 0x1adca35 0x1ac894d 0x1ac8d35 0x1ac8ca1 0x5bc47b4 0x1ac4f88 0x1ac2829 0x1ac21ce 0x1adce0f 0x1adc5ea 0x1adca35 0x5c9aac9 0x5bc414e 0x79083ba 0x2b1b80 0x2b431c 0x2b20ab 0x2b431c 0x2b20ab 0x2b431c 0x22d9cb 0x1fba95 0x2aa5ce 0x1fbb40 0x7897684 0x1ac21ce 0x1add099 0x1adb7c7 0x7890484 0x2b1b80 0x2b431c 0x2b20ab 0x2b431c 0x22d9cb 0x1fba95 0x2aea26 0x2b3a9f 0x2b431c 0x2b20ab 0x2b431c 0x22dad2 0x1fba95 0x2aea26 0x2b3a9f 0x2b3a9f 0x2b431c 0x22d9cb 0x1fba95 0x20d29e 0x1fba95 0x2aa5ce 0x2f37ec 0x412055 0x411f12)
2015-04-08 13:37:07.907 Xpra[98954:5bfb] *** _NSAutoreleaseNoPool(): Object 0x116aab20 of class NSCFDictionary autoreleased with no pool in place - just leaking
Stack: (0x1d5df4f 0x1c6a432 0x6c01be 0x64b09e9 0x5b5c3a8 0x5b5cf07 0x1ac5ff7 0x1ac21ce 0x1add099 0x1adc5ea 0x1adca35 0x1ac894d 0x1ac8d35 0x1ac8ca1 0x5bc47b4 0x1ac4f88 0x1ac2829 0x1ac21ce 0x1adce0f 0x1adc5ea 0x1adca35 0x5c9aac9 0x5bc414e 0x79083ba 0x2b1b80 0x2b431c 0x2b20ab 0x2b431c 0x2b20ab 0x2b431c 0x22d9cb 0x1fba95 0x2aa5ce 0x1fbb40 0x7897684 0x1ac21ce 0x1add099 0x1adb7c7 0x7890484 0x2b1b80 0x2b431c 0x2b20ab 0x2b431c 0x22d9cb 0x1fba95 0x2aea26 0x2b3a9f 0x2b431c 0x2b20ab 0x2b431c 0x22dad2 0x1fba95 0x2aea26 0x2b3a9f 0x2b3a9f 0x2b431c 0x22d9cb 0x1fba95 0x20d29e 0x1fba95 0x2aa5ce 0x2f37ec 0x412055 0x411f12)
2015-04-08 13:37:07,909 stop_receiving_sound(True) done
2015-04-08 13:37:07,909 sound_sink_overrun() will restart in 0ms (server supports eos sequence: True)
...

Could be related to the auto-release stuff?


Wed, 08 Apr 2015 16:45:54 GMT - Antoine Martin: owner, status changed

More tweaks and fixes in r8970 + r8971 + r8972 + r8974. Restarts should be quicker now. Though we still have to spawn a sound sink process client side, wait for it, then we tell the server we're ready, the server spawns its sound source process and we eventually get some data to forward... there will always be a delay. (we could shave off some of this setup cost - but not easily, so not doing it unless we have to)

Done most of the remaining backporting of the fixes to v0.14.x in r8973.

I was struggling to figure out why OSX was being so crap and deadlocking at random... It turns out that fixing the stacktraces from comment:27 also fixes the OSX deadlocks! Small rant: no idea why an unclean shutdown in one process manages to lockup another process' UI thread, but that's OSX for you.

r8975 gives us a lot more time to shutdown the gstreamer pipeline cleanly: we wait 250 ms in the sound process, 500 ms in the xpra client before we force kill it. (and in the meantime we can start a new process)

@afarrr: ready for more testing, and OSX should be stable too! (I've just done 350+ overrun restarts without any issue) I have a few minor things I would like to tidy up, but I can do this in parallel.


Thu, 09 Apr 2015 06:02:13 GMT - Antoine Martin: attachment set

on osx, we also need to wait for a clean sound subprocess exit on client exit... this patch attempts to solve that (and causes more problems instead)


Thu, 09 Apr 2015 09:42:37 GMT - Antoine Martin: attachment set

apply this patch to get deadlocks quickly on OSX just doing normal restarts! :(


Thu, 09 Apr 2015 11:37:36 GMT - Antoine Martin: attachment set

more simplified patch - still causes deadlocks


Thu, 09 Apr 2015 11:38:25 GMT - Antoine Martin: attachment set

even more simple patch, still deadlocks


Thu, 09 Apr 2015 14:55:48 GMT - Antoine Martin:

It takes AGES to trigger the bug on osx. Not sure r8976, r8977 or r8978 make any difference.

It requires a fairly large number of restarts to hit - not sure if the number of restarts is directly relevant or if it just a rare error condition which takes a lot of attempts to hit.


Thu, 09 Apr 2015 15:44:38 GMT - Antoine Martin: attachment set

this patch makes it much more likely to deadlock - not sure why


Thu, 09 Apr 2015 16:40:00 GMT - Antoine Martin: attachment set

thought maybe the deadlock was in stop - try using a thread, no difference


Thu, 09 Apr 2015 16:40:35 GMT - Antoine Martin: attachment set

thought just waiting a bit would prevent the deadlocks on sigint - no difference


Thu, 09 Apr 2015 16:41:20 GMT - Antoine Martin: attachment set

try to close the pipes before killing the process, and add some debug - no difference


Fri, 10 Apr 2015 12:39:06 GMT - Antoine Martin:

More fixes and improvements, including to the child reaper (r8982) and adding hooks for it (r8984), misc + cleanups (r8983). And FINALLY, fixed OSX signal exit in r8986, improved in r8987.

@afarr: the OSX client now exits reliably again, whether you exit from the osx global menu, or using control-C.


Fri, 10 Apr 2015 14:28:23 GMT - Antoine Martin:

I have published beta builds for all platforms with all these changes.

Caveats:


Sun, 12 Apr 2015 12:11:10 GMT - Antoine Martin:

r8992 improves crash and error handling and more, see commit message and info below.

Here's a recap of the new env vars that can be used for testing (making it easier to test error handling):

We should be handling almost all of these errors gracefully now. In particular, we now tell the other end to stop sending sound when the process died - something that is still a problem in the v0.14.x branch and needs fixing.

Note: see also #835: " synchronize sound with video frames "


Fri, 17 Apr 2015 01:31:33 GMT - alas:

Testing on OSX (0.15.0 r9016) against fedora 20 (also 0.15.0 r9016).

XPRA_SOUND_FAKE_OVERRUN works as expected, and the speakers dependably re-start... but I periodically see an internal error: sound-sink internal error: write connection TwoFileConnection() reset: [Errno 32] Broken pipe, and the speaker re-start is still noticeably delayed, about a long .7 seconds... as indicated in the timestamps of the logs:

2015-04-16 17:53:33,985 re-starting speaker because of overrun
** Message: pygobject_register_sinkfunc is deprecated (GstObject)
2015-04-16 17:53:34,489 sound-sink internal error: write connection TwoFileConnection() reset: [Errno 32] Broken pipe
2015-04-16 17:53:34,685 sound-sink using audio codec: MPEG 1 Audio, Layer 3 (MP3)

The others all seem to behave as expected, though I did notice that, after the FAKE_CRASH was induced, I was able to restart speakers... but when I then clicked on the Sound application menu item (top left menu), the UI thread blocking then induced the speakers to stop - not then restarting when the UI thread resumes... and caused a traceback server-side.

Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/xpra/server/server_base.py", line 1804, in _process_resume
    ss.resume(ui, wd)
  File "/usr/lib64/python2.7/site-packages/xpra/server/source.py", line 450, in resume
    ws.resume(window)
  File "/usr/lib64/python2.7/site-packages/xpra/server/window_source.py", line 384, in resume
    self.send_window_icon(window)
  File "/usr/lib64/python2.7/site-packages/xpra/server/window_source.py", line 412, in send_window_icon
    surf = window.get_property("icon")
TypeError: object of type `xpra+x11+gtk_x11+window+OverrideRedirectWindowModel' does not have property `icon'

Fri, 17 Apr 2015 02:11:35 GMT - Antoine Martin:

but I periodically see an internal error: sound-sink internal error: write connection TwoFileConnection?() reset: [Errno 32] Broken pipe


That's OK. The sound process and the main client process are talking to each other over pipes, and when we shutdown the sound process, those pipes are closed causing the warning. If we continue to get many overruns, I guess I could add some code to silence this warning.

the UI thread blocking then induced the speakers to stop - not then restarting when the UI thread resumes...


Can you provide the -d sound client log output?

and caused a traceback server-side.


Good catch. That's fixed in r9029 and will need backporting. You need an OR window to trigger this bug (ie: usually menus are OR windows) and suspend + resume (can be triggered using the control channel on platforms other than OSX).


Fri, 17 Apr 2015 22:53:53 GMT - alas:

Hmm, looks like the UI thread was a false positive. Tested with no environment variables, and sound was fine. Re-set XPRA_SOUND_FAKE_CRASH=7 and then realized that re-starting speakers just restarted counter for the fake sound crashes (which had just been coincidentally hitting while the UI thread was blocked, apparently.

Just to make it easy to confirm, here's an excerpt for the speaker stopping:

2015-04-17 14:26:56,794 sound-sink add_data(261 bytes, {'duration': 26122449, 'timestamp': 22439183673L, 'caps': 'audio/mpeg, mpegversion=(int)1, mpegaudioversion=(int)1, layer=(int)3, channels=(int)2, rate=(int)44100', 'sequence': 0, 'time': 1429306016665L}) queue_state=pushing, src=/GstPipeline:pipeline0/GstAppSrc:src (__main__.GstAppSrc)
2015-04-17 14:26:56,795 sound-sink pushed 261 bytes, new buffer level: 0ms
2015-04-17 14:26:56,796 info_update: {'pipeline': 'appsrc name=src max-bytes=32768 emit-signals=0 block=0 is-live=0 stream-type=stream format=4 ! mp3parse ! mad ! audioconvert ! audioresample ! volume name=volume volume=1.0 ! queue name=queue min-threshold-time=45000000 max-size-buffers=0 max-size-bytes=0 max-size-time=450000000 leaky=2 ! osxaudiosink async=True qos=True sync=False', 'queue.min_time': 45, 'queue.time': 450, 'queue.state': 'pushing', 'codec_description': 'MPEG 1 Audio, Layer 3 (MP3)', 'bytes': 266380, 'pid': 10225, 'volume': 100, 'state': 'active', 'queue.overruns': 0, 'queue.used_pct': 0, 'speaker.bitrate': 99000, 'codec_mode': 'joint', 'queue.used': 0, 'codec': 'mp3', 'buffers': 861}
2015-04-17 14:26:56,824 the mp3 sound sink has stopped
2015-04-17 14:26:56,825 stop_receiving_sound(True) sound sink=sink_subprocess_wrapper(<subprocess.Popen object at 0x91cb9b0>)
2015-04-17 14:26:56,825 stop_receiving_sound(True) calling <bound method sink_subprocess_wrapper.cleanup of sink_subprocess_wrapper(<subprocess.Popen object at 0x91cb9b0>)>
2015-04-17 14:26:56,825 cleanup() sending cleanup request to sound-sink
2015-04-17 14:26:56,825 sound_sink_overrun() not the current sink, ignoring it
2015-04-17 14:26:56,825 stop_receiving_sound(True) done
2015-04-17 14:26:56,826 sound_process_stopped(sink_subprocess_wrapper(<subprocess.Popen object at 0x91cb9b0>), ()) not the current sink, ignoring it
2015-04-17 14:26:56,828 speaker is now disabled - dropping packet
2015-04-17 14:27:03,585 UI thread is now blocked
2015-04-17 14:27:05,691 UI thread is running again, resuming
2015-04-17 14:27:05,693 start_receiving_sound() sound sink=None
2015-04-17 14:27:05,693 start_sound_sink(mp3)
2015-04-17 14:27:05,693 starting mp3 sound sink
2015-04-17 14:27:05,693 start_receiving_sound(mp3)
2015-04-17 14:27:05,702 mp3 sound sink started

Will test the fix after the backporting I suppose. Otherwise, I think I'll just let some sound stream on osx for a while and see if it has any problems.


Mon, 20 Apr 2015 16:52:09 GMT - Antoine Martin:

The new wrapper was not firing the emulated signals from the UI thread, which was causing crashes on win32 with GTK3, r9079 fixes that. (and this could have caused other problems elsewhere - a nice find!)


Thu, 30 Apr 2015 14:42:29 GMT - Antoine Martin:

r9201 fixes the "microphone" forwarding (sound from client to server).

This should be the last piece of the puzzle, further improvements will go in the next release, see #835 and #849.


Fri, 01 May 2015 22:29:59 GMT - alas:

Testing with osx client 0.15.0 r9202 vs. fedora 20 0.15.0 r9137, with just a pair of idle xterms, I'm seeing client-side output indicating that the UI thread is blocked/running again about once per minute.

2015-05-01 14:55:27,066 UI thread is now blocked
2015-05-01 14:55:27,067 UI thread is running again, resuming
2015-05-01 14:56:27,088 UI thread is now blocked
2015-05-01 14:56:27,089 UI thread is running again, resuming
2015-05-01 14:57:11,056 re-starting speaker because of overrun
** Message: pygobject_register_sinkfunc is deprecated (GstObject)
2015-05-01 14:57:11,558 sound-sink internal error: write connection TwoFileConnection() reset: [Errno 32] Broken pipe
2015-05-01 14:57:13,556 sound-sink using audio codec: MPEG 1 Audio, Layer 3 (MP3)
2015-05-01 14:57:18,915 UI thread is now blocked
2015-05-01 14:57:18,915 UI thread is running again, resuming
2015-05-01 14:58:18,920 UI thread is now blocked
2015-05-01 14:58:18,921 UI thread is running again, resuming

Connecting and waiting patiently for a minute (or doing something else then glancing back at the terminal) with just an xterm or two should reproduce.

Enabling -d sound on the server side I'm not seeing anything that looks interesting, just a lot of these over and over::

2015-05-01 15:01:27,069 info_update: {'pipeline': 'pulsesrc ! audioconvert ! volume name=volume volume=1.0 ! lamemp3enc ! appsink name=sink', 'codec_description': 'MPEG-1 Layer 3 (MP3)', 'bytes': 1012232, 'pid': 10090, 'volume': 100, 'state': 'active', 'codec': 'mp3', 'buffers': 9733}2015-05-01 15:01:27,069 sound_source_info: {'pipeline': 'pulsesrc ! audioconvert ! volume name=volume volume=1.0 ! lamemp3enc ! appsink name=sink', 'codec_description': 'MPEG-1 Layer 3 (MP3)', 'bytes': 1012232, 'pid': 10090, 'volume': 100, 'state': 'active', 'codec': 'mp3', 'time': 1430517687, 'buffers': 9733}
2015-05-01 15:01:27,078 new_sound_buffer(source_subprocess_wrapper(<subprocess.Popen object at 0x2014e90>), 104, {'duration': 26122449, 'timestamp': 254223673469, 'caps': 'audio/mpeg, mpegversion=(int)1, mpegaudioversion=(int)1, layer=(int)3, channels=(int)2, rate=(int)44100', 'time': 1430517687077}) suspended=False

Oddly enough, starting a browser and playing something that actually has sound ends the output. Closing the browser again, it takes about 5 minutes for the UI messages to resume.


Sat, 02 May 2015 04:45:28 GMT - Antoine Martin:

@afarr: is this related to sound? (ie: does this go away if you turn sound off) If not, then this belongs in another ticket. Is this a recent regression? I am obviously not seeing this, I get plenty of overruns when testing in virtualbox but no UI thread issues.


Mon, 11 May 2015 16:06:59 GMT - Antoine Martin: status changed; resolution set

I have moved the UI thread issue to #855. I think we can close this.


Tue, 12 May 2015 08:10:06 GMT - Antoine Martin:

Minor fixes (mostly for those that try py3k support):


Tue, 12 May 2015 18:04:40 GMT - Antoine Martin: status changed; resolution deleted

One more thing that needs doing: I can get the win32 client to get stuck on control-c...


Tue, 12 May 2015 18:05:15 GMT - Antoine Martin: attachment set

try to force the win32 client to exit on control-c...


Wed, 13 May 2015 04:23:25 GMT - Antoine Martin: owner, status changed

Well, that was not easy - or clean (now we can get stacktraces on shutdown...), but at least the client will exit reliably on win32 now.

For v0.15.x, we will need to apply at least: r9334, r9339, r9340 And maybe also: r9335, r9337


Wed, 13 May 2015 09:21:59 GMT - Antoine Martin: status changed; resolution set

Backports in r9347 + r9350.


Mon, 18 May 2015 17:03:49 GMT - Antoine Martin:

Related tickets:


Sat, 23 Jan 2021 05:02:29 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/669