In trying to fix support for running gtkperf
, I found out that r1783 (use synchronized X11 calls by default) caused a fairly big performance regression.
The problem is that we cannot simply go back to unsynchronized calls because then we get random blowups in unpredictable places when the system is stressed:
So I've mitigated this by optimizing the hell out of the critical codepaths, see: r2307, r2303, r2302, r2294, r2291, r2290, r2275, r2271, r2270, r2269.
Including writing an inlined cython version of trap.call
for get_pywindow
(see r2289).
Also, we now try to group more X11 calls before calling XSync
(see r2281) and we keep broken window models around longer to avoid attempting to create them dozens of time before giving up: we take a shortcut (see r2304), which also fixes a bug where the same gdk window would end up having dozens of event receivers (as made apparent by r2267).
I've reviewed every single call to trap.call
and trap.swallow
and replaced them with their more explicit counterpart (sync/unsynced), see: r2285, r2284, r2282, r2280, r2279, r2277, r2276
The general rule is to use synced calls when not in the critical path (ie: during setup or rare/important events) or when we must ensure a consistent state (ie: setting up window model wrappers, etc) - there are only a few cases where we can do unsynced calls: generally when we are certain that the call will be followed by another synced call before returning from the thread, or when the syscall is deemed safe.
We now have much better support for profiling CPU usage: see r2298, r2297
What is left:
gtkperf -a
, and probably other applications too, can still generate hundreds of DamageNotify
events per second, more than we can deal with. So we end up starving the main UI thread and the server latency goes through the roof: often higher than 5 seconds! Which also makes us raise the batch delay way too high - but in a way, this is the right response, so probably no need to change that. What we need to do is ensure we deal with the DamageNotify
much more quickly. Options:
idle_add
or timeout_add
) instead of just one damage rectangle.
CompositeHelper
then WindowModel
then Server
(each signal call has to go from one class to another via python to C and back...)
shows most of the calls handling damage requests
graph of calls for sending data via mmap
UI thread side of network sending
gtkperf
is really a special case and will need special treatment: see #232
Also see X11 errors debugging wiki which has a sample error.py
which can be used to debug or trace these sync issues.
Looks like pretty much every single X11 call needs to be synced to prevent crashes later on.. sigh
This caused #258
this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/224