xpra icon
Bug tracker and wiki

This bug tracker and wiki are being discontinued
please use https://github.com/Xpra-org/xpra instead.

Opened 9 years ago

Last modified 9 months ago

#224 closed task

synchronized X11 calls and performance — at Initial Version

Reported by: Antoine Martin Owned by: Antoine Martin
Priority: major Milestone: 0.9
Component: server Version: trunk
Keywords: Cc:


In trying to fix support for running gtkperf, I found out that r1783 (use synchronized X11 calls by default) caused a fairly big performance regression.
The problem is that we cannot simply go back to unsynchronized calls because then we get random blowups in unpredictable places when the system is stressed:

  • ie: when running gtkperf -a in a loop)
  • ie: #208, #167, #71, etc

So I've mitigated this by optimizing the hell out of the critical codepaths, see: r2307, r2303, r2302, r2294, r2291, r2290, r2275, r2271, r2270, r2269.
Including writing an inlined cython version of trap.call for get_pywindow (see r2289).
Also, we now try to group more X11 calls before calling XSync (see r2281) and we keep broken window models around longer to avoid attempting to create them dozens of time before giving up: we take a shortcut (see r2304), which also fixes a bug where the same gdk window would end up having dozens of event receivers (as made apparent by r2267).
I've reviewed every single call to trap.call and trap.swallow and replaced them with their more explicit counterpart (sync/unsynced), see: r2285, r2284, r2282, r2280, r2279, r2277, r2276
The general rule is to use synced calls if not in the critical path (ie: during setup or rare/important events), and use synced calls whenever we must ensure a consistent state (ie: setting up window model wrappers, etc)
We now have much better support for profiling CPU usage: see r2298, r2297

What is left:

  • check which X11 calls really do need synchronized error handling and which ones do not (in particular, property change/get/set calls - unsure about those).
  • gtkperf -a, and probably other applications too, can still generate hundreds of DamageNotify events per second, more than we can deal with. So we end up starving the main UI thread and the server latency goes through the roof: often higher than 5 seconds! Which also makes us raise the batch delay way too high - but in a way, this is the right response, so probably no need to change that. What we need to do is ensure we deal with the DamageNotify much more quickly. Options:
    • do more profiling, maybe I've missed something..
    • do some initial batching in the cython event handler, and pass a list of damage regions (via idle_add or timeout_add) instead of just one damage rectangle.
    • avoid most of this indirection stuff with gobject signals going via CompositeHelper then WindowModel then Server (each signal call has to go from one class to another via python to C and back...)
    • give up and re-write most of the server in C...

Change History (0)

Note: See TracTickets for help on using tickets.