xpra icon
Bug tracker and wiki

Opened 8 years ago

Closed 8 years ago

Last modified 8 years ago

#224 closed task (fixed)

synchronized X11 calls and performance

Reported by: Antoine Martin Owned by: Antoine Martin
Priority: major Milestone: 0.9
Component: server Version: trunk
Keywords: Cc:

Description (last modified by Antoine Martin)

In trying to fix support for running gtkperf, I found out that r1783 (use synchronized X11 calls by default) caused a fairly big performance regression.
The problem is that we cannot simply go back to unsynchronized calls because then we get random blowups in unpredictable places when the system is stressed:

  • ie: when running gtkperf -a in a loop)
  • ie: #208, #167, #71, etc

So I've mitigated this by optimizing the hell out of the critical codepaths, see: r2307, r2303, r2302, r2294, r2291, r2290, r2275, r2271, r2270, r2269.
Including writing an inlined cython version of trap.call for get_pywindow (see r2289).
Also, we now try to group more X11 calls before calling XSync (see r2281) and we keep broken window models around longer to avoid attempting to create them dozens of time before giving up: we take a shortcut (see r2304), which also fixes a bug where the same gdk window would end up having dozens of event receivers (as made apparent by r2267).
I've reviewed every single call to trap.call and trap.swallow and replaced them with their more explicit counterpart (sync/unsynced), see: r2285, r2284, r2282, r2280, r2279, r2277, r2276
The general rule is to use synced calls when not in the critical path (ie: during setup or rare/important events) or when we must ensure a consistent state (ie: setting up window model wrappers, etc) - there are only a few cases where we can do unsynced calls: generally when we are certain that the call will be followed by another synced call before returning from the thread, or when the syscall is deemed safe.
We now have much better support for profiling CPU usage: see r2298, r2297


What is left:

  • #225: improve damage handling in python UI thread
  • check which X11 calls really do need synchronized error handling and which ones do not (in particular, property change/get/set calls - unsure about those).
  • gtkperf -a, and probably other applications too, can still generate hundreds of DamageNotify events per second, more than we can deal with. So we end up starving the main UI thread and the server latency goes through the roof: often higher than 5 seconds! Which also makes us raise the batch delay way too high - but in a way, this is the right response, so probably no need to change that. What we need to do is ensure we deal with the DamageNotify much more quickly. Options:
    • do more profiling, maybe I've missed something..
    • do some initial batching in the cython event handler, and pass a list of damage regions (via idle_add or timeout_add) instead of just one damage rectangle.
    • avoid most of this indirection stuff with gobject signals going via CompositeHelper then WindowModel then Server (each signal call has to go from one class to another via python to C and back...)
    • give up and re-write most of the server in C...

Attachments (3)

pycallgraph-damage-full.png (442.1 KB) - added by Antoine Martin 8 years ago.
shows most of the calls handling damage requests
pycallgraph-data_to_packet-mmap.png (67.5 KB) - added by Antoine Martin 8 years ago.
graph of calls for sending data via mmap
pycallgraph-net.png (152.5 KB) - added by Antoine Martin 8 years ago.
UI thread side of network sending

Download all attachments as: .zip

Change History (7)

comment:1 Changed 8 years ago by Antoine Martin

Description: modified (diff)
Status: newaccepted

Changed 8 years ago by Antoine Martin

Attachment: pycallgraph-damage-full.png added

shows most of the calls handling damage requests

Changed 8 years ago by Antoine Martin

graph of calls for sending data via mmap

Changed 8 years ago by Antoine Martin

Attachment: pycallgraph-net.png added

UI thread side of network sending

comment:2 Changed 8 years ago by Antoine Martin

Description: modified (diff)

comment:3 Changed 8 years ago by Antoine Martin

Resolution: fixed
Status: acceptedclosed
  • profiling did not reveal any more severe bottlenecks - gtkperf is really a special case and will need special treatment: see #232
  • r2358 fixes one more explicit synced call

Also see X11 errors debugging wiki which has a sample error.py which can be used to debug or trace these sync issues.

Last edited 8 years ago by Antoine Martin (previous) (diff)

comment:4 Changed 8 years ago by Antoine Martin

Looks like pretty much every single X11 call needs to be synced to prevent crashes later on.. sigh

This caused #258

Note: See TracTickets for help on using tickets.