When we get many small areas in damage requests, we may end up scheduling hundreds of calls to idle_add from damage_to_data
which causes the latency to shoot right up.
Solutions:
damage_to_data
in one idle_add call - or even run it directly from damage
(via damage_now
or send_delayed
). This would reduce the number of idle_add calls at the cost of making one longer idle_add call...
packet_cost
and/or decreasing pixels_threshold
Or a combination of the above? Or??
Once done, we can deal with #135
Using this simple patch:
Index: src/xpra/server.py =================================================================== --- src/xpra/server.py (revision 883) +++ src/xpra/server.py (working copy) @@ -539,9 +540,10 @@ except Exception, e: log.error("damage_to_data: error processing region %s: %s", damage, e) continue - gobject.idle_add(self._process_damage_regions, wid, window, ww, wh, regions, coding, sequence, options) + gobject.idle_add(self._process_damage_regions, time.time(), wid, window, ww, wh, regions, coding, sequence, options) - def _process_damage_regions(self, wid, window, ww, wh, regions, coding, sequence, options): + def _process_damage_regions(self, req_time, wid, window, ww, wh, regions, coding, sequence, options): + log.info("process_damage_regions: idle_add latency=%s", int(1000*1000*(time.time()-req_time))) if self._damage_cancelled.get(wid, 0)>=sequence: log("process_damage_regions: dropping damage request with sequence=%s", sequence) return
And running the xterm performance test, we can clearly see the idle_add latency shooting right up:
process_damage_regions: idle_add latency=39 process_damage_regions: idle_add latency=601 process_damage_regions: idle_add latency=793 process_damage_regions: idle_add latency=1873 process_damage_regions: idle_add latency=8841 process_damage_regions: idle_add latency=2904 process_damage_regions: idle_add latency=3646 process_damage_regions: idle_add latency=4523 process_damage_regions: idle_add latency=4644 process_damage_regions: idle_add latency=5743 Uh-oh, our size doesn't fit window sizing constraints: 979x633 vs 979x628 Uh-oh, our size doesn't fit window sizing constraints: 979x633 vs 979x628 unknown modifier: meta process_damage_regions: idle_add latency=135900 process_damage_regions: idle_add latency=135923 process_damage_regions: idle_add latency=135952 process_damage_regions: idle_add latency=106001 process_damage_regions: idle_add latency=289604 (...) process_damage_regions: idle_add latency=1162372 process_damage_regions: idle_add latency=1150017 process_damage_regions: idle_add latency=124438 process_damage_regions: idle_add latency=93 process_damage_regions: idle_add latency=28 process_damage_regions: idle_add latency=41 process_damage_regions: idle_add latency=46 process_damage_regions: idle_add latency=59
Then suddenly dropping back down... now need to figure out why!
A simple:
time.sleep(0)
added in r884 ensures we don't queue too many idle_add calls before giving them a chance to run.
A better solution is probably to remove the threading from damage_to_data and to keep track of the idle_add latency as a measure of how smoothly the system is running, and batch more when it goes too high?
removes threading from damage codepath
updated patch which dumps the damage latency at various points in the pipeline
With this code in _get_rgb_rawdata
pixbuf = gtk.gdk.Pixbuf(gtk.gdk.COLORSPACE_RGB, False, 8, width, height) log.info("get_rgb_rawdata(..) creating pixbuf object took %s ms", int(1000*(time.time()-start))) pixbuf.get_from_drawable(pixmap, pixmap.get_colormap(), x, y, 0, 0, width, height) log.info("get_rgb_rawdata(..) pixbuf.get_from_drawable took %s ms", int(1000*(time.time()-start))) raw_data = pixbuf.get_pixels() log.info("get_rgb_rawdata(..) pixbuf.get_pixels took %s ms", int(1000*(time.time()-start))) rowstride = pixbuf.get_rowstride() log.info("get_rgb_rawdata(..) took %s ms", int(1000*(time.time()-start)))
We can see:
get_rgb_rawdata(..) creating pixbuf object took 0 ms get_rgb_rawdata(..) pixbuf.get_from_drawable took 182 ms get_rgb_rawdata(..) pixbuf.get_pixels took 183 ms get_rgb_rawdata(..) took 184 ms
So the expensive call is pixbuf.get_from_drawable()
in gtk.
Solutions:
patch for protocol.py so the latency reported is when the packet has been sent by the network layer
r886 commits the unthreaded damage code and adds the latency measurement via "xpra info", see changeset commit message for details
good enough in trunk with the latency minimization code from #153
this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/132