xpra icon
Bug tracker and wiki

This bug tracker and wiki are being discontinued
please use https://github.com/Xpra-org/xpra instead.

Opened 3 years ago

Closed 22 months ago

Last modified 4 months ago

#1851 closed task (fixed)

tune vpx threading

Reported by: Antoine Martin Owned by: Antoine Martin
Priority: major Milestone: 2.4
Component: encodings Version: 2.3.x
Keywords: Cc:

Description (last modified by Antoine Martin)

Same as #1840 but for libvpx.

Some links:

It looks like part of the reason why vp8 and vp9 are now faster, and why I chose to use vpx more (see ticket:832#comment:22) is that the threading improvements make it faster.
This does mean that reducing the threading might reduce the performance too much.

Attachments (1)

test_vpx.tar.gz (138.2 KB) - added by Smo 22 months ago.
VPX data and charts for threads 1/2/4

Download all attachments as: .zip

Change History (9)

comment:1 Changed 3 years ago by Antoine Martin

You can choose the maximum number of threads with:

XPRA_VPX_THREADS=2 xpra start ...

We want to see how this affects frame latency, bandwidth, CPU load, etc.
Unlike x264, it looks like we don't have a lot of room for manoeuver here.
(the current value is "number-of-cpus" minus 1)
Maybe this should be capped at 2 threads.

comment:2 Changed 3 years ago by J. Max Mena

I've set up a quick script that should run a series of three tests runs with XPRA_VPX_THREADS set to 1, 2, and 4. For reference the test box is an 8-core system. I'm more curious to see how much of an impact it has on more low-end machines so I'm going to update one of my low-end test boxes and run the tests again on there.

Last edited 3 years ago by J. Max Mena (previous) (diff)

comment:3 Changed 3 years ago by Antoine Martin

Description: modified (diff)

comment:4 Changed 2 years ago by Antoine Martin

Owner: changed from J. Max Mena to Jonathan Anthony
Status: assignednew

comment:5 Changed 22 months ago by Smo

Owner: changed from Jonathan Anthony to Smo

Changed 22 months ago by Smo

Attachment: test_vpx.tar.gz added

VPX data and charts for threads 1/2/4

comment:6 Changed 22 months ago by Smo

Owner: changed from Smo to Antoine Martin

I've attached some test data and charts.

The data seems to show that more threads is better.

Can you check these over and let me know if any other action is required.

comment:7 Changed 22 months ago by Antoine Martin

Resolution: fixed
Status: newclosed

Interesting data:

  • we encode more pixels per second with more threads but when it comes to actually sending ("pixels sent"), the benefits are much lower as other costs come into play (and maybe we're hitting a performance ceiling?)
  • there seems to be a sweet spot with 2 threads, at least for the batch delay and damage latency
  • going up to 4 threads doesn't gain much (ie: marginal improvement in damage latency and pixels sent per second) - I suspect that this may vary with bigger picture sizes
  • decoding takes a little bit longer with more threads - which is fine, we're almost never bound by the client's decoding speed
  • 4 threads uses quite a bit more server side memory

So, r23474 makes us use fewer threads by default (was number-of-cpus - 1):

>>> import math
>>> for i in range(8):
...  print("%-3i: %2i" % (2**i, math.sqrt(2**i+1)))
1  :  1
2  :  1
4  :  2
8  :  3
16 :  4
32 :  5
64 :  8
128: 11

This can still be overriden using the env var XPRA_VPX_THREADS=

comment:8 Changed 4 months ago by migration script

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/1851

Note: See TracTickets for help on using tickets.