xpra icon
Bug tracker and wiki

http://xpra.org/icons/speed.png

CSC Performance

The point of providing different CSC implementations is to be able to get the best performance out of the hardware.

Unfortunately, it is impossible to say in advance for definite which module will be the fastest on any given piece of hardware, though swscale is a good bet.

Also, some modules offload to the GPU while others remain on the CPU only, and often this will have an impact on the rest of the system. Some modules take longer to initialize, which may or may not be an issue.

So the best way to choose the right CSC module is to test each one and see the cost/benefits.

Running the performance tests

You can get your own performance figures by running the tests:

To prevent conflicts between the source tree and the installed version of xpra, the easiest way to run the tests is to check out the tests in a temporary area:

mkdir tmp && cd tmp
svn co http://xpra.org/svn/Xpra/trunk/src/tests/
PYTHONPATH=. tests/xpra/codecs/test_csc_cython.py
PYTHONPATH=. tests/xpra/codecs/test_csc_opencl.py 
PYTHONPATH=. tests/xpra/codecs/test_csc_swscale.py

Caveats

  • ensure that there are no other tasks running on the system... even having an X11 GUI will use the GPU, which will take some memory, bandwidth and performance out of it
  • ensure that the CPU/GPU are not running at lower clock speeds to save power (ie: powermizer for nvidia, CPU governor on Linux)
  • run the tests repeatedly and average the results - results that vary too widely should be investigated or simply discarded




Warning: the results below do not include libyuv, which was added in version 0.17 and is now the clear winner. (see #973)



Results with 0.16.0 pre-release

  • 1920x1080 in MPixels/s
  • ffmpeg version 2.7.1
  • pyopencl 2015.1
  • Cython 0.22.1
Module Options (ICD) CPU GPU BGRX to YUV YUV to BGRX
YUV420P YUV422P YUV444P YUV420P YUV422P YUV444P
cython AMD X4 945GTX 970 23 23
swscale AMD X4 945GTX 970 152 150 119 385 462 182
openclNVIDIAAMD X4 945GTX 970 378 331 272
cython Core i5-4440GTX 760 70 69
swscale Core i5-4440GTX 760 307 303 248 854 820 417
openclNVIDIACore i5-4440GTX 760 623 500 417
openclIntelCore i5-4440GTX 760 309 258 289
cython Core i3-3110MIntel HD 4000 50 48
swscale Core i3-3110MIntel HD 4000 190 195 154 704 704 298
openclpoclCore i3-3110MIntel HD 4000 58 51 50 52 49 45
openclIntelCore i3-3110MIntel HD 4000 143 126 151
openclAMDCore i3-3110MIntel HD 4000 63 61 50 47 42 33

Results with 0.15.4

  • 1920x1080 in MPixels/s
  • ffmpeg version 2.7.1
  • pyopencl 2015.1
  • Cython 0.22.1
Module Options (ICD) CPU GPU BGRX to YUV YUV to BGRX
YUV420P YUV422P YUV444P YUV420P YUV422P YUV444P
cython AMD X4 945GTX 970 31 29
swscale AMD X4 945GTX 970 150 149 121 334 471 225
openclNVIDIAAMD X4 945GTX 970 376 323 274
cython Core i5-4440GTX 760 103 80
swscale Core i5-4440GTX 760 307 303 248 624 760 352
openclNVIDIACore i5-4440GTX 760 673 580 454
openclIntelCore i5-4440GTX 760 252 256 278
cython Core i3-3110MIntel HD 4000 63 46
swscale Core i3-3110MIntel HD 4000 198 195 163 443 616 277
openclpoclCore i3-3110MIntel HD 4000 59 52 52 45 49 40
openclIntelCore i3-3110MIntel HD 4000 139 124 144
openclAMDCore i3-3110MIntel HD 4000 65 62 48 46 42 34

Results with 0.14.28

  • 1920x1080 in MPixels/s
  • ffmpeg version 2.7.1
  • pyopencl 2015.1
  • Cython 0.22.1
Module Options (ICD) CPU GPU BGRX to YUV YUV to BGRX
YUV420P YUV422P YUV444P YUV420P YUV422P YUV444P
cython AMD X4 945GTX 970 23 21
swscale AMD X4 945GTX 970 151 150 122 398 474 182
openclNVIDIAAMD X4 945GTX 970 380 329 281
cython Core i5-4440GTX 760 71 64
swscale Core i5-4440GTX 760 306 303 251 554 785 387
openclNVIDIACore i5-4440GTX 760 679 584 442
openclIntelCore i5-4440GTX 760 206 190 266
cython Core i3-3110MIntel HD 4000 52 45
swscale Core i3-3110MIntel HD 4000 197 197 162 468 608 237
openclpoclCore i3-3110MIntel HD 4000 59 51 51 53 48 45
openclIntelCore i3-3110MIntel HD 4000 154 133 149
openclAMDCore i3-3110MIntel HD 4000 60 63 50 48 42 35

Results with 0.11.0 release

All tests at 1920x1080 in MPixels/s

Module Options CPU GPU BGRX to YUV YUV to BGRX
YUV420P YUV422P YUV444P YUV420P YUV422P YUV444P
cython AMD X4 945GTX 76047
swscale AMD X4 945GTX 760119163132199345229
nvcuda AMD X4 945GTX 760126109114
openclNVIDIAAMD X4 945GTX 760382326278275315266
openclAMDAMD X4 945GTX 760615443443724
openclIntelAMD X4 945GTX 760573940433719
cython Intel i3-3110MIntel HD 400073
swscale Intel i3-3110MIntel HD 4000150199164341361351
openclAMDIntel i3-3110MIntel HD 4000707062494334
openclIntelIntel i3-3110MIntel HD 4000159119152
cython Intel i7-4500UIntel HD 4400105
swscale Intel i7-4500UIntel HD 4400206284228458480362
cython 2xIntel Xeon E5-2670GTX 76092
swscale 2xIntel Xeon E5-2670GTX 760184257199343446421
nvcuda 2xIntel Xeon E5-2670GTX 760848076
openclNVIDIA2xIntel Xeon E5-2670GTX 760333289222242269233
cython AMD FX-6100Radeon HD 687060
swscale AMD FX-6100Radeon HD 6870175168138433577353
openclAMDAMD FX-6100Radeon HD 6870235232201204192210

Previous Results

These values were obtained with r4272 and later, different combinations may have been tested with different revisions and should therefore not be trusted.

(results are in MPixels/s):

  • 1920x1080 RGB to YUV???P:
Module CPU/GPU YUV420P YUV422P YUV444P
swscaleAMD FX 8150142182151
swscaleAMD X4 945120165131
swscaleAMD X2 260124170140
swscaleIntel Core i3-3110M164229181
swscale2xIntel Xeon E5-2670215322253
CUDA-NvidiaAMD X4 945 + GTS 450366341290
CUDA-Nvidia2xIntel Xeon E5-2670 / 2xK1173177160
OpenCL-NvidiaAMD FX8150 + GTX 760345303254
OpenCL-NvidiaAMD X4 945 + GTS 450357303260
OpenCL-Nvidia2xIntel Xeon E5-2670 / 2xK1210211192
OpenCL-NvidiaIntel Xeon E5-2620 / GTX 650ti502457399
OpenCL-IntelAMD FX 8150129114119
OpenCL-IntelIntel Core i3-3110M1419253
OpenCL-Intel2xIntel Xeon E5-2670472412263
OpenCL-IntelIntel Xeon E5-2620254213131
OpenCL-IntelIntel i7-4500U155125166
OpenCL-AMDAMD FX 8150 + Radeon HD54501104942
OpenCL-AMDAMD FX 8150937976
OpenCL-AMDAMD FX 6100 + Radeon HD6870274234219
OpenCL-AMDAMD FX 610012611590
OpenCL-AMDAMD X4 945635453
OpenCL-AMDAMD M300141212
OpenCL-AMDAMD X2 + Radeon HD54501516157
OpenCL-AMDAMD X2151411
OpenCL-AMDIntel Core i3-3110M715863
OpenCL-AppleIntel Core2Duo P8600 + GeForce? 320222822
  • 1920x1080 RGB to GBR (simple byte swapping):
Module CPU/GPU MPixels/s
swscaleAMD FX 8150718
swscaleAMD FX 6100608
swscaleAMD X4 945524
swscaleAMD X2 260582
swscaleIntel Core i3-3110M550
swscaleIntel i7-4500U627
swscale2xIntel Xeon E5-2670758
  • 1920x1080 YUV???P to BGR(X):
Module CPU/GPU YUV420P YUV422P YUV444P
swscaleAMD FX 8150381406416
swscaleAMD FX 6100361375370
swscaleAMD X4 945369323237
swscaleAMD X2 260312255330
swscaleIntel Core i3-3110M350309310
swscale2xIntel Xeon E5-2670177168163
CUDA-NvidiaAMD X4 945 + GTS 450202191180
CUDA-Nvidia2xIntel Xeon E5-2670 / 2xK1180155151
OpenCL-NvidiaAMD FX 8150 + GTX 760331289257
OpenCL-NvidiaAMD X4 945 + GTS 450???
OpenCL-NvidiaIntel Xeon E5-2620 / GTX 650ti458377358
OpenCL-Nvidia2xIntel Xeon E5-2670 / 2xK1190165148
OpenCL-IntelAMD FX 8150967067
OpenCL-IntelIntel Core i3-3110M828887
OpenCL-IntelIntel Xeon E5-2620146123116
OpenCL-Intel2xIntel Xeon E5-2670265271268
OpenCL-IntelIntel i7-4500U162122153
OpenCL-AMDAMD FX 8150 + Radeon HD5450848270
OpenCL-AMDAMD FX 8150605547
OpenCL-AMDAMD FX 6100 + Radeon HD6870179231197
OpenCL-AMDAMD FX 6100786856
OpenCL-AMDAMD X4 945545150
OpenCL-AMDAMD M3001197
OpenCL-AMDAMD X2 260 + Radeon HD54501079898
OpenCL-AMDAMD X2 26011107
OpenCL-AMDIntel Core i3-3110M605658

And here are some charts based on those figures. Note: for historical reasons, we include results for the now deleted csc nvcuda module despite the fact that it never worked reliably...

Last modified 15 months ago Last modified on 04/18/16 07:42:10