xpra icon
Bug tracker and wiki

Opened 2 years ago

Closed 19 months ago

Last modified 4 months ago

#1260 closed task (fixed)

nvenc v7 support

Reported by: Antoine Martin Owned by: Smo
Priority: major Milestone: 1.0
Component: encodings Version: trunk
Keywords: Cc:

Description

Download link: https://developer.nvidia.com/nvidia-video-codec-sdk.
Anand: Maxwell Display Matters: New Display Controller, HDR, & HEVC

Key features that are relevant to us:

  • HEVC 8K (8192 pixels x 8192 pixels) encoding
  • HEVC 4:4:4 encoding
  • HEVC 10-bit encoding
  • HEVC lossless encoding
  • Rate control and quality improvements

Claims are that the performance is doubled over Maxwell.

Related tickets:

  • #828 Debian and Ubuntu
  • #1046 nvenc v6 support
  • #825 nvenc v5 support

1.1 should drop support for all older nvenc codecs

Attachments (1)

nvenc7.pc (305 bytes) - added by Antoine Martin 2 years ago.
pkg-config file for building against nvenc7

Download all attachments as: .zip

Change History (23)

Changed 2 years ago by Antoine Martin

Attachment: nvenc7.pc added

pkg-config file for building against nvenc7

comment:1 Changed 2 years ago by Antoine Martin

Status: newassigned

Stub added in r13060, minor api updates in r13061, build switch in r13087. Missing symbol fix in r13088.

Still TODO:

  • we should now probe the max encoding size instead of hard-coding to 4k
  • support ARGB and ABGR as input so we can drop the CUDA kernels?
  • add new parameters to NV_ENC_RC_PARAMS and elsewhere, including those for b-frames #800 and 10-bit colour #909
  • I will need a card to take this further..
Last edited 2 years ago by Antoine Martin (previous) (diff)

comment:2 Changed 2 years ago by Antoine Martin

  • r13335 dropped support for all the older nvenc versions, AFAICT the newer SDK is backwards compatible.
  • r13363: lots of improvements (see commit message), includes support for HEVC (aka h265)
  • r13364: probe max-encoder-size at runtime (support 8K with HEVC)

comment:3 Changed 2 years ago by Antoine Martin

10-bit HEVC support moved to #1308.

comment:4 Changed 23 months ago by Antoine Martin

Owner: changed from Antoine Martin to Smo
Status: assignednew

Well, well. I've spent a small fortune on a GTX1070 to test this ticket, in particular the performance of HEVC. Problem is that I get hard system lockups running the tests.
I thought there was something wrong with the code but then I went back to the 0.17.x branch and all nvenc codec versions from that branch also lockup the system...

So it could be one of two things:

  • hardware problem, this card could be a dud: it does have problems with 4k (not working on one monitor) and some glittering pixels at boot
  • the code just isn't compatible with Pascal cards (10xx) - in which case we'll probably need to feed the YUV data directly rather than using the CUDA buffer (sigh)

@smo: you have a card, can you run the tests:

mkdir tmp && cd tmp && cp -apr ../tests ./
PYTHONPATH=. ./tests/xpra/codecs/test_nvenc7.py 

comment:5 Changed 22 months ago by Antoine Martin

Owner: changed from Smo to Antoine Martin
Status: newassigned

Found the bug: using "sliceMode" and "sliceModeData" causes nvenc to lockup completely (I have rebooted my desktop system ~20 times today as a result of that bug).
Fix in r14303, (backport to v0.17.x in r14304)

It is quite a bit faster than the previous card I had been testing with, though some of the gains could also be due to the faster CPU / memory. I'm getting:

nvenc(BGRX/NV12/H264 - low-latency - 3840x2160) finished encoding 100    BGRX frames at  3840x2160: \
     521 MPixels/s,   15ms/frame,        9KB/frame (NV12)

That would allow for up to 60 fps at 4K.
Or almost 1500fps at 480p! (or 60 clients at 30 fps)

Still TODO:

  • fix some test failures
  • remove driver version warning
  • finish or re-schedule ARGB mode

comment:6 Changed 22 months ago by Antoine Martin

Owner: changed from Antoine Martin to Smo
Status: assignednew

@smo: test if you have any cards that support NVENC (not just Pascal generation), or just close.

comment:7 Changed 22 months ago by Antoine Martin

Recently, I've started seeing a lot of these (improved error handling in r14337):

Error: cannot initialize CUDA
 cuInit failed: unknown error

Not much we can do about it! ("unknown error" - sigh)

comment:8 Changed 21 months ago by Smo

Trying to run the tests but maybe i'm missing something? This is weird it should work like you said.

[cosmo@explosivo src] $ pwd
/home/cosmo/work/Xpra/trunk/src
[cosmo@explosivo src] $ mkdir tmp && cd tmp && cp -apr ../tests ./
[cosmo@explosivo tmp] $ PYTHONPATH=. ./tests/xpra/codecs/test_nvenc7.py 
Traceback (most recent call last):
  File "./tests/xpra/codecs/test_nvenc7.py", line 7, in <module>
    from tests.xpra.codecs import test_nvenc
ImportError: No module named tests.xpra.codecs
[cosmo@explosivo tmp] $ pwd
/home/cosmo/work/Xpra/trunk/src/tmp
[cosmo@explosivo tmp] $ ls
tests

Not entirely sure why it is complaining or ignoring PYTHONPATH

comment:9 Changed 21 months ago by Antoine Martin

you need r14403 which is just this:

touch tests/__init__.py

comment:10 Changed 21 months ago by Smo

After installing python2-pycuda from your repo this is what is happening while trying to run that test.

PYTHONPATH=. ./tests/xpra/codecs/test_nvenc7.py 

2016-11-16 19:07:05,771 CUDA initialization (this may take a few seconds)
2016-11-16 19:07:06,027 CUDA 7.5.0 / PyCUDA 2016.1.2, found 1 device:
2016-11-16 19:07:06,027   + Graphics Device @ 0000:05:00.0 (memory: 89% free, compute: 6.1)
2016-11-16 19:07:06,156 NVidia driver version 370.28
2016-11-16 19:07:08,143 NVENC successfully initialized
creating sample data for size 4096
Traceback (most recent call last):
  File "./tests/xpra/codecs/test_nvenc7.py", line 23, in <module>
    main()
  File "./tests/xpra/codecs/test_nvenc7.py", line 12, in main
    test_nvenc.test_encode_one()
  File "/home/cosmo/work/Xpra/trunk/src/tmp/tests/xpra/codecs/test_nvenc.py", line 30, in test_encode_one
    test_encoder(encoder_module)
  File "/home/cosmo/work/Xpra/trunk/src/tmp/tests/xpra/codecs/test_encoder.py", line 94, in test_encoder
    do_test_encoder(e, src_format, actual_w, actual_h, images, log=log, after_encode_cb=after_encode_cb)
  File "/home/cosmo/work/Xpra/trunk/src/tmp/tests/xpra/codecs/test_encoder.py", line 120, in do_test_encoder
    c = encoder.compress_image(image)
  File "xpra/codecs/nvenc7/encoder.pyx", line 2035, in xpra.codecs.nvenc7.encoder.Encoder.compress_image (xpra/codecs/nvenc7/encoder.c:24541)
  File "xpra/codecs/nvenc7/encoder.pyx", line 2213, in xpra.codecs.nvenc7.encoder.Encoder.do_compress_image (xpra/codecs/nvenc7/encoder.c:27908)
  File "xpra/codecs/nvenc7/encoder.pyx", line 1317, in xpra.codecs.nvenc7.encoder.raiseNVENC (xpra/codecs/nvenc7/encoder.c:9269)
xpra.codecs.nvenc7.encoder.NVENCException: locking output buffer - returned 8: This indicates that one or more of the parameter passed to the API call is invalid.

After pressing ctrl+c the whole system locks up I have no choice but to power cycle it.

comment:11 Changed 21 months ago by Smo

Owner: changed from Smo to Antoine Martin

comment:12 Changed 21 months ago by Antoine Martin

Owner: changed from Antoine Martin to Smo

That's the error I was seeing before, which was meant to be fixed in r14303.
Please include more details about the setup: full revision, GPU details, ie:

python ./xpra/codecs/nv_util.py
python ./xpra/codecs/cuda_common/cuda_context.py

etc.

comment:13 Changed 21 months ago by Smo

python ./xpra/codecs/nv_util.py
2016-11-17 12:48:46,868 NVidia driver version 370.28
2016-11-17 12:48:46,868 NVENC license keys:
2016-11-17 12:48:46,887 * version common: 0 key(s)
2016-11-17 12:48:46,887 * version 7: 0 key(s)
2016-11-17 12:48:46,892 
2016-11-17 12:48:46,892 1 card:
2016-11-17 12:48:46,901 * 0
2016-11-17 12:48:46,902   - clock-info-graphics           : 1354
2016-11-17 12:48:46,902   - clock-info-graphics-max       : 1974
2016-11-17 12:48:46,902   - clock-info-mem                : 3504
2016-11-17 12:48:46,902   - clock-info-mem-max            : 3504
2016-11-17 12:48:46,902   - clock-info-sm                 : 1354
2016-11-17 12:48:46,902   - clock-info-sm-max             : 1974
2016-11-17 12:48:46,902   - fan-speed                     : 30
2016-11-17 12:48:46,902   - memory
2016-11-17 12:48:46,903     - free                        : 3715563520
2016-11-17 12:48:46,903     - total                       : 4234018816
2016-11-17 12:48:46,903     - used                        : 518455296
2016-11-17 12:48:46,903   - name                          : Graphics Device
2016-11-17 12:48:46,903   - pci
2016-11-17 12:48:46,903     - bus                         : 5
2016-11-17 12:48:46,903     - busId                       : 0000:05:00.0
2016-11-17 12:48:46,903     - device                      : 0
2016-11-17 12:48:46,904     - domain                      : 0
2016-11-17 12:48:46,904     - pciDeviceId                 : 478286046
2016-11-17 12:48:46,904     - pciSubSystemId              : 1649621058
2016-11-17 12:48:46,904   - pcie-link-generation          : 2
2016-11-17 12:48:46,904   - pcie-link-generation-max      : 2
2016-11-17 12:48:46,904   - pcie-link-width               : 16
2016-11-17 12:48:46,904   - pcie-link-width-max           : 16
2016-11-17 12:48:46,904   - power-state                   : 0
2016-11-17 12:48:46,904   - temperature                   : 33
2016-11-17 12:48:46,905   - uuid                          : GPU-f6898fc2-4cc7-0e8c-5b3e-02000396306e
2016-11-17 12:48:46,905   - vbios-version                 : 86.07.22.00.50
python ./xpra/codecs/cuda_common/cuda_context.py
2016-11-17 12:49:28,438 pycuda_info
2016-11-17 12:49:28,439 CUDA initialization (this may take a few seconds)
2016-11-17 12:49:28,674 CUDA 7.5.0 / PyCUDA 2016.1.2, found 1 device:
2016-11-17 12:49:28,674   + Graphics Device @ 0000:05:00.0 (memory: 86% free, compute: 6.1)
2016-11-17 12:49:28,789 * version                         : 2016.1.2
2016-11-17 12:49:28,789   - text                          : 2016.1.2
2016-11-17 12:49:28,790 cuda_info
2016-11-17 12:49:28,790 * driver
2016-11-17 12:49:28,790   - driver_version                : 8000
2016-11-17 12:49:28,790   - version                       : 7.5.0
2016-11-17 12:49:28,790 preferences:
rpm -qa xpra
xpra-1.0-0.20161115r14430.fc24.x86_64

I'm using the rpm packages from the beta repo. Do you think building this myself would make a difference?

comment:14 Changed 21 months ago by Antoine Martin

"Graphics Device"... sigh.

Can you please try with the latest drivers to see if that improves things: 375.20 is out.
Please also post the gl_check output, we may be able to get more GPU information that way. (though loading opengl could also be a problem in itself..)

I'll try to downgrade mine.

For the record, here's what I get with my overpriced GTX 1070 (trick of the day: XPRA_LOG_FORMAT):

XPRA_LOG_FORMAT="" python ./xpra/codecs/nv_util.py 
NVidia driver version 375.10
NVENC license keys:
* version common: 0 key(s)
* version 7: 0 key(s)

1 card:
* 0
  - clock-info-graphics           : 961
  - clock-info-graphics-max       : 1987
  - clock-info-mem                : 4006
  - clock-info-mem-max            : 4004
  - clock-info-sm                 : 961
  - clock-info-sm-max             : 1987
  - fan-speed                     : 0
  - memory
    - free                        : 7338196992
    - total                       : 8507162624
    - used                        : 1168965632
  - name                          : GeForce GTX 1070
  - pci
    - bus                         : 1
    - busId                       : 0000:01:00.0
    - device                      : 0
    - domain                      : 0
    - pciDeviceId                 : 461443294
    - pciSubSystemId              : 0
  - pcie-link-generation          : 2
  - pcie-link-generation-max      : 2
  - pcie-link-width               : 16
  - pcie-link-width-max           : 16
  - power-state                   : 0
  - temperature                   : 57
  - uuid                          : GPU-5ae4275b-349c-124a-b4ac-072e50f886f2
  - vbios-version                 : 86.04.26.00.3E
XPRA_LOG_FORMAT="" ./xpra/codecs/cuda_common/cuda_context.py 
pycuda_info
CUDA initialization (this may take a few seconds)
CUDA 7.5.0 / PyCUDA 2016.1.2, found 1 device:
  + GeForce GTX 1070 @ 0000:01:00.0 (memory: 85% free, compute: 6.1)
* version                         : 2016.1.2
  - text                          : 2016.1.2
cuda_info
* driver
  - driver_version                : 8000
  - version                       : 7.5.0
preferences:
Last edited 21 months ago by Antoine Martin (previous) (diff)

comment:15 Changed 21 months ago by Smo

Thanks for the trick of the day :) Video card name is showing up properly with the new driver.

XPRA_LOG_FORMAT="" python ./xpra/codecs/nv_util.py 
NVidia driver version 375.20
NVENC license keys:
* version common: 0 key(s)
* version 7: 0 key(s)

1 card:
* 0
  - clock-info-graphics           : 759
  - clock-info-graphics-max       : 1974
  - clock-info-mem                : 810
  - clock-info-mem-max            : 3504
  - clock-info-sm                 : 759
  - clock-info-sm-max             : 1974
  - fan-speed                     : 30
  - memory
    - free                        : 3770810368
    - total                       : 4267573248
    - used                        : 496762880
  - name                          : GeForce GTX 1050 Ti
  - pci
    - bus                         : 5
    - busId                       : 0000:05:00.0
    - device                      : 0
    - domain                      : 0
    - pciDeviceId                 : 478286046
    - pciSubSystemId              : 1649621058
  - pcie-link-generation          : 2
  - pcie-link-generation-max      : 2
  - pcie-link-width               : 16
  - pcie-link-width-max           : 16
  - power-state                   : 5
  - temperature                   : 32
  - uuid                          : GPU-f6898fc2-4cc7-0e8c-5b3e-02000396306e
  - vbios-version                 : 86.07.22.00.50
XPRA_LOG_FORMAT="" ./xpra/codecs/cuda_common/cuda_context.py
pycuda_info
CUDA initialization (this may take a few seconds)
CUDA 7.5.0 / PyCUDA 2016.1.2, found 1 device:
  + GeForce GTX 1050 Ti @ 0000:05:00.0 (memory: 87% free, compute: 6.1)
* version                         : 2016.1.2
  - text                          : 2016.1.2
cuda_info
* driver
  - driver_version                : 8000
  - version                       : 7.5.0
XPRA_LOG_FORMAT="" ./xpra/client/gl/gl_check.py 
OpenGL_accelerate module loaded


OpenGL properties:
* GLU.extensions                  : GLU_EXT_nurbs_tessellator GLU_EXT_object_space_tess 
* GLU.version                     : 1.3
* accelerate                      : 3.1.1a1
* display_mode                    : ALPHA, SINGLE
* extensions                      : GL_AMD_multi_draw_indirect, GL_AMD_seamless_cubemap_per_texture, GL_AMD_vertex_shader_viewport_index, GL_AMD_vertex_shader_layer, GL_ARB_arrays_of_arrays, GL_ARB_base_instance, GL_ARB_bindless_texture, GL_ARB_blend_func_extended, GL_ARB_buffer_storage, GL_ARB_clear_buffer_object, GL_ARB_clear_texture, GL_ARB_clip_control, GL_ARB_color_buffer_float, GL_ARB_compatibility, GL_ARB_compressed_texture_pixel_storage, GL_ARB_conservative_depth, GL_ARB_compute_shader, GL_ARB_compute_variable_group_size, GL_ARB_conditional_render_inverted, GL_ARB_copy_buffer, GL_ARB_copy_image, GL_ARB_cull_distance, GL_ARB_debug_output, GL_ARB_depth_buffer_float, GL_ARB_depth_clamp, GL_ARB_depth_texture, GL_ARB_derivative_control, GL_ARB_direct_state_access, GL_ARB_draw_buffers, GL_ARB_draw_buffers_blend, GL_ARB_draw_indirect, GL_ARB_draw_elements_base_vertex, GL_ARB_draw_instanced, GL_ARB_enhanced_layouts, GL_ARB_ES2_compatibility, GL_ARB_ES3_compatibility, GL_ARB_ES3_1_compatibility, GL_ARB_ES3_2_compatibility, GL_ARB_explicit_attrib_location, GL_ARB_explicit_uniform_location, GL_ARB_fragment_coord_conventions, GL_ARB_fragment_layer_viewport, GL_ARB_fragment_program, GL_ARB_fragment_program_shadow, GL_ARB_fragment_shader, GL_ARB_fragment_shader_interlock, GL_ARB_framebuffer_no_attachments, GL_ARB_framebuffer_object, GL_ARB_framebuffer_sRGB, GL_ARB_geometry_shader4, GL_ARB_get_program_binary, GL_ARB_get_texture_sub_image, GL_ARB_gl_spirv, GL_ARB_gpu_shader5, GL_ARB_gpu_shader_fp64, GL_ARB_gpu_shader_int64, GL_ARB_half_float_pixel, GL_ARB_half_float_vertex, GL_ARB_imaging, GL_ARB_indirect_parameters, GL_ARB_instanced_arrays, GL_ARB_internalformat_query, GL_ARB_internalformat_query2, GL_ARB_invalidate_subdata, GL_ARB_map_buffer_alignment, GL_ARB_map_buffer_range, GL_ARB_multi_bind, GL_ARB_multi_draw_indirect, GL_ARB_multisample, GL_ARB_multitexture, GL_ARB_occlusion_query, GL_ARB_occlusion_query2, GL_ARB_parallel_shader_compile, GL_ARB_pipeline_statistics_query, GL_ARB_pixel_buffer_object, GL_ARB_point_parameters, GL_ARB_point_sprite, GL_ARB_post_depth_coverage, GL_ARB_program_interface_query, GL_ARB_provoking_vertex, GL_ARB_query_buffer_object, GL_ARB_robust_buffer_access_behavior, GL_ARB_robustness, GL_ARB_sample_locations, GL_ARB_sample_shading, GL_ARB_sampler_objects, GL_ARB_seamless_cube_map, GL_ARB_seamless_cubemap_per_texture, GL_ARB_separate_shader_objects, GL_ARB_shader_atomic_counter_ops, GL_ARB_shader_atomic_counters, GL_ARB_shader_ballot, GL_ARB_shader_bit_encoding, GL_ARB_shader_clock, GL_ARB_shader_draw_parameters, GL_ARB_shader_group_vote, GL_ARB_shader_image_load_store, GL_ARB_shader_image_size, GL_ARB_shader_objects, GL_ARB_shader_precision, GL_ARB_shader_storage_buffer_object, GL_ARB_shader_subroutine, GL_ARB_shader_texture_image_samples, GL_ARB_shader_texture_lod, GL_ARB_shading_language_100, GL_ARB_shader_viewport_layer_array, GL_ARB_shading_language_420pack, GL_ARB_shading_language_include, GL_ARB_shading_language_packing, GL_ARB_shadow, GL_ARB_sparse_buffer, GL_ARB_sparse_texture, GL_ARB_sparse_texture2, GL_ARB_sparse_texture_clamp, GL_ARB_stencil_texturing, GL_ARB_sync, GL_ARB_tessellation_shader, GL_ARB_texture_barrier, GL_ARB_texture_border_clamp, GL_ARB_texture_buffer_object, GL_ARB_texture_buffer_object_rgb32, GL_ARB_texture_buffer_range, GL_ARB_texture_compression, GL_ARB_texture_compression_bptc, GL_ARB_texture_compression_rgtc, GL_ARB_texture_cube_map, GL_ARB_texture_cube_map_array, GL_ARB_texture_env_add, GL_ARB_texture_env_combine, GL_ARB_texture_env_crossbar, GL_ARB_texture_env_dot3, GL_ARB_texture_filter_minmax, GL_ARB_texture_float, GL_ARB_texture_gather, GL_ARB_texture_mirror_clamp_to_edge, GL_ARB_texture_mirrored_repeat, GL_ARB_texture_multisample, GL_ARB_texture_non_power_of_two, GL_ARB_texture_query_levels, GL_ARB_texture_query_lod, GL_ARB_texture_rectangle, GL_ARB_texture_rg, GL_ARB_texture_rgb10_a2ui, GL_ARB_texture_stencil8, GL_ARB_texture_storage, GL_ARB_texture_storage_multisample, GL_ARB_texture_swizzle, GL_ARB_texture_view, GL_ARB_timer_query, GL_ARB_transform_feedback2, GL_ARB_transform_feedback3, GL_ARB_transform_feedback_instanced, GL_ARB_transform_feedback_overflow_query, GL_ARB_transpose_matrix, GL_ARB_uniform_buffer_object, GL_ARB_vertex_array_bgra, GL_ARB_vertex_array_object, GL_ARB_vertex_attrib_64bit, GL_ARB_vertex_attrib_binding, GL_ARB_vertex_buffer_object, GL_ARB_vertex_program, GL_ARB_vertex_shader, GL_ARB_vertex_type_10f_11f_11f_rev, GL_ARB_vertex_type_2_10_10_10_rev, GL_ARB_viewport_array, GL_ARB_window_pos, GL_ATI_draw_buffers, GL_ATI_texture_float, GL_ATI_texture_mirror_once, GL_S3_s3tc, GL_EXT_texture_env_add, GL_EXT_abgr, GL_EXT_bgra, GL_EXT_bindable_uniform, GL_EXT_blend_color, GL_EXT_blend_equation_separate, GL_EXT_blend_func_separate, GL_EXT_blend_minmax, GL_EXT_blend_subtract, GL_EXT_compiled_vertex_array, GL_EXT_Cg_shader, GL_EXT_depth_bounds_test, GL_EXT_direct_state_access, GL_EXT_draw_buffers2, GL_EXT_draw_instanced, GL_EXT_draw_range_elements, GL_EXT_fog_coord, GL_EXT_framebuffer_blit, GL_EXT_framebuffer_multisample, GL_EXTX_framebuffer_mixed_formats, GL_EXT_framebuffer_multisample_blit_scaled, GL_EXT_framebuffer_object, GL_EXT_framebuffer_sRGB, GL_EXT_geometry_shader4, GL_EXT_gpu_program_parameters, GL_EXT_gpu_shader4, GL_EXT_multi_draw_arrays, GL_EXT_packed_depth_stencil, GL_EXT_packed_float, GL_EXT_packed_pixels, GL_EXT_pixel_buffer_object, GL_EXT_point_parameters, GL_EXT_polygon_offset_clamp, GL_EXT_post_depth_coverage, GL_EXT_provoking_vertex, GL_EXT_raster_multisample, GL_EXT_rescale_normal, GL_EXT_secondary_color, GL_EXT_separate_shader_objects, GL_EXT_separate_specular_color, GL_EXT_shader_image_load_formatted, GL_EXT_shader_image_load_store, GL_EXT_shader_integer_mix, GL_EXT_shadow_funcs, GL_EXT_sparse_texture2, GL_EXT_stencil_two_side, GL_EXT_stencil_wrap, GL_EXT_texture3D, GL_EXT_texture_array, GL_EXT_texture_buffer_object, GL_EXT_texture_compression_dxt1, GL_EXT_texture_compression_latc, GL_EXT_texture_compression_rgtc, GL_EXT_texture_compression_s3tc, GL_EXT_texture_cube_map, GL_EXT_texture_edge_clamp, GL_EXT_texture_env_combine, GL_EXT_texture_env_dot3, GL_EXT_texture_filter_anisotropic, GL_EXT_texture_filter_minmax, GL_EXT_texture_integer, GL_EXT_texture_lod, GL_EXT_texture_lod_bias, GL_EXT_texture_mirror_clamp, GL_EXT_texture_object, GL_EXT_texture_shared_exponent, GL_EXT_texture_sRGB, GL_EXT_texture_sRGB_decode, GL_EXT_texture_storage, GL_EXT_texture_swizzle, GL_EXT_timer_query, GL_EXT_transform_feedback2, GL_EXT_vertex_array, GL_EXT_vertex_array_bgra, GL_EXT_vertex_attrib_64bit, GL_EXT_x11_sync_object, GL_EXT_import_sync_object, GL_NV_robustness_video_memory_purge, GL_IBM_rasterpos_clip, GL_IBM_texture_mirrored_repeat, GL_KHR_context_flush_control, GL_KHR_debug, GL_KHR_no_error, GL_KHR_robust_buffer_access_behavior, GL_KHR_robustness, GL_KTX_buffer_region, GL_NV_alpha_to_coverage_dither_control, GL_NV_bindless_multi_draw_indirect, GL_NV_bindless_multi_draw_indirect_count, GL_NV_bindless_texture, GL_NV_blend_equation_advanced, GL_NV_blend_equation_advanced_coherent, GL_NVX_blend_equation_advanced_multi_draw_buffers, GL_NV_blend_square, GL_NV_clip_space_w_scaling, GL_NV_command_list, GL_NV_compute_program5, GL_NV_conditional_render, GL_NV_conservative_raster, GL_NV_conservative_raster_dilate, GL_NV_conservative_raster_pre_snap_triangles, GL_NV_copy_depth_to_color, GL_NV_copy_image, GL_NV_depth_buffer_float, GL_NV_depth_clamp, GL_NV_draw_texture, GL_NV_draw_vulkan_image, GL_NV_ES1_1_compatibility, GL_NV_ES3_1_compatibility, GL_NV_explicit_multisample, GL_NV_fence, GL_NV_fill_rectangle, GL_NV_float_buffer, GL_NV_fog_distance, GL_NV_fragment_coverage_to_color, GL_NV_fragment_program, GL_NV_fragment_program_option, GL_NV_fragment_program2, GL_NV_fragment_shader_interlock, GL_NV_framebuffer_mixed_samples, GL_NV_framebuffer_multisample_coverage, GL_NV_geometry_shader4, GL_NV_geometry_shader_passthrough, GL_NV_gpu_program4, GL_NV_internalformat_sample_query, GL_NV_gpu_program4_1, GL_NV_gpu_program5, GL_NV_gpu_program5_mem_extended, GL_NV_gpu_program_fp64, GL_NV_gpu_shader5, GL_NV_half_float, GL_NV_light_max_exponent, GL_NV_multisample_coverage, GL_NV_multisample_filter_hint, GL_NV_occlusion_query, GL_NV_packed_depth_stencil, GL_NV_parameter_buffer_object, GL_NV_parameter_buffer_object2, GL_NV_path_rendering, GL_NV_path_rendering_shared_edge, GL_NV_pixel_data_range, GL_NV_point_sprite, GL_NV_primitive_restart, GL_NV_register_combiners, GL_NV_register_combiners2, GL_NV_sample_locations, GL_NV_sample_mask_override_coverage, GL_NV_shader_atomic_counters, GL_NV_shader_atomic_float, GL_NV_shader_atomic_float64, GL_NV_shader_atomic_fp16_vector, GL_NV_shader_atomic_int64, GL_NV_shader_buffer_load, GL_NV_shader_storage_buffer_object, GL_NV_stereo_view_rendering, GL_NV_texgen_reflection, GL_NV_texture_barrier, GL_NV_texture_compression_vtc, GL_NV_texture_env_combine4, GL_NV_texture_multisample, GL_NV_texture_rectangle, GL_NV_texture_shader, GL_NV_texture_shader2, GL_NV_texture_shader3, GL_NV_transform_feedback, GL_NV_transform_feedback2, GL_NV_uniform_buffer_unified_memory, GL_NV_vdpau_interop, GL_NV_vertex_array_range, GL_NV_vertex_array_range2, GL_NV_vertex_attrib_integer_64bit, GL_NV_vertex_buffer_unified_memory, GL_NV_vertex_program, GL_NV_vertex_program1_1, GL_NV_vertex_program2, GL_NV_vertex_program2_option, GL_NV_vertex_program3, GL_NV_viewport_array2, GL_NV_viewport_swizzle, GL_NVX_conditional_render, GL_NVX_gpu_memory_info, GL_NVX_nvenc_interop, GL_NV_shader_thread_group, GL_NV_shader_thread_shuffle, GL_KHR_blend_equation_advanced, GL_KHR_blend_equation_advanced_coherent, GL_SGIS_generate_mipmap, GL_SGIS_texture_lod, GL_SGIX_depth_texture, GL_SGIX_shadow, GL_SUN_slice_accum, 
* gdkgl
  - version                       : 1.4
* gdkglext
  - version                       : 1.2.0
* glconfig                        : <gtk.gdkgl.Config object at 0x7f9095059c30 (GdkGLConfigImplX11 at 0x55ab518c1920)>
* gtkglext
  - version                       : 1.2.0
* has_alpha                       : True
* max-viewport-dims               : (32768, 32768)
* opengl                          : 4, 5
* pygdkglext
  - version                       : 1.1.0
* pyopengl                        : 3.1.1a1
* renderer                        : GeForce GTX 1050 Ti/PCIe/SSE2
* rgba                            : True
* safe                            : True
* shading-language-version        : 4.50 NVIDIA
* texture-size-limit              : 32768
* transparency                    : True
* vendor                          : NVIDIA Corporation
* zerocopy                        : True

comment:16 Changed 21 months ago by Smo

Pasting this before I ctrl+c my client because I think it will lock up my machine.

PYTHONPATH=. ./tests/xpra/codecs/test_nvenc7.py 
2016-11-22 12:00:28,143 CUDA initialization (this may take a few seconds)
2016-11-22 12:00:28,388 CUDA 7.5.0 / PyCUDA 2016.1.2, found 1 device:
2016-11-22 12:00:28,388   + GeForce GTX 1050 Ti @ 0000:05:00.0 (memory: 86% free, compute: 6.1)
2016-11-22 12:00:28,498 NVidia driver version 375.20
2016-11-22 12:00:30,496 NVENC successfully initialized
creating sample data for size 4096

Traceback (most recent call last):
  File "./tests/xpra/codecs/test_nvenc7.py", line 23, in <module>
    main()
  File "./tests/xpra/codecs/test_nvenc7.py", line 12, in main
    test_nvenc.test_encode_one()
  File "/home/cosmo/work/Xpra/trunk/src/tmp/tests/xpra/codecs/test_nvenc.py", line 30, in test_encode_one
    test_encoder(encoder_module)
  File "/home/cosmo/work/Xpra/trunk/src/tmp/tests/xpra/codecs/test_encoder.py", line 94, in test_encoder
    do_test_encoder(e, src_format, actual_w, actual_h, images, log=log, after_encode_cb=after_encode_cb)
  File "/home/cosmo/work/Xpra/trunk/src/tmp/tests/xpra/codecs/test_encoder.py", line 120, in do_test_encoder
    c = encoder.compress_image(image)
  File "xpra/codecs/nvenc7/encoder.pyx", line 2035, in xpra.codecs.nvenc7.encoder.Encoder.compress_image (xpra/codecs/nvenc7/encoder.c:24541)
  File "xpra/codecs/nvenc7/encoder.pyx", line 2213, in xpra.codecs.nvenc7.encoder.Encoder.do_compress_image (xpra/codecs/nvenc7/encoder.c:27908)
  File "xpra/codecs/nvenc7/encoder.pyx", line 1317, in xpra.codecs.nvenc7.encoder.raiseNVENC (xpra/codecs/nvenc7/encoder.c:9269)
xpra.codecs.nvenc7.encoder.NVENCException: locking output buffer - returned 8: This indicates that one or more of the parameter passed to the API call is invalid.

comment:17 Changed 21 months ago by Smo

Yes this did lock up my workstation.

comment:18 Changed 21 months ago by Antoine Martin

r14473 + r14472 require newer drivers (375.x or later) so we can be sure that we'll detect the newer cards and then we blacklist the 10xx ones.

At some later point, we can relax this check when we figure out what works and what doesn't...

comment:19 Changed 19 months ago by Smo

Tested today with 1.0.2

Warning: device 'GeForce GTX 1050 Ti @ 0000:05:00.0' is blacklisted and will not be used
NVidia driver version 375.26

comment:20 Changed 19 months ago by Smo

Resolution: fixed
Status: newclosed

comment:21 Changed 14 months ago by Antoine Martin

Follow up in #1550: other cards get the API error now, fortunately without the lockups.
NVENC v8 support in #1552.

Last edited 14 months ago by Antoine Martin (previous) (diff)

comment:22 Changed 4 months ago by Antoine Martin

For nvenc v8 support see #1823

Note: See TracTickets for help on using tickets.