Xpra: Ticket #1500: Firefox and Chrome causing hard client crash

Using the latest trunk r15617 (Both client and server are Fedora 25), I am getting hard client crashes after using Firefox or Chrome. Chrome seems to behave nicer, I can use it almost indiscriminately, but at some point something changes and then it causes a hard client crash. Firefox seems to be the easiest - I just opened it up and watched a YouTube? video and that was enough to cause the crash - even just opening up a text only website like Reddit or Wikipedia and scrolling will also cause a crash. Once the client crashes, any subsequent connections will fail with the same error(if you were watching a video) until you kill Firefox or Chrome. Of note, just leaving Xterms running seems to behave - no crashes or errors. Running with a -d gtk seems to turn up nothing of interest.

Here's the last bit of the client log before it crashes:

(Xpra:19270): Gdk-ERROR **: The program 'Xpra' received an X Window System error.
This probably reflects a bug in the program.
The error was 'BadMatch (invalid parameter attributes)'.
  (Details: serial 79698 error_code 8 request_code 72 minor_code 0)
  (Note to programmers: normally, X errors are reported asynchronously;
   that is, you will receive the error a while after causing it.
   To debug your program, run it with the --sync command line
   option to change this behavior. You can then get a meaningful
   backtrace from your debugger if you break on the gdk_x_error() function.)
2017-04-14 10:30:54,287 sound output stopping
Trace/breakpoint trap (core dumped)

One quick question:

What -d flag would be most useful here?



Sat, 15 Apr 2017 06:38:34 GMT - Antoine Martin: owner changed

When did this regression start? Can you bisect it? I'm not seeing that here and I'm always running trunk on F25. Do you use mmap? Have you tried enabling / disabling opengl? clipboard? etc..

As for debug flags, when in doubt go for "-d all": better have too much than too little.


Tue, 18 Apr 2017 17:04:34 GMT - J. Max Mena:

Looks like it was OpenGL - running with --opengl=no I don't get the crash anymore.

Running with OpenGL enabled and -d all, here's the last few prints:

2017-04-18 10:08:01,913 check_server_echo(0) last=True, server_ok=True
2017-04-18 10:08:01,913 add_packet_to_queue(add_data ...)
2017-04-18 10:08:01,928 gtk2.GLWindowBacking(12, (460, 185), None).gl_show() swapping buffers now
2017-04-18 10:08:01,929 gl_show after  79ms took  0ms,  1 updates
2017-04-18 10:08:01,929 gtk2.GLWindowBacking(12, (460, 185), None).gl_frame_terminator()
2017-04-18 10:08:01,929 <OpenGL.platform.baseplatform.glBindFramebuffer object at 0x7fed61243fd0>(GL_FRAMEBUFFER (36160), c_uint(1L))
2017-04-18 10:08:01,929 gtk2.GLWindowBacking(12, (460, 185), None).do_present_fbo() done
(Xpra:13520): Gdk-ERROR **: The program 'Xpra' received an X Window System error.
This probably reflects a bug in the program.
The error was 'BadMatch (invalid parameter attributes)'.
  (Details: serial 5782 error_code 8 request_code 72 minor_code 0)
  (Note to programmers: normally, X errors are reported asynchronously;
   that is, you will receive the error a while after causing it.
   To debug your program, run it with the --sync command line
   option to change this behavior. You can then get a meaningful
   backtrace from your debugger if you break on the gdk_x_error() function.)
2017-04-18 10:08:04,399 sound output stopping
Trace/breakpoint trap (core dumped)


Can you bisect it?


I'll get on that now since this morning seems a bit slow.


Tue, 18 Apr 2017 17:06:57 GMT - J. Max Mena:

Okay, I found a very reliable trigger:


That seems to reliably cause a crash here. Now, on to bisecting.


Tue, 18 Apr 2017 17:28:26 GMT - J. Max Mena:

First things first, rebuilding the client as 1.X causes the issue to go away, so it's limited to 2.X for now.

So I've rolled the client back to trunk r15500, and I'm still getting the crash.

I'll try rolling the server back as well, just to see if it makes a difference, but I'm 75% sure the issue is limited to the client.


Tue, 18 Apr 2017 17:47:00 GMT - J. Max Mena:

Alright, so of interest, my client's OpenGL information lists that it's a VMWare, Inc GPU with OpenGL version 2.1 - same as the server without a GPU. This desktop has an Nvidia 745 dedicated GPU, so I'm a little confused as to why it's listed as VMWare.

Edit:

This would probably explain the issues I'm running in to, they seem to be only on this machine as afarr's laptop with an Intel iGPU works fine running Windows 10 using the 2.1 r15608 client, and a 2.1 r15664 Fedora 25 server.


Tue, 18 Apr 2017 17:47:17 GMT - J. Max Mena: attachment set


Wed, 19 Apr 2017 05:22:58 GMT - Antoine Martin: priority changed; milestone set

We need to figure out if this crash is caused by:


Thu, 20 Apr 2017 20:21:09 GMT - J. Max Mena:

Starting with the low hanging fruit:

I perused #1309 and found out the fix was added in r15094. I rolled back to r15093, and the crash went away. However, upping to r15094 still doesn't cause the crash, so that isn't it.

Next up, walking through the versions you mentioned.


Thu, 20 Apr 2017 20:28:27 GMT - J. Max Mena:

So I still get the same crash even with r15560 - so it's been around for a while now. Not sure why I didn't catch this earlier. The fact that I get it before those changes tells me that they aren't the cause. Probably.


Thu, 20 Apr 2017 23:25:19 GMT - J. Max Mena:

Okay I've tried all the revisions mentioned in comment:6 and they all have the same crash.


Fri, 21 Apr 2017 05:55:49 GMT - Antoine Martin:

Please try:

And bisect from there as needed. (bearing in mind you may have to force enable opengl on older revisions)


Thu, 27 Apr 2017 19:58:41 GMT - J. Max Mena:

Quick update on this one:

After talking about it, the going theory right now is that the Nvidia driver isn't being loaded properly, as such we are falling back to the software opengl, which is causing an issue. That being said, the "VMWare" should work.


Thu, 11 May 2017 17:55:18 GMT - J. Max Mena:

Update:

Looks like it's the software OpenGL that's the problem.


Fri, 12 May 2017 05:54:44 GMT - Antoine Martin: summary changed

r15811 moves "vmware" to the blacklist, so opengl won't be used by default on this buggy driver.

@maxmylyn: I think we can close this ticket?


Fri, 12 May 2017 16:20:47 GMT - J. Max Mena: status changed; resolution set

Agreed, closing.


Sat, 23 Jan 2021 05:25:59 GMT - migration script:

this ticket has been moved to: https://github.com/Xpra-org/xpra/issues/1500