Adventures in Mesa – EGL and Transparency

A bug report lands in my notifications, I open it, transparency is broken it says, that can’t be right. I test it out on my rig, looks broken to me – no that can’t be right either.

Hah! I don’t have a compositor running, so I start Compton – no cigar, still broken. “An interesting bug” I mumble to myself, I guess the hunt is on.

As with any good bug hunt, the first stage is research, and so I wonder: how do the folks at GLFW handle this. First result, a PR titled “support for X11/GLX window transparency”.

“Simple!” All I got to do is check that the X11 configs have a alphaMask, easy-peasy.

Of course, had it ended there, I would not be writing this post. I test it out, it works, but here is the catch: only with GLX. Apparently mesa does not report EGL configs with an alpha mask… hmm. One could settle there, as we only use EGL as a fallback, but that’s no fun.

So I searched around – to no avail. I then decided to ask at #dri-devel, and after some discussion regarding what behavior I was expecting, they point me at this issue open circa 2013. 2013! Yikes!

So a hacky workaround could have been just removing the if (>depth == 24) { line. I say “could have” instead of “would have” because it just wasn’t so. Go on, go try it! Did it work? Of course not! Why would I lie to you? Or maybe it did work for you, I don’t know, but it sure didn’t work for me!

But that’s okay, simple problems got simple solutions, right? And transparency is a pretty simple problem, is it not? You got to conform yourself somehow if you want to fall asleep, I’m sure this will only take a couple days. Yet lo and behold! Witness as days drag into a week, then a week into weeks, all on transparency!

So tell me, why does Mesa only store four DRI configs per EGL config? Why not one for one? I didn’t know at the time, so I set out to make the ratio 1:1! I thought about it… what are we going to need? Let’s see, one vector, and…. that’s about it!

Well, this is C, not C++ or something else, just C, and there are no built in vector types, but that’s okay, Mesa is a large project, they probably have one. Two stnvims of the source code later and I’ve tracked down a type called u_vector. And so I use it!

Ah, but hold your horses! I’m now getting SEGVs and EGL formats with negative numbers of RGB bits, something that’s certifiably insane! Tell me, what’s wrong with this code:

Do you see it? No? Yes? Maybe? Okay, here is a hint:

Given up? Okay, so I passed sizeof(struct dri2_egl_config_bundle *) to u_vector_init. That means I passed the size of a pointer (8 bytes) to a function expecting the size of the actual struct (a number a lot bigger than 8 bytes, trust me). This was causing every bundle to partially overwrite the ones prior.

So I resolve that, but to no avail! The gods are just not on my side! Some investigation later I discover three things:

  1. unreachable should be ABORTing my program.
  2. I had accidentally been compiling with debug assertions disabled, so that was not happening.
  3. Had I compiled with them enabled, I would of discovered that u_vector only supports types with a power of two size.

So I switched to the _EGLArray type, but that does not end my tale! No siree! We are just getting started! Two memory leaks, four NULL accesses, 3 pointer type mismatches, some other thing I’ve forgotten, and an uncountable number of tears latter I’ve finally got something that compiles. Only compiles.

Does it work? Of course not. Not one bit. Now nothing using EGL can successfully swap it’s buffers! In fact, I accidentally ran out of memory one day, and when I restarted, my changes to Mesa did this to me:

See the colours in the screenshot? Those were not the colours I could see. The greens in the screen shot were blues on my monitor, and the blues were oranges.

I was stumped on this issue, for a very good while, until I saw this warning appear when closing Weston via Mod-Shift-Q instead of my usual Ctrl-C: libEGL warning: FIXME: egl/x11 doesn't support front buffer rendering. It took some thinking to realize what’s up… but eventually…

Do you remember how Mesa used to store up to four DRI configs per EGL config? Well, that’s because they had to! You see, the EGL configs don’t specify whether or not the configs are for windows or pbuffers, and whether or not they had sRGB support, that’s something that happens when creating the surface. For that reason four DRI configs had to share the same EGL config. Now, when I decided to not share them, the EGL configs with DRI configs meant for pbuffers just so happened to be first, so I was causing the EGL clients to use EGL configs that used DRI configs meant for pbuffers (what a mouthful), which would obviously fail to work.

Four iterations later and I had this issue resolved, allowing EGL to support more configs than it ever did before. So every thing is peachy, right? Well, have a look at this:

The background should be transparent FWIW.

After some more investigation, I discovered that I’d replaced the rgba_masks I was passing to dri2_add_config with NULL while testing and forgotten. Undoing this mistake has taken us back to square one, with an Mesa implementation that lacks transparency support.

Indeed, after exactly 21 days of strife we have gone full circle! What’s the moral of this article, you ask? I dunno, maybe don’t try to patch Mesa? I just felt like documenting my adventures, I guess. Maybe come back in a couple weeks for the second installment of Adventures in Mesa, where we (hopefully) actually resolve this transparency issue.

One thought on “Adventures in Mesa – EGL and Transparency”

  1. Thank You for sharing your adventures with us.

    It was very interesting to read about your work from your POV. And while reading I hope Rust replace C ASAP. Great work!

Leave a Reply

Your email address will not be published. Required fields are marked *