We have used th glSDL v0.8 wrapper (http://olofson.net/mixed.html) converting it into a SDL video backend adding support for some nice effects and fixing some memory leaks.

The glSDL developers have a web page (included below) explaining how to use this video backed.

We have added resize, antialising filtering, color filters and rotation effects that are directly usable by CSprite interface.

After migrating some existing projects to glSDL, we have found that one of the most common issues is the conversion from surface to texture at running time (further information can be found on Blitting section). To identify easily these issues, we have added two counters that are shown through the log system after calling to CRM32Pro.VideoInfo():

-> Surface to Texture conversions: at loading time iS2Tload and at running time iS2Trun

If iS2Trun is a big number and increases with the execution time, it means that there are these kind of conversions at running time and usually, it kills the performance.

How to use the glSDL video backend for SDL

http://icps.u-strasbg.fr/~marchesin/sdl/glsdl.html

Guidelines for using the glSDL video backend for SDL

glSDL tries hard to be fully SDL-compliant. That means the following advice is only guidelines, not absolute rules, and your program will work and will give correct results even if you don't follow it. However, you'll gain a lot, performance-wise, from following it (read : if you don't, rendering will be slower than any software mode, but if you do, you might very well get the best possible performance for your platform). Moreover, if you follow those guidelines, the speed of most other video backends will probably be improved too.

Basic glSDL internals

glSDL is a video backend for SDL that uses OpenGL. To achieve hardware acceleration, the surfaces that it's given are converted to OpenGL textures before being blitted to the screen. This special functioning has some implications for performance, as the cost for creating and/or modifying an OpenGL texture is usually quite high. Similarly, reading from video memory with OpenGL is slow and should be avoided. The surface format you decide to use also impacts the application speed. This document tries to explain all the things to do and not to do when using glSDL.

Conventions

In this document, the following conventions are used :

a software surface is a surface meeting the following condition :

(surface->flags & SDL_HWSURFACE) == SDL_SWSURFACE
a hardware surface is a surface meeting the following condition :

(surface->flags & SDL_HWSURFACE) == SDL_HWSURFACE

Note : the screen is always a hardware surface with glSDL.

A backend is, in SDL parlance, the underlying driver SDL uses on the current platform (example of drivers : X11, directX, framebuffer, ascii art...). To choose which backend you want to use, you have to use the environment variable SDL_VIDEODRIVER. For example, to use glSDL, you should do the following (under linux/bash-style shell) :

export SDL_VIDEODRIVER=glSDL

Video Initialization

You should let SDL choose the bpp, either by requesting a bpp of 0 or by using the SDL_ANYFORMAT flag during SDL_SetVideoMode. If you want to make use of the OpenGL acceleration, you should also request a hardware surface (explicitly by asking for an SDL_HWSURFACE or implicitly by asking a SDL_DOUBLEBUF surface). Also keep in mind that a hardware accelerated single buffered video surface will cause a lot of tearing/blinking, so the best solution is probably to use a double buffer all the time :

SDL_SetVideoMode(640, 480, 0, SDL_DOUBLEBUF);

As creating a shadow surface would disable any OpenGL acceleration, glSDL always satisfies the requested bpp. glSDL is tested at 8, 15, 16, 24 and 32 bpp ; other bpps are untested, but you are welcome to report results and we would be happy to fix any problem you encounter. Also keep in mind that SDL_SetVideoMode will fail if your system or your current video mode doesn't support OpenGL (like for example XFree86 running at 8bpp). If you don't request 0 bpp, glSDL will have to convert your hardware surfaces to 32 bpp before being able to create a texture (indeed newer glSDL versions running on OpenGL 1.2+ might remove this limitation for some of the pixel formats, but it's not done at the moment). If you don't request a hardware surface, a shadow buffer will be created and OpenGL acceleration won't be used at all.

Note : ideally, if you want to be user-friendly you'll want to provide the user with a means of setting the bpp like a menu option or a command line switch.

Use hardware surfaces

Use hardware surfaces when the pixel contents is static and when the main purpose is on-screen blitting. To create a hardware surface, set the SDL_HWSURFACE flag during surface creation :

surface = SDL_CreateRGBSurface(SDL_HWSURFACE, width, height, bpp, rmask, gmask, bmask, amask);

If you have to use procedural surfaces (a.k.a dynamic surfaces, surfaces that are modified all the time) you shoud consider using software surfaces instead. However, blitting software surfaces is slow, so here are ways to avoid using software surfaces :

if you use a software surface for a precomputed animation, you should create different hardware surfaces corresponding to the frames of the animation
if you're using a software surface to save the background of a mouse cursor, you should simply re-blit what's beind the cursor every frame rather than doing a full restore-save-draw cycle. Such a cycle will surely be slow.

If you're on a target that supports real hardware surfaces and you're doing stuff that isn't accelerated, VRAM is the worst place you could possibly put your source surfaces. It might actually be faster to stream the data form disk! (Really!)

Surface format

Use SDL_DisplayFormat()/SDL_DisplayFormatAlpha() on surfaces that will be blitted to screen. Using it has an impact not only on glSDL (where it will set the SDL_HWSURFACE flag), but also on other video backends :

SDL_Surface * tmp;
tmp = SDL_DisplayFormat(surface);
SDL_FreeSurface(surface);
surface = tmp;

and for surfaces with alpha :

SDL_Surface * tmp;
tmp = SDL_DisplayFormatAlpha(surface);
SDL_FreeSurface(surface);
surface = tmp;

Note : SDL_DisplayFormat*() calls might have an impact with procedural surfaces, i.e. surfaces that are constantly modified. The simplest way to go is to use 24 or 32 bpp surfaces (which glSDL is able to handle directly) although that might not be friendly with other video backends.

Surface properties

The following guidelines apply to choose your surface properties :

RLE is useless with glSDL, and causes a little slowdown, because glSDL will have to un-RLE the surface before converting it to an OpenGL texture. Even worse, if you're using software RLE surfaces, glSDL will have to un-RLE the surface at every blit.
Per surface alpha is almost free (indeed you won't notice the difference on most video cards, and starting with a TNT2, per surface alpha is totally free).
Per pixel alpha has a small cost, mostly due to the higher memory bandwith requirements needed to handle per pixel alpha surfaces (RGBA uses 25% more memory compared to RGB).
Colorkey has the same cost as per pixel alpha, because glSDL uses per pixel alpha to emulate colorkeying (using only min (transparent) and max (opaque) alpha values).

Notes :

You most likely won't notice the performance difference between alpha and non alpha surfaces, even when using older video cards and per pixel alpha. However, if you care about usability on backends that don't accelerate alpha blits, try to avoid doing too many of these.

Locking/unlocking

Locking/unlocking has a very high cost with glSDL :

Avoid locking/unlocking hardware surfaces. Unlocking a hardware surface requires that the whole OpenGL texture is recreated and sent to the video card.
Avoid using hardware surfaces that change all the time. Touching a hardware surface requires locking/unlocking it. Replace those with multiple different surfaces if you can (for example if your changing surface is a cycling 10-frame animation, use 10 different hardware surfaces, one per frame). Using software surfaces for that is not a good idea, because modifying a software surface and blitting it every frame is about as costly as modifying a similar hardware surface and blitting it every frame.
Avoid reading the video surface contents. For one, its content isn't reliable precision-wise and will vary depending on the video bpp, and reading the video surface requires transfering some data over the AGP bus which is slow (especially with an OpenGL driver using memory mapped IO, where you might have a throughput as low as 4 MB/s !). Even reading small rectangles will have a visible impact on performance, so you might want to avoid it altogether. A typical bad example is software mouse cursors that save the cursor background every frame and restore it afterwards.
Obviously, the speed at which locking/unlocking happen will also prevent most real time direct pixel access on hardware surfaces.

Blitting

To avoid ending up uploading surfaces to video memory every frame, and thus getting a visible slow down in your program, you should :

Avoid blitting a hardware surface (including the video surface) to a software surface
Avoid blitting a software surface to a hardware surface (including the video surface)
Avoid blitting the video surface to another surface (including the video surface)
Use as few direct pixel access on hardware surfaces (including the video surface) as possible.

Indeed, modifying a hardware surface means this surface needs to be converted into a texture and sent to the video card after the changes are done.
Also note that glSDL uses a lazy uploading scheme, i.e. the surface is uploaded to video memory only when it's needed so the first blit is slower than subsequent ones.
As a rule of thumb, mixing software and hardware surfaces in blits is not the way to go. Hardware/hardware blits will probably happen in video memory, and likewise, software/software blits will happen in system memory. But Hardware/software blits will most likely end up in moving the hardware surface to system memory first, which is a very slow operation.

Known problems

If the window is (partially) occluded and the application tries to read its contents, the occluded pixels will be full of zeros.
Some older cards may not support per pixel alpha or per surface alpha correctly. For example, the ATI 3D Rage Pro silently uses a different OpenGL blending mode than the one requested, which has a nasty visual effect. Colorkeying in glSDL uses alpha, so if your card has such alpha problems colorkeying might not work correctly either.
Some applications crash on exit with the nvidia closed source drivers v5x.xx for linux. The bug is somewhere in the nvidia driver, so if that's causing you trouble, change for another version. At the time of writing, the v4x.xx drivers work fine.

Small FAQ

I followed all your advice, and glSDL is damn slow !
You are probably using software OpenGL rendering. Using glSDL on a non hardware accelerated OpenGL system is pointless, since the SDL software blitters are optimized for the 2D case, while a software OpenGL driver has to support generic 3D texturing and thus will be slower.
glSDL seems to use a lot of ram (seen in 'top')
glSDL doesn't use all that memory, it's just AGP memory that gets attached to every OpenGL process (that happens with the Nvidia 3d driver usually).
Surface creation/destruction is slow
This is normal. That's the time it takes to upload the surface to the video card or to destroy the corresponding OpenGL texture. Remember that sending a surface to video memory means it must be read from system memory (you will be limited by your memory speed), go through the front side bus (where you'll be limited by your FSB speed) then be sent across the AGP bus (where you'll be limited by your AGP bus speed). An OpenGL implementation using dma transfers will somewhat lift this limitation, but you shouldn't rely on it for portability reasons.

Contact

In case you want to contact the authors :)
david@olofson.net
stephane.marchesin@wanadoo.fr

Back