Introduction

Several drivers are available for ATI Radeon graphic cards for linux. This article will compare them to each other, mostly focused on 3d performance with games. Update July 23th: Some clarifications & corrections were added.

Tested drivers

- The DRI drivers: this is the driver which is included in XFree. It's development is hosted at dri.sourceforge.net, for this comparison a cvs checkout from the 1st of July 2003 was used - XFree 4.3 contains a version of this driver which is quite a bit older. It is open-source and supports all graphic cards which are based on R100 and R200 (including the derivatives RV100, RV200, RV250, RV280), that is card models Radeon 7000-9200. It (currently) does not support the integrated graphics of the ATI IGP chipsets. Newer cards (based on R300 and up) chips such as the 9700 are not supported. This driver will also run on non-x86 and some non-linux platforms. Update: For the adventourous, patches are available for the IGP chipsets (not yet in CVS but available at xfree86 bugzilla).
- xig summit Accelerated-X: strictly speaking, this is not only a driver, but a commercial replacement for the XFree86 X Server. For this comparison a demo version of the Desktop DX Platinum Edition version 2.2-11 was used. According to the xig website, it is fully functional except it will only run for 25 minutes. xig summit Accelerated-X Servers are also available for different platforms, but the demos are restricted to x86 and Linux or Solaris. It supports about the same cards as the dri driver (i.e. no 9500 and up cards).
- ATI's fglrx driver: this is the driver provided from ATI for XFree86. However, the driver downloadable directly from ATI is very old and doesn't work with XFree 4.3. For this comparison the newest driver downloadable from http://www.schneider-digital.de/ was used (this driver seems to be intended for OEM custumers of ATI). The version used was 2.9.13 (this is the first version which supports the DGA extension, so no more jerky mouse movement in QuakeIII). It only runs on Linux/x86 systems. In contrast to the two other drivers it does not support cards based on the R100 (or derivatives from it) chips, but instead supports the newer cards based on the R300 chips. So it supports all Radeons with a model number of at least 8500 (and of course it also supports the FireGLX cards, for which it is primarily intended). Update: This driver seems to be the same which is now available at ATIs site.

System Setup

Hardware:
CPU Athlon XP 1600
Board Asus A7V133 1.05.
Chipset KT133A
Ram 1GB (2x512MB) PC133-222 Sdram
HD 120GB Seagate Barracuda ATA V
Graphic Card HIS Excalibur Radeon 9000pro 64MB

OS:
SuSE Linux 8.1, updated to XFree 4.3 and KDE 3.1, Kernel (self-compiled) 2.4.21
for comparison Windows 2000 SP4

graphic drivers:
ATI X4.3.0-2.9.13
XiG Summit Accelerated-X DX Platinum 2.2-11
DRI cvs from 1st July 2003
(for windows) Catalyst 3.5

software used for benchmarking:
x11perf (included in the SuSE XFree86 4.3 package)
SpecViewperf 7.1
Quake III Arena 1.32
Return to Castle Wolfenstein 1.41
Unreal Tournament (4.51)
Unreal Tournament 2003 Demo (version 2206)
Neverwinter Nights 1.29

The dri and xig summit driver were set to use AGP 4x mode, fastwrites off (switching them on caused hard lockups). ATI's driver seemed to use AGP 4x mode and fastwrites on interestingly. Windows used AGP 4x mode with fast writes off as well. Additionally, the "EnablePageFlip" option was used for the dri driver. All tests were run from within a normal X session (1280x1024x24 virtual screen resolution) with KDE 3.1 (including artsd) running. The summit driver and the windows catalyst driver were set to not convert 32bit textures to 16bit (the dri driver doesn't do it, don't know about ati's linux driver, in windows this setting is controlled with the texture quality slider). Update: Forgot to mention, all drivers were running the 1280x1024 resolution at 85Hz.
The dri driver also has a patch applied which basically changes the reported buffer size to 32 instead of 24. This is necessary to get Neverwinter Nights running (note that this patch is not necessary for stock XFree86 4.3 as the GLX visual matching code was different).

Driver limitations / General issues

The dri driver does not support dual-head and 3d acceleration at the same time. If you're feeling adventurous, you should be able to get this to work with a patch, as long as both your screens don't exceed 2048 pixels in one dimension. However, testing dual-head capabilities is beyond the scope of this comparison (and I don't have two monitors to test it anyway). ATI's driver should work with dual-head, however it's restricted to 2048*2048 pixels just as well on the R200 based cards (and 2560x2560 with R300 based cards). I don't know if the xig summit driver suffers from the same limitation.
All 3 drivers were installed in parallel which wasn't much of a problem, it is only necessary to change the symlink to the libGL.so library and changing the XF86Config file or change the symlink to the X binary. Each one of the three drivers has its own kernel module (radeon.o for the dri driver, xsvc.o for xig summit, fglrx.o for the ATI driver), loading the ATI or xig kernel module will of course the tainted flag to be set. Regardless of that, the kernel modules all seemed to behave reasonably well, no kernel panics etc. were noted. However, changing the X Server / kernel module without rebooting (terminate the X session, unload the kernel module, change the symlinks / XF86Config file, insert the appropriate kernel module and restart X) more often than not resulted in a GPU/X lockup (but the kernel was still alive).
Both the dri driver and the driver from ATI suffered from some visible "snow" in the picture when the physical resolution was 1280x1024. For this to happen it didn't matter if the 3d size was 1280x1024 or smaller, so a 800x600 QuakeIII running in a window with the physical resolution still 1280x1024 also suffered from that. Basically it looks like not all values could be read from the frame-buffer for displaying or something like that (it looked a bit similar to what happens when you overclock the graphic card memory too much under windows). Update: This probably gives them a (slight) unfair advantage in benchmarks.
ATI's driver and the xig summit driver failed to automatically calculate the dpi of the monitor (which is possible by using the DDC/EDID data from the monitor which contains the physical screen resolution in millimeters), but it was possible to manually configure this.
The xig summit driver does not support some XFree86 specific extensions, most notably the VidMode extension used to change resolutions. This means that all tested games (with the exception of UT2K3 which can also use XiG extensions) could not change the resolution, and they always ran in a window. Update: While I didn't try it, it should be possible to get all games which use the SDL library to support changing resolutions and running fullscreen with a recompile (after installing the XiG drivers) of libSDL.

cpu load / xawtv / glthreads

Apart from testing only 3d performance, some other quick tests were run. xawtv was used to demonstrate overlay capabilities (especially when not using the crappy XFree86 v4l module). Glthreads (from mesa/xdemos) was used because people reported lockups with XFree86 drivers when using multiple 3d apps (especially on SMP systems) - glthreads will create a (specified) number of threads each with its own rendering context and own window, just displaying a rotating cube in each window. The cpu load test is simple, it just consists of watching cpu usage when running a 3d application. This is used to determine if the driver is well-behaved, i.e. it shouldn't eat all available cpu cycles if it's waiting for the GPU. Update: this test was improved so that a 2nd (cpu heavy, but running at the lowest priority) application is actually running in the background, which should give a more meaningful impression how a driver actually behaves. Also, two GL applications (glxgears and QuakeIII, both running in a window - not very practical I know) were run at the same time.

dri driver: xawtv worked flawlessly.
However, this driver failed the glthreads test - running only few threads worked, however increasing the number would increase the chances of a GPU (or X) lockup when for instance the windows were moved around. Using more than 32 threads resulted in an immediate lockup as soon as the 33rd context should be created.
CPU usage when running glxgears was anywhere from 3% (almost fullscreen) to 25% (default size). CPU usage when running the quake3 timedemo in a window (size 1152x864) ranged from 10%-100%. Update: prime95 running in the background got almost all cpu time (>95%) when a full-screen glxgears was running. When playing QuakeIII, prime95 got about half of all cpu time, and the QuakeIII demo four timedemo was about 15%-20% slower compared to when nothing was run in the background. Running glxgears and QuakeIII at the same time resulted in a GPU / X lockup after fractions of a second (not unexpected considering the glthreads result).

ATI driver: xawtv had some problems. Whenever starting it, the monitor just went into sleep mode (!), except when it was started with -remote (this bug did not happen when using an older driver). Switching to a virtual console and then back fixed this problem. But this was not the only problem, the overlay also corrupted the screen content outside the tv window (the corruption increases over time). Grabdisplay worked correctly.
Glthreads unfortunately didn't run correctly - there were no lockups (up to 64 threads), but nothing was drawn inside the windows at all. However, judging from the responsiveness of the system, all calculations were actually done.
CPU usage was always 100% when running a 3d application, even for the full-screen glxgears. But don't be worried too much, the windows drivers do this too (not tested with glxgears for obvious reasons, but with quake3). Update: prime95 got a constant 14% cpu time regardless if glxgears was run or QuakeIII was run (both demo playback and gaming). On the upside, QuakeIII got only less than 10% slower compared to when nothing was run in the background. QuakeIII framerates when running glxgears simultaneously weren't that bad, however the game was choppy.

xig summit driver: xawtv didn't run very well. With the default option (-xv) the picture looked good, however it was impossible to change channels. The only option which worked half-way decent was -noxv-video, this produced no image when using overlay, but worked when using grabdisplay. Unfortunately, it was not possible to scale this to a larger size than 768x576 (PAL), scaling to larger than that caused the picture to be mostly gray with some pink and green bars at the right side of the picture.
Glthreads worked flawlessly up to 64 threads, though of course the system wasn't very responsive...
CPU usage was interesting. Using glxgears the CPU load was always 81%, no matter how large the size of the glxgears window was. Running quake3 in a window also showed similar behaviour, the load never drops below 81%. Update: this driver showed an interesting behaviour. prime95 got almost all cpu time (> 95%) with a fullscreen glxgears, however this dropped significantly if the glxgears window had focus and the mouse in the window was moved. Running QuakeIII simultaneously with prime95 was also interesting, when playing back a timedemo the playback got quite choppy (prime95 got almost all and QuakeIII almost no cpu time) unless the mouse was moved from time to time (in this case both prime95 and QuakeIII got about the same cpu time and the timedemo score dropped about 20% compared to when nothing was run in the background). Playing QuakeIII didn't work too well unfortunately, the framerate dropped too much (and prime95 got a lot of cpu time). Running glxgears and QuakeIII simultaneously worked better, while the QuakeIII framerate was noticeable slower the game remained quite smooth. Obviously, this driver tries to be fair to other processes running and does its own scheduling, which could give it an advantage in some cases (for example, multiple OpenGL applications) but didn't work too well for the simple test case (where one OpenGL application always waits for the GPU and a number-crunching application gets the cpu time while the application/driver is waiting).

2d performance

2d performance was measured with x11perf. Sorry, no graph - a barchart with over 350x3 entries just isn't fun and a mathematical average doesn't really do justice to these synthetic benchmarks. However, you can view all results - the first row shows repetitions/s for the dri driver, the second row the relative performance of the ATI driver compared to the dri driver (so a score of 3 means 3 times as fast, 0.1 means 10 times slower), the third row shows the relative performance of the xig summit driver compared to the dri driver.
Just a couple of notes:
- The xig driver failed all antialiased trapezoid tests ("BadImplementation" was the error message - though x11perf didn't notice it failed). Update: according to XiG, these tests are based on features currently not completely specified (it looks like the RENDER extension isn't quite finished) and thus it's impossible to implement them correctly. And probably nobody uses them (at least currently).
- In general, the ATI driver just seems to be slow. In a lot of tests it is more than 3 times slower than the dri driver, in a few cases more than 100 times slower.
- The dri driver seems to have a slight problem with "Copy yxy n-bit deep plane" tests - the only test where it gets beaten even by ATI's driver by a wide margin.
- The xig driver appears to be the fastest overall. In a lot of the tests it is very comparable to the dri driver, it gets beaten by significant margins in only very few tests, and it trounces the other drivers in some of the character-based tests (the "Char in x0-char aa line (Charter yy)" and "Char in x0-char rgb line (Charter yy)" tests). The much faster anti-aliased font rendering speed is definitely noticeable if you're using anti-aliased fonts.

next page