The first time that we heard the name of Matrox was at the beginning of graphic cards fury when they released Millennium and Mystique. The Canadian company than later became popular with Millennium and Marvel Gxxx series. Matrox always recalls us quality analogue graphics output when compared to competitors.
Keeping this quality as a standard, Matrox became the popular brand for artists that spend hours in front of huge monitors using graphics programs like PhotoShop and so on. Moreover, Matrox keeps on giving support to their products years after the release. For instance you can still find the BIOS updates and drivers for Millennium and Mystique.
The only drawback of Matrox was its lack of support to popular 3D games and its low frame rate. And then Matrox announced its new GPU a few months ago.
The name of this new GPU is Parhelia-512. It is the first 512bit CPU in the world and uses 256 bit DDR. The reason that Matrox uses 256bit DDR in Parhelia is to overcome the most important problem in graphic cards, which is the bottleneck effect of Ram bandwidth. 256bit DDRs offers twice the bandwidth of 128bit DDR RAMs.
Parhelia’s 256bit DDR Rams operates at 275 MHz. and offers a memory bandwidth of 17.6 GB/s. This rate is 8.8 GB/s in GeForce 4 which also operates at 275 MHz. but uses 128bit DDRs. Moreover, the sub controllers, fast Z-Buffer Clear and specialized buffer memories increase the efficiency of Ram bandwidth.
Parhelia currently is on sale with 128 MB 3.3 ns. RAM, while it is possible to manufacture versions with 64 and 256 MB DDR RAMs. Standard Parhelia has 220 MHz. core and 275 MHz. RAM clocks and this model is named as “R” (Retail). There is also another 128 MB version which features 200 MHz. core and 250 MHz. memory speeds and this one is called “B” (Bulk) version.
As you see in the picture below, Parhelia uses 3.3ns DDR SGRAMs of Infineon. I don’t know why it is sold with 275 MHz. memory speed since these 3.3ns RAMs can support up to 300 MHz. without causing any trouble. (1000/3.3=303)
Parhelia’s core uses .15 micron production technology and has 80 M transistors. Core die is as you see the biggest among all the GPUs and reminds of Socket370 CPUs. Over the GPU is a heatsink which is smaller compared to NVIDIA GeForce 4 Ti 4xxx series. When people saw the first prototypes of Parhelia, many thought that the GPU would get really hot, on the contrary Parhelia has no thermal problems. (At least while operating at 220 MHz.)
Parhelia has 4 Pixel Rendering Pipelines and each pipeline can process 4 textures at the same time. (Quad Texturing) The rate is twice the GeForce 4. Moreover, 4 128 programmable Vertex shader supports Vertex Shader v2.0 which will come with DirectX 9.0. These 4 independent programmable 128 Bit Vertex Shader units have a great processing power with 256Bit registers, 512 command cache and optimized management. Parhelia’s Vertex Shader unit can perform a 10 M tri/s rate in 90 command complex Vertex Shader operations. This rate can increase to 100 M vert/s in simple Vertex Shader operations.
Beside a powerful Vertex Shader unit, Parhelia’s Pixel Shader unit is designated to support Pixel Shader v1.3 so that it can give full support to DirectX 8.1. Vertex and Pixel Shader unit operations can be emulated by certain drivers however you must bare to serious performance losses. Although Parhelia’s v1.3 support is satisfactory at the moment, I think Pixel Shader v2.0 would do much better since DirectX 9.0 requires it. Currently Parhelia’s support is somewhere in between DX8 and DX9.
You can easily install Parhelia’s drivers under WinXP however the main menu that is used to make adjustments can only be accessed if Microsoft .NET Framework. In other words you have to install .NET Framework (about 20 MB) to see the menu below. You can either get the setup file from the bundle CD or download it from Windows Update. You can easily setup Parhelia using the easy to use and professional interface. You can make all the adjustments of Parhelia from this menu. Especially Multi-Display and Desktop management features are greatly improved. If you use multi-display frequently, you will really like Parhelia.
Parhelia also offers AGP 8x but we will be able to test this feature later. Parhelia also has brand new features like displacement mapping, 10bit Gigacolor, 16X Fragment Antialiasing, Glyph Text Antialiasing, Triple Head Surround Gaming, 64 Supersampling texture filtering.
Hardware Displacement Mapping
Parhelia is the first card to offer “Internal Displacement Mapping” unit. This feature briefly renders the Z-axis data for 3D models according to the gray scale tones on a bitmap picture. Actually this is not new. It is an old technique which is used for saving and building image for years. What Matrox does is to accelerate and support it as a hardware. I think you can understand better what I mean by taking a look at the picture below.
Parhelia first of all get the 3D shape and increases the surface quantity using a method called “adaptive tessellation”. Then referring to displacement map, it builds up the Z-axis data and finally gets fast and quality results after the texturing process. Parhelia also uses a method called ‘’Depth-Adaptive Tessellation’’ to handle the high polygon numbers on the 3D scenes build this way. Consequently the surface count of objects that are away from the camera are decreased while those that are closer’s are increased. Since the objects that are away from the camera are shown in 3-5 pixels, it makes no sense for them to have high surface counts. Using this method, you can drastically decrease the surface numbers to be rendered without decreasing quality. These two techniques are not only used for background objects. Just after creating a character and its motions, you can make multiple versions of the character by just changing the displacement map.
10bit GigaColor
Today graphic cards use 8bit palettes for three main colors to create colors. Also an extra 8bit is used for alpha. This 8bit (2^8) is uses 256 different tones of red, 256 different tones of green, 256 different tones of blue to support a palette of 256x256x256=16.777.216 colors. The 32bit choice in the graphic cards menu demonstrates 24bit color + 8 bit alpha.
Matrox now goes one step beyond and introduces GigaColor which increases 8bit scale to 10bit (2^10=1024) scale. So for each main color you can use 1024 different tones and this enables a palette of 1024x1024x1024=1.073.741.824 color palette. This means you have a palette that has 64 times the color of the previous one. Of course Parhelia does not only feature 10bit color channels, also it has 10bit RAMDAC and 10bit TV encoder.
Moreover, Matrox has developed a plug-in (for .tiff .png formats) for Photoshop that supports 10bit color per channel. Also Matrox is thinking of a plug-in for 3D Studio Max very soon. I guess 10bit GigaColor will be celebrated mostly by graphic artists. At least it is now possible to work on high color-resolution scanned images by using a 10bit palette. Still Photoshop can support 16bit color depth. This feature will allow you to make more smooth color gradients when you zoom in to make detailed work. In the near future we may have games that use this mode to have better color and image quality.
16X Fragment Anti-Aliasing
Today Anti-Aliasing (AA) is the most popular method for increasing the quality of 3D images. Most common AA methods that are used are Supersampling AA and Multisampling AA. Briefly, for 4x AA the image is initially rendered in 4 times size. For instance to get a 1024×768 image, it is firstly rendered in 2048×1536 and then resized to 1024×768. When the image is rendered at a higher size, you get 4 pixel data for the creation of each result pixel. However this technique greatly decreases performance since the number of pixels to be rendered is increased to 4 times and it takes time to resize the image.
Parhelia, instead of 4x, has 16xx AA. In the conventional 16x AA method the whole image of size 1024×768 is firstly rendered to a gigantic size of 4096×3072 by Supersampling method and then later back. But this technique requires a lot of RAM and drastically decreases speed. Matrox, aware of the problems of the conventional 16x AA uses another Antialiasing method for 16x, which is called 16X FRAGMENT AA. The method finds the edges in the image that requires antialiasing and applies 16X AA only to those parts. Generally only 5-20% of a scene requires antialiasing and Parhelia makes good use of this knowledge to offer fast 16x AA.
However 16X Fragment AA also has some disadvantages. In the scenes where Stencil Buffer (especially for shading) is used, 16X Fragment AA can not be applied. Also sometimes the edges that require antialiasing are not well defined. If you look into the right hand boxes in the images above, you will see that 16x AA performs better than 4x AA. However if you compare the areas enclosed with left-hand boxes, 4x AA seems to be more successful in antialiasing. Still 16x Fragment AA has better quality in general.
Let’s take a look at the speed performance of AA in Parhelia. We will use Quake 3 Arena v1.17 demo001 as the benchmark. Three cards gave the result below in standard “High Quality” settings in 1024×768 resolution. (You can find the hardware setup in the following pages.)
As you see, GeForce4 Ti4200 and Ti4600 are faster than Parhelia. In 4X AA, Ti4600 performs nearly 3 times faster than Parhelia. However in 16X Fragment AA Parhelia renders more frames than does Ti4200 in 4X AA. IF you consider the quality of 16X AA, the performance in general is quite satisfactory. Still the 4X AA performance of Parhelia is lower than expected.
64 Supersample Texture Filtering
Parhelia, like its competitors also has 4 rendering pipelines. The difference is Parhelia has 4 texture unit per pipeline instead of 2. Theoretically, this means double the performance. Parhelia can execute 64 samples in one clock cycle with 4 pipelines x 4 units x 4 samples. Accordingly it can render 16 samples and 1 anisotropic filter. The competitors need two clock cycles to do the same job. Let’s take a look at the Anisotropic Filtering performance of Parhelia.
As you see in the graph below Parhelia is two times slower than GeForce4 Ti4600 in trilinear filtering however this gap falls down to 57% in anisotropic filtering. Also it is about 25% slower than Ti4200 at the same clock speed.
However before making any judgments I recommend you to look at the graphic below. As you see in GeForce4 it doesn’t make any difference whether you open filtering or not. On the other hand Parhelia’s both trilinear and anisotropic filtering processes enhances the image quality drastically. The reason is that GeForce4 cannot use anisotropic filtering in Direct3D (D3D). NVIDIA can only use anisotropic filtering in OpenGL applications. Still you can find anisotropic filtering benchmarks of D3D applications in many places but the results may not be reliable. Although you can enable anisotropic filtering in D3D applications by using utilities like NVMax, even NVIDIA’s latest detonator drivers v30.82 does not support this feature in D3D.
On the contrary, GeForce4 performs much better anisotropic filtering in OpenGL applications. Furthermore, it does it with a much faster rate relative to Parhelia’s D3D rate.
I can’t figure out why NVIDIA does not support anisotropic filtering in D3D since the number of D3D games are huge. The amount of D3D games are likely to be more than OpenGL games. The lack of anisotropic filtering support in D3D games is a serious drawback for NVIDIA
Glyph Antialiasing
Parhelia also has an internal hardware unit for “Glyph Antialiasing” which makes the text on the screen look smoother. You can use similar antialiasing techniques without hardware support using software. For instance this feature is available in Windows 2K and XP. However they really decrease the 2D performance. Parhelia, on the other hand has a dedicated unit for this process unlike its rivals and performs the task by making required gamma corrections.
Although Matrox declines any performance loss, I still wanted to look at the number of characters pasted in the screen per second. To do this I used “Strings” test in PCMark2002 Pro’s WindowsXP performance test. And my suspections turned out to be correct 🙂 As you see in the graph below, in fact the number of characters per second really decreases when Glyph Antialiasing is on. Yet, the rate is satisfactory.
While WinXP pastes 320K characters per second when Glyph AA is off, the number falls down to 194K when Glyph AA is on. There is a 67% decrease. The performance is same when you compare it to WinXP’s standard text antialiasing. But I can say that the quality of Glyph AA is far better than all. Also WinXP’s “Clear Type” mode is very slow when compared to Glyph AA. It is almost impossible to use “Clear Type” since it blurs the texts instead of making it more readable. In fact “Clear Type” is designed for LCD screens. In CRT monitors it is best to use either standard mod or Parhelia’s Glyph AA.
Triple Head – Surround Gaming
After dual-head, Matrox has now developed Triple-Head. Parhelia offers 2048x1536x32 analog and 1920x1200x32 digital outputs thanks to its two 400 MHz. and one 10bit RAMDACs. Also a 165MHz. DVI output and 10bit TV encoder is included. Parhelia best feature is that it can give output to 3 analog monitors @ 3840x1024x32 resolution at the same time. Matrox calls this feature “surround gaming” The third monitor output is added by a multiplier that comes with the box. Three screens will give you a new gaming experience. Of course it is not possible to play each game using this feature. The configuration files may require some adjustments. Still you can use it in games that use Quake 3 engine or in Microsoft’s Flight Simulator. In general, it is easy to add a support for this mode in 3D games.
The only problem left is to find 3 monitors of the same size 🙂 I think very few people will buy two more monitors just to play games. The cost would be real high. Still it is an attractive feature for simulator fans, graphic artists or companies that perform simulation tasks.
UltraSharp Display Tech.
As I said before Matrox is popular for its 2D image quality. It is a fact that Matrox is always one step beyond its competitors with its analog outputs. The situation is also valid for Parhelia thanks to its 10bit 400 MHz. RAMDAC and 5 filters that it applies. I had the sharpest vision I had ever seen in all resolutions and refresh rates that my 21” Compaq Pro110 supports.
After a brief introduction to the features, let’s now take a look at Parhelia’s performance in benchmarks. We have already done FSAA and Anisotropic filtering performances. So we will keep up with the others. The hardware setup of the test system is the following.
TEST SETUP | |
CPU | Pentium 4B 2.4GHz |
Memory | 4 x 128MB PC800 45ns SAMSUNG RDRAM |
Motherboards | Intel 850MV |
Video Card | Asus V8640, 300/325MHz Asus Ti4200, 225/250MHz Matrox Parhelia, 220/275MHz |
Hard Drive | Maxtor 5T030H3 |
Miscellaneous | Toshiba 32x CDROM, ElanVital 300W PSU 3Com 3C905B-TX ethernet |
Software | Windows XP build 2600 NVIDIA DetonatorXP v30.82 Parhelia Driver 1.0.4.231 Direct X 8.1 build 820 |
First of all we will use 3D Mark 2000 for DX7 performance and 3D Mark 2001SE for DX8.1 performance. Both values are the results at standard settings. As you see, GeForce 4 is far faster than Parhelia in both benchmarks. Although the core and RAM frequencies of Parhelia and Ti4200 are the same, Ti4200 is 40% faster in 3D Mark 2000 and 36% faster in 3D Mark 2001SE.
As we take a look at the High Polygon test in 3D Mark 2001SE, we see that Ti4600 draws two times more polygons when we use only one light source. Ti4200 is 80% faster. But when we use 8 light sources the difference reduces to 26% with Ti4600 and Parhelia proves to be faster than Ti4200. I think this is due to Vertex Shader and T&L support
Now we will look at Code Creatures benchmark which has so high surfaces that even GeForce4 Ti4600 performs slower. While Ti4600 draws 10million polygons @ 1024x768x32bit, Parhelia draws 6.7 M polygons. Also Ti4600 is has 60% faster frame rate
We will use Village Mark v1.19 to test the Memory Controller of Parhelia. So that we will see how Parhelia performs without Occlusion Culling. (The process that unseen objects are omitted during rendering process) As you see, although Parhelia does not have Occlusion Culling it is not too slow. The difference between Ti4200 is only 22%.
All cards lack the performance in Comanche 4 benchmark which uses all features of DX8 and shaders. Even Ti4600 performs 48.4 FPS while Parhelia has 29 FPS rate. The difference is 65% with Ti4200. In fact Parhelia is expected to perform better since it has a better pixel processing potential and more powerful Vertex Shader.
It is easier to interpret the results if we use a more industrial benchmark, SPEC Viewperf 7.0. In Unigraphics test, shaded, shaded and transparency and wire frame performance is measured and unfortunately Parhelia could not run the tests. In fact Ti4600 performs 3.5 frames in these tough tests. In Pro/Engineer test the aim is to measure the rendering performance of a 3D model in hidden-line, removal and wire frame settings. As you see, in these tests Parhelia is about 9 times slower. Of course this is mainly due to the lack of “Occlusion Culling” unit. In other tests, there is not a considerable performance difference
SPEC Viewperf v7.0 | |||
Parhelia | Ti4200 | Ti4600 | |
3ds max-01 | 5.430 | 7.361 | 8.410 |
Design Review-08 | 8.722 | 37.61 | 37.980 |
Data Explorer-07 | 23.640 | 27.150 | 27.350 |
Lightscape-05 | 6.139 | 10.380 | 10.410 |
Pro/Engineer-01 | 0.932 | 8.125 | 8.227 |
Unigraphics-01 | — | 3.045 | 3.599 |
Finally we will take a look at Quake 3 Arena performance of the three cards. Aside from demo001, we will use NV15 demo of NVIDIA which is designed with high polygon counts. In the classic demo001, GeForce4’s perform about the double of Parhelia, however when we use NV15 there is not much performance difference. The gap reduces to 20% with Ti4600. When the scene becomes more complex, Parhelia’s powerful vertex shader and T&L unit makes things easier
As a summary, Parhelia is good enough to satisfy the 3D needs of Matrox fans. It is great news that Matrox is now in 3D world without leaving its analog image and product quality. Also currently there are no GPU’s that feature Triple-Head, Glyph AA, Hardware Displacement Mapping, GigaColor, 16X Fragment AA like Parhelia.
In addition to 2D quality, Parhelia has the best 3D image quality. Parhelia has the best anisotropic filtering results I have ever seen. The speed may not be satisfactory for some people but when you consider the image quality, it is highly acceptable. The 3D performance will probably increase as new drivers and optimized games are released. I think current applications cannot make good use of the 4 128 Vertex Shaders of Parhelia. But even a driver update may increase performance at 5-20% just like we’ve seen in other VGA cards
Although Triple-Head requires two more monitors, Flight Simulator 2002 or NASCAR 2002 fans may spare moneys for the experience. Also GigaColor, Glyph Text AA will attract graphic artists and desktop publishers.
Still, we the market is prone to ATI’s R300 and NVIDIA’s NV30. It is expected that NVIDIA will increase the image quality to a higher level with NV30. I wonder the counter attack of Matrox at that time however it is possible that NV30 and R300 overwhelm it.
As a result I really liked Parhelia and plan to use Parhelia in my office computer. If you are also spending hours with your computer with more than one monitor than I recommend you Parhelia. However just like every good product Parhelia has a disadvantage. The current price is about 399, but it will lower in time. Also bulk models which has 64MB RAM and lower core and RAM frequencies may be affordable.
Those who plan to change their graphic cards must wait for the new year, but if you must buy a new one right now I think you must consider Parhelia. The price is really high but the quality and the features make it a product that is worthy to receive the “bEST fOR hARDWAREMANIACS” award.