It has been nearly 18 months since Nvidia released TNT graphic chip. And with these chips Nvidia surely impressed the market. Especially in OpenGL performance, the chips of Nvidia began to challenge the chips that are 5-6 times more expensive. Nvidia, after TNT Nvidia, in the series that continues with TNT2 and TNT2 Ultra, begin to speed up its processor by increasing the frequency. But the real revolutionary change showed up with the release of GeForce256 (NV10) in august of the previous year. Unlike the previous series it had a stronger internal Transform and Lighting (T&L) engine and had four way rendering pipeline. (Read here if you want more detailed information.) With these differences GF256 was able to reach a very high fillrate besides high polygon quantities. Since it was the first of his kind, which had an internal engine and could process 2D/3D data by itself, a new term was named after it, called GPU (Graphics processing unit). Meanwhile Nvidia made another announcement. Nvidia would release the next processor six months later and that would be the standard period for the new products. Nvidia kept its promise and released NV15 8 months after NV10.
Geforce2 GTS (NV15) is mainly distinguished from NV10 with its production technology. Unlike NV10, NV15 is produced with 0.18-micron technology. (NV10 was 0.20 micron; 0.18-micron technology increases the operation frequency and naturally lowers warming potential.) When working NV15 consumes 10W of power with 25 million transistors in it. (This ratio was about twice more in GF256.) Take a look at what these 2 million transistors come up with First of all T&L unit is improved and from now on NV15 can draw 25 million polygons per second. (It was 15 million in NV10.) NV15 also has four way pipeline but this time each pipeline can process two textures. This mean NV15 is twice better in fill rate at the same frequency. If we apply the formula of TNT series “Clock speed x 2 (rendering pipeline)=fill rate”
200Mhz (clock speed) x 4 (way rendering pipeline) x 2 (two textures at each pipeline) = 1600Mpixel
This fillrate of 1.6 GigaPixel is about four times of NV10’s 480 Mpixel and is seriously a respectable ratio. NV15 is also 66.6% faster than NV10 in clock frequency and operates at 200 MHz. They have reached this speed using the advantages of 0.18-micron technology and it is possible to reach higher speeds. NV15 operates with 166Mhz DDR Rams and they have 5.3 Gig/s bandwidth. However mathematically a bandwidth of 7.5 Gigs/s is needed to feed a processor of 1.6 Gigapixel fill rate and NV15 may have troubles in future because of Rams transfer rate.
The abbreviation GTS at the end of NV15 means “Giga Texel Shader”. GigaTexel is named after the 1.6 Gig/s fill rate, and the Shader at the end is absolutely a new technology. With this technique, so called NSR (Nvidia Shader Rasterizer), NV15 can manipulate 7 separate effects like “shadow maps, bump mapping (EMBM, Dot Product 3, and embossed), shadow volumes, volumetric explosion, elevation maps, vertex blending, waves, refraction and specular lighting” at once and with pixel precision. The greatest breakout is per pixel rendering which is trying to improve the image quality. Textures are calculated per each pixel and mappings are made more realistic. When the hardware supported above effects are added and manipulated in pixel precision, the image quality of NV15 gives results of image quality like the pictures below.
Another improvement made in the search of image quality is FSAA (Full Screen Anti-aliasing) the best way for you to understand anti-aliasing is to take a look at the pictures below. The frame on the left is a picture taken from NV15 Demo with out FSAA. On the right you see the OpenGL with FSAA on. Nvidia chose the easiest and fastest way to make FSAA. The image to be antialiased is firstly rendered for a higher resolution than the original. (This ratio can be x1, x1.5, x2, and x4) And then it is reduced according to the ratio of enlargement. So you get an antialiasing also in the ratio of x-y interpolation.
FSAA is similar to the image size tool of Photoshop. Nvidia is also resizing the image with FSAA like the way photoshop does. Offcourse it is the system performance, which slows down with these calculations. As you see in the graphics below while GeForce 2 is 2 times faster than NV10 at 16 bit, 800×600 and FSAA on, this ratio greatly decreases at 32 bit. NV15 supports FSAA with its hardware but I guess the appropriate drivers could not settle down yet. Just like the release of Detonator series and the difference between 3.xx and 5.xx drivers of GeForce256. There will be improvements in the new generation drivers for sure.
Although the drivers are not good enough, NV15 gives faster results on the classical Nvidia TreeMark demo with respect to the previous ones. You can see that it is 30-40% faster from the graph below. (In this OpenGL demo there is a scene of 35820 polygons and 4 light sources.) Although TreeMark, which mainly tests T&L engine and OpenGL, is not optimised for NV15, the result is acceptable. A more complex version of TreeMark for NV15 will be released in the near future.
NV15 is about 15-20% faster in 3D Mark2000 which makes more tests and compares more specs to give a grade however they doesn’t exceed a percentage of 30%. Briefly it is far behind the rate I expected but I think that this will improve in the future since it is a reference product and the drivers are not optimised.
We face a different situation in Quake Arena. You don’t see much difference in standard Demo001 at 32 bit but at 16 bit and high resolutions the difference increases rapidly. It is somehow funny in Nvidia’s NV15Demo. This is a special level prepared for NV15. High values of polygon number, texture quantity and resolution are used. The scene is also impressing but if you tend to compare NV10 and NV15 you will be surprised.
It gives similar results at all resolutions. As you see above each of the three cards’ results deviate only 1-2 frames and under normal circumstances this is not reasonable. Maybe the processor can’t feed NV15 good enough. Although my processor is 733Mhz PIII550E, it still builds up problems. Yet OpenGL doesn’t work independent of API and DirectX is highly dependent on CPU in some calculations. This may be the reasonable cause of these similar results.
NV15 operates at 200 MHz. and with 166 MHz. DDR. It has a much higher core clock frequency than the Nv10, which operates at 120 MHz. You may wonder what would happen if NV10 operated at 200 MHz. Core and 166 MHz. DDR Ram. I could at most test NV10 at 150/345 MHz. and naturally got very close to NV15. I am sure that if I could set up to200Mhz. Then I would get nearly the same results. However, even if NV10 gave the same frame rate, the image quality would be far worse than the NV15. With the release of new applications that supports per-pixel rendering and NSR this quality will get much better. Performance will also improve with newer drivers and DirectX 8.0. I think a new quality standard will be set with the release of DirectX 8.0. But this is the first step, so there is no need to haste to sell your GeForce256 to buy a GeForce2 GTS. But if you are to buy a GeForce I recommend you to choose GeForce2 GTS since both has the same price. It makes more sense to buy GeForce2 GTS instead of a 64Mb. GeForce256. Finally I must thank Leadtek Turkey for supplying GeForce2 samples.
Leadtek Turkey
Winfast GeForce 2 GTS 349$ – 399$TV
Winfast GeForce256 DDR 64Mb 399$
PIII 550E (733Mhz O/C)
ABIT BE6-II (QY BIOS)
WinFast GeForce 2 GTS (BIOS 4.17.2000, v4.12.01.0516, 200/333mhz)
WinFast GeForce 256 DDR 64MB (BIOS 2.29.2000, v4.12.01.0516, 120/301mhz)
Asus AGP-V6800 (BIOS 2.10.02sba, v4.12.01.0516, 120/301Mhz)
Kingston KVR PC133 128Mb
IBM DJNA 341350 UDMA66 HDD
TOSHIBA SD-M1212 DVD 1r22
SB Live! (Lw3.01)
Windows 98 SE (4.10.2222A)
DirectX 7.0 (v4.07.00.716)
3D Mark 2000 (build 335)
Quake III Arena v1.16n
All tests were made in 32bit color depth and V-sync was off. Quake III Arena was tested at High Settings with demo1.dm3 and Nv15demo and only by increasing resolution.