Graphics Processing Unit / Timeline

open/close all folders

1970s

Fujitsu MB14241 (1975): This was the first example of a video shifter chip, used by Taito and Midway to handle graphical tasks in early Arcade Games such as Gun Fight (1975) and Space Invaders (1978). Gun Fight was one of the first games to use sprites, while Space Invaders was the first to render up to 64 sprites on screen.
Motorola 6845 (1977): Not actually a GPU on its own, but an important building block of one. The 6845 was a CRT controller, a chip that generated timing signals for the display and video memory. It was originally meant for dumb terminals (and has a few features that would have only made sense on a 1970s terminal, such as light-pen support), but since it didn't impose any restrictions on memory or resolution other than the size of its counters, it was very, very tweakable. This chip ended up being the basis of all of IBM's display adapters and their follow-ons and clones, and still exists in modern VGA cards in one form or another. It was also used by a few non-IBM machines such as the Commodore CBM range, Amstrad CPC and BBC Micro.
Motorola 6847 (1978): Not based on the 6845, the 6847 was a "Video Display Generator" that could produce low-resolution graphics in up to 4 colors. Used in the TRS-80 Color Computer (which it may have originally been designed for), Acorn Atom and NEC PC-6001/NEC Trek. The 6847's interlaced output was designed for NTSC televisions, which for British companies like Acorn meant the inconvenience of adding a NTSC-to-PAL converter that produced graphical artifacts.
Namco Galaxian (1979): The Namco Galaxian GPU chipset was produced by Namco for the popular Arcade Game Galaxian (1979). It introduced and popularized the use of tile-based rendering, and it was the first graphics chipset to use fully RGB color graphics, display 32 colors on screen, and render up to 64 multi-colored sprites on screen. From 1979 to 1982, the Namco Galaxian chipset was widely used in arcade games from Namco, Midway, Konami, Sega and Taito.
Nintendo Classic (1979): An Arcade Game video board designed by Nintendo, originally for Radar Scope in 1979, and later better known for its use in the original Donkey Kong in 1981. It introduced a large RGB color palette, able to display up to 256 colors on screen out of a total palette of up to 768 colors. It also featured basic sprite-scaling capabilities. This was utilized to full effect in Radar Scope, which featured a pseudo-3D third-person perspective, with enemies zooming in as they approach the player, and had a background with a smooth gradient effect.
NEC µPD7220 GDC (1979): The NEC µPD7220 GDC (Graphic Display Controller), designed by NEC, was one of the first implementations of a GPU as a single Large Scale Integration (LSI) integrated circuit chip. Ahead of its time, it became popular in the early 80's as the basis for early high-end computer graphics boards. It had its own instruction set and direct access to system memory, like a blitter. It was also capable of advanced graphical features such as horizontal scrolling and integer zooming. It also featured high display resolutions up to 1024x1024 pixels, and the color palette was also large for its time, able to display 16 colors on screen out of a total palette of up to 4096 colors. The GDC was first commercially released for NEC's own PC-98 computer in 1982, before becoming available for other computer platforms such as the Epson QX-10 and IBM PC. Intel licensed the NEC µPD 7220 design and called it the 82720 graphics display controller, which was the first of what would become a long line of Intel GPU's.
Texas Instruments 9918 (1979) and 9928 (1981): The first home GPUs to implement tile-based rendering, the 9918 and 9928 were originally designed for and introduced with TI's 99/4 home computer in 1979. It used a 16-color Y Pb Pr palette and could handle up to 32 sprites. While the 99/4 was a flop, the 9918 and 9928 were far more successful, being used in the Colecovision and Sega SG-1000 consoles and the MSX, Sord M5 and Tomy Tutor computers; the GPUs in the Sega Master System and Sega Genesis consoles are improved versions of the 99x8. Also, the NES's PPU is an improved (but incompatible) 9918 workalike, with support for a 256-color palette (some of which are duplicates, making the usable number on NTSC somewhere around 55) and more sprites available at once (though only at one size, 8×16).

The 9918 had its own VRAM bus, and the NES in particular wired the VRAM bus to the cartridge slot, making it so that the tiles were always available for use. The NES had only 2 kilobytes of VRAM; since the tiles were usually in the game cartridge's CHR-ROM, the VRAM was only needed for tile maps and sprite maps. Other consoles (including Sega's machines and the SNES) stuck with the more traditional 9918-esque setup and required all access to the GPU to go through the main bus; however, the SNES and the Genesis can use DMA for this, something the NES didn't have.

Atari ANTIC & CTIA (1979) and GTIA (1981): The first programmable home-computer GPU. ANTIC was ahead of its time; it was a full microprocessor with its own instruction set and direct access to system memory, much like the blitter in the Amiga 6 years later (which, not coincidentally, was designed by the same person). By tweaking its "display list" or instruction queue, some very wild special effects were possible, including smooth animation and 3D effects. CTIA and GTIA provided color palettes up to 128 or 256 colors, respectively, a huge number for home systems at the time; the CTIA was initially able to display up to 5 colors on screen, while the GTIA was later able to display up to 16 colors on screen.

1980s

Namco Pac-Man (1980): A follow-up to the Namco Galaxian chipset, Namco introduced the Namco Pac-Man GPU chipset for the Arcade Games Pac-Man and Rally X. Consisting of two Namco NVC 293 Custom Video Shifter chips and a Namco VRAM Addresser, the Namco Pac-Man chipset introduced a palette of up to 512 colors, along with hardware support for multi-directional scrolling and a second tilemap layer.
MOS Video Interface Chip (1980): Originally produced by MOS for the medical and scientific equipment, Commodore ended up using it as the basis for their VIC-20 home computer, after MOS couldn't find a single buyer in their originally-intended market. In terms of graphical capability it was actually pretty limited even for the era, with a resolution of only 174 by 186 pixels and no true support for bitmapped graphics (it was possible to run a pseudo-bitmap mode, but this required a RAM expansion, as the stock VIC-20 didn't have enough RAM to use this mode). However, its low cost helped make the VIC-20 a very affordable system, and marked Commodore's breakthrough into home computing.
IBM Monochrome Display Adapter and Color Graphics Adapter (1981): These were the GPUs offered with the original IBM PC. The MDA quickly gained a very good reputation for its crisp, high resolution text display, but lacked the ability to render graphics of any sort. For that you needed the CGA adapter, which only supported four simultaneous colors in a low-resolution mode with two freaky, unnatural-looking color palettes^note one of which didn't even have a true white, and cannot display color at all in "high-resolution mode"^note . It is capable of 15 colors in text mode, but the text mode was nothing to write home about either—at the same maximum resolution of 640x200 as the graphics, and with only an 8x8 character cell, text on the CGA was jagged and chunky when compared to the MDA.^note

Due to the lack of colors available, hackers tried to work out hacks that would allow the card to generate more colors than it was capable of. The first involved CGA's text mode. While it wasn't very good for text, with some tweaking, it could be shrunken down to a pseudo-graphics mode with an effective resolution of 160x100. There was only one well known game at the time that used this mode: Round 42. There was however, a lesser known shareware game in the 80s—a Break Out clone called Bricks—that uses this mode. Another developer used this mode to create a Pac-Man clone called Paku Paku in 2012. Not all clones of the CGA card support this mode, however, and they will either show garbage on the screen or not draw the screen properly (ie the screen is still treated as 80x25 text mode, which means only the first 25 lines are visible) when this mode is invoked.

The second trick was quickly discovered soon after the card was launched, but it only works with CGA cards hooked up to a NTSC color TV. In this mode, the program takes advantage of artifacting to generate colors^note. By careful use of dithering in either CGA modes, the lines will interfere with the NTSC color burst signal and result in color. Notably, it was used by several Sierra adventure games like King's Quest. Another method, which works similarly to the Amiga's Hold-and-Modify mode, was soon discovered in that with proper timing, it is possible to switch palettes during drawing of each line, allowing each individual lines on the screen to have its own set of three colors. California Games from Epyx is one game that is known to use this trick. However, both of these methods also have flaws. The former only works with NTSC TVs and relies on the CGA card having a composite video port, meaning if you have a PAL TV or an RGB monitor instead, or if your CGA card is a clone that is lacking the composite video port, this trick won't work, while the latter relies on the precise timing of the CPU; if a faster CPU than the original Intel 8086 is used, then the method falls flat on its face and you get a flickering mess instead. The 160x100 hack only works reliably on a genuine IBM CGA, or a later card that faithfully emulates it, but unlike the other two hacks, it doesn't require unusual configurations (almost all CGA users had digital RGB monitors), and will work properly on an AT or another faster machine- it has also been documented to work on several upgraded XT clones. However, as mentioned before certain clone cards (particularly those integrated into portables with LCD displays) does not support it.

CGA's shortcomings ensured that the early PCs had a pretty dismal reputation for graphics work, and was a major factor in the Macintosh and the Amiga taking over that market. The CGA was superseded by the Enhanced Graphics Adapter (EGA) in 1985; while EGA had some significant improvements over the CGA, including real 16-color support in 200-line modes, it was still quite limited (64 colors maximum, not all of which were usable at once, and which required a 350-line monitor and extra memory—both of which were expensive in 1985). It wouldn't be until 1987 that the PC finally got an affordable, world-class color GPU.

On a side note: A IBM Monochrome Display Adapter and Color Graphics Adapter can co-exist on the same PC if you can work out the conflict (which is not really difficult, but it involves messing with the ANSI.SYS driver). DOS will boot into the monochrome display, but then switch to the CGA display when a program requiring graphics is run. Most development, engineering and desktop publishing programs can make use of this setup. This setup requires two displays- a IBM Monochrome Monitor and either a IBM RGB Monitor or a color TV, making it the earliest occurrence of a multi-display setup. The fact that few software supported the configuration (and some programs even refusing to run if two graphics cards are found in the system) ensured that it was only used in very niche markets.

Namco Pole Position (1982): The Namco Pole Position video board's GPU chipset consists of the following graphics chips: three Namco 02xx Custom Shifter chips, two Namco 03xx Playfield Data Buffer chips, Namco 04xx Address Buffer, and Namco 09xx Sprite RAM Buffer. The video board also utilized an additional 16/32-bit Zilog Z80002 microprocessor dedicated to handling graphical tasks. Created for Namco's Pole Position Arcade Games, it was the most advanced GPU chipset of its time, capable of multi-directional scrolling, pseudo-3D sprite-scaling, two tilemap layers, and 3840 colors on screen.
MOS Video Interface Chip-II (1982): The GPU used by the massively successful Commodore 64. Despite the similar name, this was a major upgrade over the original VIC, featuring bitmapped graphics, smooth background scrolling, and full sprite support, with a feature set that very few GPUs outside of arcade machines and high-end personal computers could match. Its capabilities wouldn't be matched on the IBM-compatible side (at least, not at consumer level) until the release of the EGA standard three years later, and wouldn't be convincingly surpassed (by VGA) for another two years after that. An updated version, the VIC-IIe, was later employed in the Commodore 128, though only differed on the graphics side by virtue of a few minor upgrades that were of little benefit to most users and even developers.
Hercules Graphics Card (1982): Early PC users had something of a dilemma with their GPU card choices—either the "excellent text display but no real graphics" of the MDA or the "barely adequate at both text and graphics" of the CGA. And setting up a PC with both a MDA and a CGA card installed can be a daunting task, not to mention that this setup takes up more space on your desk and leaves you with one less expansion slot in the PC. Hercules decided to Take a Third Option and produce their own graphics card which emulated the MDA's text mode, and provided a high-resolution (720x348) monochrome mode as well, making it suitable for basic graphics work. This made the Hercules a hugely popular card, and the most popular solution for PC users for about five years. Hercules also later released an add-on CGA-compatible Hercules Color Card, which included circuitry that allowed it to coexist with the HGC.

Like the CGA and EGA, the Hercules was also widely cloned by third parties, and a few high-spec EGA and VGA cards (usually the same ones that could do faithful CGA emulation) could also emulate a Hercules card.

IBM Video Gate Array/Tandy Graphics Adapter (1984): An upgraded version of IBM's earlier CGA adapter, introduced with their P Cjr line. It offered the same resolutions as its predecessor, but with an expanded and more versatile 16-color palette, as opposed to the two 4-color palettes on CGA. The P Cjr ended up bombing due to being too expensive and making too many design compromises, and IBM never used the Video Gate Array again, but the standard was co-opted by Tandy in their Tandy 1000 line of computers, which ended up being a much bigger hit than the P Cjr. Despite being originally created by IBM, the standard is usually referred to as the Tandy Graphics Adapter (TGA) to avoid confusion with IBM's later Video Graphics Array card.
MOS Text Editing Device (1984): A low-cost chip that served as the GPU for the Commodore Plus/4 and its derivatives. In terms of capabilities it was nothing special — in fact, it actually cut out the VIC-II's sprite support, though the bitmap mode could use an impressive-for-the-time 121 colors — though where it stood out was in the fact that it integrated essentially everything except for the CPU and I/O controller, making it a very early ancestor of the modern System-On-a-Chip.
Nintendo Punch-Out (1984): Nintendo's second and last major arcade chipset. While their earlier chipset had included rudimentary sprite scaling support, this one gained much more advanced scaling and rotation (albeit the latter was never used by any game) support, as well as the rare-for-the-time ability to drive two separate monitors, each with access to the system's full graphical capabilities. Only three games would make use of the system — Punch-Out!!, its Updated Re-release Super Punch-Out!!, and Arm Wrestling — before Nintendo gave up on dedicated arcade systems in favor of adapting their home consoles into arcade cabinets (save for the "Ultra 64" arcade system, which bore no real resemblance to what would eventually become the Nintendo 64).
Sega Super Scaler (1985): Sega's Super Scaler series of Arcade Game graphics boards, and their custom GPU chipsets, were capable of the most advanced pseudo-3D sprite-scaling graphics of the era, not rivalled by any home systems up until the mid-90's. These allowed Sega to develop pseudo-3D third-person games that worked in a similar manner to textured polygonal 3D graphics, laying the foundations for the game industry's transition to 3D.

The initial Super Scaler chipset for Hang-On in 1985 was able to display 6144 colors on screen out of a 32,768 color palette, and render 128 sprites on screen. Later that year, Space Harrier improved it further, with shadow & highlight capabilities tripling the palette to 98,304 colors, and enough graphical processing power to scale thousands of sprites per second. The Sega X Board for After Burner later added rotation capabilities, and could display up to 24,576 colors on screen. The last major Super Scaler graphics board for the Sega System Multi 32 in 1992 could render up to 8192 sprites on screen, and display up to 196,608 colors on screen.

Yamaha V9938/V9958 AKA MSX Video (1985): An improved follow-on of the aforementioned TI 99x8, this GPU (actually Yamaha called it VDP, or Video Diplay Processor) was probably a pinnacle of hybrid 2D chips that handled both tile and framebuffer graphics. Allowing the high (at the time) 512x212/424 resolution and up to 256 colors (indexed mode supported 16 out of 512 colors and V9958 allowed up to 19268 colors in some circumstances), it also had hardware scrolling capabilities, programmable blitter, and accelerated line draw/area fill, giving the MSX 2 computer, for which it was developed, the prettiest graphics of all 8-bit machines, superior to VGA-based PCs at the time and rivalling the Amiga. Unfortunately, it wasn't much used outside of the MSX platform and faded away with its demise in the early 90's.
IBM Enhanced Graphics Adapter (1985): Not wanting to be outdone by the Tandy 1000's graphics capabilities—least of all with a graphics standard that IBM themselves had created—IBM released an updated graphics standard for PC/AT class computers. The EGA easily outgunned the old CGA standard, offering nearly double the resolution and a massively expanded color range (64 colors/4 grays, though only accessible 16 colors at a time). Aside from the color improvements, EGA also added quite a few improvements for office users, such as downloadable text fonts (before this, the font burned into the card's character ROM was all you had unless you ran in graphics mode), and a 43-row mode, which let text editors and spreadsheets fit more data onto the screen, albeit with the same 8x8 character cells CGA used. Most EGA hardware was backwards compatible, mostly, with CGA and MDA modes; in particular, the hacked 160x100 mode mentioned above won't work on an EGA, though the EGA's proper support for 16 colors in 320x200 mode makes this easy to work around. EGA also officially dropped support for composite video out, even though two RCA jacks are present on most EGA boards; on IBM boards and "100% IBM-compatible" clones, these jacks are wired to an expansion connector on the board, meant for a optional video overlay board IBM apparently planned but never introduced. Several third-party boards, particularly ATI's EGA Wonder range, have working composite video-out (which works more like the video-out on a modern VGA, including scaling down high-resolution modes to NTSC or PAL resolution).

Third-party clones of the EGA were often quite a bit more capable than the original IBM board, adding support for the then-new "multiscan" monitors (some late boards could run at 800x600), the aforementioned "smart" video out, and in the case of boards like the Paradise Auto Switch, automatic board detection and emulation (i.e. if a program insisted on doing something that would only work on a CGA, software included with the board would notice it and switch itself to CGA emulation mode).

While EGA was a definite improvement over CGA, the competition had moved on quickly, with the EGA's color capabilities being surpassed first by the Atari ST, and then by the Amiga. EGA was also intentionally crippled by IBM at first; many early boards only came with 64k of RAM, making them useful only for monochrome or 200-line color modes. Later IBM boards and practically all clone boards come with the full complement of 256k RAM, and can do 350-line modes with no problems.

IBM Professional Graphics Controller (1985): It was more similar to a modern-day GPU rather than just a display adapter, supporting full 2D hardware acceleration and high display resolutions (up to 640x480 pixels, and up to 256 colors on screen out of a 4096-color palette) that wouldn't be equaled on an IBM-compatible PC until the arrival of the Hitachi HD 63484 ACRTC the following year and the Super VGA standard four years later. In addition, it effectively included a full computer on the card (including an Intel 8088 CPU), making it fully programmable in a way that most other cards wouldn't be until the turn of the century. Unfortunately, there was one very big flaw — the combined cost of the card and the proprietary IBM monitor that you had to use was $7,000, almost four times the cost of the Commodore Amiga that debuted a couple of months later. You can probably guess which of the two became the must-have solution for graphics professionals.
Commodore Agnus and Denise (OCS)/ Fat Agnus and Super Denise (ECS) (1985): The GPUs of the Commodore's Amiga chipsets, designed by the late Jay Miner (who oversaw the design of the ANTIC, GITA and CITA of the Atari computer graphics subsystem mentioned above). The OCS was capable of 320x240@p60 or 320x256@p50 up to 640x480@i60 and 640x512@i50 with up to a whopping 4096 colors via Hold-And-Modify mode, the OCS chipset was good value for money and offered high quality graphics at a relatively cheap price for it's time. The Agnus (actually a main part of the main Amiga chipset, but had heavy emphasis on the graphical capabilities on the Amiga) handled the blitting, ie compositing the graphics and pushing them onto the screen, and is also responsible for the genlock function of the Amiga, while the Denise handled the actual sprite/graphics drawing.

The Denise was used in the Amiga 500, 1000, 2000 and CDTV console.

In 1990, the OCS was given a minor upgrade called ECS and Super Denise replaced the regular Denise, while Fat Agnus replaced the Agnus chip (which is now retroactively renamed Thin Agnus). Super Denise allowed for a 640x480@p50 or p60 (otherwise known as VGA) as well as Super-Hi Res 1280x200@p60, 1280x256@p50, 1280x400@i60 and 1280x512@i50 resolutions, albeit with only 4 colors. Unlike OCS (in which there are separate parts that only support 50Hz and 60Hz resolutions), ECS supported both PAL and NTSC resolution. It also has a mode that allow for user-definable resolutions.

The Super Denise was used in the Amiga 500+, Amiga 600, the revised Amiga 2000, and Amiga 3000.

Hitachi HD63484 ACRTC (1986): The Hitachi HD 63484 ACRTC (Advanced CRT Controller) was the most advanced GPU available for an IBM-compatible PC in its time. Its display resolution went up to 1024x768 pixels, and it could display up to 256 colors on screen out of a palette of 16,777,216 colors. It also had advanced features such as a blitter, hardware scrolling in horizontal and vertical directions, and integer zooming.
Sharp-Hudson X68000 Chipset (1987): The Sharp X68000 GPU chipset consists of four graphics chips developed by Sharp and Hudson Soft: Cynthia (sprite controller), Vinas (CRT controller), Reserve (video data selector), and VSOP (video controller). It was the most advanced GPU available on a home system in the 80's, and remained the most powerful home computer GPU well into the early 90's. Its resolution could go up to 1024x1024 pixels, and it could display up to 65,536 colors on screen. It could also render 128 sprites on screen, and display 2 tilemap layers and up to 4 bitmap layers. Because of its power, the X68000 received almost arcade-quality ports of many Arcade Games, and it even served as a development machine for Capcom's CPS arcade system in the late 80's to early 90's.
IBM Video Graphics Array (1987): Introduced with the IBM PS/2 line, finally offering the IBM-compatible PC graphics abilities that outshone all its immediate competitors. Many noted that it would probably have been a Killer App for the PS/2, had it not been for the Micro-Channel Architecture fiasco (more on that elsewhere). In any case, this established most of the standards that are still used for graphics cards to this day, offering a then-massive 640x480 resolution with 16 colors, or 320x200 with an also-massive 256 colors. Subsequent cards included the Super VGA and XGA cards, which extended the maximum resolutions to 800x600 and 1024x768 respectively, but they are generally considered to fall under the VGA umbrella. The VGA also had a cheaper sister card, the Multi-Color Graphics Array, which was heavily stripped down and much nearer EGA in terms of specs, but compatible with VGA monitors.
IBM 8514/A (1987): For all intents and purposes this was the successor of the Professional Graphics Controller, though without the absurd price tag and requirement to use a proprietary monitor. Specs-wise it was mostly equivalent to the VGA, but added in 256-color modes at 640x480 and 1024x768, though the latter mode was heavily compromised, requiring a special monitor and only running at an interlaced 43hz. Most importantly, the card retained the PGC's 2D hardware acceleration, and while it was still much more expensive than a VGA card, it made 2D acceleration affordable enough that graphic design apps began making widespread use of it. Since IBM's version was limited to their short-lived MCA bus, several other manufacturers began producing low-cost clones that used the ISA bus; in particular, one of ATI's first big successes was a cheaper incarnation of this card.
Texas Instruments TMS34010 (1987): The first fully programmable graphics accelerator chip, the TMS 34010 (followed in 1988 by the TMS 34020) was featured in some workstations, the "TIGA" (Texas Instruments Graphics Architecture) PC cards of the early 1990s, and all of Atari and Microprose's early 3D Arcade Games; Midway Games actually used it as a CPU in many of their arcade games.

1990s

S3 86C (1991):

Used in: S3 Vision, Trio and Aurora series

One of the most popular 2D accelerators, and one that made them matter. Originally, graphics chips on PCs were limited to a higher resolution text mode, or a smaller resolution graphics mode. 2D acceleration however, only had a graphics mode that could manipulate any arbitrary 2D image you wanted. The chip also introduced blitting (to be precise, bit-blitting) to the PC graphics platform as well as double-buffering. The first chip was the 86C911- named after the Porsche 911, and it delivered its claims so well, by the mid 90s, every PC made had a 2D accelerator. This chipset also received many revisions, seeing an upgrade from VLB to PCI with the 86C928p, and lastly AGP via the Trio3D. The Vision series introduced built-in MPEG acceleration, shipping with a specialized version of Xing MPEG Player. The most famous and widespread of this chip family is the Trio series, with the Trio3D bearing a stripped down version of the ViRGE 3D core. There is also a mobile version, the Aurora series, used in laptops. It's also so well documented that many PC emulators at least emulate this chipset for 2D video. It is also the base chipset from which the ViRGE and Savage3D chips' 2D cores are derived from.

Commodore Alice and Lisa (AGA) (1992):

The Lisa GPU replaces Super Denise, while Alice took the place of Agnus, and marks the start of Commodore's technological slip. While the chipset has much improved resolutions (up to 1440x560@i50 with 18-bit color via HAM8 mode but a full 16.7 million color palette), it was slow as it only ran with a 16-bit bus, lacked chunky graphics mode support which caused slowdowns in older programs, and the only mode that supported progressive scan were the 240 lines, 256 lines and 480 lines mode. While Commodore was developing several chipsets to address the shortcoming of the AGA, it went bankrupt before any of the ideas could be put into production.

This chipset was used in the Amiga 1200, 4000 and the ill-fated CD 32 console.

Fujitsu TGP MB86234 (1993):

The Sega Model 2 Arcade Game system was powered by the Fujitsu TGP MB86234 chipset. This gave Sega's Model 2 the most advanced 3D graphics of its time. It was the first gaming system to feature filtered, texture-mapped polygons, and it also introduced other advanced features such as anti-aliasing as well as T&L effects such as diffuse lighting and specular reflection. With all effects turned on, the Model 2 was capable of rendering 300,000 quad polygons per second, or 600,000 triangular polygons per second. Its 3D graphical capabilities were not be surpassed by any gaming system until the release of its own arcade successor, the Sega Model 3, in 1996.

NVIDIA NV1 (1995):

One of the earliest 3D accelerators on the PC, manufactured and sold as the Diamond Edge 3D expansion card. It was actually more than just a graphics card - it also provided sound, as well as a port for a Sega Saturn controller. It was revolutionary at the time, but very difficult to program due to its use of Quadrilateral rendering (which the Sega Saturn's GPU also used) instead of the Triangles (Polygons) used today. It was also very expensive and had poor sound quality and compatibility. It was eventually killed when Microsoft released DirectX and standardized Polygon rendering. It was supposed to be followed by the NV 2, which never made it past design stage.

S3 ViRGE (1995)

The first 3D accelerator chip to achieve any real popularity. While it could easily outmatch CPU-based renderers in simple scenes, usage of things like texture filtering and lighting effects would cause its performance to plummet to the point where it delivered far worse performance than the CPU alone would manage. This led to it being scornfully referred to as a "3D decelerator," though it did help get the market going.

Rendition Vérité V1000 (1995)

This chip included hardware geometry setup years before the GeForce. While it was used as its marketing point against 3dfx's Voodoo chip, its performance was fairly abysmal when it came to games. It was still the only 3D accelerator supported by a few games released at the time, like Indy Car Racing II (though its original release required a patch). It was used mostly in workstation computers.

NEC-VideoLogic PowerVR (1996)

The PowerVR GPU chipset developed by NEC and VideoLogic was the most advanced PC GPU when it was revealed in early 1996, surpassing the PlayStation and approaching near arcade-quality 3D graphics. It was showcased with a demo of Namco's 1995 Arcade Game Rave Racer, a port of which was running on a PowerVR-powered PC, with graphics approaching the quality of the arcade original (though at half the frame rate). It would not be rivalled by any other PC graphics cards until the release of the 3dfx Voodoo later in the year.

The PowerVR would be superseded by the PowerVR Series 2 in 1998.

Real3D PRO-1000 (1996)

The graphics chipset that powered the Sega Model 3 Arcade Game system, released in 1996, this was the most powerful GPU for several years. Its advanced features included trilinear filtering, trilinear mipmapping, high-specular Gouraud shading, multi-layered anti-aliasing, alpha blending, perspective texture mapping, trilinear interpolation, micro texture shading, and T&L lighting effects. With all effects turned on, it was able to render over 1 million quad polygons/sec, over 2 million triangular polygons/sec, and 60 million pixels/sec.

A cheaper consumer version called the Real3D 100 was planned for PC, and was the most powerful PC GPU when it was demonstrated in late 1996. However, after many delays, it was cancelled. Later in 1998, a cheaper stripped-down version was released as the Intel i740 (see below).

Matrox Mystique (1996)

Matrox's first venture into the 3D graphics market. Was notable for being bundled with a version of MechWarrior 2: 31st Century Combat that utilized the card. It did not perform well however against the 3dfx Voodoo and lacked a number of features the competition had. Matrox tried again with the slightly more powerful Mystique 220, but to no avail. The product line was soon after labelled the "Matrox Mystake" due to its dismal performance.

3dfx SST1 and SST96 (1996):

Used in: Voodoo series and Voodoo Rush

The first consumer 3D acceleator to really take off. The card itself used a two-chip solution to handle compositing the image and texture mapping the polygons. The card also featured basic texture fitlering so they wouldn't look pixelated, fog effects, and could even perform some color blending based on what was already in its frame buffer. Its world-class performance at a decent price helped ensure that it ascended to market dominance. Helping it was it released around the time Quake and provided a render path using 3dfx's Glide API, creating the Killer App for the card.

However, it lacked a 2D graphics chip, meaning one had to have a separate card for essentially any non-video game task. Fortunately most users already had a 2D graphics card of some kind so this wasn't a major issue. 3dfx cleverly turned this to their advantage by saying the Voodoo wouldn't require users to throw their old graphics cards away in their marketing campaign. 3dfx did produce a single-card version called the Voodoo Rush, but used a rather cheap and nasty 2D GPU which had mediocre visual quality and actually prevented the Voodoo chipset from working optimally. As a result, they stuck to making 3D-only boards at the high-end until 1999's Voodoo 3.

SGI Reality Co-Processor (RCP) (1996)

Originally developed for the Sega Saturn, the project was scrapped due to Sega of Japan's Executive Meddling as a result of Sega of America and Sega of Japan's infighting, and at a loss to SGI to boot. When SGI asked Sega of America what it should do with the remnants of the project, Sega of America told them to shop around. Nintendo eventually answered the call and the GPU found it's home in the Nintendo 64. The main standout features of the RCP was its ability to process textures much better than its competitors. These included mipmapping (a way of saving memory by using lower resolution versions of textures if the texture is only covering a small portion of the screen), bilinear filtering (which avoids the "pixelated" look), and perspective correction (so textures look correct regardless of how they're viewed). However, it was much more powerful than Nintendo let on, being capable of performing effects that any home system wouldn't either be capable of or used for years.

Fujitsu TGPx4 MB86235 (1996) and FXG-1 Pinolite MB86242 (1997):

The Fujitsu TG Px 4 MB 86235 was used in Sega's Model 2C CRX Arcade Game system in 1996, introducing true T&L geometry processing. It was then adapted for PC as the FXG-1 Pinolite MB 86242 geometry processor in 1997, pioneering consumer hardware support for T&L, making near arcade-quality 3D graphics possible on a PC.

Rendition soon utilized the Fujitsu FXG-1 for their Hercules Thriller Conspiracy, which was to be the first consumer GPU graphics card featuring T&L, but its release was eventually cancelled. This was years before the GeForce 256 released in 1999 and popularized T&L among PC graphics cards.

NVIDIA Riva 128 (1997):

NVIDIA's real major foray into the 3D graphics market and helped establish NVIDIA as a major 3D chipmaker. One of the first GPUs to fully utilize DirectX. The Riva 128 performed fairly well, but was not enough to really challenge the 3dfx Voodoo. An improved version, the Riva 128 ZX released the following year, had more onboard memory, a faster memory clock, and full OpenGL support. This eventually culminated into the Riva TNT, or TwiN Texel for its ability to do two texture lookups at once, released in 1998 to compete with the Voodoo2.

3dfx SST2 (1998):

Used in: Voodoo2 series and Voodoo Banshee

The main improvement the Voodoo2 brought was adding another texture mapping unit to the graphics chipset, allowing it to blend two textures in a single pass. But what set it apart from its competition was its ability to add another Voodoo2 to increase the overall 3D acceleration performance. Dubbed Scan-Line Interleave or SLI, the cards would alternate which line of the screen to render (hence its name), using a ribbon cable to synchronize information between the two so it'd form a coherent image. One manufacturer made a single slot solution by stacking another Voodoo2 on top of another. After purchasing 3dfx, NVIDIA would later repurpose the idea and the initialism, renaming it Scalable Link Interface.

Like the first Voodoo, 3dfx created a single card solution, derived from the Voodoo 2 and known as the Banshee. This one had the 2D and 3D chips combined into one unit and was actually somewhat successful in the OEM market; while it didn't catch on in the high-end, it provided the basis for the following year's Voodoo 3.

Intel i740 (1998):

First introduced in order to promote its new Accelerated Graphics Port (AGP) slot. It was based on the 1996 chipset Real3D 100 (see above), but was stripped-down considerably. The Intel i740's 3D graphics performance was weaker than the competition, and it was quickly shelved months later. Oddly enough its 2D performance was actually very good, albeit not up to the standard of the Matrox G200, the 2D market leader at the time. It did, however, give Intel enough experience to incorporate it into its chipsets as integrated graphics, notably the GMA series.

NEC-VideoLogic PowerVR Series 2 (1998)

The GPU that drove the Sega Dreamcast. What set it apart from the others was it used PowerVR's tile-based rendering and the way it did pixel shading. Tile based rendering worked on only a subset of the 3D scene, which can make use of lower memory buses. It also sorted the depth of polygons first then colored them, rather than coloring them then figuring out the depth later. Though the Dreamcast failed, and PowerVR stopped making PC cards as well, this card is notable in that it showed PowerVR that the true strength of their tile rendering hardware was in embedded systems. This lead to their Series 4 and 5 lines (see below). One notable feature it had thanks to its tile based rendered was Order-Independent Transparency, which allows transparent objects to move in front of each other and still look realistic. DirectX 11, released in 2010, made this feature a standard.

ATI Rage 128 (1998)

While ATI were very successful in the OEM market for most of the 1990s, most enthusiasts didn't give them a second thought. The Rage 128 started to change things, by being the first chip to support 32-bit color in both desktop and games, along with hardware DVD decoding. On top of that, ATI offered a variant called the "All-in-Wonder" which integrated a TV tuner. The Rage 128 quickly became popular as a good all-round solution for multimedia-centric users, but its rather average gaming performance (and ATI's then-notoriously bad drivers) meant most high-end gamers looked elsewhere.

S3 Savage (1998)

S3's major attempt at breaking into the 3D graphics market. This chip introduced the forgotten MeTal API, but is mostly remembered for the now industry standard S3 Texture Compression (S3TC) algorithm that allowed very large and highly detailed textures to be rendered even with relatively little Video RAM on board. The Savage itself however suffered poor chip yields and buggy drivers, thus it never really took off.

3dfx Avenger (1999)

Used in: Voodoo3

3dfx's first "performance" graphics card to contain a 2D graphics core, and actually more the successor to the Banshee than the Voodoo 2. Despite shaky DirectX support and a lack of 32-bit color, it was still the fastest card around (the slowest version being as fast as two Voodoo2s in SLI) when it was released. However, this coincided with 3dfx making what many consider to be the mistake that sent the company down in flames; buying out board manufacturer STB and making all their own Voodoo-based graphics cards in-house which, combined with ATI and Matrox doing the same and most of the other GPU makers going bust at this time, basically handed NVIDIA a monopoly over the other graphics card makers and allowed them to utterly dominate the market for the next four years.

Matrox G400 (1999)

The first graphics chip which could natively drive two monitors. On top of that the G400 MAX version was the most powerful graphics card around on its release, meaning you could game across two monitors with just one card, something that neither ATi nor NVIDIA came up with for several more years. Unfortunately, management wrangles meant that Matrox's subsequent product releases were botched, and nothing ever came from the G400's strong positioning.

NVIDIA Celcius (1999)

Used in: GeForce 256, GeForce 2, GeForce 2 MX, and GeForce 4 MX

The first graphics processing unit to carry the term "GPU" and arguably the most influential in modern 3D graphics processors. Celcius added hardware transform and lighting, also called Hardware T&L. Transform involves turning the polygons from one frame of reference, the world reference, to another, typically the player camera reference. This step is necessary primarily because it tells the renderer just how much to actually render. Lighting is figuring out what lights there are in the world and how much they affect the final pixel color. This step helps, for example, add shading in real time without the need for something hacky like baking it in a texture and using it in specific places. The CPU normally handled these, but both operations were becoming increasingly more taxing on the CPU alone to do it. Indeed, when first launched, NVIDIA's rival 3dfx scoffed that a fast enough CPU could keep up and benchmarks showed that a fast CPU with a 3dfx graphics card could. But the flip side was a slow CPU, one that might be found in older or budget computers, coupled with a with a GeForce 256 could perform just as well. Likely because 3dfx never implemented Hardware T&L in their future GPU designs, among other poor business decisions, they went out of business shortly after.

The GeForce 2 GTS version of the GPU released a year later, doubled the pixel pipeline and texture mapping units to 4 and 2 respectively of its predecessor. Along with a bump in clock speed, this allowed it to easily defeat anything else on the market at the time. Even higher-clocked versions, the GeForce 2 Pro and Ultra were introduced later in the year. However, easily the most influential model from this family was the GeForce 2 MX, which offered not only respectable performance and full T&L functionality, but dual-monitor support and even better multimedia functionality than the GTS and its sister chips. This made the MX a massive success and one of the best selling video card lineups. This chip also saw use in the first Quadro card from NVIDIA, targeted at professionals. Unlike their GeForce counterpart, Quadro cards prioritize quality and accuracy over performance, meaning that they perform poorly in games, but rendered graphics at a consistent quality- an important feature when it comes to the animation and graphics design industry. However the only difference is in the drivers and firmware, the base GPU is practically the same.

When the GeForce 4 lineup was launched in 2002, it also included a budget series in the form of the GeForce 4 MX. However, NVIDIA opted to use the Celcius design rather than the later Kelvin design, with some features from Kelvin being ported over, such as its more efficient memory bandwidth and fill-rate as well as hardware accelerated MPEG-2 decoding. Despite many criticizing this move, the GeForce 4 MX was still a market success, though the backlash likely prompted NVIDIA to avoid doing this again (at least until the Turing GPU).

2000s

3dfx VSA-100 (2000)

Used in: Voodoo4 4500 and Voodoo5 5500

The last hurrah in 3dfx's time as a GPU designer. The most notable thing was it was designed to be scalable, i.e., being able to hook up multiple GPUs together similar to the previous Voodoo2 SLI. The first graphics card launched with this GPU, the Voodoo5 5500, had two of these GPUs on a single card. It almost accumulated into 4 with the Voodoo 5 6000, but the company folded and was acquired by NVIDIA before that happened. One major factor that contributed to this demise was the sudden rise in the prices of RAM chips, and the Voodoo 5 needed LOTS of it (because each GPU needed its own bank of RAM; they couldn't share it), which would have driven production costs over the roof. On a more positive note the, Voodoo 5 was the first PC GPU to support anti-aliasing, though enabling it resulted in a major performance hit. A single-GPU version called the Voodoo 4 was also released, but got steamrollered by the equally-priced, far more powerful GeForce 2 MX that came out at the same time.

ATi R100 (2000)

Used in: Radeon 7000 series

The Radeon was ATI's first real contender for a well performing 3D graphics chip, and basically started the road that put ATI in the place of 3dfx as NVIDIA's main rival. Notably, it was the first GPU to compress all textures to make better use of memory bandwidth, and also optimized its rendering speed by only rendering pixels that were visible to the player. Despite being much more technologically elegant than the GeForce 2, it lacked the brute force of its rival, and only outperformed it at high resolutions where neither chip really delivered acceptable performance.

ATi Flipper (2001)

This was the GPU for the Nintendo GameCube. It first started life in a previous company known as ArtX, spawned off from a team at SGI who did work on the Reality Co-Processor on the Nintendo 64. ATi later bought them out, seeing what they were working on was promising. While looking similar to other GPUs at the time, including a vertex unit, pixel pipelines, and a texture mapping unit, it was a little ahead of its time when released. The texturing stage of the hardware also had something called the Texture Environment Unit or TEV, which was a programmable color blender. This allowed the Flipper to perform some rather advanced techniques than what its architecture would suggest. ATi would later have the team work on its next generation GPU, the R300. The GPU design itself was later upgraded for the Nintendo Wii.

NVIDIA Kelvin (2001)

Used in: GeForce 3 series, GeForce 4 Ti series, and the Xbox

The first GPU to comply with the DirectX 8 standard, which required fully programmable shaders. It also included an upadted memory subsystem to reduce overdraw, compressing buffers, and improve VRAM management and a more efficient anti-aliasing routines which made usage of edge sampling and blur filters in order to make the performance loss a little less crippling.

The first product to use this GPU was the GeForce 3 series. There was no real budget version of this family as the GeForce 2 MX continued to fill that role. The later released Ti200 model later served this market segment as a budget midrange card. As it was only clocked lower than the original model, but retained all its functionality, it proved very popular among enthusiasts. An even faster version, the Ti500 was also released at the same time.

Later an upgraded version of the GPU found its way into the Xbox. It'd sort of serve as a preview to the upcoming GeForce 4 Ti lineup. The upgraded version included more updates to the memory subsystem, vertex shader, and a hardware anti-aliasing unit. The multi-monitor feature from the Geforce 2 MX lineup was also brought over.

NVIDIA nForce (2001)

While not exactly a GPU, the nForce was a brand of motherboard chipsets, with one of NVIDIA's GPUs integrated in it. This meant that motherboards with integrated graphics no longer had abysmal performance in 3D applications. People who wanted to build entry-level computers with some graphical capabilities could now do it without a discrete graphics card. The nForce was exclusive to AMD processors until about 2005, when they started making Intel versions as well.

Eventually the nForce became a motherboard chispet without graphics to host NVIDIA exclusive features, such as SLI. When they brought back a GPU into the chipset, they also included a new feature known as Hybrid Power, which let the computer switch between the integrated graphics when not doing a 3D intensive task to save power. Eventually, NVIDIA dropped out of the chipset business (Intel withdrew permission for NVIDIA to make chipsets for their CPUs, while AMD bought out rivals ATI and co-opted their chipset line, resulting in NVIDIA quickly being abandoned by AMD enthusiasts), but most of their features were licensed or revamped. SLI is now a standard feature in Intel's Z68 chipset and Hybrid Power is seen in laptops as Optimus, though rumor is that NVIDIA is making a desktop version called Synergy.

ATi R200 (2001)

Used in: Radeon 8000 series, budget Radeon 9000 cards

The first product by ATi which made a real stab at the performance lead, which it actually held for a few weeks before the much pricier GeForce 3 Ti500 arrived on the scene. It extended the Direct3D functionality to version 8.1, boasted better image quality and texture filtering, and improve multi-monitor support. Only a few weeks after it launched however, it was discovered that the chip was using a driver cheat to post better benchmarks than the GeForce 3 in Quake III: Arena' (ATi's performance in OpenGL-based games had historically lagged quite a bit behind that of NVIDIA), which turned enthusiasts against the chip and ruined any chance of it outselling the GeForce 3 Ti200. Once the initial internet backdraft had calmed down however, it did gain some popularity thanks to its reasonable price and the drivers improving, and a cut-down version of the core was later released as a cheaper Radeon 9000 card.

Matrox Parhelia (2002)

Matrox's last major design. Like the G400 before it, the Parhelia included a lot of new features, including a 256-bit memory bus (which allows Anti-Aliasing with a much lesser performance loss) quad-texturing (which was nice in theory, but completely pointless in execution since games were now using pixel shaders to simulate the effect of multi-layered textures) and support for gaming across three displaysnote . While the Parhelia should have been more than a match for the competing GeForce 4 Ti series, Matrox made a serious blunder and neglected to include any sort of memory optimization system, crippling the Parhelia to the point where it struggled to match the GeForce 3. ATi and NVIDIA's subsequent efforts would implement the Parhelia's features more competently, and Matrox quickly dropped away to being a fringe player in the market. They released a few modified versions of the Parhelia (adding in support for PCI-Express and wide ranges of display connectivity) over the next decade, before announcing in 2014 that they were pulling the plug on their GPU line and would become a graphics card manufacturer instead. While they still do make their own GPUs, these chips no longer emphasize 3D performance and instead put an emphasis on multi-display and 2D color accuracy instead, are also based on the dated G400 chipset, and the main clientele are server motherboard manufacturers where their weak graphical abilities aren't important since the server is often supplemented with dedicated GPU cards. Most of the cards sold by Matrox to professionals nowadays are instead based on AMD/ATI chips.

ATi R300 (2002)

Used in: Radeon 9000 series, Radeon X300/X600

What made the headlines about this GPU and the first video card to use it (the Radeon 9700) was that it had DirectX 9.0 support before it was officially released. And due to NVIDIA making a critical error (see below), it was a Curb-Stomp Battle against the GeForce FX in any game using DirectX 9.0. In particular, Valve made the highly anticipated Half-Life 2 the killer app for the R300 based cards, practically ensuring gamers would flock to them.

Moreover, it still offered exceptional performance in older games, thanks to ATi ditching the until-then popular multitexture designs in favor of a large array of single-texturing processors, which generally offered better performance and remains the standard approach for GPUs to this day. Sweetening the deal was a much more competent execution of the 256-bit memory bus that Matrox had attempted with the Parhelia, finally making Anti-Aliasing a commonly used feature. This cemented ATi as competitor in graphics processors. A slightly revised version, the Radeon 9800 was released the following year, and ATi also started selling their graphics chips to third-party board makers, ending NVIDIA's monopoly on that front.

ATi Imageon (2002) / Qualcomm Adreno (2009 onwards)

One of the most popular embedded graphics chipset to enter the market and widely used by higher end cellphones of the era. It supported hardware level MPEG 1/2/4 and JPEG encoding/decoding, support for OpenGL ES 1.1, and even has a built-in driver for image sensors, alleviating the need for a separate camera controller chip. The line was eventually sold to Qualcomm in 2009, who rebranded it as Adreno (an anagram of Radeon), and continued to improve on it- the latest version supports OpenGL ES 3.1 and 4K output to an external display via HDMI. The Adreno silicon can be found in most Qualcomm ARM-based SoCs since 2009.

NVIDIA Rankine (2003)

Used in: GeForce FX series

NVIDIA's first major blunder since becoming the GPU market leader. A series of unfortunate events and questionable decisions practically sealed its second-place status during the time it was launched.

The first issue was that while the GPU was intended to target DirectX 9.0, it was optimized around using 16-bit values rather than the API's 24-bit minimum standard. The only way to process 24-bit values was to have the GPU process 32-bit values, which severely hampered performance. NVIDIA's hope was that programmers would optimized for the 16-bit code, but it never came to fruition. While performance with previous DirectX based games improved with the later FX 5900 release, the FX series could never match the R300 in DirectX 9.0 games. The second issue was NVIDIA wanted to leverage the upcoming 130 nm process to allow for better clock speeds, efficiency, and production yields. However, the manufacturer of the GPU themselves, TSMC, had issues getting it working.

The initial video card released, the FX 5800, also introduced the idea of GPU coolers which took up a whole expansion slot all by themselves, which is now standard in anything higher than an entry level card. Unfortunately, NVIDIA got the execution of that wrong as well, using an undersized fan which constantly ran at full speed and made the card ridiculously loud, earning the cooler itself the nickname of "the leaf-blower". This eventually gave way to a more reasonable cooler in the FX 5900, and some fondly remembered Self-Deprecation videos from NVIDIA. Allegedly, and in a bit of irony, the GeForce FX was developed by the team that came from 3dfx, whom NVIDIA bought a few years earlier.

XGI Volari (2004)

Notable mostly for being the last GPU which made a serious attempt to take on ATi and NVIDIA at the high-end, it followed in the footsteps of the Voodoo 5 by offering two graphics chips on one board. Their top-end SKU was targeted at the GeForce FX 5900 and Radeon 9800, with the same range of Direct3D 9 functionality. Unfortunately, any chance XGI had of establishing themselves as serious competitors was dashed within a few weeks, when it ran into its own version of the driver scandal that had engulfed the Radeon 8500. With XGI's driver hacks the chip was somewhat competitive with the FX 5900 and Radeon 9800 with noticeably worse image quality. With those hacks disabled, performance was abyssmal.

NVIDIA Curie (2004)

Used in: GeForce 6 series and GeForce 7 series

Learning from its mistakes, NVIDIA designed Curie for the latest DirectX 9.0c standard, ensuring that it was more than capable of meeting the standard. This also turned the tables on ATi for a while, as the GeForce 6800 Ultra had double the performance of the Radeon 9800 XT at the same $500 price point and ATi's next generation GPU targeted DirectX 9.0b ironically much like 9.0a, few developers specifically targeted. Also by this time, TSMC's 130 nm production woes were resolved, and later models of the GPU used their 110 and 90 nm process.

This GPU also marked the re-introduction of the idea that GPUs could be linked up much like 3dfx's SLI, this time as Scalable Link Interface, with the introduction of the PCI-Express interface to displace the existing Accelerated Graphics Port interface. As a result, ATi countered with CrossFire. While it offered more flexibility than SLI (which only worked with identical GPUs at the time), it was clunky to setup as one needed to buy a master Crossfire card and a funky dongle to connect to the other card, and only offered a very limited set of resolutions (they subsequently implemented Cross Fire much better in the following year's Radeon X1900 family).

On the other hand, setting up SLI was clunky as well at that time- it required that the owner had a nForce chipset motherboard with two PCI Express slots. In a bid to plug their own chipset, NVIDIA crippled the driver to ensure that the SLI option is not offered if the motherboard isn't using an nForce chipset. ATi used this restriction to their marketing advantage by not crippling their drivers and claiming that Crossfire will work on any motherboard with two PCI express slots (albeit that you'll still need a Master card and a Slave card, something they later abolished as well). NVIDIA, still adamant on plugging their own chipset, responded by only easing down on the restriction by allowing other chipset manufacturers to license SLI support, resulting in poor SLI uptake in the early years compared to Crossfire. This restriction was only completely removed when NVIDIA left the chipset business at the turn of the decade.

SLI did bring about some interesting cards as well. One manufacturer released a video card that had two GeForce 6600 GTs, acting like an all-in-one SLI solution. NVIDIA later released an official design in the form of the GeForce 7900 GX2 and 7950 GX2, which could also be used in SLI. The last of these was the GeForce GTX 690, before SLI itself would die a slow death throughout theh 2010s, finally no longer being a feature in some form or fashion in 2022.

S3 Chrome (2004)

Basically S3 waking up after being sold to VIA and sleeping for four years, looking around and realizing how they've slipped to last place, they released this evolution to the Savage line in 2004 to the sound of crickets chirping. It is not very popular outside of the budget PC market. But damn, does S3 care. This was followed the next year by cards supporting Multichrome, S3's take on SLI and Crossfire, but requiring no special chipset or master and slave cards. S3's take is that you could take just any two or more identical Chrome cards and put them in any motherboard with two or more PCI Express slots, and it will do Multichrome. While this may seem as an advantage at first, S3's cards still performed poorly in 3D compared to ATi and NVIDIA cards of the time, and putting them into Multichrome mode barely helps. And when ATi cards that could do the same thing when the Radeon X1000 series came out, it completely destroyed the one advantage S3 had on.

S3 also introduced AcceleRAM, their take on NVIDIA's TurboCache and ATi's HyperMemory. TurboCache and HyperMemory didn't gain much traction since cards that used them were simply cost-reduced versions that had less VRAM than the non TurboCache/HyperMemory version, so this didn't win S3 any favors.

ATi Xenos (2005)

The GPU that powers the Xbox 360. The standout feature of this GPU design was the unified shader architecture. Up until this point, vertex and pixel shading work was done on separate, fixed pipelines. Unified shading allows the GPU to assign any amount of pipelines to do vertex or pixel shading work. Thus if one part of a scene needs more pixel output, the GPU allocates more pipelines to it and vice versa for vertex work. Essentially, it was able to use the same number of transistors more efficiently.

NVIDIA Tesla and ATi R600 (2006)

Used in
- Tesla: GeForce 8 series, GeForce 9 series, some GeForce 200 series
- R600: Radeon HD 2000 series, Radeon HD 3000 series, Radeon HD 4000 series

The idea of the unified shader pipeline from the ATi Xenos kicked off. However, these two graphics processors took another approach. Instead of using vector units, they incorporated scalar units. This meant that instead of working on multiple pieces of the output at the same time, only one output is processed. While this sounds like a step back, scalar units are smaller so you can pack much more in the same area as a vector unit and can be chained anyway to act like vector units. In addition, vector units tended to be harder to attain close to 100% utilization, but it's much easier to do so with scalar units.

Since the scalar units were also designed to do whatever work was needed, it lead to another kind of shader, the compute shader. This handles work that doesn't make sense to do on the vertex, geometry, or pixel shader; things like physics simulations or what not. This led to the creation of the General Purpose GPU, which both NVIDIA and ATi made graphics cards just for computational work. GPUs now can do so much computation that they've replaced CPUs in the supercomputer market to the point where at the time, one can make a supercomputer class computer for around $2000.

A standout card in the GeForce 8 lineup was the GeForce 8800 GT. Released out of the blue as a refresh to the GeForce 8 series, it performed almost as good as the high-end 8800 GTX, while costing half as much, $250 at the time. And in a rarity for high-end cards of the time, it occupied only one PCI-Express slot. In a one-two punch, an upgraded version was released for $300 a few weeks later. This started a trend of the "sweet spot" card, where $300 would get you really good performance. AMD even countered with the Radeon HD 3850, which had all the charms of the 8800 GT. It also started a shift in how AMD designed their GPUs, going with the "smaller, scalable" cores over the "big, monolithic" ones. Much of this design philosophy extends to this day.

The 8800 GT was such a good GPU design, at least in NVIDIA's eyes, that it was used in three generations of graphics cards, and is one of the last video cards to be so fondly remembered.

Intel Larrabee

In 2008, Intel announced they would try their hands in the dedicated graphics market once more with a radical approach. Traditionally lighting is actually estimated using shading techniques done on each pixel. Intel's approach was to use ray tracing, a hugely computationally expensive operation. Intel's design however was to use the Pentium architecture, but scale it down using modern integrated chip sizes, modify it for graphic related instructions. A special version of Enemy Territory: Quake Wars was used to demonstrate it. It was axed in late 2009. However in 2012, Intel took what they made from Larrabee and turned it into a general purpose GPU to compete against NVIDIA and AMD on that front, the result being the Xeon Phi.

The idea of real-time ray tracing wouldn't be dropped though. NVIDIA tried later their hands on "real time" ray tracing with the GeForce GTX 480 using a proprietary API before adding dedicated hardware in the GeForce 20 series. Several other, smaller companies also developed hardware ray-tracing acceleration cards or GPUs before their technology was either swallowed up by the bigger players or ended up being mostly vaporware.

S3 Chrome 430 (2008)

Still struggling to keep relevant, S3 embraces the unified shader model two years after NVIDIA and ATi. Again, to the sound of crickets and only embraced by budget card manufacturers. S3 would eventually be resold to cellphone maker HTC in 2011, but continues to make 3D chipsets for budget motherboard and video card manufacturers to this date.

ATi R700 (Evergreen) (2009)

Used in: Radeon HD 5000 series and Radeon HD 6000 series

This GPU finally allowed ATi to retake the GPU performance lead, having been continually behind NVIDIA to various degrees for the previous five years. More notably however, it was the first card to implement full DirectX 11 and Shader Model 5 support, and also included gaming across three displays as a standard feature. The special EyeFinity edition cards took things further by supporting gaming across six displays in a 3x2 configuration.

As an aside, this was the last graphics card family to be released under the ATi brand, which was retired by owners AMD immediately afterwards, with all future products being released under their own name.

PowerVR Series 5 (2005-2010)

After several generations of failed attempts to re-enter the PC graphics market, PowerVR entered the embedded systems market. The Series 5 brought exceptional performance for mobile devices, including Apple's iPhone 4 and many Android powered tablets and phones. The second generation, SGXMP, is arguably the first multi-core GPU. A dual-core variant powers the iPad 2 while a quad-core version powers Sony's PS Vita. The core can also be found in many off-brand ARM-based SoCs like those from AllWinner due to it's low licensing costs and excellent documentation, and thus can be found in many off-brand Android devices as well.

2010s

Intel Gen6 and AMD Northern Islands (2010)

Used in:
- Gen6: Intel HD 2000 and Intel HD 3000
- Northern Islands: Radeon HD 6000D/G and Radeon HD 7000G

After AMD acquired ATi, they announced plans to integrate GPU cores onto the CPU to help accelerate various workloads, if not act as a basic GPU for computers that didn't need a full blow graphics card. They dubbed these types of processors "Accelerated Processing Unit" or APUs.

Intel however, beat AMD to the punch with the release of the 2nd Gen Core processors that included the HD 2000 graphics core in all of the lineup. They even implemented ways to use it, famously with QuickSync that allowed video encoding/decoding to be offloaded to the GPU, so the CPU was free to do other things. AMD eventually launched their APUs, but they were paired with lower end CPUs.

NVIDIA Fermi (2010)

Used in: GeForce 400 and 500 series, Quadro series 4000 through 6000

NVIDIA's first real misfire since the GeForce FX. The Fermi was designed for high computational performance, anticipating the rise of GPU-assisted computing. Unfortunately, the resulting product was very late to market, 6 months behind the Radeon 5870, and suffered from terrible power efficiency and a poor set of initial drivers that meant it was at best, only equal to the 5870. Unlike the GeForce FX, Fermi actually did improve somewhat with age, as driver improvements allowed it to overtake the 5870, and then the improved 500-series further improved performance while also bringing down power consumption. It also helped that GPU computing really started to take off at this point, and Fermi's design and NVIDIA's programming tools really helped it outshine the contemporary Radeons. However, it was obvious that the Fermi design was an evolutionary dead-end, forcing the company to think differently for their next major release.

AMD Graphics Core Next (GCN) 1st Gen (Southern Islands) (2012)

Used in: AMD Radeon HD 7000G/8000 series, Radeon RX 200, Radeon RX 300

AMD's Southern Islands represents a radical change previous architectures, based upon a new family of technology AMD calls Graphics Core Next. The strategy this time was to increase compute performance. To do this, unlike the earlier TeraScale GPU design that uses VLIW based shaders, the Graphics Core Next family uses RISC based shaders. While this increases transistor requirements, the use of RISC based shaders helps increase GPGPU performance. Also to help with distirbuting workloads, scheduling on the shaders was done through what AMD later called Asynchronous Compute Engines, or ACEs. This allowed the GPU to behave something like a CPU in that shaders that were done with work could be scheudled with more work at any time, rather than wait on tasks to be finished.////Other major features include an even lower power mode when the GPU has been idling for a long time and support for unifying memory (either through Unified Virtual Memory or Heterogenous System Architecture). AMD's Mantle API was also introduced later in 2013. GCN received two updates. 1.1, added on later APUs, introduced an audio processing core called True Audio, while version 1.2 included incremental updates to improve efficiency. AMD intends to continue using the Graphics Core Next codename and is set to carry it over into the next GPU family, Polaris.

NVIDIA Kepler (2012)

Used in: GeForce 600 series/700 series (except 750)/TITAN, Tegra K1, Quadro K series (except Quadro K620, K1200, K2200),NVS 510, Tesla K series

Kepler represented another revolutionary design from NVIDIA once it began to realize that the previous Fermi and Tesla design philosophies were becoming inefficient in terms of power and performance. Originally, their design was to have a smaller number of shaders clocked at twice the frequency of a base clock. Running at a faster clock speed is often a huge power hog so NVIDIA made the shaders run at the base clock speed. But now they had to at least double the shader count to break even on performance. This posed a problem with area on the chip. However, transistor sizes were enough that they could pack twice the shaders in about the same area compared to the previous generation.

Then they went further to find things to cut. In previous designs, the instruction scheduler was in charge of seeing what threads were waiting and kept track of what resources they needed through a complex piece of hardware. However, for graphics workloads, NVIDIA found operations had reliably predictable execution times (e.g., a fuse multiply-add operation always took 9 cycles). So instead of having the hardware keep track of when the threads could run, the compiler could do this, providing hints to the hardware as to when each thread could be expected to be ready to run. This simplified the scheduler to needing to only keep track of what threads are in the pipeline and how many cycles they have to wait before they can run.

One other thing that was notable, but not really a huge problem was the GPUs meant for consumer video cards had their compute performance cut to focus more on graphics performance. It wasn't seen as a problem since consumer facing applications rarely used the compute pipeline of the GPU. This effectively meant the 64-bit floating point math part was cut down to 1/24th the performanace of doing 32-bit floating point math, whereas before it was only 1/8th performant.

Like previous designs (as well as AMD in their GCN architecture), Kepler is designed to be modular and scalable. So much so that not only are its implementations are in the highest performing supercomputers, but well down into the mobile market. NVIDIA claims a 192 shader Kepler GPU for mobile consumes less than 2W, yet performs as well as the Xbox 360 or PlayStation 3

Intel Gen7.5 (2013)

Used in: HD Graphics 4000, HD Graphics 5000, Iris Pro Graphics 5200

This GPU represented Intel's seriousness in competing in the integrated graphics market, in particular with the Iris Pro Graphics. Utilizing eDRAM instead of piggybacking on main memory, Iris Pro does perform enough to compete with even lower end discrete graphics cards. However, where it's supposed to supplement like the Gen6 series is for compute performance. Iris Pro will allow a processor to perform just as good as higher end processors or discrete graphics solutions for about 30%-40% less power.

Every subsequent generation of Intel's GPUs has included some form of the Iris Pro lineup, representing Intel's "top" end integrated graphics. It eventually culminated into Intel trying again in the discrete video card market with the Alchemy video cards.

NVIDIA Maxwell (2013)

Used in: GeForce GTX 750 and GeForce 900 series, Quadro K620/K1200/K2200, Tegra X1, TITAN X, GRID M6/M10/M30/M40

Refining on the principles of Kepler, Maxwell's aim was efficiency by designing for mobile first. Despite being on the same 28nm process as Kepler and GCN, some clever engineering allowed Maxwell to perform up to twice as much as the GTX 680 with virtually the same image quality. A much touted feature that was advertised was real-time global illumination and they showed this by debunking several conspiracy theories on the lunar landing (particularly about lighting). In an odd move, NVIDIA released a lower end version first. This was presumably to get something out the door as a "beta" product and make refinements later.

Though this GPU wouldn't be without its quirks and convtroversies.

The first quirk that was noticed was that flagship video card, the GeForce GTX 980, had a 256-bit memory bus. This stood in contradiction to previous generations of high-end GPUs using a 384-bit, 512-bit, or even AMD's upcoming HBM using GPU with a 4096-bit memory bus. The only explanation offered at the time that could connect to why this was the case as NVIDIA had developed lossless compression techniques on the intermediate rendering data. They claimed that this saved about 20% of the bandwidth requirement. It was later found out that NVIDIA also implemented a different type of rasterization (or figuring out which polygons to actually render) that worked on smaller chunks of the screen at a time, so bandwidth was saved there as well. However, this technique wasn't new, as it was used in the PowerVR design since its inception

Then shortly after launch, a user noticed in the GTX 970 video card that if more than 3.5 GB of VRAM was allocated, performance would start to decrease considerably. NVIDIA later explained that the GPU only has enough memory controllers for 7 out of the 8 VRAM chips on the card, and the two sharing a memory controller is what causes reduced performance. This led to a lawsuit over misleading specs, claiming the card only had 3.5 GB of VRAM since that was the only amount that performed at full speed. Likely because of this, NVIDIA (and perhaps AMD) never used this setup again.

Another minor controversy was whether or not Maxwell was compatible with DirectX 12. People noticed that the performance difference between DirectX 11 and DirectX 12 render paths didn't change and said that was proof it wasn't compatible. Some people also added that Maxwell couldn't perform graphics and compute workloads at the same time. Without going into too much detail, there wasn't anything that DirectX 12 required on the hardware side, it was mostly how the applications now interact with the API to potentially lower CPU overhead. The reason why Maxwell didn't see any performance differences was that unlike AMD's GCN-based GPUs, once work was scheduled on the GPU, it had to finish it before taking on more work.

AMD GCN 3rd Gen (2015)

Used in: Higher-end Radeon 200 series, R9 Fury series

While the GPU received incremental upgrades from the previous design, the standout was the Fiji version of the GPU. Implemented in the R9 Fury series of cards, this was the first GPU design to incorporate high-bandwidth memory, or HBM. By this point, the GDDR5 memory commonly used on graphics cards was becoming increasingly power-hungry and needed huge amounts of PCB space due to the memory bus width. HBM has the memory clocked slow, but connected over an incredibly wide bus on a larger substrate known as the interposer that's shared with the GPU. This helped reduce the area footprint and the amount of power needed. The end product wasn't quite the slam dunk that many had hoped for, often being noticeably slower than the competing GeForce 980 Ti at 1080P resolution, though the HBM did come into its own and help it mostly draw level at 4K resolution. Other issues rose about in manufacturing as well, as AMD contracted two different manufacturers for the entire package and one of them didn't make sure the height of all the pieces were within spec, leading to cooling issues for some cards. Ultimately, HBM proved to be too expensive as well to work in the consumer space and most of the professional space, leaving it used only in high-end professional cards, but the R9 Fury served as a useful test bed to figure out the kinks of this tech.

NVIDIA Pascal (2016)

Used in: Tesla P100, GP100, GeForce 10 series, TITAN X with Pascal.

Launchehd in May 2016, Pascal represented more of an evolution of Maxwell, with a move to 16nm which meant a two-generation jump in manufacturing. This provided a trifecta of perks: More shaders, smaller die size, and faster clock speeds. This allowed the flagship GTX 1080 to perform as good, if not better than the GTX 980 Ti while using 75% of the power. Later, in a move almost reminencent of the 8800 GT video card, NVIDIA released a refresh in the form of the GTX 1080 Ti, performing as good as the top-end Titan cards at a $700 price point compared to $1200 for the Titan cards. The GTX 1080 then dropped from $650 to $500. Overall for the generation, this was one of the last times a decent performing video card could be had for <$150 and as of 2024, the last time NVIDIA shipped a budget video card (the ~$80 MSRP GT 1030).

This also represented a new venture for NVIDIA: making consumer video cards themselves. Typically, NVIDIA sells the GPUs to other manufacturers (called Add-In Board or AIB manufacturers), who design a board and buy the other components with it. However, NVIDIA wanted that slice of the pie, so they made their video cards dubbed the "Founder's Edition" cards. They also carried a premium over the paper MSRP (said AIB manufacturers rarely sold at MSRP anyway). Before then, NVIDIA did make video cards themselves, but only the higher-end workstation and supercomputer based cards, with a few AIBs handling the lower end offerings.

As far as additional improvements to the GPU, the scheduler was updated to solve the issue with Maxwell regarding simultaneous compute and graphics workloads, allowing the GPU to re-allocate resources to finish the remaining task as quickly as possible if one finished early. Another was to help with the emerging VR market, so the video card can render frames with as little latency as possible, in addition to rendering in such a way that makes sense to a VR headset.

AMD GCN 4th gen (AKA Polaris) (2016)

Used in: Radeon RX 400 series, Radeon Pro 400 series, Radeon RX 500 series, Radeon Pro 500 series, Radeon Pro WX series.

Launched during Computex Taipei 2016, GCN 4th Gen was announced to be the next step for AMD after Southern Islands and Fiji. Starting with this family, the codenames for the GPUs themselves will now be based off stars. However these cards are meant for the mainstream market, with AMD planning to service the enthusiast and high-end market with the upcoming Vega chipset instead (which unlike the Polaris cards, will continue to use HBM).

The Polaris comes in two families, Polaris 10 and Polaris 11, the former focusing on performance while the latter on low power consumption. Both families uses the fourth generation of the Graphics Core Next. Because Polaris GPUs were targeted at entry to mid range markets, AMD has announced that it will revert to using GDDR5 memory instead of sticking to still very expensive HBM. The Polaris 10 part that comes in the Radeon 480 GPU not only consumes much less power than it's predecessors, but also have enough grunt to pass Valve's SteamVR benchmark. AMD further shocked the attendees of the exhibition by announcing that the Radeon 480 will be sold at an affordable US$200, making it the cheapest card to be certified VR-Ready. AMD later announced that the 480 will be joined by lesser siblings 470 and 460, with the latter being the first Polaris 11 card and is meant for the budget market where performance isn't a concern. The three cards became generally available by the end of July 2016.

Apple later launched the 2016 MacBook and MacBook Pros with a Polaris 11 graphic chip inside, these Apple-customized parts are called the Radeon Pro 450, 455 and 460 respectively.

In mid-April 2017, on the anniversary of the launch of the Polaris series of silicon, AMD launched the RX 500 series, with a new addition to the family- Polaris 12. Additionally, Polaris 10 and 11 were rebranded to Polaris 20 and 21- along with minor improvements to the silicon- the change was to indicate that these were second generation Polaris silicon. Polaris 20 would be used in the RX 580 and 570., Polaris 21 would be used in the RX 560, and the new Polaris 12 would be used in the RX 550. A little while later, the Radeon Pro 500 line was announced at Apple's WWDC expo.

AMD GCN 5th gen (AKA Vega) (2017)

Used in: Radeon Instinct MI25 Vega with NCU, Vega 56/64, Radeon VII, Radeon Vega Pro II and Radeon Vega Pro II Duo cards for the Mac Pro, Ryzen and Ryzen Embedded APUs with Vega Graphics, Intel Core-i GH-series CPUs

Unlike Polaris, the first of the Vega series of cards had a quiet launch on February 12th, 2017, with AMD announcing the Radeon Instinct line of GPGPU cards to take on NVIDIA's professional-grade Pascal cards nowhere but on it's website. Unlike Polaris before, Vega uses HBM2 memory and cache, and is thus more of a descendant of the Fiji GPU. However, like the Polaris GPU series it is notable that the card is extremely energy efficient, consuming a little less than 300w of power and yet is passively cooled. In this scenario however, the cards are being used for calculation and AI research, and thus lacks any video output.

Later that year, a home version of the Vega card was announced. The home version would come in two variant: the full featured Vega 64 and the slightly cut down Vega 56. It should be noted that the Vega was so power efficient that Acer actually managed to put the Vega 56 GPU that was meant for a desktop, on a laptop. And despite being severely underclocked and undervolted, the GPU on said laptop was still capable of packing a serious punch, running many new games at high to maximum settings even after three years of release.

The final discrete iteration for the Vega came in the form of the Radeon VII, a GPU with capabilities sitting right smack in the middle of the Vega 56 and Vega 64.

After that, the Vega saw use as embedded graphics in several Zen 2 and Zen 3 as well as Ryzen Embedded APUs. It even managed to turn heads when AMD announced that several Intel CPUs will be shipping with Vega graphics cores instead of Iris cores. It was eventually retired in favor of Navi cores.

The graphics chip was also chosen by Apple to power the latest line of "cheese grater" Mac Pros.

NVidia Turing (2018)

Used in: GeForce GTX 16 series, GeForce RTX 20 series, Quadro RTX series, Tesla T4

Arugably this represents another revolutionary change in NVIDIA's GPUs, though focusing more on packing features than re-engineering the lower level guts of the GPU itself.

The most standout features, though only available on the GeForce RTX 20 series, is the inclusion of hardware accelerated ray tracing through dedicated ray tracing hardware and AI computation acceleration through tensor cores. The former was seen as pushing the so-called "holy grail" of real-time graphics rendering, as ray tracing is the most accurate way to shade 3D scenes. The only problem was that ray tracing is prohibitively expensive to compute. However, instead of doing something on the same level of Pixar CGI, the approach NVIDIA took was to have ray tracing augment traditional rendering and at the same time, doing only enough ray tracing such that AI-based processing takes over and fills in the gaps. The initial implementations, while they showed promise, had a lot of detractors for several reasons. The first one was ray tracing caused a huge performance hit, with expectations of 50% or higher compared to not using ray tracing. Another was the AI-based processing, which was supposed to help alleviate the performance hit, only worked in specific circumstances such as limited resolutions and only if there was a trained model for the game. NVIDIA continued development though, and within a few years, the restrictions mostly went away, but there was still the performance hit (which is mostly inevitable, due to how computationally heavy ray tracing is).

A few other things that NVIDIA added that went under the radar was the inclusion of dedicated integer execution units and mesh shaders. The former came about because integer operations were becoming increasingly common in games (NVIDIA claims an up to 1.5x in performance) and the latter was yet another means of efficiently processing polygons by allowing concurrent processing of objects.

However, in a move not seen since the GeForce 4 MX days, the lower end offering lacked both the ray tracing and AI acceleration, but retained the integer units. These GPUs were used in the GeForce GTX 16 series, to help clearly differentiate it from the ray tracing and AI acceleration enabled cards.

AMD Navi 1&2 (2019)

Used in: AMD Radeon and Radeon Pro 5000 series, AMD Radeon and Radeon Pro 6000 series

AMD's attempt to fight back at Nvidia's Turing chips. The first generation of the Navi cards made the leap onto the then new PCIe 4 specification. AMD also introduced Resizable Bar in this generation, branding it Smart Access Memory. This feature allowed the CPU to write to and read from the GPU's memory space directly while the GPU is busy with some other task, allowing for a notable boost in performance as the CPU now no longer has to copy data to RAM, then get the CPU's attention and send the data over.

However, critics mocked AMD's move to PCIe 4, claiming that the PCIe 3 bus is still underutilized that the move is unnecessary and mostly a gimmick. This quickly changed within a year, as games that pushed the limits of the current gen of GPU started appearing and it became quickly apparent the PCIe 3 bus has hit it's limits.

AMD was also not prepared to counter NVidia's RT cores, and thus the first generation of Navi cards were ridiculed for not being able to handle Ray Tracing tasks as well as NVidia's offering. The next generation of Navi cards finally introduced Ray Accelerator units, special cores dedicated to raytracing not unlike NVidia's RT cores. Unfortunately at this time, the COVID-19 Pandemic would hit, causing factories and silicon fabs to shut down. Couple this with the increasing demand for GPUs in the cryptocurrency space, the silicon shortage caused by breakdown of US and China relationship, and scalpers, the entire situation would cause the GPU market to go into a frenzy and result in a shortage that lasts almost two years. This stymied innovation, and the next version of Navi cards were simply refreshes of the 6000 series cards with AMD having nothing new to offer.

2020s

NVIDIA Ampere (2020)

Used in: GeForce RTX 30 series, NVidia RTX Axxxx series

Ampere builds upon Turing by improving the efficiency of the ray tracing parts of the hardware, allowing for up to twice the performance. In addition the card now uses PCI-Express 4.0 and adds Resizable BAR for better utilization of the memory interface. Regarding the organization of the GPU interals, the integer cores are now combined with floating point ones as floating point operations still dominate the kind of math being done. Lastly, the entire Ampere lineup is feature complete, rather than having a subset that has features cut for the lower end market.

Its launch however couldn't come at a worse time for consumers (but definitely a good one for NVIDIA). It launched during the peak of the COVID 19 pandemic and with more people getting into PC gaming, scalpers were keen on slurping up all of the video cards on the open market using bots and scripts. This resulted in very few people being able to buy a card at near MSRP, with markup being anywhere up to 2-3 times. Worse yet, there was a revival of cryptocurrency which caused even more people to buy troves of video cards just to mine crypto. NVIDIA tried to alleviate the problem with cards that couldn't run certain mining algorithms efficiently, but hackers found a way to get around it. Another attempt was to sell "crypto only" cards that lacked display outputs, but that did little to stop miners from buying regular cards anyway.

Another major controversy was NVIDIA's decision to use a new connector for power delivery, dubbed the 12V High Power (12VHPwr) connector, sometimes called the PCIe 5.0 connector. For cards with an expected power consumption of 300W or higher, there were reports that the connector would catch fire. An internal investigation by NVIDIA and those involved in the PCI-SIG group (who maintain the PCIe standard) blamed it mostly on user error, stating that the plug wasn't inserted properly. Further muddling things were the converters provided, as a lot of high-end cards shipped with a 4-to-1 converter and it confused people on how many they really needed on the 4-plug side to use. In addition, both the card and the power supplies that were planned to support the new connector required special sense lines to make sure the correct amount of power was being delivered, but these sense lines could be easily bypassed to trick the card into thinking everything is fine. Also not helping was that higher end video cards had a tendency to have power spikes, with the RTX 3090 sometimes reaching 500-600W in short durrations despite it being rated as a 450W card.

The main criticism for the RTX 30 video cards outside of that was the lack of VRAM. Some hardware sites found that even after 2 years of the series release that the VRAM was insufficient for running some games at 1440p on maximum details smoothly.

Intel Gen12

Used in: Intel Xe iGPUs, DG1 and Arc GPUs.

Intel's Gen12 represents another stab at the video card market, following failure with the i740 and Larrabee project decades before, and third time’s the charm. The project started in 2018 and has been trialed as integrated graphics solutions as Xe Graphics as early as the second half of 2020, but the first discrete card to come out was the DG1, an experimental card that only worked with specific Intel boards. The first retail card using the chipset is the Arc A380, which was initially only sold in the mainland China market and had extremely spotty performance owing to badly written drivers. However, after the driver situation was resolved, Intel started making the card available in many more markets worldwide. Intel plans to release several more cards by the holiday season of 2022. Keeping to their word, Intel then proceeded to launch the Arc 770 and 750 at the beginning of the Holiday season.

Intel further refined the drivers in the meantime, drawing on technology used in Wine on Linux to get their drivers up to stuff. Before long the Arc 770 was actually going toe-to-toe with AMD's Radeon cards.

Some have speculated that these cards may be Intel's only discrete GPU offerings to hit the market as the project is speculated to be discontinued shortly after the announcement, as Intel announced it was axing unprofitable divisions within the organization. However, Intel has come out to say that that is not the case and it does plan to stay as a player in the discrete GPU market, confirming that it is working on a successor called BattleMage.

NVIDIA Ada Lovelace (2022)

Used in: GeForce RTX 40 series

Once more building on top of Ampere and Turing before it, Ada Lovelace improves ray tracing performance by nearly double and adds a hardware optical flow accelerator to create a new DLSS mode: DLSS frame generation (typically called DLSS 3). This allows for an AI model to generate frames based on previous ones that the GPU can insert between frames the game actually tells the GPU to render. Another thing that many hardware reviews took notice was just how fast the halo video card, the RTX 4090, performed over the previous generation, topping out at well over twice the performance (at a time when maybe 1.25x to 1.5x gains are expected)

However, the series has led to a number of controversies. NVIDIA's business strategy was leaked before the release of the card and after release many could not fathom NVidia's business decision of pricing the card the way it did- the flagship card costs a dollar shy of US$1600 - at a time where cryptocurrency mining using a GPU has died because Ethereum has finally switched to proof-of-stake, silicon shortages resolved, and factories back to running at full capacity owing to the ending of the pandemic, and prices of GPUs were returning to normalcy.

NVIDIA's decision to make DLSS 3.0 exclusive to the series had angered owners of the older cards as DLSS 2.0 was able to be used on them. Not to mention that eventually AMD released its own version (Fluid Motion Frames, as part of FSR3) to compete against it that doesn't require an optical flow accelerator, but comes at the caveat that frame generation works best with some sort of latency reduction, and FSR3 may not work well with NVIDIA's Low Latency features.

And to top it all off, it was found that the 4080 cards not only has two variants, but the variant with less RAM will also have less shaders, making the naming scheme misleading and many are calling NVIDIA out, saying that the lesser 4080 should've been branded as a 4070 instead, as well as voicing their greviences against NVIDIA's artifical inflation of pricing. NVIDIA did relent on that front, rebranding the lower end 4080 the 4070 Ti instead.

NVIDIA also continued to use the same 12VHPwr connector used in the previous generation. This is despite the fact that the PCI-SIG eventually revised the connector to make it more robust. This makes the the RTX 4080 and 4090 cards may be susceptible still to catastrophic issues, though the power spikes in this generation aren't as bad as in the previous one.

Moore Threads' MUSA (2022)

Used in: The Moore Threads MTT S80, Moore Threads MTT S3000

Proudly touted as the first PCIe 5.0 card in the market and launched to fanfare and limited supply even within China, it was soon found that the card underperforms so badly that the NVIDIA GT 1030, a low end budget card from 2017, could run circles around it in gaming workloads. And even at that, the only games that could run on it were DirectX 9 titles, anything that uses OpenGL, Vulkan or newer versions of DirectX would crash spectacularly.

More mysteries were solved soon when western tech reviewers started getting their hands on the card- the card's 3D engine is based on a customized NEC PowerVR ISA and uses technology derived from said GPU family, debunking rumors that the card uses stolen NVIDIA technology. This also marks the first time in over a decade that PowerVR is seen on the PC platform. Unfortunately, the return was met with crickets chirping and disappointment instead of jubilation.

AMD Navi 3 (2023)

Used in: Radeon RX 7000 series

AMD adopts the chiplets topology used in its CPUs since Zen 2 for its GPUs. However, instead of separating only the compute part of the GPU and tie them all together with an I/O die, only the memory controllers and last level cache blocks are made into chiplets. AMD's reasoning for this was that scaling transistors for cache and GDDR6 interfaces wasn't good as using the same transistor size as the main compute core. So rather than use expensive cutting edge manufacturing techniques for something that wouldn't really benefit it, use a less costly manufacturing process to build those parts separately.

However, the main draw with the lineup are that the video cards are typically cheaper than NVIDIA's and include more VRAM. While the price/performance ratio is a better compared to "similarly" priced NVIDIA cards, the VRAM advantage has been tested to be more useful in certain games that eat up a lot of VRAM. This is especailly the case when playing games at 4K.

Later, it was discovered that a batch of their reference model cards shipped with defective heatsinks, causing the cards to run worryingly hotter than usual. While AMD initially downplayed the issue, they quickly went back on their words after several large tech influencer Youtube channels started bringing the issue to light. AMD subsequently pinned the issue on defective heatsinks and issued a replacement program. This issue does not affect third party cards, however, and cards produced by other manufacturers that used their own cooling solution are not affected.

Timeline / Graphics Processing Unit

Edit Locked