Hattix.co.uk PC Hardware Collection
     
 


Intel 80286-12 - 1982
Though you can't read it very well, this is an old 286, the curse of the PC's being until Windows 2000 and XP. The 286's segmented addressing modes were famously described as "brain damaged" by Bill Gates and were the direct cause of the extended/expanded memory problems of '80s and '90s PCs.

Architecture failings aside, the 286 was a very successful processor and this model in particular, the 12MHz, sold millions. In most 286 systems sold in Europe, however, the CPU was an AMD model, Intel had cornered the shorter shipping distances of the North American market from their assembly plant in Costa Rica while AMD had to export from their production facilities the southern USA.

This particular part has a story behind it, like most other examples in this virtual museum. Sometime in the late 80s, a businessman was given an AMD 386DX-40 (see below) to work at home with. He used this machine quite well until 2000, when he came to me asking if it could be upgraded so his children could do homework on the Internet. It was a design seldom used today, where the entire I/O interfacing was on a pair of bridged ISA slots leading to a single, very large, daughterboard which had two IDE, parallel and serial ports. In a socket on this daughterboard was its controller, this 286-12.

The slower 286s (8, 10 and 12 MHz parts) were usually ones to avoid. Not because they were bad, they weren't. It's because if we're not at least looking at a 16 MHz part, we're cutting corners. The "go to" in those days, late 1980s, very early 1990s, was a 16 MHz 286, 1 MB RAM (usually SIPPs), and a 40 MB HDD from Miniscribe or Seagate. If we wanted to play games, a 256 kB VGA card. If we were word processing or spreadsheeting, a 64 kB EGA card. Deviations from that were rare and usually meant something was wrong somewhere. As you can see, a pin is broken off the PLCC package rendering this chip useless. As though it was any use to begin with, that is.

The 286 almost always sat next to a 287 socket for the FPU, and just as almost always, that socket was empty. In all my years of dealing with PCs, I never found a 287 in the wild. The highest speed grade Intel made the 286 with appears to have been 16 MHz, but AMD and Harris made 20 MHz (which compared very favourably to 386SX) and Harris made a 25 MHz 286 (which was extremely speedy!).
ProcessClockFeature CountPlatform
? nm12 MHz134,000QFP-68 or SMT
L2 CacheSpeedWidthBandwidth
None---

AMD Am386DX40 - 1991
Intel had tried to prevent AMD from producing chips with the name "286" and "386" with a lengthy lawsuit. The courts finally ruled that numbers could not be trademarked; This is why Pentium was not called P5 or 586, but was given a name so that it could be trademarked. AMD was forbidden via injunction from producing or selling 386s until the legal shenanigans were settled. At the time it was a massive victory for Intel, they had successfully delayed their competitor to market by years, even as the case was quietly settled out of court. Such legal stalling tactics are, and have always been, very common.

Historically, the 386 made the PC platform what it is today. IBM had long had manufacturing rights to the 8088, 80186 and 80286, further they had mandated second-sourcing agreements, which meant that they could shop around for whoever was making the cheaper processors. With Intel seeking to prevent AMD from marketing 386s and the chip being very expensive, IBM didn't believe it the right time to release a 386 based machine. They were partly right. Nobody was to use the 32 bit modes of the 386 for almost ten years (the 386 debuted in 1986) and faster 286s could almost keep up with the slowest 386s. AMD, for their part, won the court battle and were finally allowed to start marketing their own 386s in 1991 after being held back for five years by Intel's legal blackmail.

It was Compaq who broke IBM's deadlock on 286-only PCs. With a clean reverse-engineered BIOS (some say with substantial assistance from Intel) and a 386 machine, they opened the door to legitimate PC clones and heavy competition in the PC market. IBM lost their grip overnight, though machines were still branded 'IBM-Compatible' until the late '90s largely to hide the fact that they weren't.

AMD's 386DX40 was a legend, those building computers at the time will be smiling, nodding and basking in the nostalgia. The introduction of a new platform tends to be a bit of a rough ride, early 486 motherboards were less than reliable and the processors themselves got remarkably hot to the point where many would fit small fans to them. Many prefered the older 386 platform which used reliable, proven motherboards from the likes of Soyo or DFI but even these couldn't deny the performance of even a 20MHz 486SX. The scarily expensive 25MHz 486s were faster still.

Imagine Intel's chagrin when AMD produced the 386DX40, a processor capable of matching the FOUR TIMES as expensive 486SX20 bit for bit. To say that AMD's processor sold was rather like saying that water was wet. 486 was not to put enough distance between itself and AMD's DX40 until the 486 hit 66MHz with the DX/66.

Where Intel's 486s had on chip caches, pipelined ALUs and faster MMUs to make it roughly double the 386 on a for-clock basis, Intel initially had difficulty reaching the 33MHz bin which would put it 'safe' beyond the capabilities of any 386. This meant that the 20MHz and 25MHz parts were able to be challenged, matched and, due to the finely tuned mature motherboards of the 386, even exceeded by AMD's 40MHz 386. While the 486 did have internal cache, it also had much slower access to RAM thanks to the 20/25 MHz system bus. AMD's 40 MHz system bus meant the cacheless 386 could keep up with and even exceed the 486s

The Am386DX40 probably took the PC platform to more people than any other single component and, in our view, is the greatest x86 CPU ever made. From 1991 to 1995, four long years, the Am386DX40 with 8 MB RAM, probably a 200 MB hard drive

A small system builder near me (now sadly out of business due to some stupid decisions by the retired owner's son[1]) was selling very small slimline AMD 386DX40 systems with 16MB of memory and a 540MB hard disk even when the 486DX2/66 was the in-thing and he even put Windows95 on some of them (this was unwise). The 486DX2/66s were fast, very fast, but also very expensive. Most came only with 8MB of memory and perhaps a 340MB hard disk so in actual use a 386DX40 with 16MB of memory, still cheaper, could actually be the faster system!

This particular chip was made in week 11 of 1993 (first three numbers of the line "D 311M9B6") and is mask nine, stepping B6. Note how the chip has no (C) mark, instead just the (M) mark. This is because the processor was made under patent license from Intel but partly designed and implemented by AMD. Also notice the odd packaging, the PLCC package mounted on a small PCB. The PCB was a socket pin adapter[2], able to mate the SMT-intended processor into a common socketed motherboard.

[1]When you're known for your excellent service, your supporting a built and sold system FOR LIFE, your knowledgable staff and your general high quality, what do you do? Turn into a brand-name oriented shop which gives no support after the first year and refuses to stock AMD because "someone said they're unreliable". Then you wonder why business goes down the toilet and independent techs, like me, refuse to deal with you and explain to your father on the phone that the reason why is because you're a complete idiot.</rant>
[2]Anyone with an actual ceramic Am386DXL40 is invited to submit it.
ProcessClockFeature CountPlatform
? nm40 MHz275,000QFT-208 or PGA
L2 CacheSpeedWidthBandwidth
None---

MIPS R3000A (MIPS R3000A) - 1991

On the right, the gold-coloured PGA packages, are NEC manufactured MIPS R3000 CPU chips. The larger one is the R3000A CPU, the smaller is the R3010 FPU, both clocked to 33 MHz. These implemented the MIPS III instruction set. The R3000A was a minor revision to reduce power use and enable a 40 MHz clock. The R3000A is also used as the CPU of the Playstation and as a secondary processor in the Playstation2.

MIPS began as a project to make extremely highly clocked CPUs using very deep pipelines, rather like the much later Pentium4. To do this, complex instructions had to be removed as they took more than one clock cycle in an ALU, so the processor required interlocks to indicate when the pipeline was busy. MIPS was to remove interlocks by making EVERY operation take just one cycle, hence the name "Microprocessor without Interlocked Pipeline Stages".

Each instruction on MIPS is fixed-length, each instruction word is 32 bits, the opcode (tells the CPU which operation to do) is 6 bits and it may have four 5 bit fields specifying registers to operate on. Extended formats were a 6 bit opcode and a 26 bit jump address (for a jump instruction, the CPU's "go to", used for branches) or the 6 bit opcode, a 16 bit data value and two 5 bit register identifiers.

The actual commercial CPUs, such as this R3000A, did have hardware interlocks and did have multiply and divide (doing MUL or DIV was cheaper on bandwidth than being issued ADD or SUB over and over again, this plagued early PowerPC performance) and gained fame by powering these, SGI machines. The CPUs performance was really nothing special, but for their performance level, they were small, cheap and used very little power. By 1997, almost 50 million MIPS CPUs had been sold, finally taking the volume crown from the Motorola 68k series and in 2009, MIPS CPUs are still a high volume design, shipping more or less the same volume as x86. The competing British ARM design, however, out-ships absolutely everything else as much as four to one, it being the processor dominating the cellphone, PDA and embedded appliance (e.g. satnav, robotics) markets.

MIPS R3000 has seen use in aerospace as the Mongoose-V, a radiation hardened version used in the New Horizons mission, the failed CONTOUR probe and its first use was in the Earth Observer 1 satellite. The Mongoose-V is manufactured by Synova and contains the R3010 on-die in a 256 pin QLCC package.

In this SGI Indigo, the MIPS R3000A's largest competition was from the Amiga 3000, released two years earlier. This sported a 25 MHz Motorola 68030 but could not achieve the 30 VAX MIPS of the R3000A - the R3000A was a scalar processor, executing one instruction every clock, where the 68030 could peak at that, but many instructions took two or more clocks, resulting in the 25 MHz 68030 having a measured VAX MIPS performance of 10: Clock for clock the R3000 was three times faster.

On a clock for clock basis, the R3000A's IPC (instructions per clock) was very nearly 1.0, the 68k series would not exceed this until the 1.2 of the 68040 (released in 1990, but extremely expensive and power hungry).

Performance in 1991
CPUVAX MIPS
R3000A 33 MHz30
68030 33 MHz12
68040 33 MHz36
386DX 33 MHz11
486DX 33 MHz27


In this SGI, the QFP part labelled "HM62A2012CP-17" is one of the SRAMs used as L1 cache. The very presence of fast SRAM cache was necessary for the R3000A to be able to maintain its phenomenal performance. While complex processors such as the 68030 or 68040 could work straight from DRAM to near-maximum performance and needed only small L1 caches, the very simple instructions of RISC processors meant a lot of them were needed, which caused large amounts of bandwidth being used which would cripple any DRAM system, hence the requirement for expensive SRAM cache. The R3000 supports split I/D caches of 64-256 kB each.
ProcessClockFeature CountPlatform
1.2 µm (1200 nm)33 MHz115,000PGA
L2 CacheSpeedWidthBandwidth
No L2 in most systems---

Contributed by Evan


Cyrix 486DX2/66 - 1993

Cyrix's 486s hold a special place in many a PC technician's heart. They were typically a few percentage points slower than AMD or Intel, maybe a Cyrix 66 would run like an AMD 60. However, they would run on really janky motherboards. They were known to be very unfussy.

L1 cache on the Cyrix Cx486DX was 8 kB and ran from 5V, although 3.3/3.45V parts were available as the "Cx486DX2-V66". The earleir Cx486S, intended to compete with Intel's 486SX, had no FPU and only 2kB L1 cache: The FPU was a good 40% of the entire die on the 486DX versions, making a considerable cost-saving. Cyrix's units did, however, use write-through caches instead of the more common write-back caches. Write-through caches are slightly faster. Cache design was one of Cyrix's specialities and Cyrix chips usually ran their caches that little better than everyone else.

The DX2/66, as seen here, was the mainstream part. Cyrix did not make a 60 MHz version, and the 50 MHz version was hobbled on a 25 MHz bus, making it quite unpopular. From introduction in 1993 well to the middle of 1998, a Cyrix 486DX2/66 remained a reasonable processor, though slow toward the end of its useful life.

Cyrix was, and would always be, fabless, so had fabs belonging to IBM, SGS-Thomson (ST) and Texas Instruments manufacture them. Part of the deal was that the manufacturers could sell some stock under their own name. Cyrix's processors ran better power management (i.e. they actually had some at all) than both AMD and Intel, so typically ran cooler and needed less power to do so.

One of the rarer Cyrix 486s was the Cx486DX2-V80, which ran with a lower voltage, and with a 40 MHz bus to make an 80 MHz core clock from introduciton in late 1994. They were surprisingly fast, even fast enough to run Windows 98, handle light Internet duties and play MP3s in 1999, but they were hellishly dated by then. Just playing an MP3 in Winamp was over 90% CPU utilisation on a 66 MHz 486 of anyone's manufacture.

The code on the back is a manufacturing code. "A6FT530A" decodes as follows:
A - Manufacturer (Unknown)
6 - Process (650 nm)
F - Foundry (Unknown)
T - Mask Revision
5 - Year (1995)
50 - Week (Week 30)
A - Die lot code


Manufacturers were G for IBM, F for Texas Instruments, so A is likely ST. Process is set per-manufacturer, so SGS process 6 is likely 650 nm. The foundry, in this case F, is again not well documented for SGS. IBM used the first letter of the foundry location, which was one of Burlington, Corbeil, Fishkill or Bromont.
ProcessClockFeature CountPlatform
650 nm66 MHz1,400,000168 pin PGA
L2 CacheSpeedWidthBandwidth
64-128 kB typ.33 MHz32 bit133 MB/s


Texas Instruments 486DX2-66 - 1993 A Cyrix 486DX2-66 marketed by TI. Cyrix licensed out the design and allowed its partners (Cyrix itself was fabless) to sell some under their own names. Cyrix took retail sales, IBM, SGS-Thomson and TI sold their chips to OEM channels only. It was available with the Windows 3 style logo on it or the Windows 95 style logo on it.

Texas Instruments produced these on its then-new 650 nm manufacturing process, down from the 800 nm earlier 486s were made on. This, in 1993, allowed a lower 3.45V operating voltage and peak power below five watts. TI would scale the DX2 from 66MHz, through the sometimes-iffy 80 MHz (due to a 40 MHz bus, which poor motherboards would not handle, and these guys tended to get put in poor motherboards) to a very quick 100 MHz with a 33 MHz bus cleanly multiplied 3x.

Faster TI chips were very rare: I never saw one in the wild, of the hundreds of CPUs which passed through in the late 1990s and early 2000s. By the time 66 and 80 MHz parts were around, Cyrix was pushing most of its output via IBM's manufacturing.

They generally performed well, on par with an Intel or AMD 486DX2-66 or a little slower, due to Cyrix having to reverse engineer the microcode. The trouble is that they were too cheap for their own good. They'd go in extremely low cost motherboards, often with a very low amount of (or none, or even fake) L2 cache, hardly ever in a quality OPTi or ULI motherboard. They'd then be paired with a low amount of RAM (4 MB or 8 MB), slow hard disks, and you've got 1995's $999 cheap PC.

So many corners cut that things suffered, a tale we see again and again. The CPU was chosen because it was lower cost, but then because the entire system is going for low cost, it's poor across the board. All this gave Cyrix CPUs a bad reputation, entirely undeserved, and they were well loved among independent system builders. So long as a quality motherboard, fast hard drive, and enough RAM (16 MB) were fitted, a Cyrix 486 made a saving of around $150-200, ran cooler and was unnoticably slower. It was sometimes said that a 66 MHz Cyrix ran like a 60 MHz Intel, but it was usually even closer than that: Perhaps 5%.
ProcessClockFeature CountPlatform
650 nm66 MHz1,400,000168 pin PGA
L2 CacheSpeedWidthBandwidth
64-128 kB typ.33 MHz32 bit133 MB/s

Cyrix 486 DX4/100 - 1995
A Cyrix 486 DX4/100 was a lot of things. Fast, cheap and late are among them. Cyrix had always made CPUs not by a licensing agreement, but by meticulous clean-room reverse engineering. Eventually Intel started to wave the patent stick and Cyrix fought back. On the verge of a Cyrix victory (which would have severely harmed Intel), a cross-licensing deal was worked out and the two parties settled. Many attribute the beginning of Cyrix's decline here, their failure to push the litigous advantage they had, which could have gained a much greater advantage for Cyrix than a mere cross-licensing deal.

Back in 1995, when this CPU was new, the home PC (being sold as "multimedia") was starting its boom which would lead to the rise of the Internet. Intel's Pentiums were frighteningly expensive, so many prefered the 486s instead. 486s from all vendors still sold briskly even to the early part of 1997, but AMD and Cyrix had a sort of unwritten policy to sell their parts at the same price as Intel's one below. A DX4/100 from Cyrix or AMD would cost about the same as Intel's 66 or 80. As the chips themselves performed within a hair of each other, it took some very creative marketing from Intel - They usually spreaded FUD about how AMD and Cyrix were incompatible or not reliable. Utter bullshit, but it detered some buyers.

Sure, it wasn't as fast as a Pentium but with 8 or 16 MB of memory (rarely 4 MB) they made a capable Windows 95 machine. Cyrix's DX4/100 was rated to a 33 MHz bus, a 3.0x multiplier and a 3.45V supply. It was actually more commonly used on 25 MHz motherboards with a 4.0x multiplier. Cyrix sold largely to upgraders, who valued Cyrix's compatibility, low prices and their ability to work, more or less, with very creaky motherboards. Where an Intel 486 DX4/100 wouldn't even boot, Cyrix's models would usually run with just a few stumbles.

Fully static designs, Cyrix's 486 DX CPUs consumed less power than Intel's models and featured much more advanced power management and a form of clock gating. It wasn't all good. While AMD and Intel made identical parts due to a licensing agreement, Cyrix had to reverse engineer both the CPU architecture and its microcode. So while Cyrix made more compatible and lower power models, they were slightly slower.

Cyrix were liberal with their licenses, entering into deals with IBM, Texas Instruments and SGS-Thomson for use of their production facilities, in return Cyrix allowed the aforementioned partners to sell CPUs under their own names. IBM in particular sold a lot of Cyrix's later 6x86 offerings.
ProcessClockFeature CountPlatform
600 nm(?)100 MHzUnknown168 pin PGA
L2 CacheSpeedWidthBandwidth
128-256 kB50 MHz32200 MB/s


Intel Pentium 90 - 1995
This is when the Pentium became a viable alternative to the fast 486s. Before then, Pentiums had ran very hot, were buggy and quite unstable. They did not offer a justifiable (or often any) performance improvement over the 486 DX90/100/120s of the time.

The first Pentiums were, believe it or not, released as samples as long ago as 1992, the 60 and 66MHz parts. The 50MHz sample is very rare and, strangely enough, all samples sported 36bit physical addressing in the form of PAE. They were released to full manufacturing in early 1993, buggy, hot and far too immature. This was the P5 core with a maximum power of 16.0W at 66MHz, three times more than anything else. The pricing was also wince-inducing, costing around the same as an entire system made with a processor 60% as fast.

Today, we're used to processors hitting 100 watts and more with huge coolers, but back then all we had were small heatsinks, rarely with little 40 mm fans sitting atop them, designed for the five watts of a fast 486. Sixteen watts was three times more than what was normal for the time!

They were succeeded in 1994 by P54, the BiCMOS 800nm process refined to 500nm, the huge 294mm2 die reduced to 148mm2 and the huge 16W power requirement reduced to a mere 3.5W (for the 90MHz part, the 100 MHz was released at the same time). That was what I meant with "viable alternative".

I'd be very surprised if any P60s or P66s survive to this day, but this P90 still runs just fine and was running an OpenBSD router until late in 2005.

It was curious as to why Intel priced the 90 MHz part so high. Everyone was used to paying a huge premium for Intel's most powerful, but not for the second best, and the 90 MHz was the second best. By 1996, the Pentium to get was either the 90 or the 133. There was a 120, but this was very spendy for its performance level: The 60 MHz bus did it no favours.

The Pentium got most of its performance from a superscalar pair of ALUs (the FPU was not superscalar but it was sort-of pipelined) which enabled it to double the speed of a 486 at the same clock on software specifically designed for it, or just luckily suited. Other enhancements were memory being twice as wide, 64 bits as opposed to 32 bits and more CPU cache on the Pentium.

This one was manufactured in January (week 3) 1995 as demand for the Pentium (and the PC in general) took off as a result of Windows95.
ProcessClockFeature CountPlatform
500 nm90 MHz3,200,000Socket 5
L2 CacheSpeedWidthBandwidth
256-512 kB60 MHz64 bit480 MB/s


Intel PentiumII 266MHz - 1997
Of stepping code SL2HE, this is a Klamath cored 266MHz part released in May 1997. The Klamath ran rather warm and only made it as far as 300MHz before the .25 micron die shrink to Deschutes took over. This particular part, as can be seen by a close inspection of the image, has had the B21 trace isolated in order to modify the CPU to request a 100MHz FSB. As this sample is also unlocked, then it can be made to run at 250MHz or 300MHz, both of which are a massive jump in performance over 266MHz thanks to the 100MHz bus speed and lower multiplier.

The core has had its nickel plating lapped off to facilitate better contact with the heatsink, which helped it to reach 300MHz when overclocked. On either side of the core, one can see two SRAM chips, the processor has four of these rated for 133MHz and 128kB each. They do work when overclocked to 150MHz and they are clocked at half of CPU core speed.

The reverse of the SECC (single edge contact card) shows the L2 tag RAM in the centre and the other two L2 SRAMS on either side. Despite what certain inaccurate publications will tell you, the P6 core did not have internal L2 tag, nor did it have internal L2 cache (internal = on die) until the Mendocino and, later, the Coppermine. The Pentium Pro did not have on-die L2 cache or tag so let's put this to rest, the L2 tag and SRAM was housed on the same ceramic substrate (it was an MCM, multi-chip module, like Intel's Core2 Quads) as the P6 die.

ProcessClockFeature CountPlatform
350 nm266 MHz7,500,000Slot 1
L2 CacheSpeedWidthBandwidth
512 kB133 MHz64 bit1,064 MB/s


Intel Pentium 200 MMX SL27J - 1997
For release details, see the 233MHz part below.

For this one, we'll have a look at why Intel disabled certain multipliers on the P55c (Pentium-MMX) series. For this, we need the codes on the bottom of the CPU package. The first on this one is "FV80503200". "FV" means organic substrate, "8050" is Pentium's model number, "3" is the code for the MMX Pentium, finally 200 is the rated clock frequency. It's nothing we don't already know. The kicker is the bottom code, "C7281004". The first letter is plant code - C for Costa Rica, L for Malaysia. (The 233 below is "L8160676") the next is year of manufacture, 7 being 1997. After that is the week of manufacture, week 28 in this case. Finally we have the FPO (testing lot) number at "1004". The next four after the hyphen are unimportant serial numbers.
We were interested in the year and week. Before week 27, 1997, Pentium-MMX processors would recognise the 3x and 3.5x multipliers. For Pentium's 66MHz bus, this is 200MHz and 233MHz. After week 27, they would recognise their own mutiplier and no higher. This one would understand 1.5, 2, 2.5 and 3, but would not understand 3.5 - It could not run at 233MHz cleanly. Of course by running 75x3, we'd get 225MHz and often faster than the 66x3.5 because of the faster bus and L2 cache, but re-markers couldn't do that. Re-marking was a serious problem for Intel, unscrupulous retailers would re-mark the slower 166 and 200 parts and sell them as 233 parts! The chips didn't care, everything Intel was making would hit 233MHz, but Intel had to sell some of them as slower, cheaper parts so as not to cause a glut of the higher parts.

Chips of part codes later than SL27H would usually always be multiplier restricted. Some rarer SL27K parts weren't limited but all past that were locked down.
ProcessClockFeature CountPlatform
350 nm200 MHz4,500,000Socket 7
L2 CacheSpeedWidthBandwidth
256-512 kB66 MHz64 bit533 MB/s


Intel Pentium 233 MMX SL27S - 1997
Amid a huge blaze of marketing in January 1997, Intel released their first MMX-equipped Pentiums, the 166, 200, a very rare 150MHz part and six months later a 233MHz part (after the 233MHz PentiumII). To believe the marketing, MMX was the best thing ever, but what really was it?

Just after Pentium's production, Intel engineers wanted to add a SIMD (single instruction multiple data) extension to the x86 instruction set which would greatly accelerate anything that performs the same few operations on a great amount of data, such as most media tasks. Intel management, however, were a more conservative bunch and refused the request as it would require a new processor mode. They did allow a very limited subset of ALU-SIMD to be added, MMX. Not much actually used ALU-SIMD (media is FPU heavy) so MMX itself gave perhaps a 2% improvement on most code recompiled for it. The full SIMD extension would later come with Katmai as Katmai New Instructions (KNI) or its marketing/Internet friendly "Streaming SIMD Extensions", which of course "enhanced the online experience". Yep, the processor was claimed to make your Internet faster.

What the new Pentium-MMX also had, however, was not quite so performance-agnostic. The original Pentium was originally made on the 800nm process node, but by the time of Pentium-MMX, Intel had a 350nm process available, meaning that less silicon was needed for the same parts, so yield was higher and manufacturing was cheaper.

The P55c didn't just add MMX, it also doubled the size of the L1 cache from 2x 8kB to 2x 16kB which gave it a 5-10% boost on existing software across the board. P-MMX still could not keep up with the Cyrix 6x86L in office work environments and was very little faster than Pentium in media environments. It gained the inglorious distinction of the first ever Intel CPU to be defeated by a competitor: Cyrix's 6x86L-200 was faster than Pentium-MMX 200.

The 1997 release of the Pentium-MMX was seen as too little and too late, it did not differentiate itself from Pentium, it was expensive and hyped up by Intel so much that consumers were expecting something better than a refresh of a five year old part. Part of the problem was that, at the 200 MHz grade (and certainly at the 233 MHz one), the 66 MHz bus speed just wasn't enough. AMD and Cyrix had already pushed 75 MHz bus speeds with success, and Cyrix was talking up an 83 MHz bus. The extreme bottleneck of the 66 MHz bus would be why Intel used it for its Celerons well into the 800 MHz range!

ProcessClockFeature CountPlatform
350 nm233 MHz4,500,000Socket 7
L2 CacheSpeedWidthBandwidth
256-512 kB66 MHz64 bit533 MB/s


Intel Mobile Pentium II 233 - 1997
Intel's mobile "Tonga" Pentium II was really just Deschutes, the desktop Pentium II and it was pretty much identical but for form factor. The MMC-1 cartridge, seen here, packed the northbridge on the cartridge so that it was impossible to use cheaper VIA or SIS chipsets with the mobile Pentium II and for Intel, more sales. In this case, the well-regarded 440BX chipset.

ProcessClockFeature CountPlatform
250 nm233 MHz7,500,000MMC-1
L2 CacheSpeedWidthBandwidth
512 kB11764 bit936 MB/s
Feature count does not include L2 cache and tag, both of which are external.
No Image Yet Intel Celeron 300MHz - 1998
The bastard offspring of Intel's price premium craze in the PentiumII days, this was the CPU that was, in June 1998, to take market share in the low end, where AMD and Cyrix were having their own party. The lack of L2 cache left the P6 core choked for data, the 66MHz FSB didn't help either. Coupled with the low end motherboards, this CPU's performance was dismal.

Usually, a lousy CPU had an excuse. It had a reason. The AMD K5 was delayed by unforseen problems in development. The Intel Pentium was just too transistor heavy for 80 micron BiCMOS. The VIA C3 was never intended to compete in the performance market.

The Celeron had none of these excuses. It was actually designed to stink. It had no L2 cache whatsoever and performance was typically less than the elderly Pentium-MMX 233 it was meant to replace. AMD's K6 went unchallenged and gained popularity dramatically as a result

This is the "Covington" core, which is absolutely nothing more than a very slightly tweaked Deschutes, and shares its overclockability, hitting 450MHz with only a modest heatsink. The CPU is partly lapped to facilitate a better heatsink fit, but didn't make it past 464MHz. The PentiumII 266 running at 300MHz still outperformed it substantially, such is the importance of L2 cache. The redesign of this CPU, to incorporate 128kB L2, was known as Celeron A or "Mendocino", the famous Celeron 300A being one example.

They weren't all bad, since games of the time rarely needed much in the way of L2 cache and showed only a marginal drop in performance on the cache-less Covington. An overclocked Celeron with a Voodoo3 was a cheap, but powerful, Quake3 machine. It was just utterly awful at anything else! The 300 MHz cacheless Celeron would be outperformed by any half-way decent system with a Pentium-MMX at 200 MHz or so.
ProcessClockFeature CountPlatform
250 nm300 MHz7,500,000Slot 1
L2 CacheSpeedWidthBandwidth
None!---



AMD K6-2 450AFX - 1999
The 450MHz and 500MHz K6-2s were everywhere in 1999 and 2000 after their release in February 1999. The 500MHz parts were hard to get hold of, it seemed everyone wanted one, but the 450 part was king of the price/performance curve. In a world where Intel was selling the 450MHz PentiumII for £190 and the 500MHz PentiumIII for £280, the £54 that you could get a K6-2 450MHz for was an absolute steal. That in mind, the K6-2 would often run rings around the Pentiums in daily usage, as long as your daily usage wasn't games.

You very much did not want a K6-2 for games. Back then, games were very FPU intensive (the GPU does this these days) as part of their lighting and geometry transformation workloads, and K6-2 plain did not have a powerful FPU. In something like Quake 3 Arena, A K6-2 450 would run around the same as a Pentium II 266. Ever since Cyrix had become the first x86 maker to claim the performance throne with the 6x86-200 in 1996, Intel had seemed weak. The Slot-1 platform was indeed performant, but it was 100% Intel-only and lacked that competitive drive. Intel took back its rightful crown in 1997, only to lose it months later to AMD's K6-233.

AMD continued producing K6-2 450s for over two years, they were still common in 2001. By this time they were typically 2.2 volt and, sometimes, really K6-III chips which had defective L2 cache. Either way, they made excellent, and quite cheap, little machines for running Windows 2000 on. Once you'd found a motherboard you liked (usually a Gigabyte or FIC with the ALI Aladdin V chipset), they worked first time, every time, and went well with a few hundred MB of PC-100 SDRAM.

This part was paired with the GA5-AX motherboard in the motherboards section and 192MB of PC100 SDRAM (three 64MB DIMMs), such a combination would have been quite formidable in 1998. In 1998 and 1999, the big seller everywhere was the K6-2. It was cheap, fast, and reliable.

The K6-2 was pretty poorly named, it was the K6-3D, it had a SIMD instruction set known as '3DNow!' added; Much like SSE but a bit less flexible and a bit more streamlined. This made the K6-2 much faster in supporting games, a field where it had traditionally been quite poor. It was not, however, a new processor. Intel were to copy this misleading naming shenaniganry with the Pentum-III, which was nothing at all more than 'PentiumII-SSE' until Coppermine.
ProcessClockFeature CountPlatform
250 nm450 MHz9,300,000Super 7
L2 CacheSpeedWidthBandwidth
256-1024 kB100 MHz64800 MB/s

AMD K6-2 500AFX - 1999
These things sold like hotcakes in 1999 (after their release in August) to 2000. The psychological "half gigahertz" barrier could be broken by a Pentum III, fiendishly expensive and widely understood to be no better and little different to the aged Pentium II, or by a K6-2. The Pentium III was three to four times the price of the K6-2 and performed slightly worse in most tasks other than games. For many, that wasn't a compromise worth taking and the higher K6-2s were AMD's most successful product since the amazing little Am368DX40.

There was also a K6-2 550 but these were always in short supply (which drove up the price and made them less attractive) and represented about as far as the K6-2 could go on a 250nm process. I never really saw a K6-2 550 which was happy at 550. Most K6-2s by this time were marked to 2.2V (some slower ones were 2.4V) and the 550s were rated at 2.3V, so overvolted out of the factory, and not happy about it either. You'd normally have to knock them down to around 525 (105 MHz bus) to make them stable. By the time AMD was shipping 550s, the 400 MHz K6-III was around the same price and very, very fast. Later, the quietly announced K6-2+ (with 128 kB of L2 cache onboard) appeared which was also quite scarily fast, but hard to find. The K6-2+ was later revision of the K6-III where a failure in part of the L2 cache could be walled off into a 128 kB chunk, disabling only half the cache.
ProcessClockFeature CountPlatform
250 nm500 MHz9,300,000Socket 7
L2 CacheSpeedWidthBandwidth
None---
No image yet
AMD K6-III 450 - 1999
Never actually came across one of these, but this area felt a bit bare without it. AMD's K6-III was one of the all-time greats of the industry. It was the undisputed, unchallenged king of business computing for 1999 and 2000, while Athlon was still a little too expensive for buying 50 of them to outfit an office and Durons were yet to take over.

K6 was already a good performing core design, adding full speed L2 cache to it really helped it along and they routinely sold for less than £100. They were just under twice the price of the K6-2 500, about 25% faster, so didn't make sense compared to the K6-2 500, but compared to the Pentium-II, they were almost a complete no-brainer. A third of the price, faster in almost everything (other than games) and not on the very pricey Slot-1 platform: K6-IIIs used the same proven and mature Super 7 motherboards as the K6-2 did. They flew on the 512 kB L2 cache of a Gigabyte GA5-AX, and positively screamed along on the rarer 1 MB cache boards.

Intel's competition at a similar price was the Socket 370 Celeron, 433 and 466 ratings. They were good for low-cost gaming, but for the money you were better off with a K6-III and a more powerful video card. K6-III would not be matched by any Celeron in business or productivity computing until it became discontinued.

Oddly enough, a K6-III 450 was 5-20% faster than a Pentium II Xeon 450 with 512 kB L2 cache at Microsoft SQL Server 7.0 and held itself proud against the immensely expensive Pentium II Xeon 450 2 MB. With the same RAID-5 (hardware, on an Adaptec PCI SCSI-III controller) array, same 128 MB PC100 memory, the cheap little K6-III 450 ran around the same in a high end database server as a CPU fifteen times the cost!

For those in the know, the K6-III was a frighteningly effective processor.

Later, AMD moved the K6-III to a 180 nm (.18 micron) process and sold it at the same clocks as a low power K6-III+. They were nothing special, until you overclocked them. They'd hit 600 MHz. You'd up the FSB to about 110 MHz (the fastest most Super 7 boards could go, thanks to the onboard cache) and hope for 605 MHz, maybe with a 2.1V core voltage (they were rated to 2.0). If you didn't make it, you'd drop the multiplier to 5 and 550 MHz worked with nearly all of them. At this speed, they were keeping up with Durons. Fast Durons.

K6-III was one of those parts which never really got around enough to be appreciated as much as it should have been. It was the performance leader for almost a year.
ProcessClockFeature CountPlatform
250 nm500 MHz21,400,000Socket 7
L2 CacheSpeedWidthBandwidth
256 kB100 MHz64 bit1,600 MB/s

Intel PentiumIII 600E SL3H6 - 1999
A micro-BGA package slapped onto a quick slot adapter and sold mostly as an upgrade for existing Slot-1 owners from October 1999. They rapidly became popular with OEMs who could use their already qualified Slot-1 designs with the new Pentium IIIE processor. The E denotes the presence of a Coppermine with onboard L2 cache, the 133MHz one was EB, the B meaning the 133MHz bus. Confused yet? P3 went through no less than THREE different CPU cores or, indeed, five if you include the Xeon derivatives. To make it even more confusing, I'm told that some slower mobile P3s were actually the PentiumII Dixon and didn't support SSE and that some mobile PentiumII parts were actually P3s with SSE disabled!

This wasn't even faster than where some enthusiasts had some P3 Katmai (external L2), I had a pair of P3 450s which invididually would pass 650 and even in SMP would easily run at 600MHz on a 133MHz bus. The faster bus made all the difference and as a pair they were scarily fast.
ProcessClockFeature CountPlatform
180 nm600 MHz28,100,000Slot 1
L2 CacheSpeedWidthBandwidth
256 kB600 MHz256 bit18.75 GB/s


Intel Pentium III 733EB SL3SB - 1999
By this time (it was released with the introduction of its Coppermine core, October 25th 1999, same as the 600 MHz part above), Pentium III was still well behind Athlon. P3 wouldn't begin to catch up while the 900 MHz parts arrived due to Athlon's L2 cache being limited to 300MHz while P3 had an on-die full speed four times as wide L2 cache.
ProcessClockFeature CountPlatform
180 nm733 MHz28,100,000Slot 1
L2 CacheSpeedWidthBandwidth
256 kB733 MHz256 bit22.9 GB/s

AMD Athlon 850MHz ADFA - 2000
Now this brings back some memories. I found this image in a long forgotten corner of my storage HD, it's the very first Athlon I ever bought, to replace a K6-2 450. It was placed in the KT7 (in the motherboards section) which was voltage modded and overclocked to just over 1050MHz. A GHz class machine! Might seem nothing nowadays, but back then it was right on the cutting edge of performance. In late 2000 as I was building the system, nothing could keep up with it. The fastest CPU in the world was the Athlon 1200 and, due to the bus speed of 214MHz on this overclocked 850, the PC133 SDRAM was running at 154MHz, making this CPU in some tasks faster than the 1200. One of the SDRAMs in there (a 64 MB) was actually rated for 66 MHz and was running almost three times faster than it should have been, but this it did without fail, at 2-3-2-8 timings. It was a seriously fast machine, and all made possible thanks to the paranormal stability of VIA's KT133 chipset and Abit's KT7 motherboard.

I never really tapped its full potential at the time, it was mostly to handle TV tuning with a BT878 card. The video card was a Voodoo3, back then not too bad, but it was much happier with a Geforce. I was to later (much later) get a Geforce 4MX but this CPU was much too early for that.

The 'ADFA' stepping wasn't anything special, just another aluminium interconnect 180nm part from AMD's Austin fab. It was pushing the limits of what it was capable at 1000MHz with 1.85V (standard was 1.75V) and even at 2.05V, it would only manage around 1050MHz. 1070 (107 was the highest FSB a KT7 could handle) was just too much.
ProcessClockFeature CountPlatform
180 nm850 MHz37,000,000Socket A
L2 CacheSpeedWidthBandwidth
256 kB850 MHz64 bit6.64 GB/s

AMD Athlon 1100MHz ASHHA - 2000
The Thunderbird core sported 256kB of on-die L2 cache, but only on a 64bit bus opposed to the 256bit bus of the PentiumIII "Coppermine", the slower cache on the Thunderbird allowing the Coppermine to keep up (just) with a by far more advanced CPU core. This specimen is a dead one, probably burnt or otherwise incapacitated. The Thunderbird was the CPU of choice in 2000-2001, still plenty powerful enough for even light gaming five years later. Though released in August 2000, this particular specimen was manufactured in week 31, 2001 but according to site policy, parts are dated and ordered by their first release in the form they're presented.

This Athlon is an early model from the then-new Fab30 in Dresden, using copper interconnects. It overclocked well, hitting 1480MHz on moderately good cooling. Austin parts had a four digit code (E.g. AXDA, my first 800MHz Thunderbird, which ran happily at 1020MHz) and weren't terribly happy over 1GHz.

Notably the Thunderbird derivative Mustang, which had 1MB of L2 cache, performed next to identically with Thunderbird, so AMD canned Mustang and brought forward the Palomino project, which was originally what Thunderbird was meant to be, but the Slot-A K7 ran out of steam too soon. The 'Thunderbird' project was to add the rest of SSE (integer SSE had already been added to the first Athlons), improve the translation look-aside buffers and use hardware prefetching to improve cache performance. AMD knew that K7 would be limited by its cache performance but were also limited to a 64 bit wide cache. Short of double-clocking the cache (which would have exceeded the possibilities of AMD's 180nm process), an expensive and lengthy redesign to 128 bit would be necessary, instead AMD made the cache smarter rather than faster. However, the GHz race from 500MHz to 1000MHz was much faster than anyone had predicted and AMD had been forced to take the Slot-A Athlon to 1GHz sooner than they'd have liked. This meant that the Thunderbird project was nowhere near ready when the market was calling for it.

Instead, AMD renamed the Thunderbird project to Palomino and rushed out a 180nm Athlon with nothing at all new but 256kB of cache integrated into the die, a rather trivial change. This took the Thunderbird name and was able to beat back Pentium4 long enough for Palomino to reach completion.

On release, Palomino was generally 10-20% faster than Thunderbird at the same clock due to its streamlined cache handling. Given that a 1.4GHz Thunderbird was already faster than everything but the scorching hot 2.0GHz Pentium4 (Tbird 1.4 wasn't exactly cool running either) the initial 1.47GHz release of Palomino made Pentium4 a rather unattractive product. Palomino eventually reached 1800MHz.
ProcessClockFeature CountPlatform
180 nm1100 MHz37,000,000Socket A
L2 CacheSpeedWidthBandwidth
256 kB1100 MHz64 bit8.59 GB/s

AMD Duron 900MHz ANDA - 2001
AMD's Duron (the 900MHz model released in April 2001) was their response to opening a new fab in Dresden. Their older Austin fab was then no longer producing many Athlons, instead AMD cut down the L2 cache on Thunderbird to a mere 64kB and named it 'Duron'. Duron was then sold for ridiculously low prices, one could pick up a 750MHz Duron for around £25, which was just as fast as the Pentium III 733 which cost £70 and only a very tiny amount slower than the 750MHz Athlon (Thunderbird) which was £60. The older Athlon 'Classic' was about the same speed as Duron.

The K7 core just never really cared much for cache size, Duron performed within 5% to 10% of the 256kB cache of Thunderbird in almost any test. Up against Intel's very slow Celeron, Duron wiped the floor with it across the board. This 900MHz Duron, paired with PC133 memory, would most likely outperform the 1GHz Celeron below despite a 100MHz clock disadvantage.

Strategically, Duron was to win AMD market share and get the market used to AMD as a viable alternative to Intel. It was also cheaper to manufacture than Thunderbird (slightly) and allowed AMD's Dresden facilities to get on with making the high margin Athlon parts, which Austin couldn't do. Duron was a great success, perhaps even greater than the K6-2...But better was yet to come for AMD.
ProcessClockFeature CountPlatform
180 nm900 MHz25,000,000Socket A
L2 CacheSpeedWidthBandwidth
64 kB900 MHz64 bit7.03 GB/s


Intel Celeron 1000 SL5XT - 2001
The FCPGA 370 Celeron was never something someone would willingly buy. For less money, one could buy a higher clocked Duron when, even at the same clock, a Duron spanked a Celeron silly and was dangerously close to doing the same to the Pentium IIIE.

Celeron's problem was one of scaling and, simply, that it didn't. An 800MHz Celeron was about 70% faster than a 400MHz Celeron, but a 1000MHz Celeron from August 2001 was only perhaps 5-10% faster than the 800MHz. Celeron didn't just lose half the L2 cache, but it was also crippled with a dog slow 100MHz bus. The Celeron-T above 1000MHz would rectify this with a 256k cache, but was still stuck at 100MHz. Celerons were, then, nothing more than cheap upgrades for older systems which were stuck on FCPGA 370 (or Slot-1 with a 'slotket' adapter).

They were almost always in a cheap VIA or SiS based motherboard, which limited performance even further, with slow RAM and no video card or a very slow one. It was strongly suspected that Intel deliberately limited performance on non-Intel chipsets, either by inadequate bus documentation, or active detection.

It was not at all difficult to find puke-boxes from the likes of eMachines using 1000MHz Celerons which were significantly slower than a well built 700MHz Duron. The Duron machine would likely be cheaper too!
ProcessClockFeature CountPlatform
180 nm1000 MHz28,100,000FCPGA 370
L2 CacheSpeedWidthBandwidth
128 kB1000 MHz256 bit31.25 GB/s

AMD AthlonXP 1800+ AGOIA - 2001
The code on this one is AX1800MT3C which tells us it's a 266MHz bus processor at 1.75V and runs at 1.53GHz. That's not important on this model, what is important on this one is the next line down which has the stepping code. The code "AGOIA" may just be random characters to most people, but to overclockers it read "gold". The AGOIA stepping was used in 1600+ to 2200+ processors and practically all of them would clock higher than 1.8GHz. Some made it to 1.9GHz and even 2.0GHz was not unheard of.

At the time (October 2001) this the fastest available AthlonXP and the fastest available x86 processor, period. For this particular processor, everything about the model codes yells "great overclocker"; It has the AGOIA code, it is a "9" (third line) and it's a week 13 part. Should it be put back into use, I don't doubt it'd approach and maybe even pass 1.9GHz.

Update! This CPU has been put back into use on a motherboard that never supported it, an Abit KT7-RAID, at 1150MHz to replace an old 900MHz Thunderbird. With the stock AMD cooler for the Thunderbird, this CPU hits 53C under full load. The AthlonXP's SSE doesn't work though.
ProcessClockFeature CountPlatform
180 nm1533 MHz37,500,000Socket A
L2 CacheSpeedWidthBandwidth
256 kB1533 MHz64 bit12.0 GB/s
AMD AthlonXP 1800+ AGOIA - 2001
Same as above, manufactured in the 36th week of 2002, but in green. The organic substrate AMD used for the AthlonXP was sometimes in brown and sometimes in green, it was pretty random. Both Palomino, both 2002 manufacture (remember, we list by release date, not manufacture date), both overclockable like mad.
ProcessClockFeature CountPlatform
180 nm1533 MHz37,500,000Socket A
L2 CacheSpeedWidthBandwidth
256 kB1533 MHz64 bit12.0 GB/s

Intel Pentium 4 1.8A SL63X - 2002
Pentium 4's 130 nm shrink was Northwood, which also doubled the L2 cache size to 512 kB. SL63X was the original Malaysia production code for the 1.8A, but a later SL68Q code was rumoured to be cooler and more overclockable.

Northwood was the first Pentium 4 which was worth buying. Willamette ran very hot, had poor chipsets, poor RDRAM, poor reliability, poor everything. Athlon XP beat it. Athlon beat it. In some cases, Pentium III beat it. Northwood was the first Pentium 4 which pulled ahead of its own legacy, if not ahead of Athlon XP.

At this point in history, Intel was openly hostile to PC enthusiasts. Overclocking was banned, multipliers locked, and chipsets even started to not permit bus speed changes. The 1.8A was rated for a 100 MHz bus (quad-pumped to 400 MT/s) but could be configured with a 133 MHz bus (533 MT/s) which would take its 18x multiplier and give 2.4 GHz. Some boards allowed a 166 MHz bus, which would clock Northwood to 3.0 GHz... Going this far needed voltage boosted above 1.7V and was rare. The Abit BD7II was popular for overclocking and a CPU like this could be reasonably expected to reach a 150 MHz bus and 2.7 GHz.


ProcessClockFeature CountPlatform
130 nm1800 MHz55,000,000mPGA478
L2 CacheSpeedWidthBandwidth
512 kB1800 MHz256 bit56.4 GB/s
AMD AthlonXP 2000+ AGOIA - 2002
Week 20, 2002, AGOIA 9 code...pretty much identical to the above two only for the 2000+ (1.67GHz) rating. Except whoever bought it was a complete idiot and made a hell of a mess of the thermal compound.

Don't do this, people. Ever.

This one was manufactured not long after its January 2002 release, week 20 working out to be mid-May.
ProcessClockFeature CountPlatform
180 nm1667 MHz37,500,000Socket A
L2 CacheSpeedWidthBandwidth
256 kB1667 MHz64 bit13.0 GB/s
AMD AthlonXP 2000+ KIXJB - 2003
The 130nm node for AMD on the K7 core was Barton (below) and Thoroughbred. Barton had 512kB L2 cache, Thoroughbred had 256kB (a "Thorton" version was a Barton core with half the cache turned off, the name is a combination of the two). Most Thoroughbreds were rated to 2400+ and above (1.8GHz) but this one is a rarity, a 2000 rated Thoroughbred.

The KIXJB stepping code tells us that it's the Thoroughbred-B core (CPUID 681) redesigned for increased scaling...So why would a Thoroughbred-B be only rated for 1.67GHz? Probably market pressure. It was cheaper to make Thoroughbred-B than it was to make Palomino and the power use of Thoroughbred-B at this kind of clock would be quite low, it was rated for a typical power of merely 54.7W at 1.65V.

Thoroughbred-B AthlonXPs at 1700+ - 2100+ were quite rare but by no means impossible to find, especially for mobile applications.

ProcessClockFeature CountPlatform
130 nm1800 MHz37,200,000Socket A
L2 CacheSpeedWidthBandwidth
256 kB1800 MHz64 bit14.1 GB/s

AMD AthlonXP 2500+ AQYHA - 2003
Although not a 2500+ Mobile (with the unlocked multiplier) "Barton" part, it shares the same stepping code as the mobiles and was produced in the same year and week (week 14, 2004) as many mobile parts. For overclockers this chip was about as good as it got, able to have its 333MHz FSB flipped to 400MHz and the core clock changed from 1833MHz to 2200MHz; It became an AthlonXP 3200+ in every way but printing on the substrate.


That's exactly why I bought it. This chip served in the main PC, paired with first an Abit KX7-333 (which it was never happy with) and finally an Asus A7N8X-X. When I moved to Socket 939 and an Opteron 146, this pairing remained as a secondary machine. It served well for four years, always with a Radeon 9700 video card, until a heatsink failure finally destroyed it in October 2008. (Not permanently, it seems, it is now on a motherboard and will boot!)

There were two 2500+s. The common one was a 333MHz bus and a 11x multiplier, but a rarer 266MHz, 14x part existed running at 1867 MHz. AMD's in-house 760 chipset would not run faster than 266 MHz for the FSB, so AMD produced almost all of its AthlonXP speed grades in 266 MHz versions!

It was my main machine from 2004 to 2006 always running at 2200MHz and replaced the 1000MHz Celeron in the secondary machine immediately after it was retired from main use. Between its removal from the main desktop and placing as secondary, it was involved in a project to build a PC with absolutely no moving parts.

The AQYHA stepping is the highest stepping the AthlonXP was ever produced with. When AMD released them, people soon took note that every last one of them would clock to at least 2200MHz (I got just over 2.3 GHz with this). These parts continued in production until 2007 as Semprons on Socket A, the final part being the Sempron 3300+ which was identical in every way to the AthlonXP 3200+ - A little bit of ratings fudging from AMD, perhaps.
I'm aware of very small quantities of a 2,333 MHz Athlon XP 3200+, which ran on a 333 MHz bus with a 14x multiplier. I've never seen one in the flesh, but these would represent the fastest K7s ever made.
ProcessClockFeature CountPlatform
130 nm1533 MHz54,300,000Socket A
L2 CacheSpeedWidthBandwidth
512 kB1833 MHz64 bit14.3 GB/s

AMD AthlonXP 2800+ AQYHA - 2003
The 2800+ represented the sweet spot for people not willing to overclock. The code "AXDA2800DKV4D" tells us the processor is a 130nm AthlonXP (AXDA), it's a 2.0GHz part with a 166MHz bus (2800), the packaging is OPGA (D), it is rated for 1.65V (K), 85C temperature (V), has 512kB L2 cache (4) and the bus speed is 333MHz (D).

Model codes aside, the 2800+ was a nippy little processor on the Barton core, which was as advanced as AthlonXPs ever got. Most enthusiasts forgoed the 2800+ for the 2500+, which was also on the 333MHz bus, but when pushed to 400MHz it became a 3200+ and almost all 2500+s made this jump; I say "almost", but I've never seen one which wouldn't. A 2800+ had a 12x multiplier (Socket A provided for a maximum multiplier of 12.5) and would end up at 2.4 GHz if just bumped to a 400 MHz bus. Very few K7 processors were happy that fast.

The double size cache on the Bartons didn't really do much for their performance. Ever since Thunderbird, the first socketed Athlons, they were held back by the cache being only 64 bits wide, so being quite slow. Athlon64 would later remedy this with a 128bit cache, but AthlonXP was too early for that.
ProcessClockFeature CountPlatform
130 nm2000 MHz54,300,000Socket A
L2 CacheSpeedWidthBandwidth
512 kB2000 MHz64 bit15.6 GB/s

Intel Celeron 2.6GHz SL6VV - 2003

This FCPGA 478 Celeron is based on the Pentium4 Northwood core, the 130nm part. It's fairly uninspiring, maybe about as fast as an AthlonXP 2000+ (1667 MHz) in most tests. The FSB is the same 400MHz bus that the P4 debuted with, but the 512kB L2 cache on Northwood has been axed to 128kB (three quarters of it is disabled), see the 2800 MHz Celeron below for more specifics.

What's remarkable about this one is not what it is, but where it came from. It came out of a Hewlett Packard Pavilion ze5600. Not tripping any alarms? That's a laptop! This desktop CPU came out of a laptop! It does not support any power management (Intel's Speedstep) which is a dead giveaway. Look up the S-spec if you want more proof.

Everything about the laptop was, as its former owner informed me, "crummy". The display was a 15.4" TFT at 1024x768, the kind of panel you find in cheap desktop displays (the laptop equivalent is usually 1280x1024 or higher) and it used the ATI RS200M chipset, perhaps the crappiest P4 chipset ever made other than the i845.

I noticed the laptop was thermal throttling under any kind of load, so I took it apart (Service Manual for HP Pavilion ze4x00/5x00, Compaq nx9000 series, Compaq Evo N1050V/1010V, Compaq Presario 2500/2100/1100) and cleaned it up. Taking the heatsink off to clean it I noticed the familiar FCPGA 478 socket, lever and all, for a standard desktop CPU, which is exactly what was populating it, this CPU. I removed it, put the 2.8GHz model in (below), removed the crappy thermal pad (foil and graphite based) and replaced it with thermal paste. On re-assembly, despite the CPU being 4W higher in TDP, it ran between 3 and 8 degrees celsius cooler and didn't thermal throttle at all. One benchmark, GLExcess1.1, went from a score of 938 to a score of 1624 and several games became playable.
ProcessClockFeature CountPlatform
130 nm2600 MHz55,000,000Socket 478
L2 CacheSpeedWidthBandwidth
128 kB2600 MHz256 bit81.2 GB/s

Intel Celeron 2.8GHz SL77T - 2003

Intel's Pentium4 based Celerons were even weaker than their PentiumIII ancestors and they were soundly spanked silly by Durons two thirds of their price. The Pentium4 core was quite weak on a clock for clock basis to begin with and it was very sensitive to cache and memory bandwidth. The Celeron had a quarter of the cache of its Northwood big brother and ran on the 400MHz bus (as opposed to the 533MHz or 667MHz of the full Northwood) that the very first Pentium4, the entirely unlovely Willamette, had used on introduction two years earlier. To make matters even worse, they were paired with a single channel of DDR memory, usually PC2100 or PC2700. To say it was slow was like saying water was wet. The Athlon XP 2800+ (above) wiped the floor with it for about the same money.

It just wasn't sensible to buy such a part. The equivalent performance Pentium4 was around 400 to 600MHz lower in frequency and still available through both OEM and retail channels - For about the same price too! If cost cutting was the order of the day, a cheap motherboard, a stick of PC3200 and an AthlonXP 3200+ was a far better system, cost less and used less power.

This one came out of a dead eMachines. No surprise there, then. It's been put back into use, replacing the 2.6GHz part just above.
ProcessClockFeature CountPlatform
130 nm2800 MHz55,000,000Socket 478
L2 CacheSpeedWidthBandwidth
128 kB2800 MHz256 bit87.5 GB/s
Intel Pentium 4 HT 3.0 GHz SL6WK - 2003

This P4 (later re-released as "Intel® Pentium® 4 Processor 745 supporting HT Technology") was about the fastest thing you could put on an Intel motherboard for some time. The 3.2 was faster, but scarily expensive.

Northwood was the 130 nm shrink of Willamette, and doubled the L2 cache. Simultaneous multi-threading (SMT, also known as "HyperThreading") was present in Willamette, but never enabled. By 2004, almost all Northwoods had SMT enabled as AMD was, to a word, kicking Intel's ass. SMT on the Pentium 4 was a mixed bag. Some applications saw quite a boost from it (LAME-MT), while others actually saw a performance hit.

SMT allowed the CPU to pretend to be two CPUs, so run two threads at once. This placed more demand on caches but theoretically allowed the CPU to remain busy doing useful work if it was waiting on memory for one thread. In practice, the two threads fought each other for cache and the P4's narrow and deep pipelines didn't offer enough instruction-level parallelism (ILP) to allow two to run concurrently.

Many speculated at the time that SMT would probably run better on AMD's K8s with their 1 MB L2 cache and much wider execution units, though short of a single in-house prototype I'm aware was made of an SMT-supporting Sledgehammer, we never got to see if this was true. At this time, AMD thought SMT was better done with a multiplication of ALUs, as DEC did with the Alpha 21264, it is not surprising that AMD took a lot of inspiration from DEC, as the chief architect of the Alpha was Dirk Meyer, who was the chief architect of K7 (Athlon), and eventually became CEO of AMD entirely. This thinking would eventually result in Bulldozer.

Pentium 4 was, in general, an acceptable CPU. It used a lot of power and ran very slowly in most FP-heavy workloads, but software supporting Intel's SSE2 was much faster on the P4 than anything else. Its major downside was that it was extremely power hungry. This P4 was electrically rated to 112 watts, while AMD's AthlonXP 3200+ (about the same performance) was just 61 watts maximum.

With the AthlonXP 3200+ being so much cheaper, as well as on a more mature platform and lacking P4's extreme dislike of some workloads, AMD gained a large amount of marketshare while Pentium 4 was Intel's flagship. Intel on the back-foot, they actually released some Xeons as "Pentium 4 Extreme Edition", the Northwood's "Xeon-relative" was Gallatin, which was identical to the Northwood, but had 2 MB of L3 cache. They were electrically rated as high as 140 watts. Oh, and they cost $999. That's right, one thousand US dollars. And still just a tiny bit faster than a $125 Athlon XP 3200+. Choose the workload carefully, such as some video encoders (XviD was one), and the Athlon XP could be as much as 40% faster than the thousand dollar space-heater.
ProcessClockFeature CountPlatform
130 nm3000 MHz55,000,000Socket 478
L2 CacheSpeedWidthBandwidth
512 kB3000 MHz256 bit96 GB/s

AMD Duron 1400 MHz MIXIB - 2004

The Appaloosa Duron part had been cancelled (it was to be a Thoroughbred with just 64 kB L2 cache, as with Spitfire) since it was cheaper to just make more Thoroughbred and turn off some of the L2 cache. Hence this Duron, the last, became known as "Applebred" or, earlier, "Appalbred".

Amusingly enough, while the model number of the Athlon XPs were all a sort of performance rating, the Durons were marked with their actual clock. AMD officially stated that it was relative to Athlon-C (Thunderbird) but we all knew it was being measured against Pentium4, especially as benchmark results showed similarly clocked Thunderbirds running quite a bit faster than the "rating" would indicate - the 1.46 GHz XP was rated to "XP 1700+" but a Thunderbird at just 1.55 GHz (the fastest I ever took one) was about the same. A Willamette P4 at 1.7, however, was more or less the same. Odd, that.
ProcessClockFeature CountPlatform
130 nm1400 MHz37,200,000Socket 462
L2 CacheSpeedWidthBandwidth
64 kB1400 MHz64 bit11.2 GB/s

Intel Celeron-D 325 2.53GHz SL7C5 - 2004

After the release of Intel's power hungry Prescott core, which didn't exactly set the world on fire (being 10% slower than its predecessor, Northwood), Intel were quick to disable some cache on them and release their Celerons. In this case, the cache was disabled to 256kB from 1MB, exactly one quarter - It was all there, just 3/4 turned off. The 325, as it was later renamed (it was initially just the "Celeron 2.53D").

The very same microarchitectural modifications which made Prescott slower than Northwood also made Prescott's Celerons faster than their Northwood brothers. The L1 cache is doubled in size, the L2 is doubled in size and all Celeron-D parts ran on a 533MHz FSB, up from the 400MHz of their predecessors. This would make it more or less the same speed as the older 2.8GHz part just above. The 2.8GHz part was rated by Intel's rather baroque "TDP" system for 65.4W and electrically specified for a maximum of 87W. The 2.53GHz Celeron-D changed this to TDP of 73W maximum 94W, a common complaint against Prescott was that it used more power even at lower clocks. Indeed, the full Prescott based Pentium 4 3.6GHz was electrically rated to 151W! That kind of temperature from a 112 mm² die made it hotter per unit area than a nuclear reactor.

ProcessClockFeature CountPlatform
90 nm2533 MHz125,000,000Socket 478
L2 CacheSpeedWidthBandwidth
256 kB2533 MHz256 bit79.2 GB/s

AMD Opteron 146 CACJE 2.0GHz - 2005
At a stock clock of 2.0GHz, the Opteron 146 wasn't really that much faster than the AthlonXP 3200+. It did have faster L2 cache and twice the memory bandwidth, but it was 200MHz behind in core clock. That is, however, where the similarity ends. While the fastest anyone could push an AthlonXP to was around 2.4GHz, the 90nm Venus/San Diego core of this Opteron didn't need anything special to hit 2.7GHz at either 270MHz x 10 or 300MHz x 9; The latter being preferable as it allowed the memory to be cleanly clocked to 200MHz. At those kinds of clocks, it was not at all different to the Athlon64 FX; Same dual channel DDR, same 1MB cache, same 90nm E4 revision core. I wasn't even pushing this one, this particular stepping and date has a very good record for passing 3.0GHz. It simply hit the limits of what my aircooler, six heatpipe Coolermaster Hyper 6+, could handle.

The fastest single core processor was either the Pentium 4 Extreme Edition 3.73 GHz or the AMD Athlon 64 FX 57 at 2.8 GHz. Uusally, you wanted the Athlon 64 FX 57. The Athlon 64 FX series were the same silicon as the Socket 939 Opterons, like this Opteron 146, even the same binnings. If anything, the Opterons were better. I ran this guy at 2.72 GHz most of the time: It was neck and neck with the fastest single core CPU in the world. AMD had added SSE2 to the CPU over the SSE supported by AthlonXP, but SSE2 on the Athlon64 or Opteron was little faster than standard MMX or even x87 code. Intel's first Core processor had the same issue: It could run SSE2, but was usually even slower at it.

Compared to the earlier Opteron 146 Sledgehammer core (130nm, 2003 release), the new Venus core was 90nm and labelled 'High Efficiency' by AMD due to it using much less power than the 130nm part did when released in early August 2005 - and for Socket 939!

Opteron was, of course, underclocked by AMD for reliability and heat. The 2.0GHz AthlonXP used 57.3W under load according to AMD yet, with twice the cache the Opteron 146 was rated for 89W, the exact same as its 2.2, 2.4 and 2.6 brothers, the 148, 150 and 152. It doesn't take a genius to work out that AMD were being very conservative with the Opteron 146, actual power draw measurements under load give numbers around 46-55W. Of course, pumped up from 1.40V to 1.50V and clocked to 2.8GHz, it could easily pass 90-100W.

An amazingly flexible little chip which, if allowed to do its Cool-n-Quiet thing and drop multipler to 5x (and voltage to 1.1) when it had no load would happily idle at 1GHz at perhaps two degrees above the system's ambient temperature. Great for a firebreathing gaming machine with the clocks up, great for a silent living room PC at standard. There wasn't much the Opteron 146 couldn't do...except keep up with the inevitable march of technology.

This CPU had a short stint powering this server running at 2.7GHz on an MSI K8N Neo2 Platinum without breaking a sweat. It was then sold... Then bought back... Then sold again! It was still in use in 2012.

People using Tech Power Up's comprehensive CPU database: It is often inaccurate for CPUs older than around 2013. This E4 Opteron 146 has the wrong TDP (it's really 67 W instead of 89 W) and falsely claims the multiplier was locked. No Socket 939 Opteron had a locked multiplier. CPU World's database is typically more accurate, but often incomplete.
ProcessClockFeature CountPlatform
90 nm2000 MHz114,000,000Socket 939
L2 CacheSpeedWidthBandwidth
1024 kB2000 MHz128 bit31.25 GB/s

[Back exactly as the 146 above]
AMD Opteron 165 LCBQE 1.8GHz - 2005
The best of the best Toledo cores, usually from the very middle of the wafer, were selected and underwent additional testing before being labelled as the Denmark, AMD's 1P dual core Opteron. Most of them were sold on Socket 940, usually as upgrades from earlier single core Opterons such as the Sledgehammer or Venus (The Opteron 146 above is a Venus and, believe it or not, released only two days before this dual core part).

LCBQE stepping code was later than the CCBWE code, seen from late 2006 through to the last few Opteron 2P HE parts in early 2009. It was Toledo (Consumer), Denmark (Opteron 1P), Egypt (Opteron 8P), and Italy (Opteron 2P). All were 90 nm and JH-E6 stepping.

The CCBWE stepping code began in mid-2005 with San Diego and Toledo, but was also seen on Denmark, Italy and Egypt, as well as Roma, the mobile Sempron.

Decoding this was simple. In order, the letters meant:

Revision/Production: Early Venice was A, most production was C, later was L
Configuration: A is 1 core, 1 MB L2 cache, B is 1 core 0.5 MB, C is 2 core 1 MB, D is 2 core 0.5 MB. Even if features are disabled, they'll still be coded here. You wanted codes A or C!
IMC Revision: AMD's DDR controller and "uncore", so the system bus interface and HyperTransport, went through numerous revisions. BQ is quite late, the counting works by he 4th letter starting at A and going to Z, then it wraps back to A and increments the 3rd letter.
Die Model: Very early Sledgehammer prototypes were seen as B here, but all release chips seem to have been C or above. C was Clawhammer, D was Winchester, E was San Diego, Toledo, etc. and F was Windsor.

By the time we got to the "L" codes in Opteron, they were all high end Opteron X2 processors o nSocket 940. Some, however, were released on Socket 939 where they represented a high quality, low clock, low heat alternative to the Athlon 64 X2. At heart, all Opterons on Socket 939 were really Athlon 64 FX processors, just underclocked and undervolted so they ran very cool.

That's not why we bought them.

We bought them because they were usually even higher quality parts than the FX series so their maximum clock was usually very high. Around one in five would pass 3GHz and almost all of them would easily pass 2.7GHz. With the Athlon X2s, most overclocking efforts were limited by one core which would run much hotter than the other, a result of being slightly flawed on the substrate (but not faulty). Opterons were selected from parts of the substrate highly unlikely to carry flaws, so both cores were cool-running. Without the handicap of a core being less scalable than the other, Opterons would hit higher clocks. The 90nm AM2 Windsors (JH-F2) were never very happy above 2.8 GHz, yet their Opteron counterparts almost all were capable of 3.0 GHz, regardless of what factory marking they carried.

This particular part was manufactured in week 12 2007 on AMD's 90nm process in Dresden, Germany. It ran my main PC on the Asus A8N-SLI Deluxe in the Motherboards section for some time. The stepping code, LCBQE, was first seen in the Opteron 290, a 2.8GHz dual core Socket 940 part. Where most of the Opteron 1xx series were little more than 'overqualified Athlons', the later revisions of them (identified by the code beginning with L rather than C) were genuine Opteron 2xxs which were marked down to 1xx to meet demand (the 2xx series were the same silicon too). They were utter gold. This one was under a by-no-means incredible aircooler (Coolermaster Hyper6+) and was very happy at a mere 2.7GHz. Alas, the motherboard conspired to hold it back since the cooler on the nForce4SLI chip wasn't very good. So at the bus speeds (300MHz+) required for a decent clock, the motherboard itself became unstable.

In all, the exact same thing as the 146 above, just E6 (vs E4) revision and dual core. The Athlon X2 equivalent was the 4400+ model.

At 2.88 GHz (maximum, the CPU was usually lower due to power management), this CPU is had a two year stint powering the server you're viewing this page from, before being replaced by a Brisbane Athlon X2 4200+ (At 2.52). It would boot quite readily at 3.1GHz but wasn't quite stable there. The fastest Toledo AMD released was the 2.6 GHz Athlon 64 FX-60, this CPU was 280 MHz faster.

It was sold and finally became a secondary machine in 2016 when a series of power outages made the machine lose its BIOS settings. The user, thinking it had broken (it lost its boot order), replaced it.

People using Tech Power Up's comprehensive CPU database: It is often inaccurate for CPUs older than around 2013. This E6 Opteron 165 does not have a locked multiplier.
ProcessClockFeature CountPlatform
90 nm1800 MHz233,200,000Socket 939
L2 CacheSpeedWidthBandwidth
1024 kB x21800 MHz128 bit28.13 GB/s

Intel Celeron-M 420 SL8VZ - 2006
Celeron-M 420 was around the slowest thing Intel would sell you. It had one Yonah-class core, 1 MB L2 cache, ran a 533 MHz bus and was built on 65 nm. Not that interesting, so the tech writer has to improvise to hold his reader's attention. That's you, good sir.

Celeron-M 420 had an "SL8" S-spec, and Celeron-M 430 was an "SL9", so there was clearly some difference. What was it? The core stepping had moved from C0 to D0 with SL9 specs, but was the same Yonah-1024 single core, 1 MB die. The stepping doesn't appear to have changed much. Intel also changed the capacitor layout on the rear of the CPU from C0 to D0.

In mobile, Yonah-1024 replaced Celeron-D and Mobile Celeron parts, which were based on the Pentium4 architecture, power hungry, inefficient, low performance. The 1.6 GHz Celeron-M here would run software at around the same speed as the 2.53 GHz Celeron-D 325 just above.
ProcessClockFeature CountPlatform
65 nm1600 MHz151,000,000Socket-M (mPGA478)
L2 CacheSpeedWidthBandwidth
1 MB1600 MHz256 bit51.2 GB/s

Intel Celeron-M 430 SL9KV - 2006
In January 2006, Intel dropped the unassuming Core Duo and Core Solo parts on the market. They were mobile only, like all the Banias derivatives. April saw the first three Celeron-Ms drop, the 410, 420 and this 430.

Intel made two silicon variants of it, Yonah and Yonah-1024. The former was dual core and 2MB L2 cache, the latter was half of both. From the size of the die here, around the same as Merom-L, this appears to be the full Yonah with a core and half the cache disabled, instead of not present at all. It's hard to say for certain without good images of both.

At 27 watts TDP, Celeron-M 430 was very good for a mainstream mobile part and wiped the floor with Intel's Pentium4 Mobile.
ProcessClockFeature CountPlatform
65 nm1733 MHz151,000,000Socket-M (mPGA478)
L2 CacheSpeedWidthBandwidth
1 MB1733 MHz256 bit55.5 GB/s

AMD Athlon 64 X2 5200+ CCBIF - 2006

AMD's Windsor core, at 227 million transistors (see note) was the introductory part for AMD's AM2 platform, supporting DDR2 memory. The performance difference was, to a word, zero. All the initial AM2 parts were to the "FX" spec, of having the full 1 MB L2 cache per core, though later 512 kB parts arrived, either as Windsor with some cache disabled or Brisbane, the 65 nm shrink.

Confusing matters was that there were two different 2,600 MHz parts, the 5000+ Brisbane or the 5200+ Windsor, and two different 5200+ parts, the 2.6 GHz Windsor and the 2.7 GHz Brisbane. AMD thought that half the L2 cache was about the same as 100 MHz core clock, with some justification.

Confusing matters more was that some Windsors were marked as Brisbane, ones which had 512 kB cache natively and none disabled but were still the 90 nm part. This part had no "codename" and was also called Windsor. A weird 3600+ 2 GHz model had just 256 kB L2 per core and was variously made using 1024 kB or 512 kB silicon. In June 2006, AMD announced it would no longer make any non-FX processors with the full 1 MB cache, although they still trickled through until 2008, this one is marked as being produced week 50 2007 and was bought mid-2008.

The dies seemed to exist as just two units: A very large almost square one and a smaller rectangular die. The larger one was the 1024 kB L2 version, the smaller one was the 512 kB variant.

The actual silicon being used was named per its featureset, not what was being manufactured, except when it wasn't. It was bizzare, confusing, and the below isn't with full confidence. AMD used the same Athlon 64 X2 xxxx for all these products, except some of the later 65 nm parts, which dropped the "64" as just Athlon X2.

NameCoreL2FeaturesProcessSocket
Athlon 64 X2Manchester512 kB~150,000,00090nmSocket 939
Athlon 64 X2Toledo1024 kB233,200,00090nmSocket 939
Athlon 64 X2Windsor1024 kB227,400,00090nmSocket AM2
Athlon 64 X2Windsor512 kB153,800,00090nmSocket AM2
Athlon 64 X2Brisbane512 kB153,800,00065nmSocket AM2


Manchester was an oddball. It was a revision E4 - same as San Diego and Venus - but dual core. The probable explanation would be that Manchester was not "true" dual-core, but half-cache San Diegos on the same substrate, but a de-capped Manchester was a monolithic die. Additionally, some E6 (Toledo) CPUs were marked as Manchester with half their cache disabled. Most Manchesters were likely actually Windsor silicon.
Quite a lot of 512 kB Windsors were actually the 227 million, 1024 kB part with half the cache turned off. For some reason, AMD had a lot of yield problems with Windsor, which it did not have with Toledo on the same 90 nm SOI process. Toledo would almost always clock higher and run cooler, but it was Windsor which was in the highest bins.

Performance was per clock identical to the Socket 939 Opteron 165 above, while the DDR2 AM2 socket had more bandwidth available, it was higher latency and AMD's DDR2 controller was really not very good.

It wasn't an upgrade for me, though. I was replacing the Opteron 165 above (I broke the Socket 939 motherboard), which ran at 2.8 GHz. This Windsor would just run at 2.8, but needed a huge core voltage boost to do so, and ran very hot. The F2 revision was never produced faster than 2.6 GHz and F3 only made it to 3.2 GHz. Oddly, the 65 nm Brisbane was never released that fast, its highest bin being 3.1 GHz for the fantastically rare 6000+. My own Brisbane (a 4000+, 2.1 GHz) will not clock past 2.7 GHz no matter how hard it is pushed. Usually a die shrink means higher clocks, but AMD's 65 nm process or its DDR2 K8s seemed to just not work very well. Most likely, the critical path optimisation needed to get more clock out of K8 just wasn't worth it.

After AMD's success on the 90 nm fabrication process and it using the same designs on the 65 nm process, observers expected great things. Surely if most of AMD's chips on 90 nm would approach 2.8 GHz, some hit 3.0 GHz, then 65 nm would mean AMD would be able to push 3.5 GHz and more. 65 nm did not appreciably reduce power, did not appreciably increase clock headroom, in fact the majority of 65 nm Athlon X2 chips would not clock as high as their 90 nm brothers, ran hotter and were never quite as happy when overclocking. The Brisbane 4000+ that replaced this 5200+ wouldn't go much beyond 2.6 GHz, for example. The only explanation is that AMD's 65 nm silicon-on-insulator process node was just bad.

ProcessClockFeature CountPlatform
90 nm2600 MHz227,400,000Socket 939
L2 CacheSpeedWidthBandwidth
1024 kB x22600 MHz128 bit41.6 GB/s
For some reason, AnandTech seems to think Windsor has 154 million features, which is accurate for Brisbane (and possibly Manchester) and then goes on about how the die size isn't as small as it should be, since Anand was thinking that Windsor had many fewer transistors than it did! Brisbane has less as it has only half the L2 cache that Windsor does.

AMD Turion 64 X2 LDBDF - 2006
Turion was the brief branding AMD did for some late K8 and the mobile/embedded K10s. The part code TMDTL50HAX4CT tells us it is the TL50 model with the core name "Taylor". There was also the overlapping name "Trinidad".

It is seen here nestled in Socket S1, where it ran at 1.6 GHz on an 800 MHz HyperTransport bus. It used 90 nm manufacturing technology and sported 153.8 million transistors. L2 cache was 256 kB per core. On the desktop, this exact silicon was called "Windsor" and usually branded Athlon X2. Desktop variants had all 512 kB per core L2 cache enabled, but only the "Trinidad" mobile versions had all the cache turned on, "Taylor" was cut back quite a lot.

The 638 pin Socket S1 was AMD's first ever mobile-only socket, before then AMD had used Socket 754, Socket A, and even Super 7. This was not unusual, Intel had also used Socket 370 and the semi-compatible FCPGA 370 on laptops and, at the time, was using the desktop Socket 478 in laptops.

Socket S1 went through four generations: S1g1 was equivalent to Socket AM2, S1g2 was equivalent to AM2+, S1g3 equivalent to AM2+, and S1g4 is comparable to AM3. It was a quite reasonable way to get the extra responsiveness of a dual core on a mobile machine, particularly when high CPU performance was not needed. Thermal design power was 31 watts, which was high for a mobile of this performance.
ProcessClockFeature CountPlatform
90 nm1600 MHz153,800,000Socket S1
L2 CacheSpeedWidthBandwidth
256 kB x21600 MHz128 bit25.6 GB/s

Intel Celeron-M 540 SLA2F - 2006
Intel's Merom was the mobile name for Conroe, the desktop dual core Core 2, or Woodcrest, the server Core 2. Merom, Merom-L and Merom-2M existed. Merom-L had one core and 1 MB L2 cache. Merom had two cores and 1 MB L2 cache. Merom-2M had two cores and 2MB L2 cache. All examples used the existing Yonah CPU architecture, developed by Intel in Israel. Yonah used shared L2 cache between the two cores.

This was Merom-L, one enabled core, 1 MB L2 cache, 533 MHz bus. Enhanced C1E power management and SpeedStep were disabled, so Celeron-M used much more power than it really needed to. This was to prevent its use in lighter machines with longer battery lives, as Celeron and Intel's Core 2-ULV were dramatically overlapped.

Merom was Intel's first ever "Tock" in the "Tick-Tock" paradigm, where a "Tock" was a new architecture on an existing process, and a "Tick" was the same architecture on a new process. Merom's corresponding "Tick" was Penryn on 45 nm.

Merom had many silicon variants, so three configurations were all "Merom" - Dual core 4 MB, dual core 2 MB, and single core 2 MB, which were "Merom", "Merom-2M" and "Merom-L" respectively. All of these were also called Conroe or Woodcrest if they were aimed at the corresponding segment... Intel was more of a marketing company than a manufacturing one, it seemed! This Celeron-M 540, for example, used both Merom-2M (Allendale) and Merom-L silicon.
ProcessClockFeature CountPlatform
65 nm1866 MHz167,000,000Socket-P (mPGA479)
L2 CacheSpeedWidthBandwidth
1 MB1866 MHz256 bit59.7 GB/s

Intel Celeron-M 570 SLA2C - 2006
This Merom-L product ran at 2.26 GHz and made a decent entry-level laptop, but if we're on a Celeron-M, we probably have already made many sacrifices already and the CPU won't be the limiting factor. This came out of a Fujitsu-Siemens with 2 GB RAM (2x1GB, so at least in dual channel) with a horribly slow Seagate Momentus hard drive.

It's a speed grade of the one just above. If you want details, they're there.
ProcessClockFeature CountPlatform
65 nm2.26 MHz167,000,000Socket S1
L2 CacheSpeedWidthBandwidth
1 MB2.26 MHz256 bit72.5 GB/s


Click for a 4k version!
Intel Core 2 Duo T7200 SL9SF - 2006
Another Merom processor, this time the full on dual core, 4 MB L2 cache model. It's plain that different silicon is in use here, highlighting the difference between Merom and Merom-L.

The 4 MB L2 cache Core 2 Duos were worryingly powerful on release, not only catching up to AMD's Athlon 64 FX and Athlon X2 processors, but soundly pulling ahead of them. While AMD had faster, lower latency memory thanks to the integrated on-die memory controller, Intel just threw a huge big dumb L2 cache at the problem. Pentium 4 was wholly unable to keep up with AMD. Core 2 didn't just keep up, it pulled ahead.

This sample was pulled from an elderly Dell Latitude D620 where, as can almost be seen in the background, it got to play with an Nvidia Quadro NVS-110M... A slightly knobbled Geforce Go 730, which in turn was a castrated GeForce 7200 GS. It was better than the IGP and, well, that's about it.

It feels wrong to end the piece on a negative about an unrelated product, so we'll instead mention that the performance of the T7200 was sufficient that this laptop was a casual user's daily driver until well into 2015, with 2 GB RAM and Windows 7. At that point, 2 GB just wasn't enough and the Dell D620's keyboard was beginning to fail and the Intel 945PM chipset was picky about its RAM support, so an upgrade to 4 GB was passed over in favour of a replacement device.
ProcessClockFeature CountPlatform
65 nm2000 MHz291,000,000Socket-M (mPGA478)
L2 CacheSpeedWidthBandwidth
4 MB2000 MHz256 bit64.0 GB/s
[PENDING] Core 2 Quad Q6600 - 2007
Q6600 was the high end CPU to have in 2007. It was a multi-chip-module (MCM) based around two Conroe dies: Conroes intended for embedding into a quad were known as "Kentsfield". It was built on Intel's 65 nm process and was rated to a 105 watt TDP. The two Conroe dies had 4 MB L2 cache each and the CPU ran at 2.4 GHz from a 1066 MHz (4x266) FSB.

That's all a bit dry. Conroe was an overclocking beast. Most of them would trivially pass 3.0 GHz. Some samples were able to approach 4.0 GHz. Some passed 4.0 GHz. Intel back then did not permit overclocking on anything, so all multipliers were locked, and overclocking was done by raising the FSB speed

This one was tied with an Asus P5E motherboard (a slightly less featureful Maximus Formula) and 8 GB DDR2-800 RAM. Even in 2010, that was a worthwhile machine... Particularly with some overclocking!

Kentsfield and Conroe were regarded as legendary CPUs and firmly took back the performance crown from AMD, which had had it for nearly a decade.


Intel Core 2 Quad Q8300 SLGUR - 2008
Like other Core 2 Quads, these slightly reduced parts were made from either two Wolfdale dies or a single Yorkfield die. With the reduced L2 cache and slightly lower clocks, gamers typically preferred faster Core 2 Duos, such as the C2D E8400. In either case, the two dual-core modules are wholly independent and connected only via the front side bus. AMD made much of this, saying they were not "true" quad cores. AMD would have more of a point if its own 2008 offerings of Barcelona were any faster.

AMD was vindicated, however, as all multi-core processors today have a large shared L3 cache, which AMD introduced with the ill-fated Barcelona.
ProcessClockFeature CountPlatform
45 nm2500 MHz456,000,000LGA-775
L2 CacheSpeedWidthBandwidth
2048 kB2500 MHz256 bit80.0 GB/s


Intel Core 2 Duo P8700 SLGFE
Socket P was Intel's mobile socket, where the surface mounted FC-BGA478 package wasn't used. With a 25 watt TDP, Penryn-3M was a tiny 82 mm^2 die (Full Penryn was 6 MB L2 cache and 107 mm^2) and actually smaller than the chipset which supported it.

Penryn was a die shrink of Merom, with Merom's 2/4 MB L2 cache options changed to 3/6 MB. The desktop Wolfdale was the same 107 mm^2 silicon as Penryn. Penryn-3M was Wolfdale-3M on the desktop. They were identical.

We see it here mounted in a Dell Latitude E6400, still working at 11 years old in 2020, slow but functional in Windows 10. At the time, these were the mobile CPUs you wanted. Laptops were still a generation or two from having reasonable video performance, even as some of them sprouted power-hungry discrete GPUs.

Intel's mobile platform was awful, but everyone's was. The Cantiga-GM (82GM45) chipset was rated for 12 watt TDP with the ICH9M adding 5 watts to that. Intel was in the business to make money, not good laptop chipsets, so the chipset was a desktop part (Cantiga was Eagle Lake on the desktop) ran at a slightly lower voltage and clock rate to drop the power from 22 watts to 12 watts, cut the 1333 MHz FSB, and de-clocked the GPU, but also manufactured on older lithography - likely 90 nm in this case. This enabled the production of products in older, paid-for, foundries. Of course they use twice the power they needed to, but where else are you going to go?

The P8700 itself was rated at 25 watts. Flat out, this is almost 45 watts just on the CPU and chipset. If we go back to 2008, when this was all new, 45 watts mobile was tolerated. It was okay for a performance or mainstream mobile platform.

Nobody liked it, and it was the reason why laptops were only expected to run on battery for 2-4 hours, but this Dell and its 45 watt-hour battery would run for those 2-4 hours. The Latitude E6400 was serious kit, built for easy and rapid service, the keyboard drains into drain channels for liquid spills, and is a field-replaceable unit (FRU). The chipset is cooled by an impregnated pad, the CPU by direct copper contact to the heatpipe, and the blower (very quiet up to 3,200 RPM) ran through a very small heatsink, little more than a few dozen small vanes.

This picture was taken during refurbishment work in 2020, as the blower was running 4,400 RPM (maximum) under moderate load. Dell had cleverly tied the blower throttle to the chipset temperature, as the chipset was furthest from the heatsink. This logic was if it was getting warm (55 celsius was enough for full throttle) then the cooling system was under extreme stress so it was appropriate to max out the fans.

After removing the heatsink, cleaning off the thermal interface and replacing it, and cleaning around a decade of dust and fluff out of the blower and heatsink, it was behaving much better. A 77C CPU under moderate load became 60C.
ProcessClockFeature CountPlatform
45 nm2533 MHz230,000,000Socket-P (mPGA478)
L2 CacheSpeedWidthBandwidth
3072 kB2533 MHz256 bit81 GB/s


Intel Core 2 Quad Q9400 SLB6B - 2008
The L2 cache situation on Core 2 Quad was a little unusual.
Q6x00 had 2x 4MB
Q8x00 had 2x 2MB
Q9x00 had 2x 3MB
Q9x50 had 2x 6MB
ProcessClockFeature CountPlatform
45 nm2667 MHz2x 230,000,000LGA-775
L2 CacheSpeedWidthBandwidth
3072 kB2666 MHz2x 256 bit2x 85.3 GB/s


Qualcomm MSM7227 - 2008
The CPU in here, the ARM11, wasn't the biggest and best even for the time. ARM11 was announced as available by ARM in 2002, as the ARM1136. The 1156 (adds the Thumb2 16 bit architecture) and 1176 (adds security like NX) would follow. ARM11 was ARM's workhorse for embedded perofrmance for some time, particularly when ARM9's chops just weren't meaty enough.

The ARM11 core has a nominal 8 stages, can run in ARMv6 or Thumb mode, Thumb mode uses implied operands and 16 bit addressing, for low storage, low pincount, embedded designs. ARMv6 in general is designed for high instruction level parallelism, and indeed has significant parallelism within instructions. ARM11 introduces proper branch prediction (ARM9 did static "never taken" speculative execution) and has static "take if negative offset, never take if positive offset" prediction, so it prefers to branch backwards but not forwards. L1 cache is Harvard, 4-way set associative, and can be configured from 4 kB to 64 kB.

assadasd Configured as the ARM1136EJ-S, this means the ARM11 core is about as basic as it comes. The "EJ" means it has Jazelle DBX and an enhanced Vector Floating Point unit (basic DSP SIMD), and the -S means it can do unaligned memory access.

The CPU ran at 600 MHz, the GPU at 266 MHz, and the "baseband" at 400 MHz. Baseband was the modem and modem DSP for handling cellular telephony and data, in previous generations the entire hardware of a cellular telephone. In this generation, the two entire computers (baseband and application processor) had merged into the same silicon, but were architecturally still two distinct computer systems.

As implemented in the MSM7227, it was given 2x 16 kB L1 caches and a 256 kB L2 cache and had a single 16 bit LPDDR memory controller able to run up to 166 MHz (1.33 GB/s). It had the Adreno 200 GPU at 266 MHz, QDSPv5 "Hexagon" (JPEG, MPEG, MP3/WMA assist) in its very earliest iteration, and an "image signal processor" (ISP) (hardware de-Bayer) able to handle a camera up to 8 megapixels.

The GPU it its own little story. Qualcomm had no GPU other than a rasteriser in the earlier MSM7225 and MSM7625, but had recently bought Imageon from AMD. AMD had no designs on the embedded market, and ATi had developed Imageon for exactly that. Qualcomm had already licensed the Imageon Z430, as Adreno 200, to deliver OpenGL ES 2.0 support. The ISP was also part of the Z430. When AMD bought ATI, Qualcomm cheekily bought Imageon from AMD.

MSM7227 was everywhere in the Android 2.x generation, particularly at the entry level. It, almost single-handedly, drove Qualcomm's dominance in the Android space. Here we see it in a ZTE Blade, but it was also in devices such as the HTC Wildfire, the Samsung Galaxy Ace, and many, many, many others.

Here in the ZTE Blade, it had 512 MB RAM, 512 MB ROM (juuuust enough) and came as standard with a 2 GB micro-SD. The screen, an 800x480 3.5" IPS at 240 PPI was by far the highlight of the device.

This was the first wave of the smartphone revolution 2009-2012. It was when iOS (which had yet to take that name) and Android both became mature systems.


AMD Phenom II X4 955 Black Edition CACYC - 2009
When AMD launched the "K10" or "K8L" Phenom in 2007, it was found to have a rather embarrassing bug: The translation look-aside buffers could sometimes corrupt data. This TLB bug was blown out of all proportion (for most people it would and could never happen) but as far as CPU errata go, it was a pretty big deal. AMD fixed it by slaughtering performance of the Phenom chip, until the B3 revision which had fixed silicon. Without the "TLB fix", a Phenom was clock-for-clock about 10% faster than an Intel Core 2 Quad in some tests, 10% slower in others: In general, it was a wash. With the fix, it was as much as 15% slower across the board with the occasional dip to 30%. Additionally, the top speed grade on launch was 2.5 GHz and AMD's top end Phenom X4 9950 Black Edition only hit 2.6 GHz, and would overclock to around 2.8 only with extreme persuasion. Intel was firing with a 3.0 GHz Core 2 Quad at this time. While Phenom had somewhat of a per-clock advantage, Intel just had more clock and so the performance leadership remained in the blue corner.

AMD released Phenom II in 2009 to go against Intel's Nehalem chips, the first generation Core i5 and i7s. In summary, the L3 cache increased to 6 MB from 2 MB and the TLB bug was fixed. Also, per-core power management was disabled. Transistor count increased from 463 to 758 million, the chip size reduced from 283 square millimetres to 258, L1 and L2 were made slightly smarter, prefetching was made more aggressive... Boring stuff, really. Clock for clock, AMD's new Phenom II was more or less identical to the older one... but it launched at 3.0 GHz and eventually reached 3.7 GHz as the Phenom II 985 Black Edition.

Phenom II was really just a very minor update to the original Phenom, the main difference was that it had three times as much L3 cache and, due to the quite good 45 nm process and some critical path optimisation, could clock very high. Even original Phenom IIs on release were hitting 3.4 and 3.6 GHz.

AMD's stepping code system didn't change for this era, but it did, of course, reset. Phenom II continued the stepping codes Phenom did, and it's not clear what any of these codes meant. "CACYC AC" was used for all manner of parts, but all based off the Deneb silicon. Some partial decoding was made:
Second letter A means quad core or dual core silicon. C means six-core.
Third letter C or higher means 45 nm "K10.1" based except when second letter is C.
Fifth letter B means a 65 nm K10, C means a 45 nm K10.1, D or E mean a 45 nm six-core K10.1

The core stepping was different. For K10.1 on 45 nm, there were C2 (around 3.8 GHz max), C3 (3.8-4.0), D0/D1 (6-core) and whule rumours of an E-code of an eight core were out there, none ever saw one.

Later, a C3 stepping was released, which clocked much better and ran on less voltage. Most C2s topped out at 3.8 GHz no matter how hard they were pushed. C3s would hover around 4.0 GHz

Phenom II disabled the original Phenom's per-core power states and instead ramped the entire four cores up and down as needed. This helped performance, but severely hurt power use. Phenom's on-die thermal sensor was also modified to only decline at certain maximum rate, so on returning to idle after a load, it would not show true temperature but a higher temperature. This was actually very good for watercoolers which would otherwise spin down their fans too rapidly to take coolant temperature down, but this side-effect was unintentional, as it was there to prevent fans from rapidly changing pitch.

Phenom II was also available, soon after launch, on the AM3 platform, which ran on DDR3 memory. All Phenom IIs will work in AM2+ motherboards with DDR2 memory, as the CPU supports both. They'll even boot and run on an elderly AM2 motherboard, but the 125 W models will lock the multiplier to 4x (800 MHz) as AM2 lacks dual power planes.
It was, per clock, about 5-20% faster than Phenom. Clock for clock, Nehalem (as Intel's Core i7 965), was as much as 50% faster than Phenom II, though in some tests - games especially - Phenom II was Core i7's equal (and sometimes sneaked a small win). Clock for clock, however, is a poor metric across different architectures, or we'd all still be using Pentium 4. Price is a better one, and Phenom II usually beat Nehalem at the same price.

For the money, the Phenom II was quite acceptable and on the apex of the price/performance curve, right at the point where, if we go cheaper, the performance tanks and if we go faster, the price rockets. At launch, an AMD Phenom II X4 940 Black Edition would cost $240, a Core 2 Quad Q6600 was $190 and a Core i7 920 was $300 (and needed a $240 motherboard, while the Phenom II was happy with a $135 one).

Of the three, AMD's was almost exactly half way between the uber-expensive i7 and the older Core 2 Quad. Should one have saved the money and gone with a Core 2 Quad? It depended. Mostly, the Core 2 Quad was about equal to the Phenom II.

After a few releases, prices dropped. In 2009, it was Phenom II turf, all the performance of a fast Core 2 Quad Q9550 for cheaper, 8% cheaper in fact: The Phenom II 955 BE was introduced at £245, while a Core 2 Quad Q9550 was $266 - the AMD system used about 20% less power too. Add in that AMD's 790FX motherboards were around $40 cheaper than their Intel equivalents and you then had enough money left to bump the GPU a notch or two. If money wasn't an object, Intel's Core i7 could buy as much as 40% extra performance.

By 2012, AMD was still selling the then-old Phenom II X4, including this 955 model, for the same price as Intel's Pentium G2120. The Phenom II was around 25% faster except in solely single threaded tasks.

As of June 2014, the X4 955 Black Edition was going for about £55 on ebay. The king of the Phenom IIs, the X6 1100T BE, was over double that. The six core "Thuban" was really just Deneb with some minor revisions made and two more cores bolted on the side. It did have "turbo core" which was a really rough first generation implementation. The CPU would run three of its cores faster so long as it wasn't overloaded with power (125 W TDP) or temperature (95 celsius). This is because Windows, at the time, would do really dumb things with scheduling and would take a task off a 3.6 GHz core to give it to an 800 MHz powered down core.

By 2014, however, it was time to put the K10.1 generation away. Being a new parent at the time, thart wasn't an option here! By 2015, they were definitely showing their age, but still quite useful even for moderate gaming. This Phenom II powered a primary gaming machine until February 2017, when it was really quite archaic, yet it performed quite well with a Radeon HD 7970 3 GB. It had given five years of reliable service and was bought as a pre-owned part to begin with. The manufacturing code on this chip is "0932" so it was manufactured in the 32nd week of 2009, and had therefore been retired from my service at the age of 8. It was donated to a friend, where it lived out its remaining days, around another year of service.
ProcessClockFeature CountPlatform
45 nm3200 MHz758,000,000Socket AM3
L2 CacheSpeedWidthBandwidth
512 kB x43200 MHz128 bit51.2 GB/s


AMD Phenom II X4 955 Black Edition CACYC - 2009
Almost identical to the one above! It has the same CACYC code and the only real difference is the production code: This one is marked with 0930, the one above is 0932. They were made two weeks apart in 2009.

One of the cool features of AMD's AGESA platform in K8 and K10 was "Clock Calibration". When AMD made K8, it could vary its clock on-die by 12.5% either way, but this was never really used. Windows was really bad at scheduling tasks relative to power states, even as Windows 7 (and up) had full control of both.

AMD K10 increased this to each core able to vary clock individually and called it Advanced Clock Calibration, and again never really used it: It was for overclockers, so that weaker cores could be backed down and a higher peak clock reached on the good cores. However, it did have a bug, or at least an oversight.

If we specified clock calibration on a CPU core which was sold disabled (e.g. the disabled one or two cores on a Phenom II X3 or X2, respectively), then the CPU's onboard firmware actually enabled that core! Maybe it worked. Maybe it didn't. It could be clocked down by up to -12.5% if it had a scaling defect, meaning that cheap X2 and X3 parts could have that bit of extra performance.

The particulars are covered, we'll go into the history of this particular CPU. I bought it second-hand with an Asus Crosshair III Formula motherboard. The board turned out to be extremely flaky, and I got that refunded. No harm, no foul, but I still lacked a working motherboard. The AM2 board I had, the Gigabyte GA-MA69G-S3H, didn't implement dual power planes (AM2s didn't have to, although they could), so the Phenom II locked itself to 800 MHz. It worked, but was quite slow. HT bus overclocking got it up to about 1200 MHz, enough for basic use. I eventually got a proper AM3 board and, until February 2017, long after it should have been put out to pasture, it was my daily driver.

As the Black Edition, this CPU has an unlocked multiplier so can be set to any clock the user desires. Most would end up between 3.6 and 3.8 GHz on air, this one is happy at around 3.7 GHz on air and 3.8 on water. To go higher requires cooling I plain do not have! At 3.8, it manages to hit 65C even with a Corsair H80 watercooler. High voltage (1.625V) tests at 4.0 GHz did boot, but hit almost 80C, FAR too hot.

AMD's top bin was the 980 Black Edition, clocked at 3.7 GHz which almost universally would go no faster. It wasn't a great overclocker and 4 GHz was beyond the reach of most mortal men - They simply overheated rather than running into any silicon limit. Earlier Phenom IIs (such as the 940 intro model) would rarely pass 3.5 GHz. The C2 Deneb stepping generally ran hotter and didn't clock as well as the later C3 stepping. This is a C2, C3 appeared in November 2009. C3 also fixed issues with high speed RAM (4x 1333 MHz sticks could be iffy on C2) and ran at a lower supply voltage most of the time, so could be rated for 95 watts instead of 125 watts. C3 also implemented hardware C1E support. Most C3s would also overclock to around 4 GHz, but C2s ran out of puff around 3.8.

Over time, the board or CPU or both started becoming a bit flaky. After changing from dual GPUs to a single GPU, the motherboard wouldn't come out of PCIe x8/x8 mode, so I tried resetting BIOS (clearing CMOS). This made one of the RAM slots stop working, and crash on boot (if it booted at all) with anything in that slot. Bizzare, I know. The RAM also wouldn't run right at 1333 MHz and had to be dropped down to 1067 MHz. Clearly, the board or chip weren't long for this world. This was in late 2015 or early 2016. I got another year out of it before BANG. It froze up while playing Kerbal Space Program and then refused to boot at all.

I diagnosed it as a bad motherboard or CPU, given that they had been on their way out anyway, and got replacements out of an old Dell Optiplex 790. This turned out to be a Core i5 2500, which was actually a small upgrade from the Phenom II. To cut a long story short, the video card, a Radeon HD 7970, had died. The motherboard was flaky anyway, a RAM slot was unstable and the PCIe switch was locked into 8x/8x mode for no good reason, but it went back into use with a friend in Newcastle for around a year.
ProcessClockFeature CountPlatform
45 nm3200 MHz758,000,000Socket AM3
L2 CacheSpeedWidthBandwidth
512 kB x43200 MHz128 bit51.2 GB/s



AMD Athlon II Neo N36L NAEGC AE - 2009
AMD's core codenames did not actually attach to silicon in this era. As the AMD Athlon II, this was Sargas. As Athlon II X2, it was Regor, as Phenom II Dual-Core Mobile, Turion II Dual-Core Mobile, V Series for Notebook PCs, and Turion II Neo Dual-Core Mobile, and Athlon II Dual-Core Mobile it was Champlain. As either Athlon II Neo or Turion II Neo for embedded, it was Geneva.

AMD was actually copying what Intel did with Merom and Penryn. Merom on the desktop was Conroe. Conroe mobile with some cache disabled was Merom-L. It wasn't the silicon which carried the name, but instead the intended market and featureset, so Regor was quite often the same silicon as Deneb with cache disabled, but Regor could also be the same silicon as Champlain.

To help straighten this out, we can sort them by die area in mm^2, which does not change between segments.
Die Size (mm^2)CoresL2/core (kB)L3 (MB)Common Name(s)
117210240Regor/Champlain
16945120Propus/Champlain
25845126Deneb
34665126Thuban
Note how the "maximum" configuration of 4 cores, 1024 kB L2 and 6MB L3 was never made. AMD was once rumoured to be working on a revision D, "Hydra" which had that configuration and maybe 8 cores but likely was a distorted telling of Thuban: It would have been huge on 45 nm.
So this was named Geneva, but the actual silicon was the 117 mm^2 dual core "Regor" design without L3 cache and with 1024 kB L2 cache per core. This one has the product code AEN36LLAV23GM and is engraved with "AMD AthlonTM II Neo" and is mounted on an FC-BGA package for surface mounting. The particular codes here only ever seem to have been used in the HP ProLiant Microserver TL36, which is where this came from.
AMD had got the 1024 kB L2 cache on Regor for free. The die size could only get so small before redesigning the HyperTransport and DDR2/3 memory controller, which fit around the outside, was needed. Deneb, Propus and Regor all use the same layouts for these, and the closest they can go together controls how small the die can get. Regor would have had blank space just under the memory controllers, almost exactly the size of an extra 512 kB L2 cache per core... so that's what AMD did.

The feature count below is taken from Tech Power Up's CPU database as of 2020/12 and almost certainly wrong. It gives the 117 mm^2 Regor 410 M while the 169 mm^2 Propus is given 300 M. These figures may be inaccurate or simply swapped. If we find better data, we will use it and remove this notice.
ProcessClockFeature CountPlatform
45 nm1300 MHz410,000,000FC-BGA 812
L2 CacheSpeedWidthBandwidth
1024 kB1300 MHz128 bit20.8 GB/s


Intel Celeron T3500 SLGJV - 2010

The Penryn refresh series of Celerons for the mobile market were released in 2010 and 2011, when most Penryns were released in 2008. All Penryns ran on an 800 MHz FSB, but not all Celerons had two cores enabled. As a dual core CPU running at 2.1 GHz in 2010, this was quite reasonably performant, something few Celerons ever achieved.

It didn't exactly go in the top laptops of the day, so would usually be found with 2 GB RAM in one slot, single channel. The T3500 had a TDP of 35 watts as its power management was mostly disabled. The very similar Core 2 Duo SL9600 (2.13 GHz, 6 MB cache) had a TDP of 17 watts and the identical Pentium T4300 (2.1 GHz, 1 MB cache) was 35 watts. All were based on the same 410 million feature Penryn-3M die.
ProcessClockFeature CountPlatform
45 nm2100 MHz410,000,000PGA478
L2 CacheSpeedWidthBandwidth
1024 kB2100 MHz256 bit67.2 GB/s


Nvidia Tegra T30L - 2012
Nvidia names its system-on-chip products after comicbook characters, so this one, named Kal-El (the name of Superman) was introduced with quite some expectation. It was aimed at the mainstream, used ARM's popular Cortex A9 core design (at 1.3 GHz), and had Nvidia's GeForce ULP GPU onboard. Would it be the SoC to end the Qualcomm dominance?

Oh hell no. Before we get into why, let's look at what. Nvidia outfitted four A9s and an extra A9 fabricated to run at lower clocks and lower powers. The cache per core was 2x 32 kB and the whole thing had 1 MB L2 cache. It was given a 416 MHz GeForce ULP MP12 using the VLIW Vec4 architecture, which has four vertex shaders, eight pixel shaders, and probably two ROPs. OpenGL ES 2.0 support was a given, but it did not have unified shaders as it was based on the GeForce 7000 generation from many years before. It was built at TSMC on 40 nm.

Tegra 3's design wins would shape Nvidia's future direction. It was used in a few phones, but the package was too large for that, so it tended to go in tablets. Asus, a long time Nvidia partner, used it in the Transformer series and the Asus-made Google Nexus 7. Tesla selected it for the Roadster 2013-2014 information display and the Model S instrument cluster. Microsoft used it in the original Surface.

Its NVRAM/Flash controller was terrible, its WiFi performance lacking and CPU intensive, the GeForce ULP lacking in features, functional units, and performance relative to the PowerVR and Qualcomm competition. I/O on the SoC was generally awful. In a larger device, which can heatsink and cool the Tegra 3 properly, it works well.

Its success with Telsa had Nvidia focus its SoCs on automotive, then Tesla abandoned Nvidia for its own designs. As of 2019, Nvidia had abandoned SoCs for the smartphone/tablet market completely, Xavier instead runs at 10-30 watts (2-3x more than a tablet and 8-10x more than a phone) and makes fewer performance concessions for lower power as it's aimed at automotive, specifically driver assistance.


Intel Core i5 2500 SR00T - 2011
This Sandy Bridge HE-4 processor was Intel's mainstream along with the i5 2300 in 2011-12.

Sandy Bridge's launch was... Uneasy. Intel had forced everyone else out of the chipset market by strong-arming, patent lawsuits, and closing the platform down by not documenting it. The last third party chipset vendor was Nvidia, still barely clinging on at the end of the LGA775 era, but had already failed to negotiate a license for Intel's new bus.

Intel's Cougar Point chipset (marketed as P67, H67, 6-series, Q65, etc., all the same chip) had a severe SATA controller issue. Intel didn't recall anything as this was unthinkable, it would leave Intel without the ability to sell any Sandy Bridge CPUs. The issue was that the SATA ports would degrade over time, slowly reducing throughput. Intel quoted a 6% performance drop over three years, but given Intel's previous statements on flaws, that was probably extremely optimistic. A single transistor was being overdriven, enough to degrade much more rapidly than intended and cause a steadily rising SATA error rate.

In time, the issues were fixed and Sandy Bridge was an exceptional performer. Indeed, Sandy Bridge is rightly regarded as one of the all-time greats. Its predecessor, Nehalem, ran hot, didn't clock well and AMD's Phenom II was able to keep up by throwing clock at the problem: A Core i7 920 ran at 2.67 GHz, while a Phenom II X4 980 ran at 3.7 GHz. The i7 920 would be around 10-30% faster.

Running at 3.0 to 3.4 GHz, the top end Sandy Bridge processors were 10-30% faster still. Announced at Intel Developer Forum (IDF) 2010, even Intel seemed to be taken aback at just how powerful Sandy Bridge was.

Sandy Bridge, in some rare cases, doubled the speed of Nehalem. That level of performance leap had last been seen with AMD's Athlon in 1999.

The processor graphics, now on the CPU die, was immensely improved over Intel's previous chipset graphics. It wasn't just some stripped back basic video solution intended to just make sure basic 3D and video decoding worked, it was finally serious hardware. It benefited from having direct access to L3 cache and greatly improved media processing ability. The GPU ISA was tightly coupled with DirectX 10, like a high performance gaming GPU is. At the same clock, it was at least twice as fast as the previous generation.

In all, Sandy Bridge was so far ahead of Nehalem that it took until Kaby Lake in 2017 for Intel's lineup to progress as far from Sandy Bridge as Sandy Bridge had from Nehalem, its immediate predecessor.

This particular unit is the Sandy Bridge HE-4 processor, which has four HyperThread(tm) (SMT) cores, 12 GPU execution units (EUs) and 8 MB of L3 cache. Intel is Intel, however, and a good deal of that is disabled. In the i5 2500, 2 MB of L3 cache is disabled, half the GPU (6 EUs) are turned off, HyperThreading is turned off and the CPU is clock locked to 3.3 GHz with opportunistic boost to 3.7 for very limited periods. Of the 1.16 billion transistors on the chip, about two thirds are enabled.

Notably, Intel split the GPU provision into "Graphics Tier 1" and "Graphics Tier 2", or GT1 and GT2. The i5 2500 was GT1, so had only six of its GPU units enabled of the twelve on die. The GPU was also de-clocked, but all Intel Sandy and Ivy Bridge GPUs weren't actually clock locked and could be freely adjusted by Intel's own tweaking utility.

This is known as "configurability" and was a major design goal for Sandy Bridge. The quad core, 8MB L3, 12 EU had "chop zones" where the mask could be simply ended at that point. Chop zones allowed for the GPU to be chopped completely off or in half, for the IA cores to be chopped in half from 4 to 2 and for L3 cache to be chopped to 1.5 MB. Mixing and matching these chopped zones allowed for multiple dies to be produced from the same design. Intel built three in the client space:
1: 4 CPU Cores, 4 L3 Slices (8 MiB), GPU with 12 EUs. 1.16 billion transistors in 216 mm²
2: 2 CPU Cores, 2 L3 Slices (4 MiB), GPU with 12 EUs. 624 million transistors in 149 mm²
3: 2 CPU Cores, 2 L3 Slices (3 MiB),GPU with 6 EUs. 504 million transistors in 131 mm²


As with all Intel products, it is heavily fusable, a post-production configuration. L3 cache was made of "slices", of which each enabled core had 2 MB worth. It could be fused off in 512 kB eighths. Each fusible segment was 2 associative sets, so dropping from 2 MB to 1.5 MB dropped associativity from 16 way to 12 way. All slices had to have the same amount of L3, but not necessarily the same physical segments enabled, allowing Intel to fuse off defects as well as reduce performance for lower grades.

Other fusible features were HyperThreading, virtualisation, multiplier lock, Turbo Boost functionality, individual cores, whether L3 (or L2) cache could even be used at all, whether the onboard GPU could use 12 or 6 EUs, and more.

For $11 (on release) more, you could get the clock unlocked as the i5 2500K, which would often overclock as far as 4.4 GHz easily. Such a chip would still be competitive years later. By 2019, quite a lot of i5 2500K and i7 2600K processors running north of 4 GHz were still in daily use. Some approached and even hit 5.0 GHz with exotic cooling and modified motherboards.

Sandy Bridge was one of the all-time greats produced by Intel, so much so that Intel produced 17 different variations on this same HE-4 silicon in the "Core i" line and several variants in the Xeon E5 (e.g E5-2637) and all the Xeon E3 line. HE-4 also powered all of Intel's quad core mobile chips of this era, where it normally ran between 2.0 and 2.5 GHz.

The GPU in Sandy Bridge was very ambitious, at least by Intel's standards. It had long been held that a discrete GPU, no matter how weak, would always be greatly superior to any integratred graphics processor (IGP). With 6 EUs at a peak of 1100 MHz, this i5 2500 was between 50% and 66% of the performance of a contemporary Radeon HD 5450. This was epic. It was within spitting distance of a discrete GPU. A tremendously weak entry level GPU, but still a discrete GPU! The 12 EU part could deliver similar performance to the HD 5450, but was only available on few, seemingly random, i3, i5 and i7 parts.

Sandy Bridge was so effective that, six years later in 2017, the i5 2500 (not the unlocked "2500K"!) was still selling for around £50 on eBay. The i7 2600, a cheap and moderate upgrade, was just over double that.
ProcessClockFeature CountPlatform
32 nm3300 MHz1,160,000,000LGA-1155
L2 CacheSpeedWidthBandwidth
256 kB x43300 MHz256 bit105.6 GB/s
Intel Core i5 3470 SR0T8 - 2012
Ivy Bridge was the 22nm shrink of the 32nm Sandy Bridge. The on-die GPU was updated, but the CPU was not. This Ivy Bridge was of stepping revision N0, identifiying it as an Ivy Bridge-HM-4, a 133 mm^2 die and 1.008 billion features. It had 4 cores and 6 GPU execution units (EUs). L3 cache was 6 MB, although some sources claim it was 8 MB and disabled to 6 MB like Sandy Bridge was: I find this unlikely, given that the Ivy Bridge based mainstream quad-core had fewer die features (1.01bn vs 1.16bn) than the Sandy Bridge.

The 133 mm^2 die had a MSRP of $184. Intel's operating margin at this point was around 70% for CPU products, which shows: AMD's Cape Verde GPU was on a similar 28 nm process, at a similar 123 mm^2 size, had a significantly denser and more complex layout at 1.5 billion features, and sold in video cards priced at half the price of the i5 3470 as just a CPU.

Ivy Bridge in general was a bit of a disappointment. The leading Core i7 3770K was launched in mid-2012 and replaced the Sandy Bridge Core i7 2700K, which came a little later in Sandy Bridge's life. It had the same base clock, the same turbo clock, and only added DDR3-1600 support, which Sandy was doing anyway on some motherboards. The "recommended customer price" was also the same. 3770K also had a significantly worse GPU!

Intel made much of Ivy Bridge's superior GPU, and even claimed that AAA titles would be playable at medium to high settings, at 1280x720. Not on the 6 execution units of the i5 3470 they weren't. With "Intel HD Graphics 2500", Ivy Bridge had 6 execution units, each of which had 8 shader cores, for a total of 48 "GPU-style cores". It clocked to a maximum of 1150 MHz, delivering 110.4 GFLOPS of pixel crunching power. This compared moderately well with a Radeon HD 5450 from two years before. The much more powerful Intel HD Graphics 4000 had 16 execution units, so 128 shader cores, and was allowed to clock to 1300 MHz, giving 332.8 GFLOPS, three times the performance.

This was indeed able to run some games acceptably. So, knowing we were on the "GT1" level of performance, I tried a very graphically light game, Stardew Valley, at 1280x720. It was playable, but juddery at times, particularly when the weather became rainy, or a screen effect came on. A regular I've used for years, Aquamark 3, gave a score of 65,000. Oddly, the CPU score was very low and the GPU score very high - Higher than a GTX 680. I can only conclude that the 2003-vintage Aquamark 3 has finally ended its usefulness.

Most lower Ivy Bridge parts boosted clocks by around 100-200 MHz for the same price as their Sandy Bridge predecessors and they all used much less power to reach those clocks. When pushed hard, however, Ivy Bridge tended to overheat at lower clocks than Sandy Bridge did. While a Core i7 2600K would quite readily hit 4.4 GHz on a decent cooler with temperatures below 70C, a Core i7 3770K would usually run out of steam at 4.2 GHz, despite a lower package power, and it'd hit 80-90C. Intel had changed from high quality, expensive solder thermal junctions to cheap thermal paste. Just what you want on your $330 CPU. It shows again that, in tech, price does not always (or usually) relate to quality.

As the i5-3470, it had a maximum multiplier of 36, to give a peak clock of 3.6 GHz, and a base clock of 3.2 GHz. HyperThreading(tm) was disabled, and the thermal design power (TDP) was 77 watts. This particular part replaced an i5 2500, which was almost identical. The i5 2500 had a 3.3 GHz base clock and a 3.7 GHz peak clock, but in reality it hovered around 3.5 GHz. The 2500 was rated to 95 watt thermal design power, although I had trouble getting more than 70 watts out of it, even in utterly unreasonable stress tests. The same Prime96 AVX small-FFT test that got 76 watts out of a Core i5 2500 could manage only 43 watts out of this i5 3470. The Sandy Bridge has 100 MHz more clock, but that's about it.

As stated, the Core i5-2500 had a 100 MHz clock advantage, as well as a faster onboard GPU, but in practice the Ivy Bridge also allowed DDR3-1600 over the 1333 on the Sandy Bridge, so improving maximum memory bandwidth from 21 GB/s to 25.6 GB/s.

It was a wash. There was less than 5% either way between the two CPUs. When the i5 3470 was new, it was best to spend £15 more and get an i5 3570K. It ran 200 MHz faster across the board and used the silicon with much more onboard GPU.

The turbo table for this was 36, 36, 35, 34.

ProcessClockFeature CountPlatform
22 nm3200 MHz1,008,000,000FCLGA1155
L2 CacheSpeedWidthBandwidth
256 kB x43200 MHz256 bit102.4 GB/s
Intel Core i5 3570K SR0PM - 2012
The i5-3570K replaced the i5-2500K in the market as a mainstream/high end overclockable CPU. Overclockers were expecting great things from the 22 nm shrink of the very well clocking Sandy Bridge. An i5-2500K would usually get to 4.0 GHz and more wasn't unheard of.

Ivy Bridge ran hot. Keep it cool and a 3570K would run between 4.2 and 4.3 all day, but keeping it cool was a problem. Intel had changed the integrated heatspreader to being attached to the die with cheap thermal interface material instead of metal solder of Sandy Bridge, Nehalem, and all previous processors as far back as Willamette!

With a Corsair H80 closed loop liquid cooler, this 3570K hit 92C at 4.2 GHz all-core load and that was considered good!

You'd think i5-3570K was the "overclockable version" of i5-3570, and you'd be wrong. i5-3570 was much less popular than i5-3470, and was made of the same HM-4 silicon as 3470. 3570K was made of the HE-4 silicon....We'd better explain that.

Ivy Bridge had 2 core and 4 core levels, and each of those has 6 GPU and 16 GPU levels, for four total die models. M-2 was 2/6, H-2 was 2/16, HM-4 was 4/6 and HE-4 was 4/16. 3570K had the 16 execution unit GPU enabled, but 2 MB of the 8 MB L3 cache was disabled and SMT was disabled.

Bizzarely, the H-2 silicon was only ever used in Core i3-3245 and Core i3-3225.

The turbo table for this was 38, 38, 37, 36.

ProcessClockFeature CountPlatform
22 nm3400 MHz1,400,000,000FCLGA1155
L2 CacheSpeedWidthBandwidth
256 kB x43400 MHz256 bit108.8 GB/s
Intel Xeon E5-2603 SR0LB - 2012
In 2012, Intel released the Sandy Bridge-E processors and differentiated them by socket. 1P LGA 2011 got "E", 1P/2P LGA 1356 got "EN" and 1P/2P/4P LGA 2011 got Sandy Bridge-EP.

This one is an EP, meaning it supports Direct Media Interface (DMI, the connect to the chipset), quad-channel memory, and 2x Quick Path Interconnect (QPI, inter-processor communication). Sandy Bridge-E also added official DDR3-1600 support.

The main change to the "E" series was that they were made up of "slices". Each slice was two cores and 5 MB of L3 cache. L2 cache was a static 256 kB per core. On the die, a slice was a core above, 2.5 MB of L3, crossing the middle of the die, then another 2.5 MB L3, then the core at the bottom. A ring bus went through all the L3 caches, as can be seen.

Sandy Bridge-E floorplan
Image copyright Intel

At best here, then, we have eight cores, 20 MB L3 cache, four DDR3 channels, two QPI links. The diagram is the "four slice" design. These have the "C1" or "C2" stepping code. Our part here, with sSpec of SR0LB, was not C1 or C2: It was M1, meaning it was a two-slice die.

The Xeon E5-2603 was not the best. In fact, it was the worst, or at least the worst with four cores. It used either binned (disabled) 8-core chips (these are C1 or C2 and very rare to see in four core configuration), or the smaller Sandy Bridge-E which had only two slices on the silicon.

It was limited to 1.8 GHz, TurboBoost was disabled, HyperThreading was disabled, QPI was limited to 6.4 GT/s (7.2 and 8.0 were the usual speeds) and the memory interface had 1333 and 1600 speeds disabled. It was, however, very cheap, just $198 on release.

This one came in a Dell Precision T5600 workstation which had been configured more or less to its minimum spec. Of the two processors supported, one of them plain wasn't fitted at all, the one that was was this slow Xeon. It did have 16 GB RAM, an awful lot for the time, which was both registered and ECC protected.
ProcessClockFeature CountPlatform
32 nm1800 MHz1,008,000,000FCLGA2011
L2 CacheSpeedWidthBandwidth
256 kB x41800 MHz256 bit57.6 GB/s

Intel Xeon E5 2640 SR0KR - 2012
This one was part of a pair which both had capacitors knocked off the back. On this one we can see a failed repair attempt. The LGA-2011 socket took Sandy Bridge EP and some Ivy Bridge chips. Some more lowly Sandy Bridges were put on LGA-2011 also.

The Sandy Bridge E series supported quad-channel memory, for over 40 GB/s of bandwidth. With their very large L3 caches, they made excellent performers, particularly on large data sets such as SQL Server. Compared to the E5-2603, the three-slice, six core E5-2640 had 15 MB L3 available to it, ran at a significantly faster 2.5 GHz, enabled HyperThreading, and could turbo at 3/3/4/4/5/5. This notation is how much more the multiplier will rise, from the standard (which is 25x on this 2.5 GHz processor from a 100 MHz base clock (BCLK). So, at maximum turbo, it will go to 28/28/29/29/30/30. 3.0 GHz isn't fantastic for Sandy Bridge, but it's certainly not poor.

Intel was reluctant to clock the Sandy Bridge Xeons very high, the E5-1620 was 3.6 GHz (3.8 GHz turbo), but only uniprocessor and the highest clocked of the generation.

Six cores, twelve threads, and DDR3-1333 support meant the E5-2640 was eminently competent. It should have been, given that it was over $880 RRP!
ProcessClockFeature CountPlatform
32 nm2500 MHz2,270,000,000FCLGA 2011
L2 CacheSpeedWidthBandwidth
256 kB x62500 MHz256 bit80 GB/s

AMD Ryzen 5 5600X - 2020

AMD's Ryzen 5000 series were "Zen 3". The Zen generation was the Ryzen 1000 ("Zen"), Ryzen 2000 ("Zen+"), Ryzen 3000 ("Zen 2") and this, Zen 3. The numbering skipped "4000" to align with mobile, which had begun its numbering a generation before Zen was released.

This entry will not cover AMD's chiplet design, nor its use of multiple foundries and a global supply chain. It will focus on this single CPU SoC (that's system-on-chip).

In 2020, I was about ready for a main system upgrade, a Core i5-3570K was really, really, really long in the tooth so I set a budget and started designing the replacement system. Some parts could be replaced piecemeal, such as the case (Lian-Li LanCool II) and the video card (RX 570 8GB), and they were in 2019 and early 2020. The design settled on was an Asus X570-PRIME motherboard and a Ryzen 7 3700X.

In 2020, AMD decided to bump prices. The "7" position in the market was more or less abandoned, so the Matisse refresh launch in July had a Ryzen 5 3600XT, a Ryzen 7 3800XT and a Ryzen 9 3900XT with a huge price difference between the 3600XT and 3800XT, more than enough to park a 3700XT in, but deliberately left empty. Apple's iPhone 6, for example, was another example of this strategy: The barely capable 16 GB model was much, much lower in price than the 128 and 256 GB models. It encourages people to move up to the next band: More people who would have bought a "7" will go to the "8" or "9" than will go to the "6". I went to the "6"!

The unified CCX (a CCX in Zen/+/2 was two blocks of four cores) brought performance benefits, and many incremental improvements across the die also brought benefits. The 8 core, 32 MB L3 cache CC chiplet, two cores disabled in this 5600X, was a generational step above anything else on the market. It was a new architecture on the same process, featuring larger integer scheduler, physical register file, wider issue, larger reorder buffer, 50% wider FPU dispatch and FMA operations one cycle faster. It's a similar generational improvement as Haswell to Skylake was: "More of the same".

The launch of Vermeer was not a happy one. Early models, including this one, appear to have been faulty, but as of time of writing this is still developing. AGESA updates made some settings more or less reliable, mostly more, but this 5600X could not run with Core Performance Boost (Analogous to Intel's TurboBoost), Memory XMP, or Precision Boost Overdrive enabled in AGESA 1.1.0.0 D or earlier... Everything points to a fault with the InfinityFabric or the wider I/O chiplet.

After the release of AGESA 1.2.0.0 in January 2021, which was meant to fix this issue for most people, didn't fix it for this CPU, it appeared this CPU was just worse affected than most others. Production codes later than around 2047 (week 47, 2020) were unaffected completely. The CPU was RMAed, the retailer confirmed it faulty, the replacement arrived and worked first time.

ProcessClockFeature CountPlatform
7 nm4,600 MHz?AM4
L2 CacheSpeedWidthBandwidth
512 kB x64,600 MHz256 bit184 GB/s

AMD Ryzen 5 5600X - 2020

An RMA replacement for the one above. This was the very first CPU I've ever had dead on arrival (DOA). I've had DOA motherboards, HDDs, RAM, you name it, but until now, never a CPU.<

Before about week 47 2020, AMD Vermeer (Zen 3 chiplets) had some manner of issue with either the Infinity Fabric or the I/O chiplet, where if Core Performance Boost or Precision Boost Overdrive was enabled, the system was unstable. You'd think "Aha, no, that's the core complex chiplet, not the I/O! However, enabling XMP to run RAM at 3200 MHz was an instant crash.

Some people reported adjusting various I/O chiplet and Infinity Fabric related voltages alleviated the issues, others found that AGESA 1.2.0.0 fixed it. Still others could not achieve stability in any circumstances, such as the one above.

This one had what appeared to be a poorer quality core complex die. Ryzen Master showed a "silver star" core (second best for clocking) but not a "gold star" (best for clocking) core, and PBO wouldn't usually hold much over 4.6 GHz as an all-core load, despite a +200 offset applied, TDP relaxed, thermals improved, EDC and TDC raised, etc.

It wasn't a dog by any means. Cinebench R20 scored 4,478 and R23 scored 11,724, for example as all-core workloads, but this is about what it should score with just default tuning. That it needs encouragement to go that far isn't great for this CPU, but it is great in general: Precision Boost Overdrive will get the CPU's peak performance, or very close to it, straight out of the box. If you have a worse CPU than normal, it'll still get you as much has it can. Only if you have a far better CPU than normal (you don't, they all went into the Ryzen 9s!) will PBO leave performance on the table.

Single threaded performance on the Ryzen 5 5600X was, in a word, extreme. It not only beat everything Intel could throw at it, but also everything else in the Vermeer-based stack, even the frighteningly expensing Ryzen 9 5950X. Cinebench R23 has the 5600X here at 1,581 single core... CPU power was being reported as, CCD + SoC, 13 watts. To underperform that, an Intel Core i5 10600K would use triple the power.
ProcessClockFeature CountPlatform
7 nm4,600 MHz?AM4
L2 CacheSpeedWidthBandwidth
512 kB x64,600 MHz256 bit184 GB/s
Overclocking? What's that? In the olden days, CPUs were run quite simply as fast as they would pass validation. All 386s needed a 5V power supply, so the voltage wasn't a variable, and they just were binned as they ran.

Later, Intel started to down-bin chips. By the late 486 era, Intel had got very good at making 486s and everything it churned out would hit 66 MHz easily, but this would have caused a glut of 66 MHz parts: The market still demanded slower and cheaper chips, as well as faster and more expensive ones. If Intel was to simply sell everything at 66 MHz, then Cyrix and AMD would be more than happy to undercut Intel dramatically with 50, 40 and 33 MHz processors, subsidised by faster and more expensive 66 and 80 MHz chips.

Similarly, by the Pentium-MMX days (about 1996 or so), everything Intel sold was capable of running 200-250 MHz, but Intel was still selling parts rated for 150, 166 and 180 MHz. It was nothing to buy a Pentium 166-MMX and run it at 233MHz.

Many overclockers cut their teeth on a Celeron 300A, in 1998, which would overclock really easily to 450 MHz, giving massive performance improvements.

The pattern repeated itself time and again, low clocked CPUs could very often be overclocked explosively. Pentium4 1.6A processors would often hit 2.8-3.0 GHz, AMD Opteron 146 and 165 were popular, while sold at 2.0 and 1.8 GHz respectively, it didn't take much to get them going at 2.8 GHz. Intel's Core 2 Duos were almost universally capable of 3.2 GHz, later Intel's Core i5 2500K became legendary for being able to hit very high clocks, from a base clock of 3.3 GHz it would often pass 4.0 with ease. Most of them found a nice happy place between 4.2 and 4.4 GHz.
Cancelled Maybes
10 GHz by 2004 Intel's 2001 promise to hit 10 GHz in the next few years (this was based on critical-path optimisation) evaporated quite spectacularly with Prescott in 2004. A successor, Cedar Mill (65 nm) appeared, didn't run much faster (it was a direct die-shrink) and was still a 31 stage pipeline. Tejas, the successor to Cedar Mill, was taped out and tested by the end of 2003. It had a 40-50 stage pipeline and ran as high as 7 GHz. The testing parts ran at 2.8 GHz and 150 watts. A full-on 5 GHz Tejas would have needed nearly 300 watts. Intel insiders told me at the time that they had LN2 cooled Tejas parts running at 7.2 GHz quite stable. Trouble was, they used 320 watts and were about 30% faster than the already terrifyingly fast Athlon64 FX-51 (which used about 70 watts). In May 2004, Intel announced that Tejas was to be cancelled.

Athlon "Mustang" In 1999, AMD revealed they were kicking around different K7 SKUs, the names "Athlon Professional" and "Athlon Select" were mentioned for high-end and entry-level parts. "Select" became the Duron, while "Professional" was never released, but what was it? The MHz race to 1,000 MHz went a bit faster than anyone, even AMD, had anticipated. AMD was preparing a 50-60 million transistor successor to the original K7, by integrating L2 cache, expanding SSE support to full, and adding a hardware prefetch system. This was called "Thunderbird" and it was going to go "Xeon-like" with a workstation/server class variant with 1 MB L2 cache named "Mustang". As things worked out, however, "Thunderbird" was not ready but Athlon could not scale past 1 GHz due to its off-due L2 cache being limited to just 350 MHz. "Thunderbird" became "Palomino" and a very easy validation of the unchanged Athlon with 256 kB of on-die L2 cache was released as "Thunderbird". "Palomino" debuted as "Athlon4" in small quantities as mobile parts, later as "Athlon XP", but AMD's original intention was to go straight from the cartridge-based Athlon to what eventually was released as Athlon XP. This would have been an awesome bump in performance and would have probably buried the Pentium4 before it had ever had chance.
So what of Mustang? A 1 GHz sample of a Thunderbird (no enhancements) with 1 MB cache which leaked out turned out to be maybe 10% faster than an equivalent Thunderbird - Palomino would have been faster than Mustang. As we found later with Barton, which was maybe 5% faster with 512 kB cache instead of 256 kB over Athlon XP, the K7 core wanted faster cache, not more of it. Palomino's prefetcher made the cache look faster, so gave the core a large performance boost.

The Athlon Backup Plan What, though, if Athlon had failed or been delayed? AMD had another trick up its sleeve, another revision of the venerable K6. At 180 nm, the AMD K6-2+ had 128 kB of L2 cache onboard, the K6-III had 256 kB. While they had short pipelines and couldn't clock highly, they were somewhat present in the retail channel and enthusiasts reported success in hitting as much as 650 MHz with them. They were highly competitive with Intel's Celerons, but unnecessary: AMD's Durons were utterly destroying Intel's low-end.

SIMD comes to x86 SIMD allows a single instruction to work on two, four or eight pieces of data at the same time. The instruction and data are loaded normally, but the data is in a packed format: A 128 bit MMX/SSEx value is actually four 32 bit values concatenated. The same instruction works on all four values at once. This is very cheap to implement, as the instruction setup and transit logic can be shared, while only the ALUs or FPUs need duplication and they're small without their supporting logic.
By 1995, Intel engineers and architectural designers wanted to add SIMD to the next CPU core, P6. Intel management was more conservative, however, as adding a new instruction set and register aliasing meant a new CPU mode. Could it be done by modifying the existing P5, and in a way it wouldn't be so intrusive? It could be, and a very limited form of SIMD which worked on integer data formats only was added to P5 and named "MMX". It did very little and had extremely limited use: It could be used to accelerate the inverse discrete cosine transform stage of DVD decoding (and the discrete cosine transform step of MPEG1/2/4 encoding), but very little else showed any benefit. What was originally proposed eventually became SSE, released with the PentiumIII in 1999. Had Intel been that little more aventurous, MMX could easily have been something like AMD's "3DNow!" which added SIMD as an extension to MMX itself, making MMX much more useful for real-world applications and allowing it to do floating point add/sub/mul/reciprocal and so on. Performance of 3DNow! implementations of code which even a fast P6 FPU would choke on were very fast.

3DNow! ultimately failed. Some game engines supported it, DirectX also did, and on Athlon and AthlonXP. AthlonXP added in 19 new instructions which were part of SSE working on integer data. As 3DNow! was aliased onto the x87 registers, it could be saved and restored using conventional FSAVE and FRSTOR instructions, meaning the OS need not support 3DNow! at all for code to be able to use it. AMD continued 3DNow! support until the Bulldozer architecure on laptop, desktop and server, and no AMD ultra-mobile core has ever supported it (Bobcat didn't have it). 3DNow! prefetch instructions, which remain a really good way to prefetch in certain situations, remain supported. The most recent APU supporting 3DNow! is Llano, which is based on the Phenom-II era "Stars" core and has a VLIW-5 based TeraScale2 (Radeon HD 5000 series) GPU integrated.

Process Nodes and Fabrication A process node is the "fineness" of the fabrication process, or how small a feature it can etch into the silicon, usually measured by "feature size", but not always. Gate length is another measure, and, by around 2013, the process node was simply a number and bore little relation to what was actually being fabricated.

We started measuring them in microns (µm, micrometers, one millionth of a metre), so we had Pentium MMX fabricated ("fabbed") on a 0.35 micron process. The next process was a 0.25 µm (K6-2, Pentium-II), the one after was 0.18 µm, then 0.13 µm, 0.09 µm, 0.065 µm... we started to use nanometers then, so 130 nm, 90 nm, 65 nm, 45 nm, 22 nm, 20nm (at TSMC), then 14 and 16 nm in 2016. Samsung was shipping 7 nm in 2018, GlobalFoundries expects volume in early 2019, but Intel, always one step ahead of everyone else, royally screwed up its 10 nm process and was still stuck on 14 nm.

There's a clear path to at least 4 nanometers according to Intel but at that point there are too few atoms across a gate to be able to actually hold a proper band-gap. The 14 nm node has just thirty silicon atoms across the gate (silicon's lattice constant is 0.543 nm). The semiconductor properties of silicon depend on its small band-gap, which depends on molecular orbitals, which require a large number of contributing atomic orbitals. Less than 30 and there aren't enough atomic orbitals to contribute, so the band-gap between bonding and antibonding orbitals is poorly defined.

The width of the gate, that is, how much insulating material is between the two ends, was just 25 nm on the 65 nm bulk silicon process. As there were fewer than 100 atoms across the gate, leakage via quantum tunneling was significant, leading to higher power use. A typical contract fabrication would have two processes, low power (LP) and high performance (HP). The LP process would have a wide gate, maybe 50 nm, but poorer density as a result. As the distance between transistors affects how fast they can be switched (clock frequency), LP fabrication means the chip doesn't run as fast. HP has a narrower gate, so runs faster, but uses far more power.

The gate's thickness was down to as little as 1.5 nm (Intel), meaning around three atoms of oxide insulated the switching part of the transistor: This causes massive leakage. By 22 nm, in 2012, the gate oxide layer was just 0.5 nm, about twice the diameter of a silicon atom, and could not get any smaller.

Intel delayed 14nm from "the end of 2013" to the last quarter of 2014, a nearly-unprecedented year long delay to a process.

Volume availability of 14nm was not until the middle of 2015, nearly two years after estimates, and Broadwell just a die shrink of the earlier Haswell, from 22 to 14 nm, a "tick" in the "tick-tock" model. Intel considered the 22 to 14 nm switch to be the most difficult it had ever done. Because Broadwell was so late, Skylake was ready to launch just a few months later.

And that's where Intel's story ends for now! 14 nm was refined many times, to a "14+++" for Comet Lake, but 10 nm was always slightly out of reach, not ready for prime time. Intel sampled 10 nm in 2018, but had not achieved mainstream availability even by 2020: In 2020, Intel's mainstream was still 14 nm. The rest of the industry was pushing 7 nm and even ramping up 5 nm. To be fair on Intel, its 14 nm class process had the best geometry in the industry, but did not match its leading geometry with leading density, which is the entire point!

Intel, however, was still struggling to progress beyond 14 nm. Some very limited run 10 nm products had seen the light of day. Intel stretched the 14nm Skylake core through several "generations" of thoroughly similar chips, Skylake, Kaby Lake, Coffee Lake, Cannon Lake, Whiskey Lake and Comet Lake all of which used the same central architecture.

Now, Intel was the worst victim of this, and fell the furthest, but the struggle below 20 nm was industry-wide. GlobalFoundries completely failed to make a working 14 nm, it licensed Samsung's 14LPP. TSMC initially sold its proces, a little less tight than Samsung or Intel's 14 nm, as a 16 nm. For the first time in living memory, Intel was not first to a major node, Samsung was. GlobalFoundries ceased R&D and is not persuing nodes below its optimised 14 nm, which it calls 12 nm. TSMC came out of it a winner.

On 10 nm, Intel hit it out the park to begin with: A year after TSMC and Samsung, but almost twice their density! TSMC and Samsung hovered between 50 and 60 million transistors per square millimeter, Intel released 10 nm Comet Lake with 100.8! Then failed to make it mass production. It was just too low yielding. Intel scrapped that and tried again. Intel had actually made a 7 nm class process but lacked the technology to deliver it well. Instead of extreme ultra-violet, Intel used self-aligned quad patterning (so did TSMC) but just didn't get it working well enough for prime time.

As of 2020, TSMC has entered volume production of 5 nm (density 173) and Samsung is close behind with an early (this means "improvable) 5 nm process at a density of 127. Intel is skipping 10 nm entirely and aiming for a 7 nm class process, which is supposedly aiming for 200-250 million transistors per square millimeter: Or fitting an entire Sandy Bridge Core i7 in four square millimeters... About a grain of rice. It is likely Intel will consider this to be 5 nm-class.

Now Intel has lost its foundry advantage, there is talk of Intel going fabless and spinning off its manufacturing business like AMD did. The future also sees TSMC amd Samsung aiming for either 3.5 nm or 3 nm class nodes. At TSMC, it will not be a refinement of the 5 nm (which was a refinement of 7 nm which was a refinement of abandoned 10 nm) but a new process. TSMC's 3 nm is expected to hit 200 MT/mm^2 in logic, less in SRAM (CPUs use a lot of SRAM, around 70% of the die). Both TSMC and Samsung have prototype production running, and believe they could enter risk production (an industry term meaning the yield is very low). Some engineers have been referring to 3.5 nm and 3 nm as "35 angstrom" and "30 angstrom" respectively. Samsung is using gate-all-around field effect transistors (GAAFETs), while TSMC is retaining FinFETs for a proposed 3.5 nm and may use GAAFETs for 3 nm.

In the distant future, of 2025-2030, it is felt that scaling cannot go beyond 250-300 MT/mm^2 (a refined "end point" 3 nm). This would be enough to fit Nvidia's enormous TU102 (GeForce RTX TITAN) in just 62 mm^2. A GPU built on this "end point" would be able to clock at around 3 GHz and have 50,000 "cores", for a peak floating point performance of around 300 TFLOPS. The fastest 2020 GPU comes in at 35 TFLOPS.
Intel S-Specs While AMD binning codes (e.g. CACJE on the Phenom II X4 955s here) are similar to Intel's, Intel ran a more tightly controlled operation for longet, so they will be discussed here.
Intel's S-spec today gives a very particular set of configurations its own code. So if we have S-spec SR0LB, we know it's made from a Sandy Bridge-EP, has a maximum multiplier of 18, Turbo is disabled, HyperThreading is disabled, four cores and 10 MB L2 ache are disabled, memory controller's 1333 and 1600 MHz capabilties are disabled, and it is configured to a TDP of 80 watts and a throttle temperature of 95C.
The S-spec doesn't tell anyone all this, it is instead a unique identifier, like a catalog number, which enables it to be looked up.

Today an S-spec is a very specific processor SKU, so all SR0LB parts are Xeon E5-2603. However, the marketing name may be given to multiple S-specs, particularly if multiple die types are used for the same SKU. SE5-2603 only ever used SR0LB in retail (it had some pre-qualification QB-codes, we don't count those) but if we look at the Xeon E5-2640 it came in both C1 and C2 steppings, meaning different silicon was used. Slightly different, but still different. This means E5-2640 used S-spec SR0H5 and SR0KR.

Intel says that identical SKUs will always have the same base performance. In these days of high temperatures and variable maximum clocks, a later stepping often runs cooler than an earlier one, increasing the maximum performance. In our example's case, the C2 stepping fixed an "erratum" (a bug) in VT-d.

If we drop back a few years to the Intel Pentium 4A, like our example of SL63X, we find Intel had three released steppings, B0, C1 and D1. B0 had four S-specs, C1 had four S-specs and D1 had two for the 1.8A model! Our SL63X was the second chronologically, and a B0 stepping. Earlier steppings ran at 1.525V, later ones at 1.5V.
     
Script by dutches   © 2002-2021 Hattix.co.uk. All rocks deserved. Creative Commons LicenseHattix hardware images are licensed under a Creative Commons Attribution 2.0 UK: England & Wales License except where otherwise stated.