Top of the Class 478

So I have that old Intel D865PERL board, which is a Socket 478/AGP board. There’s a 3.2 GHz Northwood in it but of course I was wondering, what’s the fastest CPU this board supports? And it turns out to be a question which is not so easy to answer.

The fastest clock speed supported by the board is easy to figure out: 3.4 GHz, with 800 MHz FSB speed. But there are three different 3.4 GHz CPUs that the board supports: Intel calls them Pentium 4 3.40 GHz, Pentium 4 Extreme Edition 3.40 GHz, and Pentium 4 3.40E GHz. These are also known as Northwood, Gallatin, and Prescott, respectively—with the minor caveat that Intel’s ark calls the Extreme Edition variant Northwood while others call it Gallatin.

3.4 GHz/800MHz FSB Northwood/Gallatin/Prescott

What’s not in question is that the first (Northwood) has 8 KB L1 cache and 512 KB L2 cache; the second (Gallatin) has 8 KB  L1 cache, 512 KB L2 cache, and 2 MB L3 cache; and the third (Prescott) has 16 KB L1 cache and 1 MB L2 cache.

The Gallatin variant is essentially Northwood with additional 2 MB of L3 cache (a Xeon in a desktop package), both manufactured on the 130nm process. The Prescott is an updated design with double the L1/L2 cache and added SSE3 instruction set, and manufactured on the 90nm process. Note that Northwood/Gallatin CPUs display a 2001 copyright label on the heat spreader while Prescott has a 2004 copyright. All support Hyper-Threading, MMX, SSE, and SSE2.

Okay, so knowing the features and lineage, the Prescott and Gallatin should both be better than the Northwood. And in my tests, they indeed are. But the comparison between Prescott and Gallatin is not at all straightforward.

I used two simple benchmarks for comparing the processors: CPU-Z 1.77 and 3Dmark 2001 SE (with a fast graphics card, 3Dmark is rather sensitive to CPU performance). The board I used for comparisons was not the D865PERL (Rock Lake) but instead a rather similar Intel D875PBZ (Bonanza) with a Radeon HD 4670 running Windows XP.

In CPU-Z, the results were straightforward:

Northwood Gallatin Prescott
Single/Multi 171/109 172/109 182/132

The Northwood and Gallatin have more or less identical results because they use the same core and the test clearly doesn’t benefit from the L3 cache. The Prescott has bigger L1/L2 cache and especially the multi-threaded performance shows a significant improvement. In other words, newer is better.

Now 3Dmark 2001 SE (rounded averages from several runs):

Northwood Gallatin Prescott
3Dmarks 20,350 23,300 21,100

Hmm… the Northwood comes in last again, but the Gallatin beats the Prescott by a decent margin. That L3 cache really does something, after all.

So in the simple CPU-Z benchmark, the Prescott has about 6.5% better single-threaded performance than the Gallatin, but in 3Dmark (and that’s a 3D benchmark, not a pure CPU benchmark) the Gallatin is 10% faster than Prescott.

In other words, the answer is (as it so often is) “it depends”. The Northwood has no advantage over either of the others (other than a slightly lower TDP, see below), but depending on the workload, the Prescott can benefit from the improved architecture and doubled L1/L2 caches, or the Gallatin can take advantage of the relatively big L3 cache and easily beat the Prescott.

The 3.4 GHz Northwood has slightly lower TDP (89W) than the other two (103W and 102.9W), but it’s not a big difference. The Gallatin is nowadays considerably harder to find than the Prescott or Northwood, and may be also considerably more expensive, but doesn’t have to be. Back in the day, the 3.4 GHz Socket 478 Gallatin’s list price was $999 while the Prescott was listed at “only” $417.

This simple and unscientific benchmark illustrates that Prescott was not a particularly successful design. Despite the process shrink, it put out more heat than a Northwood at the same clock speed, presumably due to the larger caches. That earned Prescott the “PresHot” nickname. Intel was also quite unsuccessful in scaling the clock speed. While Northwood went from 1.6 GHz all the way to 3.4 GHz, Intel hit a wall with the Prescott and only released CPUs with up to 3.8 GHz clock speed. While the Prescott saw the addition of 64-bit architecture (AMD64 compatible) and virtualization (VT-x), it was not a great performer, only a great Watt-gobbler.

Summary: The coolest (in a very strictly metaphorical sense) Socket 478 CPU is the 3.4 GHz Extreme Edition Gallatin, but the 3.4 GHz Prescott is much easier to find and may be either slower or faster depending on the application.

This entry was posted in Intel, PC hardware, PC history, Pentium 4. Bookmark the permalink.

8 Responses to Top of the Class 478

  1. Haru Jayasekara says:

    Hi, could you tell me what model the big/tall capacitor near the speaker is on your D865PERL? I was given this board but it’s missing that particular capacitor.

  2. Michal Necasek says:

    I could… but not quickly. I don’t have the D865PERL with me physically and I’ll only be able to look at it over Easter (mid-April). But if you’re still interested, let me know. I don’t know off hand how hard it will be to see when the board is inside a case.

  3. Haru Jayasekara says:

    If I haven’t found out by then, I will let you know 🙂

  4. Why did NetBurst processors run so hot?

  5. Michal Necasek says:

    Lots of transistors, high clock frequency? Beyond that, I don’t really know.

  6. Prototyped says:

    Replay. There was an in-depth article on iXBT (later xbit labs) about it.

    https://web.archive.org/web/20050608023824/http://www.xbitlabs.com/articles/cpu/display/netburst-2.html

    Essentially the super long pipelines in NetBurst processors’ execution units would lose a lot of performance due to pipeline hazards such as needing to access data from cache or memory (which was many times slower than the ALUs and FPUs). So when that happened, the execution units would circulate the in-flight instructions until the hazards were resolved. This was very wasteful as, instead of power-gating the silicon or introducing pipeline bubbles, the ALUs and FPUs were doing throwaway work repeatedly.

    This was also why these processors were so slow considering their clock rates (and e.g. why Willamette was usually outperformed by the much lower-clocked Tualatin Pentium III, and why the mobile Pentium 4-Ms were handily beaten by the faster and cooler-running Pentium M Banias and Dothan, descendants of the Tualatin Pentium III).

    I wouldn’t call Prescott/Cedar Mill (65 nm shrink) necessarily “improved” over Northwood—their pipelines were much longer to enable higher clock rates, and Prescott/Prescott-2M/Cedar Mill (and their Pentium D versions Smithfield/Presler) were a substantial departure from the Willamette/Northwood architectures.

    In many ways the Pentium 4 series was a blind alley where the design team was chasing clock speed at the expense of everything else, including actual performance. There’s a reason that earlier in this era, AMD’s Socket A Athlons (Thunderbird, Palomino, Thoroughbred) were able to compete so well with the Pentium 4 and, later, as the Athlon 64s and Athlon X2s came down in price, they were able to easily dominate the Pentium 4 and D. (There’s also a good reason why the Pentium D was so inexpensive compared to the Athlon 64 X2 competitors.)

  7. Prototyped says:

    xbitlabs also had a specific deep dive into Replay as a followup:

    https://web.archive.org/web/20051201064319/http://www.xbitlabs.com/articles/cpu/print/replay.html

  8. Michal Necasek says:

    The premise of the NetBurst architecture was that it would scale to (circa) 10 GHz. Which sounds absolutely bonkers today, but must have sounded far less crazy after Intel had gone from 1 to 10 to 100 MHz and then to 1 GHz. The P6 arch basically increased the frequency tenfold, and enough people at Intel clearly thought NetBurst could too. But then they hit a really hard wall around 4 GHz and the whole thing fell apart. So yes, a blind alley.

    I do wonder how many (if any) people at Intel said right at the beginning “this is never gonna work” vs how many thought it would.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.