Overclocking Extravaganza: Radeon HD 4890 To The Max
by Derek Wilson on April 29, 2009 12:01 AM EST- Posted in
- GPUs
Combined Memory and Core Overclocking: The Sweet Spot
In this round of tests, we combine our previous maximum overclocks. This is our compromise, in that we show the maximum potential of combined core and memory overclocking rather than effects of memory overclocking over each core clock speed we tested. While the latter option would be more complete, our tests do enough to show people what they need to know to find the sweet spot.
We theorized that with an extreme core clock speed that memory may have become a bottleneck to performance at some point. Despite the fact that increasing memory clock without increasing core clock didn't do much at all, we could see increased benefit beyond what one might expect based on our initial memory overclocking results.
Before we looked at varied memory clock with a stock core clock and varied core clock with a stock memory clock. Let's revisit both of those but also add in a twist. We will also look at percent increase in performance when overclocking memory with a 1GHz core clock and the percent increase in performance when overclocking the core with a 1.2GHz memory clock.
1680x1050 1920x1200 2560x1600
1680x1050 1920x1200 2560x1600
1680x1050 1920x1200 2560x1600
1680x1050 1920x1200 2560x1600
Note that in both cases we see a much bigger boost in performance. This means that while applications tend to be very heavily compute limited, at higher core clock speeds on AMD hardware memory bandwidth increasinly becomes a bottleneck. Now let's take a look at what we get when going from a completely stock part to a maximally overclocked part at 1GHz/1.2GHz (core/mem).
1680x1050 1920x1200 2560x1600
To get a basic idea of what's going on, here's an example of two programs. Remember that this isn't really real world and is just to illustrate the concept.
The first application is completely compute bound and the second is 50% compute bound and 50% memory bandwidth bound. Both tests generate 100 frames per second on a stock Radeon HD 4890. If we increase core clock speed 10%, the first application will generate 110 frames per second, while the second one would only generate 105. This is because we only see the 10% benefit while doing half of the work. If we look at only boosting memory performance 10%, the first program delivers only 100 fps while the second hits 105 again. Pushing both memory and core clock speed up 10% each gives us 110 frames per second from both applications. Basically.
Nothing is really that contrived or works like that, but the important thing to remember is that different applications can make varying use of different resources, and balancing those resources is important to ensuring the best performance in the most efficient package.
So, to find the sweet spot for your overclock, you will want to increase core clock speed as much as you can. Then bump up memory clock and see how high you can get it and remain stable. Use a real world application to test performance at each point and then use a binary search like algorithm to find the sweet spot in a short number of tests. And there you have it. We didn't do this for you, but what's better practice than a little hands on experience right? Besides, if gives readers the opportunity to compare notes in the comments on what the optimal memory clock for a 1GHz core clock on the 4890 would be. Have fun!
61 Comments
View All Comments
walp - Thursday, April 30, 2009 - link
I have a gallery how to fit the Accelero S1 to 4890(in swedish though):
http://www.sweclockers.com/album/?id=3916">http://www.sweclockers.com/album/?id=3916
Ah, here's the translated version: =)
http://translate.google.se/translate?js=n&prev...">http://translate.google.se/translate?js...mp;sl=sv...
You can change the volt with every 4890-card without bios-modding since they all are the same piece of hardware:
http://vr-zone.com/articles/increasing-voltages--e...
Its very easy that it is so fortunate, cause ASUS Smartdoctor sucks ass since it doesnt work on my computer anymore.
(CCCP:Crappy-Christmas-Chinese-Programmers...no pun intended ;)
\walp
kmmatney - Thursday, April 30, 2009 - link
Cool - thanks for the guide. I ordered the Accelero S1 yesterday. Nice how you got heatsinks on all the power circuitry.balancedthinking - Wednesday, April 29, 2009 - link
Nice, Derek is still able to write decent articles. Bad for the somewhat stripped-down 4770 review but good to see it does not stay that way.DerekWilson - Wednesday, April 29, 2009 - link
Thanks :-)I suppose I just thought the 4770 article was straight forward enough to be stripped down -- that I said the 4770 was the part to buy and that the numbers backed that up enough that I didn't need to dwell on it.
But I do appreciate all the feedback I've been getting and I'll certainly keep that in mind in the future. More in depth and more enthusiastic when something is a clear leader are on my agenda for similar situations in the future.
JanO - Wednesday, April 29, 2009 - link
Hello there,I really like the fact that you only present us with one graph at a time and let us choose the resolution we want to see in this article...
Now if we only could specify what resolution matters to us once and have Anandtech remember so it presents it to us by default every time we come back, now wouldn't that be great?
Thanks & keep up that great work!
greylica - Wednesday, April 29, 2009 - link
Sorry for AMD, but even with a super powerful card in Direct-X, their OPenGL implementation is still bad, and Nvidia Rocks in professional applications running on Linux. We saw the truth when we put an Radeon 4870 in front of an GTX 280. The GTX 280 Rocks, in redraw mode, in interactive rendering, and in OpenGL composition. Nvidia is a clear winner in OpenGL apps. Maybe it´s because the extra transistor count, that allows the hardware to outperform any Radeon in OPenGL implementation, whereas AMD still have driver problems (Bunch of them ), in both Linux and Mac.But Windows Gamers are the Market Niche AMD cards are targeting...
RagingDragon - Wednesday, May 13, 2009 - link
WTF? Windows gamers aren't a niche market, they're the majority market for high end graphics cards.Professional OpenGL users are buying Quadro and FireGL cards, not Geforces and Radeons. Hobbiests and students using professional GL applications on non-certified Geforce and Radeon cards are a tiny niche, and it's doubtful anyone targets that market. Nvidia's advantage in that niche is probably an extension of their advantage in Professional GL cards (Quadro vs. FireGL), essentially a side effect of Nvidia putting more money/effort into their professional GL cards than AMD does.
ltcommanderdata - Wednesday, April 29, 2009 - link
I don't think nVIdia having a better OpenGL implementation is necessarily true anymore, at least on Mac.http://www.barefeats.com/harper22.html">http://www.barefeats.com/harper22.html
For example, in Call of Duty 4, the 8800GT performs significantly worse in OS X than in Windows. And you can tell the problem is specific to nVidia's OS X drivers rather than the Mac port since ATI's HD3870 performs similarly whether in OS X or Windows.
http://www.barefeats.com/harper21.html">http://www.barefeats.com/harper21.html
Another example is Core Image GPU acceleration. The HD3870 is still noticeably faster than the 8800GT even with the latest 10.5.6 drivers even though the 8800GT is theoretically more powerful. The situation was even worse when the 8800GT was first introduced with the drivers in 10.5.4 where even the HD2600XT outperformed the 8800GT in Core Image apps.
Supposedly, nVidia has been doing a lot of work on new Mac drivers coming in 10.5.7 now that nVIdia GPUs are standard on the iMac and Mac Pro too. So perhaps the situation will change. But right now, nVidia's OpenGL drivers on OS X aren't all they are made out to be.
CrystalBay - Wednesday, April 29, 2009 - link
I'd like to see some benches of highly clocked 4770's XFired.entrecote - Wednesday, April 29, 2009 - link
I can't read graphs where multiple GPU solutions are included. Since this article mostly talks about single GPU solutions I actually processed the images and still remember what I just read.I have an X58/core i7 system and I looked at the crossfire/SLI support as negative features (cost without benefit).