AMD's Radeon HD 5870: Bringing About the Next Generation Of GPUs
by Ryan Smith on September 23, 2009 9:00 AM EST- Posted in
- GPUs
DirectCompute, OpenCL, and the Future of CAL
As a journalist, GPGPU stuff is one of the more frustrating things to cover. The concept is great, but the execution makes it difficult to accurately cover, exacerbated by the fact that until now AMD and NVIDIA each had separate APIs. OpenCL and DirectCompute will unify things, but software will be slow to arrive.
As it stands, neither AMD nor NVIDIA have a complete OpenCL implementation that's shipping to end-users for Windows or Linux. NVIDIA has OpenCL working on the 8-series and later on Mac OS X Snow Leopard, and AMD has it working under the same OS for the 4800 series, but for obvious reasons we can’t test a 5870 in a Mac. As such it won’t be until later this year that we see either side get OpenCL up and running under Windows. Both NVIDIA and AMD have development versions that they're letting developers play with, and both have submitted implementations to Khronos, so hopefully we’ll have something soon.
It’s also worth noting that OpenCL is based around DirectX 10 hardware, so even after someone finally ships an implementation we’re likely to see a new version in short order. AMD is already talking about OpenCL 1.1, which would add support for the hardware features that they have from DirectX 11, such as append/consume buffers and atomic operations.
DirectCompute is in comparatively better shape. NVIDIA already supports it on their DX10 hardware, and the beta drivers we’re using for the 5870 support it on the 5000 series. The missing link at this point is AMD’s DX10 hardware; even the beta drivers we’re using don’t support it on the 2000, 3000, or 4000 series. From what we hear the final Catalyst 9.10 drivers will deliver this feature.
Going forward, one specific issue for DirectCompute development will be that there are three levels of DirectCompute, derived from DX10 (4.0), DX10.1 (4.1), and DX11 (5.0) hardware. The higher the version the more advanced the features, with DirectCompute 5.0 in particular being a big jump as it’s the first hardware generation designed with DirectCompute in mind. Among other notable differences, it’s the first version to offer double precision floating point support and atomic operations.
AMD is convinced that developers should and will target DirectCompute 5.0 due to its feature set, but we’re not sold on the idea. To say that there’s a “lot” of DX10 hardware out there is a gross understatement, and all of that hardware is capable of supporting at a minimum DirectCompute 4.0. Certainly DirectCompute 5.0 is the better API to use, but the first developers testing the waters may end up starting with DirectCompute 4.0. Releasing something written in DirectCompute 5.0 right now won’t do developers much good at the moment due to the low quantity of hardware out there that can support it.
With that in mind, there’s not much of a software situation to speak about when it comes to DirectCompute right now. Cyberlink demoed a version of PowerDirector using DirectCompute for rendering effects, but it’s the same story as most DX11 games: later this year. For AMD there isn’t as much of an incentive to push non-game software as fast or as hard as DX11 games, so we’re expecting any non-game software utilizing DirectCompute to be slow to materialize.
Given that DirectCompute is the only common GPGPU API that is currently working on both vendors’ cards, we wanted to try to use it as the basis of a proper GPGPU comparison. We did get something that would accomplish the task, unfortunately it was an NVIDIA tech demo. We have decided to run it anyhow as it’s quite literally the only thing we have right now that uses DirectCompute, but please take an appropriately sized quantity of salt – it’s not really a fair test.
NVIDIA’s ocean demo is a fairly simple proof of concept program that uses DirectCompute to run Fast Fourier transforms directly on the GPU for better performance. The FFTs in turn are used to generate the wave data, forming the wave action seen on screen as part of the ocean. This is a DirectCompute 4.0 program, as it’s intended to run on NVIDIA’s DX10 hardware.
The 5870 has no problem running the program, and in spite of whatever home field advantage that may exist for NVIDIA it easily outperforms the GTX 285. Things get a little more crazy once we start using SLI/Crossfire; the 5870 picks up speed, but the GTX 295 ends up being slower than the GTX 285. As it’s only a tech demo this shouldn’t be dwelt on too much beyond the fact that it’s proof that DirectCompute is indeed working on the 5800 series.
Wrapping things up, one of the last GPGPU projects AMD presented at their press event was a GPU implementation of Bullet Physics, an open source physics simulation library. Although they’ll never admit it, AMD is probably getting tired of being beaten over the head by NVIDIA and PhysX; Bullet Physics is AMD’s proof that they can do physics too. However we don’t expect it to go anywhere given its very low penetration in existing games and the amount of trouble NVIDIA has had in getting developers to use anything besides Havok. Our expectations for GPGPU physics remains the same: the unification will come from a middleware vendor selling a commercial physics package. If it’s not Havok, then it will be someone else.
Finally, while AMD is hitting the ground running for OpenCL and DirectCompute, their older APIs are being left behind as AMD has chosen to focus all future efforts on OpenCL and DirectCompute. Brook+, AMD’s high level language, has been put out to pasture as a Sourceforge project. Compute Abstract Layer (CAL) lives on since it’s what AMD’s OpenCL support is built upon, however it’s not going to see any further public development with the interface frozen at the current 1.4 standard. AMD is discouraging any CAL development in favor of OpenCL, although it’s likely the High Performance Computing (HPC) crowd will continue to use it in conjunction with AMD’s FireStream cards to squeeze every bit of performance out of AMD’s hardware.
327 Comments
View All Comments
erple2 - Tuesday, September 29, 2009 - link
What the heck are you talking about? Are you saying that electricity consumed by a device divided by the "volume" of the device is the only way to measure the heat output of the device? Every single Engineering class I took tells me that's wrong, and I'm right. I think you need to take some basic courses in Electrical Engineering and/or Thermodynamics.(simplified)
power consumed = work + waste
You're looking for the waste heat generated by the device. If something can completely covert every watt of electricity that passes through it to do some type of work (light a light bulb, turn a motor, make some calculation on a GPU etc), then it's not going to heat up. As a result, you HAVE to take into consideration how inefficient the particular device is before you can make any claim about how much the device heats up.
I'll bet that if you put a Liquid Nitrogen cooler on every ATI card, and used the standard air coolers on every NVidia card, that the ATI cards are going to run crazy cooler than the NVidia cards.
Ultimately the temperature of the GPU's depends a significant amount on the efficiency of the cooler, and how much heat the GPU is generating as waste. My point is that we don't have enough data to determine whether the ATI die runs hot because the coolers are less than ideal, Nvidia ones are closer to ideal, the die is smaller, or whatever you have. You have to look at a combination of the efficiency of the die (how well it converts input power to "work done"), the efficiency of the cooler (how well it removes heat from it's heat source), and the combination of the two.
I'd posit that the ATI card is more efficient than the NVidia card (at least in WoW, the only thing we have actual numbers of the "work done" and "input power consumed").
Now, if you look at the measured temperature of the core as a means of comparing the worthiness of one GPU over another, I think you're making just as meaningful a comparison as comparing the worthiness of the GPU based on the size of the retail box that it comes in.
SiliconDoc - Friday, September 25, 2009 - link
You simply repeated my claim about watts, and replaced core size, with fps, and created a framerate per watt chart, that has near nothing to do with actual heat inside the die, since the SIZE of the die, vs the power traversing through it is the determining factor, affected by fan quality (ram size as well).Your argument is "framerate power efficiency", as in watts per framerate, and has nothing to do with core temperature (modified by fan cooling of course to some degree), that the article indeed posts except for the two failed ati cards.
The problem with your flawwed "science" that turns it into hokum, is that no matter what outputs on the screen, the HEAT generated by the power consumption of the card itself, remains in the card, and is not "pumped through the videoport to the screen".
If you'd like to claim "wattage vs framerate" efficency for 5870, fine I've got no problem, but claiming that proves core temps are not dependent on power consumption vs die size ( modified by the rest of the card *mem size/power useage/ and the fan heatsink* ) is RIDICULOUS.
---
The cards are generally equivalent manufacturing and component additions, so you take the wattage consumed (by the core) and divide by core size, for heat density.
Hence, ATI cards, smaller cores and similar power consumption, wind up hotter.
That's what the charts show, that's what should be stated, that is the rule, and that's the way it plays in the real world, too.
---
The only modification to that is heatsink fan efficiency, and I don't find you fellas claiming stock NVIDIA fans and heatsinks are way better than the ATI versions, hence 66C for NVIDIA, 75C, 85C, etc, and only higher for ATI, in all their cards listed.
Would you like to try that one on for size ? Should I just make it known that NVIDIA fans and heatsinks are superior to ATI ?
What is true is a lager surface area (die side squared) dissipates the same amount of heat easier, and that of course is what is going on.
ATI dies are smaller ( by a marked surface area as has so foten been pointed out), and have similar power consumption, and a higher DENSITY of heat generation, and therefore run hotter.
erple2 - Friday, September 25, 2009 - link
Oops, "milliwatt" should be "kilowatt". I got the decimal place mixed up - I used kilowatt since I thought it was easier to see than 0.247, 0.140, 0.137, 0.181...SiliconDoc - Wednesday, September 23, 2009 - link
Let's take that LOAD TEMP chart and the article's comments. Right above it, it is stated a good cooler includes the 4850 that ILDE TEMPs in at around 40C (it's actually 42C the highest of those mentioned)."The floor for a good cooler looks to be about 40C, with the GTS 250(39C), 3870(41C), and 4850 all turning in temperatures around here"
OK, so the 4850 has a good cooler, as well as the 3870... then right below is the LOAD TEMP.. and the 4850 is @ 90C -OBVIOUSLY that good cooler isn't up to keeping that tiny hammered core cool...
3870 is at 89C, 4870 is at 88C, 5870 is at 89C ALL ati....
but then, nvidia...
250, 216, 285, 275 all come in much lower at 66C to 85C.... but "temps are all over the place".
NOT only that crap, BUT the 4890 and 4870x2 are LISTED but with no temps - and take the "coolest position" on the chart!
Well we KNOW they are in the 90C range or higher...
So, you NEVER MENTION why 4870x2 and 4980 are "no load temp shown in the chart" - you give them the WINNING SPOTS anyway, you fail to mention the 260's 65C lowest LOAD WIN and instead mention GTX275 at 75C...LOL
The bias is SO THICK it's difficult to imagine how anyone came up with that CRAP, frankly.
So the superhot 4980 and 4870x2 are given #1 and #2 spots repsectively, a free ride, the other Nvidia cards KICK BUTT in lower load temps EXCEPT the 295, but it makes sure to mention the 8800GT while leaving the 4980 and 4870x2 LOAD TEMP spots blank ?
roflmao
---
What were you saying about "why" ? If why the 8800GT was included is TRUE, then comment on the gigantic LOAD TEMP bias... tell me WHY.
SiliconDoc - Wednesday, September 23, 2009 - link
AND, you don't take temps from WOW to use for those two, which no doubt even though it is NOT gpu stressing much, will yeild the 90C for those two cards 4870x2 and 4980, anyway.So they FAIL the OCCT, but you have NOTHING on them, which would if listed put EVERY SINGLE ATI CARD @ near 90C LOAD, PERIOD...
---
And we just CANNOT have that stark FACT revelaed, can we ? I mean I've seen this for well over a year here now.
LET's FINALLY SAY IT.
---
LOAD TEMPS ON THE ATI CARDS ARE ALL, EVERY SINGLE ONE NEAR 90c, much higher than almost ALL of the Nvidia cards.
pksta - Thursday, September 24, 2009 - link
I just want to know...With this much zeal about videocards and more specifically the bias that you see, doesn't it make you sound biased too? Can you say that you have owned the cards you are bashing and seen the differences firsthand? I can say I did. I had an 8800 GT and it was running in the upper 80s under load. I switched to my 4850 with the worst cooler I think I've ever seen mind you, and it stays in the mid to upper 60s under load. The cooler on the 8800 gt was the dual-slot design that was the original reference design. The 4850 had the most pathetic fan I've ever seen. It was similar to the fan and heatsink Intel used on the first Core2 stuff. It was the really cheap aluminum with a tiny copper circle that made contact with the die itself. Now, don't get me wrong I love ATI...But I also love nVidia...Anything that keeps games getting better and prices getting better. I honestly don't think, though, that the article is too biased. I think maybe a little for ATI but nothing to rage on and on about. Besides...Calm down. You know nVidia will have a response for this.SiliconDoc - Sunday, September 27, 2009 - link
1. Who cares what you think about how you percieve me ? Unless you have a fact to refute, who cares ? What is biased ? There has been quite a DISSSS on PhysX for quite some time here, but the haters have no equal alternative - NOTHING that even comes close. Just ASK THEM. DEAD SILENCE. So, strangely, the very best there is, is BAD.Now ask yourself again who is biased, won't you? Ask yourself who is spewing out the endless strings... Do yourself a favor and figure it out. Most of them have NEVER tried PhysX ! They slip up and let it be known, when they are slamming away. Then out comes their PC hate the greedy green rage, and more, because they have to, to fit in the web PC code, instead of thinking for themselves.
2. Yes, I own more cards currently than you will in your entire life. I started retail computer well over a decade ago.
3. And now, the standard red rooster tale. It sounds like you were running in 2d clocks 100% of the time, probably on a brand board like a DELL. Happens a lot with red cards. Users have no idea.
4850 with The worst fan in the World ! ( quick call Keith Olbermann) and it was ice cold, a degree colder than anything else in the review. ROFLMAO
Once again, the red shorts pinnocchio tale. Forgive me while I laugh, again !
ROFLMAO
Did you ever put your finger on the HS under load ? You should have. Did you check your 3D mhz..
http://forums.anandtech.com/messageview.aspx?catid...">http://forums.anandtech.com/messageview.aspx?catid...
Not like 90C is offbase, not like I made up that forum thread.
4. I could care less if nvidia has a response or not. Point is, STOP LYING. Or don't. I certainly have noticed many of the lies I've complained about over a year or so have gone dead silent, they won't pull it anymore, and in at least one case, used in reverse for more red bias, unfortunately, before it became the accepted idea.
So, I do a service, at the very least people are going to think, and be helped, even if they hate me.
SiliconDoc - Wednesday, September 23, 2009 - link
Well of course that's the excuse, but I'll keep my conclusion considering how the last 15 reviews on the top videocards were done, along with the TEXT that is pathetically biased for ati, that I pointed out. (Even though Derek was often the author).--
You want ot tell me how it is that ONLY the GTX295 is near or at 90C, but ALL the ati cards ARE, and we're told "temperatures are all over the place" ?
Can you really explain that, sir ?
529th - Wednesday, September 23, 2009 - link
holy shit, a full review is up already!bill3 - Wednesday, September 23, 2009 - link
Does the article keep referring to Cypress as "too big"? If Cypress is too big, what the hell is GT200 at 480mm^2 or whatever it was? Are you guys serious with that crap?I've heard that the "sweet spot" talk from AMD was a bit of a misdirection from the start anyway. IMO if AMD is going to compete for the performance crown or come reasonably close (and frankly, performance is all video card buyers really care about, as we see with all the forum posts only mentioning that GT300 will supposedly be faster than 58XX and not anything else about it) then they're going to need slightly bigger dies. So Cypress being bigger is a great thing. If anything it's too small. Imagine the performance a 480mm^2 Cypress would have! Yes, Cypress is far too small, period.
Personally it's wonderful to see AMD engineer two chips this time, a bigger performance one and smaller lower end one. This works out far better all around.
The price is also great. People expecting 299 are on crack.