Just as I don’t see perfect scaling on a GPU (Radeon cards has so many more stream processors than Geforce cards, yet perform similarly), I don’t see anywhere near 4x scaling on a quad core in real world applications. How is a 512 core CPU different from a 512 core GPU? Sure, the architecture and memory model is different, but they have the same problems, don’t they? Synchronization, e.t.c.
My point with GPUs being faster is that most things that take time are threadable, and if they aren’t you’re possibly doing it wrong. Therefore the “limited” amount of problems that you can apply them on are the things that actually time consuming. GPUs are excellent for solving problems where things can be solved in parallel, like graphics. Each vertex and fragment can be calculated pretty much independently from all other vertices or fragments, so we get pretty good scaling with cores. For other programs, there are so many things that can be done in parallel, but people don’t bother to implement multithreading. It simply isn’t worth the light computation that most programs actually do. Most games today could theoretically use all your CPU cores if they were made for it. I’m not saying 4 cores is 4 times the performance, but you’d still be able to cram more speed out of them.
What we need to realize is that multi-core processors and multi-threading is the future. Nothing can change the fact that replacing a single core with two slightly slower cores is much more energy (and therefore heat) efficient. Doubling the clock rate to match that (theoretical) performance increase is impossible. Just think about the fact that we have 6-core CPUs running at over 3GHz. Can you run a single core processor at 18GHz? Heck, we have graphics cards with 3072 stream processors running at 830MHz. Let’s make a 2.5 terra Hertz stream processor!
PS: These numbers are just for fun, don’t bash the actual performance you’d get from 3072x830MHz vs 1x2549760MHz… xDDD