At this point in the thread, I want to give the same caution I gave in the SGS2 and Sensation forums where claims of optimized or deficient benchmarks arise:
The benchmarks attempt to measure and show how hardware responds to a specific set of app calls to an OpenGL software library, usually made in some stressful way (if the benchmark is worth anything).
It's tempting to explain away unfavorable results, but in truth, if some app you need or want is coded in any way similarly to the benchmark in question, then that app is likely to run less well on your phone.
In the end, looking at all benchmarks is a good idea - but the best use of the graphics benchmarks are for app developers to choose which OpenGL calls to make to serve their audience - because there's more than one way to do about anything in graphics programming.
The way to not use the benchmarks is like results of a horse race.
There is no mystery whatsoever as to what the hardware can do. Sign a non-disclosure agreement with SoC maker as a recognized member of the hardware industry with a need to know and you can get the raw chip benchmarks straight from the horse's mouth. I absolutely promise that Qualcomm and Samsung and TI know precisely the performance of their graphics cores measured on bare metal.
At one time not long ago, they usefully published that in the open on the web. My favorite was the blog-published benchmark showing that Hummingbird could do more millions of triangles per second than Samsung measured and spec'd - by a wide margin. IOW - what the blogs reported was flatly unpossible for one particular measurement by one particular benchmark.
So - yep - it's a fine line. Look for benchmarks that exaggerate and through them out - but consider unfavorable benchmarks carefully because you might get an unfavorable app some day.
This whole rant goes back to my common claim - benchmarks have to correlate to the real world - and that ain't easy when you think about it.
Anyways - I promise if I had the answers, I'd tell you.