Monday, September 27, 2010

An unbeatable benchmark for Web browsers

"I'm pleased to announce the timeless version of HomoSapienTest, an eons-old browser benchmark. More than Kraken, Sunspider, V8, and Dromaeo, HomoSapienTest focuses on only the real workloads. I believe that the benchmarks used in HomoSapienTest are the best in terms of reflecting realistic workloads, and give every other artificial benchmark a kick in the groin." - A parody of text from Release The Kraken, Mozilla Blog

There are so many performance tests available for Web browsers. Unfortunately, none of these appears to honestly indicate the real-world performance of contemporary Web browsers. Different vendors like to report the performance of their respective browsers based on the results of only select benchmarks - ones that make a vendor's browser appear snappier (for example, Apple uses a relatively unknown test called iBench, because it projects Safari 5 as the fastest). Unfortunately, all of these tests merely try to be representative of reality.

But who decides what this reality is? Humans, of course.

No matter what these tests say, a human is almost-instantly able to decide which browser is the faster one, and whether or not the level of performance is acceptable. Let Mozilla beat the Firefox-is-fast trumpet a million times. Let Microsoft beat the IE9-the-fastest trumpet a billion times. Fact remains that - and this can be felt by a human in mere seconds - that IE and even Firefox can be unacceptably sluggish on many machines that are just a few years old, whereas Opera and especially Chrome can provide acceptable performance on such systems.

I'm not sure if there have been research studies that compare the numbers produced by these artificial benchmarks to thumbs-up and thumbs-down ratings given by human users to different browsers. If a benchmark's results can match the ratings given by humans, then it can probably be called a good benchmark.

  1. Update (20-Mar-11): Perhaps it's time to design a benchmark for Web browsers that relies on objective ratings given by human raters [of course, with the raters unaware of the browser they're giving rating to].