James Antill (illiterat) wrote,

Lies, damn lies, and benchmarks

Or why I hate "quick benchmarks"...

Recently I've started to see a lot more of what I'd call "quick benchmarks", often it's to do with yum but mainly I think that's because those tend to get sent to me by multiple methods. So I thought I'd try and write down why I react so negatively/dismissively to them, how people can spot the underlying problems that annoy me and even better some advise on how you can go about doing some real benchmarks if that kind of thing interests you (but it's much more work than quick benchmarks).

The summary of the problem is that quick software benchmarking often involves taking a huge amount of differences between two applications and have a single number result. Then you compare just the numbers, and come to a conclusion. So X gets 3 and Y gets 5 for problem ABCD ... therefore Y is 66% worse than X at ABCD. Except that might be a highly misleading (or worse) conclusion, for a number of reasons:

Read more...Collapse )
Tags: benchmarking, yum

Anonymous

August 1 2009, 23:04:11 UTC 3 years ago

  • New comment
Well one point you missed here is this:

app X does foo slower than app Y does foo.

For the user this means that X is worse than Y, the details about the implementation that causes this is irrelevant here (Y uses a cache while X does not, or X performs this additional steps).

Such benchmarks are not really helpful in finding and fixing issues, but the fact that X is slower when doing foo _is_ a problem (be it inefficient code or bad/worse design decision does not matter for the user).

Take this as an example: app A starts 20% faster than app B, so people will complain that B is worse here, B might be just doing more tasks on startup, but the user does not care about this the end result is that B _is_ slower at startup.

(well ok startup is not really a "task" but you should get the point).

Error

default userpic

Your reply will be screened

Your IP address will be recorded