James Antill (illiterat) wrote,

Lies, damn lies, and benchmarks

Or why I hate "quick benchmarks"...

Recently I've started to see a lot more of what I'd call "quick benchmarks", often it's to do with yum but mainly I think that's because those tend to get sent to me by multiple methods. So I thought I'd try and write down why I react so negatively/dismissively to them, how people can spot the underlying problems that annoy me and even better some advise on how you can go about doing some real benchmarks if that kind of thing interests you (but it's much more work than quick benchmarks).

The summary of the problem is that quick software benchmarking often involves taking a huge amount of differences between two applications and have a single number result. Then you compare just the numbers, and come to a conclusion. So X gets 3 and Y gets 5 for problem ABCD ... therefore Y is 66% worse than X at ABCD. Except that might be a highly misleading (or worse) conclusion, for a number of reasons:

Read more...Collapse )
Tags: benchmarking, yum


August 2 2009, 04:50:31 UTC 4 years ago

  • New comment

I agree with you

app X does foo slower than app Y does foo. For the user this means that X is worse than Y, the details about the implementation that causes this is irrelevant here (Y uses a cache while X does not, or X performs this additional steps).

If I said that I didn't mean to, by all means complain if "doing operation FOO" is significantly different. When I spoke about "yum update" the point there was that taking a small subset of the overall operation, measuring that, and then complaining about the full operation based on that small difference is misleading at best.

For example say a full operation of "yum update" takes 26 seconds, and using the same data (and getting the same result) "foo update" takes 22 seconds ... now let's say that the depsolving part of each takes 6 seconds and 2 seconds respectively. Now it's fair to say that the difference in time is likely due to the depsolver, and that yum is slower, but those 4 seconds are worth significantly less when you include the 20 other seconds needed for the operation.

Related to that, if you use something with manual synchronization like apt but don't include the "apt-get update" part then that is "unfair", and shouldn't be passed off as merely implementation. Because real users need to synchronize before doing an operation.


default userpic

Your reply will be screened

Your IP address will be recorded