Wednesday, July 28, 2010

Are graphs really that hard?

I don't understand how both Nike and Garmin Connect can get their running pace graphs so wrong. Both the Nike+iPod and GPS data are noisy. Nike just picks random points along your run, while Garmin seems to have chosen to just show all the noise. The first option is incorrect, and the second isn't all that useful.

Since I've switched to using the Garmin 405 and the Garmin Connect site, I've found a way to see how each of the three sites I've mentioned plot the same run data. This is made possible by this fantastic site that can convert and upload a Garmin run to the Nike Running site. Then, slowgeek.com pulls the data from Nike and generates its own graphs. So lets take a look at today's sloppy attempt at a heart rate fartlek using the graphs for all three.

Nike+iPod

First, lets look at the Nike graphs. The thing to note for these two graphs is that they are the same run. The exact same data. The only thing that differs is that one view is in kilometers and one is in miles.




How is it that these graphs are the same run? These images underscore just how broken the Nike graphs are. It seems they pick a regular interval out of their noisy data, and plot them as if they were real. When you change the settings from kilometers to miles it picks new points, in different sections, and plots those. There is no rounding or cleaning up the noise in the data, which is crazy considering what the raw data looks like.

Garmin Connect


The Garmin site takes a different approach. As near as I can tell they don't interpret the data, but try to plot all of the raw data. Here is the same run as Garmin presents it.


This is starting to look a little more like my run, and I'm sure this is an accurate representation of the raw data coming off the watch. You can see what was going on, somewhat. The first and last kilometer or so of this run include some messy data. I start from the middle of downtown Vancouver, right in the middle of tall buildings, so the GPS signal is all over the place for a bit. But then you see some somewhat regular alternations of pace. My goal for this run was to alternate running hard until my heart rate hit 160bpm, slack off until it dropped to 140bpm, and then crank it back up. You can see that. Somewhat.

Slowgeek

 The Slowgeek site presents a very different looking graph.


Now this is how my run felt. After the initial static of running in tall buildings (I assure you I did not run at 3min/kilometer at any point) it looks exactly like it felt. I alternate between running hard and backing off, until I get dog tired and everything falls apart at the end. Now that was my run. How much was that my run? Compare it to my heart rate graph from the Garmin site.

Look at the resemblance between the Slowgeek pace graph and the Garmin heart rate data. Uncanny. The Slowgeek representation of pace perfectly matches the effort exerted based on heart rate. It isn't that the Garmin graph is wrong, it is just that all that noise doesn't match reality as well as the Slowgeek interpretation.

What gives?

How is it that Slowgeek, a hobby site created by one guy, gets it right while both Garmin and Nike get it wrong. This is their business! Worse, Rasmus (who created both slowgeek and the PHP programming language) has contacted Nike a number of times and told them how to fix this. It isn't magic. From the slowgeek forums:
The math involved tries to do its best using something called a LOWESS curve. It uses locally weighted polynomial regression where each point is derived by weighted least squares regression over the local span for that point. Basically it means that it tries to pick out the trend in the data. Noisy peaks or valleys will be smoothed out in the process.
I sure wish the professionals cared as much as some random geek. I mostly love the Garmin equipment and site. But I plan to continue to use slowgeek for its superior pace graphs, and better graphs for historical data.

3 comments:

  1. Nike+ allows you to alter the granularity of your graphs. It looks like you are looking at one of the least granular graphs that they provide. I get something similar to the Slowgeek site graph when I use the granularity 2 ticks away from their most granular (Settings->Site Preferences)

    ReplyDelete
  2. What Nike is doing with their slider is picking more or fewer points and plotting those. That will never give them the smooth line in the middle. It will give them a smooth graph that is all over the place when you choose fewer points, or a noisy graph that is up and down all over the place if you pick more. The truth is neither of those options. For instance, the dip at the end of the second Nike graph is pure fiction. There was some noise in the data that it chose to plot. With the slider cranked up to 11, it would plot that point along with all the other points. With it cranked down it may choose to plot that point (as it did in the second graph) or it may not (as it did in the first). The truth is, I never ran that slow. It is pure fiction based on an outlier in noisy data. The Slowgeek graph matches actual reality, while the Nike graph is wrong, now matter where you set the slider.
    http://en.wikipedia.org/wiki/Linear_regression

    ReplyDelete
  3. This is why I use runningahead.com. It lets me do smoothing to an arbitrary set of GPS points, while also having the option to let me see the raw data plotted.

    Here's an example pace run:
    http://grab.by/5Gnt

    ReplyDelete