Showing posts with label data. Show all posts
Showing posts with label data. Show all posts

Wednesday, July 28, 2010

Are graphs really that hard?

I don't understand how both Nike and Garmin Connect can get their running pace graphs so wrong. Both the Nike+iPod and GPS data are noisy. Nike just picks random points along your run, while Garmin seems to have chosen to just show all the noise. The first option is incorrect, and the second isn't all that useful.

Since I've switched to using the Garmin 405 and the Garmin Connect site, I've found a way to see how each of the three sites I've mentioned plot the same run data. This is made possible by this fantastic site that can convert and upload a Garmin run to the Nike Running site. Then, slowgeek.com pulls the data from Nike and generates its own graphs. So lets take a look at today's sloppy attempt at a heart rate fartlek using the graphs for all three.

Nike+iPod

First, lets look at the Nike graphs. The thing to note for these two graphs is that they are the same run. The exact same data. The only thing that differs is that one view is in kilometers and one is in miles.




How is it that these graphs are the same run? These images underscore just how broken the Nike graphs are. It seems they pick a regular interval out of their noisy data, and plot them as if they were real. When you change the settings from kilometers to miles it picks new points, in different sections, and plots those. There is no rounding or cleaning up the noise in the data, which is crazy considering what the raw data looks like.

Garmin Connect


The Garmin site takes a different approach. As near as I can tell they don't interpret the data, but try to plot all of the raw data. Here is the same run as Garmin presents it.


This is starting to look a little more like my run, and I'm sure this is an accurate representation of the raw data coming off the watch. You can see what was going on, somewhat. The first and last kilometer or so of this run include some messy data. I start from the middle of downtown Vancouver, right in the middle of tall buildings, so the GPS signal is all over the place for a bit. But then you see some somewhat regular alternations of pace. My goal for this run was to alternate running hard until my heart rate hit 160bpm, slack off until it dropped to 140bpm, and then crank it back up. You can see that. Somewhat.

Slowgeek

 The Slowgeek site presents a very different looking graph.


Now this is how my run felt. After the initial static of running in tall buildings (I assure you I did not run at 3min/kilometer at any point) it looks exactly like it felt. I alternate between running hard and backing off, until I get dog tired and everything falls apart at the end. Now that was my run. How much was that my run? Compare it to my heart rate graph from the Garmin site.

Look at the resemblance between the Slowgeek pace graph and the Garmin heart rate data. Uncanny. The Slowgeek representation of pace perfectly matches the effort exerted based on heart rate. It isn't that the Garmin graph is wrong, it is just that all that noise doesn't match reality as well as the Slowgeek interpretation.

What gives?

How is it that Slowgeek, a hobby site created by one guy, gets it right while both Garmin and Nike get it wrong. This is their business! Worse, Rasmus (who created both slowgeek and the PHP programming language) has contacted Nike a number of times and told them how to fix this. It isn't magic. From the slowgeek forums:
The math involved tries to do its best using something called a LOWESS curve. It uses locally weighted polynomial regression where each point is derived by weighted least squares regression over the local span for that point. Basically it means that it tries to pick out the trend in the data. Noisy peaks or valleys will be smoothed out in the process.
I sure wish the professionals cared as much as some random geek. I mostly love the Garmin equipment and site. But I plan to continue to use slowgeek for its superior pace graphs, and better graphs for historical data.

Thursday, July 22, 2010

Garmin 405 accuracy.

GPS Accuracy

I've read much about how the Garmin 405 has a highly sensitive antenna, and is great even in heavy tree canopy or tall buildings. I'm mostly impressed with it in my limited sample size of runs so far, but today's run left a little bit to be desired.

If you really care about route data and you will be running in a city, zoom in the above and pay attention to the first and last kilometer of the run. Those do not match reality. For both the out and back my route took me up Pender, to Burrard and down Cordova. At no point did I do crazy Parkour over skyscrapers and on the living roof of the new convention center, as the map would suggest.

That said, the total distance is close to being correct, and everything after I hit the seawall is close enough to make me happy.  I still think it is more accurate than the Nike+iPod gadget overall (OK, maybe not for *this* run). I believe the problem is tall buildings. Around the lake is under trees, and that seems fine. But, from my work to the seawall is all in skyscrapers. Also, my habitual route keeps me on the South side of the street when heading west, and the east side of the street when heading north. As most of the GPS satellites are in a southernly positon in the sky from up here in the Great White North, that puts me in just the wrong position to get a direct signal. My next time out I plan to keep as clear of a line of sight to the southern hemisphere as I can and see if my accuracy improves.

Foot pod?
Another option is getting the Garmin accelerator foot pod to compliment the GPS data. For years while running with the Nike+iPod accelerometer I've had this recurring thought: an accelerometer foot pod, coupled with a GPS (and maybe some fuzzy logic algorithms) could provide almost perfect distance and pace data. I found that the Nike+iPod system is very accurate if two conditions are met:

 1) You have calibrated your foot pod on a track
 2) You run a very consistent pace

The Garmin I find very accurate unless you are in tall buildings, like the run above. But, if you could properly pair the two systems you could make it almost perfect. When you have a clear line to the satellites you could be constantly calibrating the accuracy of the accelerometer. When you lose sight of the satellites you have a recently and perfectly calibrated foot pod to cover the gaps. You would know your pace before the GPS went dark, and just after. You would have historical data for the different paces you run at, and how that matches the data coming off the foot pod. You could reconstruct the missing GPS data  almost perfectly (at least in terms of pace and distance). Plotting this on a map, you could have the route a different color when you think the GPS signal is not good enough to indicate that you probably didn't run that exact path, but pace and feedback distance would be correct on the watch and in the online data.

Is that how it works?
I don't know if this is how the Garmin 405 works when you pair it with their foot pod, but I'm going to bet it doesn't. There is a big technical hurdle that I don't think Garmin could overcome, related to processing power on the watch. If you look at the above map, I don't think I ever actually lost the GPS satellites. What happened (I assume) is that I was in the satellite shadow of a big building, and the watch was picking up the reflection of the signals off of a building across the street. How would the watch know if it was getting bad data compared to good data? Well, it would have to look at pace and location data and know how to do the right thing. An aggressive algorithm could mess things up, smoothing out speed work laps as errors and such. I think it would be tricky to get right, and might be more than a watch can handle (Or maybe not. I think we landed people on the Moon with significantly less power than this watch has).

In absence of that knowledge, I'm hesitant to fork out the cash for a footpod just yet. Sure, I'll get one at some point, if only so I can gather distance info if I'm running on a treadmill in the Winter. But I'm not going to get too excited about it. All the manual says about it is:
Your Forerunner is compatible with the foot pod. You can use the foot pod to send data for your Forerunner when training indoors or when your GPS signal is weak or you lose satellite signals.
As a technical writer, I appreciate the minimalism.  This covers the basics. I'm sure it is true. It leaves enough ambiguity around "weak signal" to allow developers to totally change how it works without reprinting the manual. Perfect. The ambiguity isn't going to keep me from buying the product and there is plenty there to satisfy the incurious. That is what I would have done as a tech writer. As a geeky consumer that blogs about running data, I want to know more. How would the foot pod have changed my run data on the run above?

At some point I will get the foot pod. When I do, I should have several, maybe dozens, of runs on this exact route logged. That should give me plenty of information to see if the foot pod improves accuracy fot this type of run.

Why not?


So here is my proposal. Garmin, why don't you send me a foot pod for testing? I'll give it a glowing review and suggest everybody buy one (assuming it improves accuracy). Heck, I love the 405 so much I think everyone should get one anyway. Buy a foot pod too in case you are on a treadmill. If it improves accuracy in the run above, I'll shout it from the rooftops.

So what do you think? Thanks Garmin. I'll be watching my mail. Also, I'm sure the accuracy would be improved significantly if you threw in an extra ANT+ USB stick. It couldn't hurt, right?

;)

Saturday, July 17, 2010

Garmin 405

So I bought a new toy, the Garmin Forerunner 405. My data addiction can continue!  It is a GPS watch targeted at runners. The reviews sound mostly positive, & a coworker has been raving about his. The Nike+iPod system has served me well, but I think it is time to try something different. The device itself is one of the first GPS watches that is small enough to wear as a regular watch. It wirelessly connects to a heart rate monitor, and comes with a USB key for syncing data to your computer, and then to Garmin Connect online. You can optionally connect it to other devices, such as a foot pod that will allow you to run on a treadmill and still gather pace and distance data and that will fill in the gaps if you lose satellite connection in trees, tall buildings or a tunnel. Adding the foot pod will also provide information on how many steps you take per minute, which is a useful metric as well if you want to improve your cadence.

You can even buy a crazy expensive scale that claims it can record "weight, body fat percentage, and hydration levels." Once you have this scale, it records all these metrics, uploads them to the watch, and the watch then transmits them to Garmin Connect. So in addition to tracking your mileage, pace, route, elevation and heart rate, you can also watch your weight and muscle to fat ratio over time. It is a data addict's dream come true. Still, I'm not going to run out and buy the scale just yet. But, I think I will start weighing myself often on the scale I have and entering that info so I can track gains and losses over time. That could help keep me focused.

So how is it? Seems great so far. The night I bought it we went to New Brighton Park for dinner with a friend and her kids. We enjoyed the beautiful afternoon, had a cocktail and ate takeout sushi. Afterwards, as the ladies conversed on a blanket and the kids blew off steam at the playground, I ducked out for a very brief jog to see what the watch could do. My first impression is that the immediate pace feedback is far better than what the Nike+iPod system provides. I set up the workout screen to show pace, distance and heart rate. A gentle acceleration or deceleration showed instant results on the pace readout. The main reason I wanted to jog around though was so that I would have some data to sync when I got the software set up. Check it out:


So the Garmin Connect site is pretty neat. You can pull lots of information out of there. I'm really looking forward to getting more familiar with this system. I did another easy run this morning. I ran an easy 4km in to work. I'm dieing to get out and do a longer run, but tomorrow morning I do the Summerfast 10k around Stanley Park. So an easy 4k was the most I could justify.

Saturday, June 5, 2010

Pace, Phish, Evolution and Data Addiction

Considering how slow I am, some will find it comical how much I look at the data around my pace and distance. Some of it is from a genuine desire to know when I'm improving, and to be able to predict how well I could do in a race. That certainly is motivating, at least when the numbers show I'm getting faster. But, the painful truth is I might have a bit of a data addiction. As a case in point, I'll allow myself a bit of a digression from what has so far been a barefoot running blog (where are all the Non Sequiturs, anyway?).

Anyone who knows me well knows I like live music. While I like all kinds of music, from electronica to country to hardcore, the band I've seen the most is Phish. How many times have I seen Phish? Funny you should ask... I've spent the last couple of weeks obsessing over just that. I had kept a record of all the shows I had seen. I had an old copy of a book called the Pharmer's Almanac that listed all shows and their related setlists up through the Spring of '98. I'd gone through and marked the shows I had been at so I could thumb through and reminisce. As I kept seeing shows after the date where the book left off, I kept count but did not keep a record. Until recently I was completely convinced I had seen 90 Phish concerts. Since I will be seeing a couple shows this July (for the first time in 6 years) I thought I'd figure out exactly which shows I've seen. After spending a little time at the the excellent phish.net site, I came up with this list. I think there is one more show I haven't accounted for, as I don't think I would have counted the 8/14/1998 soundcheck as a show. It bugs me that I can't find the missing show. Was it Chula Vista, in 2003? That may just be it. But really, why would I care? Because it is fun to play with the data. For instance, the song I've seen the most is Maze. I've seen it 30 times. It is a good song, but I could never figure out why they play it so much. Turns out they don't. If you look at the Overplayed/Underplayed statistics on my own personal Phish Stats, you can see it is an anomaly. In the 89 shows I have listed, I should have only seen that song 18 times. Strange.   I can also see that there are 52 songs that I saw the very frist time Phish played them, including some classics they play all the time.

It is fun to think about, and http://phish.net makes it easy to play with, but it is probably a pointless addiction to plow through all that data.

Which brings me back to running, and trying to interpret the data from my Nikeplus iPod attachment. While I've been encouraged by my recent pace improvements, a run last week made me think that, maybe, the calibration is more off than I would like. According to the data I ran 7.75 km at 4'37"/km pace (4.82 mi @ 7'27"/mi). Looking back, I think this is the fastest run I've ever logged since I started using the iPod to track pace in 2006. This was certainly a fast run for me. No doubt. And I've ran a certified 10k at a faster pace than this in the past, so it isn't completely out of the realm of possibility. But outliers like this make me nervous about the calibration of the equipment. Can it really be my fastest run in years? Is barefoot and minimalist running driving that much of an improvement? Really?

If the pace is off the distance will be off, right? So, I went out and mapped my lunch run on the mapmyrun.com site.  It comes out to 7.34k, while my logged run reads 7.75k, roughly 95% accurate. So, plugging in the numbers at an online pace calculator, my pace may well have been 4:53/km rather than 4:37/km. It is still a good pace for me, but not as good as I had thought: 16 seconds per kilometer slower, or more than two and a half minutes over the course of a 10k. It is not the fastest run I've ever logged, as I had thought. The good news is, looking back at the data, it is still impressive. The last time I matched that pace was during a short run on May 17th, 2007.

My seat of the pants feelings about my runs are correct. I'm seeing improvements. Now to see if I can feed that data back into the iPod to improve accuracy. When I complete a run, the iPod offers a calibrate option. That way you can set a completed run to a known distance. I did that for the Sun Run 10k. It is not a well documented feature. I don't know if it simply calibrates that one run, or if it feeds that data back for future runs. I guess we'll have to find out. Next time I do the Lost Lagoon run I'll stick strictly to my mapped route and calibrate it to 7.34 after the fact.

So I'm obsessing about details of both my runs and the concerts I've seen. I'm not really OCD, but I do like to pour over all this data. And just how does this data addiction relate to running in general? Does it? Well, I just finished reading Born to Run by Christopher McDougal (a fantastic book everyone should read) and came across an interesting idea. Stick with me here... (Wait, WTF?!? You are still here?)  A major premis of the book, besides being an interesting story about an obscure ultra-marathon that was staged in the Mexican wilderness, is that humans evolved to run. We are better distance runners than any other animal. Our build allows us to conserve energy while running steadily, while our hairless body covered with sweat glands helps us cool and recover on the go. No other animal in the world can beat us at a marathon or longer, not even a horse. We evolved that way for persistance hunting: chasing and tracking animals until they overheat and die. Obviously running is a big part of that, but when researchers attempted it they failed. The animals would disappear, fold themselves back into the herd and the hunters would wind up chasing fresh animals. But a South African man named Louis Liebenberg found the answer. He became interested in the origin of logic and scientific thought in human prehistory so he dropped out of society to go live with the Kalahari Bushmen, who were as prehistoric a culture as still exists. During his time with the Bushmen, Louis learned persistance hunting. Running was only half the equation; it turns out it takes a lot of brains as well as running.
"When tracking an animal, one attempts to think like an animal in order to predict where it is going," Louis says. "Looking at its tracks, one visualizes the motion of the animal and feels that motion in one's own body. You go into a trance like state, the concentration is so intense. It's actually quite dangerous, because you become numb to your own body and can keep pushing yourself until you collapse."
Visualization... empathy... abstract thinking and forward projection: aside from the keeling-over part, isn't that exactly the mental engineering we now use for science, medicine, the creative arts? "When you track, you're creating causal connections in your mind, because you didn't actually see what the animal did," Louis realized. "That's the essence of physics." With speculative hunting, early human hunters had gone beyond connecting the dots; they were now connecting dots that existed only in their minds.
Speculative tracking and persistance hunting probably drove our evolution; made us who we are by rewarding efficient running bodies and the ability decipher almost random scratches in the dirt. While running was a huge part of why we survived, the other half of the equation was the ability to collect and collate data.

So not only are my running and my data addiction related, they are at the core of who we are as a species. Beter than any other land animal on the planet, we can settle in to a nice comfortable run and cover huge amounts of ground. Similarly, we can take disparate information from multiple sources and see patterns, connections and causalities. We can take two seemingly unrelated points of data, non sequitors in the conversation between us and our environment, and fill in the blanks and find causal connections. You see what I did there?