The Manlius Formation Rotating Header Image

Should I get in the pool? On the bike? Hit the running trail?

During the not-short drive home from the North Bay tri Dan remarked that he wanted to work on his swimming so he could improve his swim time. I’ve known from doing tri’s for a few years that it’s better to spend training time improving the run or the bike, but I wanted to do some math to show why. As I often do, I started with the complicated approach and then got more simple.

I wanted to see if an athlete’s swim, bike, or run place could be used as a predictor for his/her overall finish place. In other words: If an athlete comes out of the water in Xth place and finishes in Yth place, how close are X and Y?

Here’s what the data looks like for the North Bay results:

northbay_scatter

This is a plot of each athlete’s actual swim, bike, and run ranking (the vertical axis) vs. his/her actual finish place. As an example, the lowest reddish box off by itself near the lower left is an athlete who had the third-fastest swim time, but who finished 33rd overall. As such, her swim finish was not a good indicator of her overall finish place. [1]

Note that in the middle of the pack there appears to be more scatter for the swim results than for the run and bike. The run and bike seem to follow a similar trend to each other.

I only had 115 points from this small race so I wanted to look at a bigger race. I had the results from the Tupper Lake Tinman 2007 handy, so I looked at that data:

tupper_scatter

The data for Tupper Lake shows a similar trend to North Bay – widely scattered swim data but tighter run and bike data.

Next I wanted to see mathematically how the three predictors (swim rank, bike rank, run rank) compare against each other. I suspected the swim predictor to have more error than the bike and run, but didn’t know how the bike and run would compare with each other.

For each predictor, there is an error in the prediction vs. the actual result. In the example above, trying to predict finish place (33rd) using the swim rank (3rd) has an error of 30. Taking the standard deviation of the entire set of swim predictor errors gives a measure of how good this method is. Do the same for the bike and run and you can compare how “scattered” the data is for each method. Here’s how it looks:

North Bay (113 samples) Tupper Lake (701 samples)
Swim 20.7 155.9
Bike 11.6 76.4
Run 14.1 78.0

The results actually look pretty similar and are not that surprising. The standard deviation in the swim predictor is almost twice as large as the bike and run. The bike predictor is slightly better than the run predictor. This makes sense since athletes spend most of their time on the bike.

Here are plots showing the fraction of time each athlete spent on each leg of the tri:

North Bay

North Bay

Tupper Lake

Tupper Lake

Both plots show that athletes spend 10-20% of their time swimming, 45-55% of their time biking, and 30-40% of their time running. Almost half of their time biking! What amazes me is the consistency of the time splits even in the face of a pretty large spread in finish times. The plots below show the large spread in finish times:

North Bay

North Bay

Tupper Lake

Tupper Lake

Looking at this from the perspective of distance yields the same conclusion – you’ll be on the bike a lot. Here are the fractions of the total distance for each event:

North Bay Tupper Lake
Swim 2.3% 1.7%
Bike 78.3% 79.7%
Run 19.3% 18.6%

So here are the conclusions I draw from this:

If you’re a normal, middle-of-the-pack age-grouper (like me) who lives in the “middle” of the scatter plots above, get your swim to the point where you leave the water in decent shape, and put in enough effort to maintain that. Any extra training effort should be put into the bike as a first priority, then the run. Dan, get out on that bike!

If you’re an elite athlete near the left bottom of the scatter plots above, balance appears to be important. Then again, if you’re an elite athlete you shouldn’t be taking training advice from me.

  1. Incidentally, she had the 39th fastest bike time and the 57th fastest run. The bike was a much better predictor of her final position, with the run a little poorer as a predictor. [back]

4 Comments

  1. Pogo says:

    Two points: 1) Not all triathlons will have the same division of distance. Maybe Dan’s next triathlon the swim will be three times as long as the previous one and the biking will be shorter. 2) Since the predictor is so close for bike v run, and the time spent on each is appreciably long, might not Dan consider which event he might get the greatest pure improvement? It might be that with the bike being a machine, etc., his form is ‘good’ and his endurance is what it is. In that circumstance, he might only be able to improve the biking by improving his endurance. However, his running form might be terrible, so he could get a greater improvement working on that event because he could fix both his endurance and his form, etc.

  2. Chris says:

    When you look at the standard deviations of position with the three skills, it might be useful to normalize them by the total number of competitors in the event. That way the standard deviations for Tupper Lake and North Bay might be more comparable.
    The previous commenter has a decent point, that if your ‘form’ is off it might be worth improving, but I think the point of the analysis is that if it’s not obvious what you should train for, then the bike would be a good place to start. Even an incremental improvement in bike time has a big impact on the finish (no matter how clumsy your running is).

  3. Bill Ruhsam says:

    I find the most interesting plot in this analysis to be the Finish Place vs. Fraction of Time for Tupper Lake. At the tail end of the age groupers you can see the bike fractions trending down while the run times are trending up. At slow speeds, bikes are better for you! At least, consistently faster for people who haven’t trained adequately or aren’t “really ready” for a 70.3 triathlon. (I’m assuming this data came from the 70.3)

    Great analysis!

  4. Pogo says:

    In regards to Bill Ruhsam’s comment: Great catch! I didn’t notice the trending on the Finish Place vs. Fraction of Time for Tupper Lake chart. I would think that if it means you are a poor placer, you’d be better served to work on running. I thought of it this way: If I was a bad placer I would spend a higher percentage of my time on running than biking (swimming held about constant) than someone who was a good placer. What do you work on to decrease your percentage of run-time and increase your percentage of bike-time? You work to improve running (or de-prove biking.) This line of thinking would also mean that a good placer should work on biking first.

Leave a Reply