Wednesday, September 11, 2013

On Statistics and Half Ironman: Going from a DNF to a 15 minute PR at the half distance

Flash back to last year, I got a stomach bug and DNS’s CedarPoint.

Then, this July, The Route 66 HIM happened.   I had a ‘just ok’ swim, but I started to feel bad with about 10 miles left on the bike.   The run was awful.  I cramped the whole way, and both calves finally locked up around mile 11, sending me face first into the dirt and qualifying me for a free ambulance ride.  Game Over.

I was seriously considering that my efforts at the 70.3 distance were somehow cursed statistically improbable. 

But, just the same, Cedar Point 2013 was on my calendar, and I wanted a bit of redemption.   So, how do you come back from your first DNF?  

To paraphrase Seth, I think first you have to acknowledge that, if you do this triathlon thing long enough, you'll DNF.   It's almost certain.  So, it happens.   How do you get back up?

For me I'd first have to not cramp up and fall over...but I just didn't have time to really dig into that.  I've got a new job, and I’m training for Ironman, there just wasn't time.   So, the bigger problem was time management.  

Time Management

My two ‘biggest’ rocks outside of family life were work and Ironman training.   These two time commitments were about 60-70 hours a week, combined.   Additionally, I was spending another 1-2 hours a week figuring out which workouts I should be doing, and how I should be training.  

So, when I heard that longtime friend, exercise physiologist, PhD student, and super triathlete athlete Laura Wheatley was starting a coaching business it seemed like a good idea to solicit her help.  

 I've worked with a lot of coaches in the past.  Most of them want to talk about what an ‘art’ coaching is, and I’ll concede that it somewhat is.  Few will answer my questions when I ask why.   Fewer still have good reasons when they do answer those questions.  And while I’m not an exercise physiologist, I’m a scientist just the same and scientific process doesn't change.  Said another way, I’m an evidence based, research based, pessimistic math guy that won’t do something because that’s what your n~=50 coaching experience says works.  I’m always going to ask hard questions and expect proof, and Laura is one of the few people that have answered those questions in a reasonable way. 

So, I have a coach.  Poof, 2-3 hours free per week, more confidence that I’m doing the right kind of work, and a lot of experience I can call on as needed.  I can focus on the work, and not the planning.  Additionally, my run has been bad for a long time, and I needed a new approach to make it less bad.   Stick to your core competencies, as the business guys say.  But first things first, I now had the opportunity to invest those hours on fixing my cramping issue, hopefully for good…

On Fixing Cramping

No one really knows why cramping happens in a specific instance.  Lots of things can cause it.   It’s a multifactorial problem.   It could be overexertion, glycogen depletion, inadequate hydration, an electrolyte problem, or something yet undiscovered, and there are decent arguments around each.    It’s something I've struggled with in the past, but usually only after a race or towards the end.   A DNF based on cramping was a whole new thing.  

So, I had a complicated multifactorial problem and about 4 weeks to solve it.   The way I wanted to solve the problem was to manipulate each individual factor and evaluate.   But, that wasn't going to work.   1.  There wasn't time.  2.  How do I know that two factors aren't dependent on one another, or both on a third? 3.  I lacked a testing methodology, because the issues I experienced in racing I wasn't experiencing in training, for various reasons of which probably only some were known or guessed at.

I was really left in a situation where the only reasonable option was a shotgun approach.   Or, to quote Ripley from "Aliens," ‘I say we take off and nuke the entire site from orbit. It's the only way to be sure.

So, I’d have to be ok with not knowing why.  I could build any number of models, related to why I was cramping.   But that’s the thing about models.   My hero statistician is George E.P. Box (What?  Everyone has a hero statistician, right?).   He says “Since all models are wrong the scientist cannot obtain a "correct" one by excessive elaboration. On the contrary following William of Occam he should seek an economical description of natural phenomena. Just as the ability to devise simple but evocative models is the signature of the great scientist so overelaboration and overparameterization is often the mark of mediocrity.

George also says ‘All models are wrong, some are useful’ or something like that…

I think George would have been down with Ripley.   And I had to be ok with not knowing ‘why’.  

So, I overhauled my entire nutrition plan.   This time I hired yet another expert, friend and Coach Kevin McCarthy, to review my nutrition from the Route 66 half and make recommendations.   Kevin was the first to see me after the Route 66 half, and he probably had a better gauge of my physical and mental state than I did.  Laura was of course doing the same thing, giving me great and practical advice on nutrition.   She was also making some changes to my training that I felt would help quite a bit.   I also did an exhaustive amount of research on my own.   Lastly, I talked to almost every experienced age grouper I trusted, including many of my fellow Trisharks.  

Once I had a lot of recommendations from many sources, I consolidated them very, very deliberately, and with great rigor, into what would become my nutrition plan version 2.0.  This is an approach I’m very comfortable with as a data scientist.   This is a proxy for a statistical technique called ensemble learning.   If you need to develop some rules, or generalized learning and you can’t dig deep on the why, because a problem is too complex or you lack time, ensemble learning is where it’s at.  Said simply, you use the ‘vote’ of an ensemble of learners to obtain better predictive performance than you could from a single constituent learner.  (If you’re a statistician reading this, also consider that the decisions of the trees in my little live action roleplay version of a random forest was, from talking to me and their own personal experience, subject to bootstrap aggregation and perhaps boosting as well. :P )

And then I tested, and tested again, on long training days, to make sure it would work, or at least do no harm. 

This is not to say that the concept of ‘phone a friend’ is especially clever in our sport.   It’s not.  But, there is a trap us age groupers sometimes fall into.   There is danger is in reading one paper, speaking to a respected friend or coach, or even  reading one pro’s nutrition plan…and then doing what they do.   My solution to cramping was using formal methodology to avoid this trap, simple as that.

So, how’d it work?

At Cedar Point this year, despite a continued string of misfortune (race report to follow), I managed a 15 minute PR at the half distance.   More importantly, I did it without a single cramp, at approximately the same effort level I had previously raced at.  I’ll never really know what went wrong at Route 66, and that does bug me on some level, but truth be told I’d never be 100% certain, even if I had an infinite number of identical races in which I could isolate and manipulate variables.   The real world is never the lab.

Perhaps even more importantly, I stopped trying to ‘do it all’ myself and gained a team, which as a busy part time long course age group athlete, is really invaluable.   If you only get to race a few times a year you don’t have much opportunity to experiment in race conditions or train sub optimally.   

Big thanks to Laura, Kevin, and all the local athletes I spoke to, that got me this far.  Also thanks to all the professional and age group athletes writing blogs like this one, you can be certain I've data mined you all.  :)

No comments:

Post a Comment