You Set Out to Actually Improve Sports Performance- Did You?

We wouldn’t train if we couldn’t improve. Our athletes would never suffer through the tasks we put them through if they weren’t sure that the training was effective to actually improve sports performance.

In some aspects of performance improvement, it is really easy to measure how we are progressing. The “inches, kilos, seconds” sports provide a really easy way with which to measure how performance changes over time. Did I put the shot farther in this competition than I did the previous competition? Did I run 5k faster than I did last time? Did I set a PR in the squat today?

In other sports, namely team sports, it is much more difficult to assess how performance improves over time. There are many different extraneous factors that can make or break an individual rugby match, or an American football game for instance, like tactics, referees, and a horde of players on the field. Overall, the combination of these factors in these types of sports makes it very difficult to measure improvement in a factor besides just a win or a loss (although devices like Catapult – a player-worn GPS system – are changing this). For those of us who are trying to focus on the physical development of our athletes, we have a bit of an easier time in assessing whether or not our athletes are getting better. We get to use performance tests.

We are of course concerned about wins and losses, but ultimately the only thing we can do is to optimally develop an athlete physically and mentally. When they step out on game night, it is largely outside our control, so all we can do is recognize that you did your best to get your athlete stronger, faster, better conditioned etc.


How do we know we are doing our job as coaches and actually improving performance?

One of our major responsibilities is to figure out the specific performance metrics that we are concerned about improving. What improvements in physical performance are going to give us the biggest bang-for-our-buck at game time? Which aspects of performance will give us the biggest carryover to maximizing the chances that our athlete and/or team can succeed?

For the “inches, kilos, seconds” athletes, what specific performance improvements will likely result in higher jumps? Or faster 100m sprints? Likewise, for a tennis player, what specific improvements will increase their chances of their success on the court? What are the specific skills and abilities they need, and which test(s) more appropriate measures those aspects?

This is a question of specificity of the tests to the athlete’s sport. How well did the prescribed and executed training transfer to the sport?

Scientific literature and coaching literature is a great starting point for this information. There are some great literature reviews out there that can give you a the right direction for the specific testing batteries that you should be concerned about.

Here are a few that might get you started (you’re gonna have to find the rest!).
Miscellaneous Rotational Sports
100m Sprinter

Okay, so we have a good starting point. We have an idea of the specific performance tests to use as indicators for getting better in the ways that matter.

Now how the heck do we evaluate an athlete’s progress? We have data from the performance tests, is it as simple as seeing if the new test results are better than the older testing results?

Yes, and no.

There are a couple of important aspects of testing that you should be concerned about here:

Number one, did they actually get better?
Number two, if they did get better, how much did they get better?

We’re going to spend the rest of this article talking about those two factors.


Did your athlete actually get better?

Okay, so you saw that your athlete jumped higher today than they did a month ago. They jumped higher, but does that mean they actually got better?


There are two huge factors that can affect your interpretation of your athletes changes over time. Those two factors are 1) individual athlete variability, and 2) test variability (technically I am talking about reliability). These two factors are incredibly important for understanding whether or not the changes you saw in your athlete are indicative of an actual change or just a normal variability within the athlete or the test.

Individual Athlete Variability

Variability in biological systems is normal. It is something that we know is an inherent part of being alive. Unfortunately for us, this means that there is also variability in performance.

First, there is normal variability in performance that exists because athletes are living, breathing organisms. For example, athletes are affected by diurnal rhythms which might can result in varying degrees of alertness and performance throughout the day (reference ).

Second, there is variability because they are human beings with hopes, dreams, fears, boyfriends, girlfriends, parents, grades, iPhones etc. (Side note: Yes, their cell phone can cause variability in performance. Want to see somebody not focused on their testing? Have them come in right after they’ve left their brand new iPhone 6 on the bus, and not been able to get it back. Do you think they will be 100% into what you are trying to get them to do? I think not.)

Thirdly, there are also other aspects of variability related to environmental concerns. Is it hot or cold outside? Is the music louder or quieter than normal? Is the sport coach standing there during the test? Are the athlete’s teammates standing nearby?

Ultimately, all of these things can result in individual variability in a given person, which can then affect your test. Unfortunately, many of these things can be hard to control for. It’s not like you can ask your softball player’s boyfriend or girlfriend not to break up with them until after max day. Not gonna happen.

What you can do is try to control for as many outside factors as possible. What you want to attempt to do is to keep as many conditions similar between individual tests, such as temperature, time of day, hydration etc. Here are a few aspects that you should probably consider: Note that the ones with links are all of the areas where there is or probably is a measurable change in performance with manipulating of the factor.

By doing your best to make sure that these factors are consistent for each testing period, you are doing your due diligence in minimizing outside sources of variability and error in their performance. Unfortunately, there are going to be things outside your control, and really all you can do is be flexible. Maybe you can move testing to the next day, give the athlete a couple of hours to get hydrated, etc. For the less easily-controlled for factors, you can always ask your athlete what is going on. A simple question like “Why do you think your performance went so well/badly today?” might give some important context. Then, once you have figured out what may have potentially affected their performance, RECORD IT with the results of your test. You should be recording all sorts of other environmental information, and not necessarily limited to the list above. This information is critical to interpreting your past results later on.

In reality, we can try to control as best we can, but ultimately there will be some level of variability in our athletes. As far as the magnitude of variability that exists, well, we will get to that a little bit later.

It can really tough to quantify this amount of variability. Your best bet is to do your best to cut down on the possible sources of variability by keeping your environment and athlete conditions as consistent as possible.

Test Variability (reliability)

All tests have some degree of error built into them. No test will be perfectly representative of “reality”. There measurements you make will always be with less accuracy (closeness of a measurement to the “true” value) than you might like.

Even fancy schmancy technology has error in it, despite the fact that the magical computer box spits out seemingly perfect numbers. Other tests, which can be affected by human error, are sometimes problematic too.

For example, the methods used for body composition can result in a wide range of values and degrees of accuracy depending on which method is being used. If you are evaluating body composition using a DEXA scanner, your accuracy will be WAAAAYYY better than even the most talented researcher using skinfolds. Furthermore, if you aren’t really skilled with body composition, that window of error is going to be even higher for the skinfold testing. So when you make a skinfold measurement as a brand-spanking-new trainer, you could see huge swings in their body composition measuring the same day! Your athlete’s body composition is obviously not what is changing within a day, so it must be the measurements. Furthermore, with body composition estimations (% fat and otherwise), are based on big groups of similar-bodied(ish) folks. We all have variation in tissue density, body water, etc, so there is some degree of error in the test anyway, even before accounting for tester/caliper/equipment related error.

Usually, the amount of error in most tests has been estimated to some degree. Your job as a “tester” is to make sure you understand what your typical amount of error percentage will be in a test, then make sure that your improvement is a good amount larger than the error. This will give you reasonable confidence that your athlete has truly gotten better. If there is an estimated 5% error in the test you are using, make sure that you have at least a 5% change before you start yelling off of the mountain “You’ve improved, you’ve improved!”

Even with a slightly higher improvement than the error in the measurement, you still should be (somewhat) tentative that there was a real change. The larger the margin of change that is greater than the error of the test should give you more confidence in the fact that there was a true improvement.

Luckily, with relative newbies, they will make big changes, very quickly, so it is easy (or easier) to be confident that they are improving. Newer athletes are prime for making gigantic strides in all sorts of important areas for their performance. But- there is always a but- with newbies (and athletes new to a given test), there is an aspect of learning involved, if there is any aspect of skill inherent to the test. For example, if you are using the 5-10-5 pro agility test, for the first few testing periods, athletes will get better at the test just because they have practiced the test. This doesn’t mean they necessarily got better, rather they got better at the test. You can try to iron out the learning effect (to isolate your evaluation of a training effect) by ensuring that athletes get a lot of practice and time to get really good at the test before you “test” them.

Unlike with newbies, where things get much more difficult are with elite level athletes, where they may slave away for months and years for an improvement for a percentage point or two. If you’re interested, you can check out a good discussion by Will Hopkins about this (smallest worthwhile change). For higher level athletes, where the margin of improvement in performance is more nuanced, the statistics that Hopkins talks about become substantially more important.

Regardless of the athlete’s level though, it is certainly your responsibility to do your due diligence and check out what the expected error will be in whatever test you choose to use.

Happy children on Computer
They just realized they improved more than the amount of error in the test!



If there is anything that we can draw on from this discussion, is that being confident of real change actually occuring is more difficult than most people think. When we test a single athlete before and after a training period, we have to consider possible sources of error and noise that may influence what each measurement says. With more variable tests, we generally want to see large improvements in performance before we are confident that real change occurred. With less variable tests, we may not have to see such large improvements before we can be sure of real change.

We also have to do our best to minimize (we will never truly eliminate) sources of variability in the athlete’s performance. By doing our best to maintain environmental concerns like temperature and audience, we can hopefully minimize or keep consistent outside sources of variability. By ensuring an athlete has similar internal conditions, like using a similar warmup, ensuring hydration etc, we can minimize or keep consistent some other sources of variability that could affecting our ability to assess the possibility of a true change.

We can also ensure that we are using a variety of tests, some of which might be redundant. The advantage to a battery of tests, rather than a single test is that it gives us a more complete picture of an athlete’s development. A single jump measured on a Vertec is probably not enough to give the whole picture- and if the results are affected somehow by environmental concerns or a couple of wacky jumps, your overall ability to assess that athlete is shot. If you combine your tests into a battery that contains, say, vertical jumps (loaded and unloaded), an agility test, and a sprint test, the chances of all of your tests being shot is lessened. If all of those tests indicate improvement, stagnation, or regression, then you have a clearer idea of what happened.

At the end of the day, by being deliberate and careful about the tests we choose, the quality of our assessments, and the environments we conduct the tests in, we can hopefully be much more sure that what we are doing is, in fact, actually improving the performance of our athletes.


Take Home Points

  • Before you begin testing your athletes, make sure that the tests you use are actually representative of important aspects of their sport. A 5k time is not going to tell you anything useful about your Offensive Lineman. Look for literature reviews or books (that are based on research, not opinion) to help you out here.
  • An athlete’s performance will always be variable, and there isn’t a lot we can do about it. What we can do is to reduce outside and inside factors that might increase it, or sway their performance in one direction or another. Consistency between sessions is key.
  • All tests have error. Make sure you know what the expected error for a particular test is. If the change you see in your athlete is larger than the error of the test, you can be more confident there was actually a real improvement. The larger the improvement is relative to the error of the test, the more confident you can be.


So what do you think? Any suggestions to improving this process?


Huge thanks to Chris Sole for feedback.