On Parachutes and Evidence


Smith and Pell’s “Parachute use to prevent death and major trauma related to gravitational challenge: systematic review of randomised controlled trials” has been shared enough times to bolster an argument against evidence based practice that I am compelled to write a blog post about it. For those unfamiliar, Smith and Pell wrote up a satirical randomized controlled trial where one group jumps from a plane with parachutes and the other group uses a placebo parachute. You can guess what the results might be. The issue with the article is that while comical, it is not actually an effective criticism of evidence based practice and its preference for randomized controlled trials over other forms of evidence (such as observational studies).

Smith and Pell’s criticism falls short because when it comes to the efficacy of interventions like parachutes, the effect of the intervention is so blatantly obvious that there is no need to conduct a randomized controlled trial. This means you end up having a truly fantastic intervention without any robust evidence to support it. Jeremy Howick calls this the “paradox of effectiveness”, where some of the most effective interventions available do not actually have controlled clinical trials that illustrate their benefits.

Howick provides other examples of the “paradox of effectiveness” such as emergency appendectomies, the heimlich maneuver, defibrillator use and anesthesia. These are interventions that are remarkably plausible, fill an acute and usually life threatening need and produce dramatic and routinely observable effects. The results of these interventions are so large that it effectively rules out any potential confounding variables. You do not survive jumping from a plane with a parachute because you expect it to work, appendectomies do not prevent complications from a ruptured appendix because of selection bias, anesthesia does not produce unconsciousness due to issues with blinding and the heimlich maneuver does not resolve choking due to regression to the mean. Therefore, conducting a double blind randomized controlled trial to demonstrate the worth of parachutes would be a waste of resources. Nor would doing so be necessary to adhere to the philosophy of evidence based practice. Where it would be helpful to perform a controlled trial is if you were comparing different parachute types, open vs. laparoscopic appendectomies, different defibrillator settings/pad placement, or comparing certain drugs used in anesthesia. These differences would be far more subtle and subsequently the confidence that the difference is not caused by confounding variables much smaller.

Unfortunately, interventions that produce dramatic, observable and repeatable effects are few and far between. Most interventions produce smaller and more inconsistent outcomes which necessitates their study under more controlled conditions to rule out competing hypotheses. These modest effects are more characteristic of the interventions available to physical therapists. Therefore, using the Smith and Pell article to argue that interventions in physical therapy do not necessarily need to be supported by well controlled evidence to be certain of their benefit would be a flimsy argument. To be clear: I am not saying that all interventions need controlled trials in order to justify their use. I am saying that to be confident in the efficacy and effectiveness of physical therapy interventions, you need solid data. For example, the effect of manual therapy in patients with neck pain tends to be small and far from that of the effect of parachute use preventing death in skydivers. Therefore in the absence of evidence from robust controlled trials, one is not able to confidently say that manual therapy for neck pain:

  1. has a meaningful, specific, repeatable effect
  2. works via a specific mechanism (ex. resolving a “stuck” facet joint)
  3. that the outcome is not actually the result of a factors separate from manual therapy such as natural history, regression to the mean, expectation, beliefs, reward etc.

Randomized controlled trials, inclusive of their limitations, are still the best tool available to tease out the effects of a large majority of interventions. Notable exceptions to this do exist, where the effects are so large that randomized controlled trials are not necessary. These exceptions create a paradox of effectiveness, however, they are not a reason for overconfidence in interventions that lack well controlled supportive data. Nor is the Smith and Pell satire an effective criticism of evidence based practice or a reasonable way to justify physical therapy interventions that have not been scientifically vetted.

Photo credit to flickr user m_ragazzon

8 thoughts on “On Parachutes and Evidence

  1. Excellently explained Kenny. You take into account a very reasoned view of practice and research, thanks for the clarity.
    You mention it too, but I always thought the parachute arguement interesting in that there is an assumption that if parachutes were not studied in a RCT, it was not studied at all. False. They are indeed, and obviously, tested. Many advancements have been made in their capabilities,etc because of that testing.
    Like you said, it’s a whole different question, and one of the first things we do when appraising research is find out if they’re asking the right question for the corresponding research method.
    Thanks for the post!
    -Matt D

  2. Kyle – great post, as always clear and lucid points. As presented in your post I agree with you. Smith and Pell have not provided an effective argument against Evidence Based Practice or against the superiority of RCTs for isolating a particular causal mechanism from amongst alternative explanations.

    There are aspects to the satire that do warrant pause and consideration. From the Smith and Pell article: “Advocates of evidence based medicine have criticised the adoption of interventions evaluated by using only observational data.”

    Based on this sentence it does not seem that Smith and Pell are attempting to provide an argument against EBP in total. Merely an extreme form of EBP that does not seem to be held by you or Howick. This extreme form may be a straw man set up by Smith and Pell in order to knock something associated with EBP down (in other words perhaps no one holds this belief of criticizing “the adoption of interventions evaluated by using only observational data”). There are systematic reviews that only include RCTs. I have personally always believed this is to compare apples to apples in the examination of evidence. And, if there are enough RCTs to support a rigorous systematic review then it is appropriate to use this highly controlled set of evidence. But if there are few RCTs due to the nature of the underlying system being studied (i.e. not amendable to an experimental set up – closed system; but observable in a structured way as an open system); or due to the inability of researchers to have had time to study the question; or due to the obviously observable mechanisms and remarkable effects; then yes observational studies should be (and are) included in systematic reviews.

    But if there are people that believe interventions based on only observational trials should not be considered (should be excluded from practice) then the Smith and Pell paper demonstrates that there are instances in which we can obtain knowledge without RCTs.

    Essentially Smith and Pell have provided a counter example to the claim that the only source of knowledge is a randomized controlled trial. If anyone was ever trying to make that claim then Smith and Pell’s counter example is effective as a single counter refutes a universal.

    If no one was making that claim, if that claim is a straw man, then at best Smith and Pell provide an opportunity for us to consider what the factors are that do lead to knowledge about the world based on observational studies (which you and Howick have done very effectively). And yes, in most instances we do tend to need at least some parts of an open system to have been isolated with strict experimental control to understand it. We must always consider that eventually we need to plug that knowledge back into the open system and test what we know about it and in those instances a return to an observational trial may be the most effective means to knowledge. More on this below.

    Underlying my response I am making use of two fundamental aspects of Roy Bhaskar’s Transcendental Realism, more commonly referred to as the Critical Realist Philosophy of Science (most widely accepted and adopted by the social sciences, but applicable to all sciences). One is the belief that ontology determines epistemology, meaning the way things are determines the way things are known. The other is the stratification of reality for scientific purposes into the closed and open. Open refers to the real world in its fully interacting state (what an observational trial attempts to apply some control to not by manipulating it but through structured observations); closed refers to the isolation of the real world based on the system you want to know and manipulating it, poking it, prodding it to see what happens). RCTs isolate and create a closed system and that is a very useful tool to identify causal relations by isolating a system and ruling out alternative explanations. And the ontology of many things warrants this epistemological approach. But the Smith and Pell satire raises awareness that the way of some things allows another approach to knowledge.

    I have two more thoughts to share (should they be any use, I am not sure). First, you and Howick have proposed a set of considerations about the ontology of systems that might be justifiably known (epistemology) through observational trials that are similar to the case of parachutes. Let’s call them ontological features. These ontological features: “…interventions that are remarkably plausible, fill an acute and usually life threatening need and produce dramatic and routinely observable effects. The results of these interventions are so large that it effectively rules out any potential confounding variables.”; are not binary. We really cannot answer these as black and white, cut and dry, yes or no, ontological features. These ontological features are on a spectrum. Therefore, it is on a spectrum that we must evaluate the claims being proposed as knowledge from observational trials. In some extreme examples as you have identified, there is little debate. In other less extreme the debate gives academics something to discuss 🙂

    The second thought is that there are two ways to consider the usefulness of observational studies. Thus far we have considered them as if their benefit comes from generating knowledge that RCT’s are clearly better at generating (isolated cause – effect) as a closed system. In this regard, they are not as effective. That is clear. But what about considering the ontology of the open system. Here we find out that RCTs are inferior to observational studies in that by their nature they isolate and close the system. I would ask whether we need to consider the usefulness of well designed, structured, observational studies to provide insights and knowledge about the causal structure of systems that are not so isolated or cannot be so isolated, that are open and exist as they do in the full messiness of reality. We do sacrifice some control, but we also may gain in applicability.

    My dissertation was on job stress and it’s impact on ECG measured variables of cardiac control (through heart rate variability). How this system was (ontology) determined how we were to know it (epistemology) and only an open approach with an observational design was appropriate. There is no way to simulate the long term, chronic, real life stress associated with a job in an experimental study.

    So there is a tension – a necessary back and forth – that we must work with as we struggle through what information (data) we need and how we collected it and how we analyze it to generate knowledge for the profession, for practice. I believe causal structures as represented as causal models provide us with a framework for working through this tension between closed and open systems, classifying the ontology of the system we are attempting to know, and using both RCTs and observational trials appropriately in the service of knowledge. Rather than “either / or” I suggest a “both / and” approach to clinical epistemology based on it’s complex ontology.

    My comments (above) with links to references and some added commentary on how this all relates to knowledge based practice (generally) and causal models (specifically) is available at:


    As always thank you for such a well written, cogent post!

  3. Hey Sean,

    I always appreciate when you share your thoughts on my blog. Your comments are always insightful and get me thinking quite a bit, so thank you. I think you’re mixing me up with Kyle Ridgeway though 😉 My name is Kenny.

    As Matt D says above, we have to match the question to the methodology. So we are in agreement that sometimes it is an observational trial that is actually most appropriate. With this in mind, I really enjoy this figure from Howick’s Philosophy of Evidence Based Medicine (which coincidentally involves parachutes) — http://i.imgur.com/jS0oqg6.png

    In regards to the ontological features being on a spectrum, your comment echoes what Roger Kerry shared on my Facebook page. I have copy and pasted his original comment and my reply below and would love to hear more of your thoughts on it:


    Roger Kerry: Again, a great [piece] Kenny, and you have framed this argument really well. The philosophical question, which neither EBM nor Howick have responded suitably too as yet is who decides when an anticipated effect is not large enough to warrant testing under trial conditions? Parachutes etc, are obvious and extreme examples of large effects, but nil-to-large is a continuum so where is the “we need RCTs from this point downwards” line drawn? And who decides this?

    My reply: Your comment reminds me of the Sorites Paradox, Roger. Noson Yanofsky does a nice job talking about this (and modus ponens “failure” in this regard) in his book “The Outer Limits of Reason” We “know” when someone is bald, but at what number of hairs does one become bald?

    Yanofsky writes — “If a man with n hairs is bald, then with n + 1 hairs he is also bald. If a man with three hairs is bald, then at four hairs he is also bald. Pressing on with our analysis, we can come to the conclusion that a man with 100,000 hairs or even 10 million hairs is still bald. But this is simply not true.”

    And the opposite (this time in terms of grains and heaps) “If n grains are a heap, then n − 1 grains are also a heap. Using this rule and applying the modus ponens rule many times, we arrive at an obviously false conclusion that a collection of 1 grain is also a heap.”

    Yanofsky goes on to state that it isn’t a problem with modus ponens, but an issue with vagueness in language. You bring up some great points and questions. Nil-to-large is a vague continuum with no clear demarcation (as is the case with baldness and heaps). Except in obvious cases where we just “know”, such as parachutes and Moby’s bald head. I certainly don’t have a good answer as to how one should delineate, but would love it if someone smarter than me did! What are your thoughts?

    Your comments on open and closed systems provide a clear conceptualization of the applications of RCTs v Observational studies, so I really appreciate you sharing that.

    I’m looking forward to giving your latest post a read today. I am finding your blog to be one of the most consistently interesting and informative resources out there in our PT social media sphere.

  4. Hi Kenny – so sorry about the name mix up. You must excuse my aging mind in the context of fast thinking social media, particularly with the depth of thought you and Kyle both provide out here!

    Yes, Roger’s post about the continuum resonates with what I was saying and your response is spot on (great use of Yanofsky’s book there), where is the line, and when does it get decided. There is, of course, no need to actually answer that question as everyone agrees that both methodological approaches have their place and contribute to the dialogue towards knowledge, particularly for justifying the existence (and acceptance) of a cause effect relationship. As for my thoughts on the line – at the risk of sounding like an empiricist I am going to say continuing to find data that systematically contributes to what we believe is the correct causal structure. I think even small effects, when mechanistically plausible, and consistent within and between studies, in well designed observational studies really start to look like causal relations at a level we can say we know about them. Of course, having some experiments to verify components of the total picture goes a long way so I think I have just run myself into a circle 🙂 To this end I am planning a series of posts about the ontology of cause – effect relationships, something that goes beyond what I have already done on direct and possible causes, something that addresses causes within the stratification of reality and the degree of interaction within a causal network. Once I do that I might have more to say.

    One way to look at is to not necessarily consider observational studies and RCTs that different but to assess them both with the same set of considerations for risk of bias that impact internal and external validity. As for making knowledge statements about causal relationships we can then look towards Hill’s criteria for causation (I continue to find this epidemiological relic useful any time I am reviewing an article as a reviewer or an editor).


    With these criteria what you see is that the “experiment” criteria is basically a set of other criteria (temporal, control of alternative explanations).

    I am glad you find the use of Bhaskar’s closed and open system a clear conceptualization, I do think his system (critical realism) has a lot to offer our field given the stratification of reality we deal with daily.

    Thanks for your all too kind comments about my blog! Your blog posts are equally inspiring to me and keep me well grounded in the realities of our need for empirical data. While I propose critical realism as my foundation, I have a propensity (better or worse) towards rationalism so I need to be reminded of the need for a realism balance between empiricism and rationalism. I am very glad to have connected with you all out here in the social media world and look forward to more dialogue to come!

  5. I think what I got out of Smith and Pell’s article is different than what a lot of others did. I don’t think they were saying that EBP is not worthwhile, but they were provoking thought on the topic that we can’t have evidence based only practice. That sometimes common sense, or clinical decision making, may trump evidence especially if there isn’t any. For me, I look at evidence as a guiding factor and if something has sound evidence for or against I’m more likely to use/not use that intervention.
    For instance….most times, we cannot make a 100% accurate diagnosis of the pain generator for LBP (look at how many studies show asymptomatic disc herniation’s) so how can we say that “MT doesn’t work for LBP”? How can a study really negate an intervention in which we don’t truly know the pain generator/mechanism?
    I think Smith and Pell were trying to be a bit over the top (it is satire) to get practitioner’s mind’s flowing. Looks like they did.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s