Single-case experimental design vs. N-of-1 designs: What’s the difference?
Last updated: Aug 15, 2021
I don’t know about everyone else, but I often find myself confused about the difference between different types of small sample study designs. In particular, I find that there is conflation between single-case experimental designs (SCEDs) and N-of-1 designs. Psychologists (like me) are typically accustomed to SCEDs whereas other disciplines, such as biostatistics, typically rely on N-of-1 designs. So are they really different? And if they are, what are the differences between them? My goal in this post is to discuss each of these designs, highlight their similarities and differences, and bridge the linguistic gap between disciplines so we can choose the design that works best for our purposes!
Both of these designs exist in contrast to large RCTs, which are typically considered the gold standard clinical trial design in medical settings. However, there are weaknesses associated with these designs that bear acknowledgment. Most importantly, these designs rely on group-based “nomothetic” data analytic methods using average scores from each treatment condition taken over groups of individuals (e.g., testing whether the average score of group A is different from group B at post-treatment). Therefore, the results of these studies address the “average” patient but are not always applicable to a unique individual. Indeed, growing interest in precision medicine highlights the fact that the “average” patient does not exist and it is almost impossible to create a single intervention that is effective for everyone with a given condition (e.g., Davidson et al., 2018). RCTs are blunt data collection designs in the sense that data are typically collected pre- and post-intervention, creating a black box with regard to what happens during the intervention that produces change. Further, moderators of treatment effects are rarely examined making it difficult to answer the classic question: What intervention works for whom, under what circumstances.
Small sample studies have the potential to address these limitations. Because they rely on frequent measurement throughout the intervention it is possible to pinpoint when, and often why, change occurs. Further, because few subjects are being studied the effects are naturally being studied in a highly personalized context and moderators of treatment response can be systematically assessed and explored. Ready to look at our two designs? Let’s check them out!
First, SCEDs. The treatment in these cases is usually a psychological intervention or therapy. These are designs that test the effect of an intervention using a small number of subjects from whom frequent observations (e.g., daily, weekly) are recorded. SCED is actually an umbrella term that encompasses several types of design: phase change, multiple baseline, and alternating treatment designs. These designs can be quasi experimental (e.g., A-B, where “A” and “B” denote different experimental conditions or phases, like “A = baseline phase” and “B = treatment phase”) or include randomization (e.g., the random sequence A-B-A-B-B). Interested readers are referred to Barlow, Nock, and Hersen (2009)’s excellent book on this topic. While a discussion of these designs is beyond the scope of this post, all of these designs allow the researcher to isolate the effects of an intervention compared to an alternative condition. SCEDs can include one subject but can also include more. For SCEDs with multiple conditions (e.g., differing lengths of baseline), it is recommended at least three patients complete each condition. In SCEDs, randomization can occur within or between subjects. Thus, SCEDs allow for both between and within patient comparisons. Because of the frequent data collection, SCEDs inherently have strong internal validity. A SCED composed of one subject has relatively weak external validity; however, the addition of more subjects and replication of effects across individuals strengthens external validity.
Now, N-of-1 designs. The treatment in these cases is usually a medical intervention or drug. As the name suggests, these designs include a single subject. Here lies the first difference from SCED designs, which by virtue of some designs, include multiple subjects (that being said, the N-of-1 world does include “series of n-of-1 trials” and “n-of-1 meta-analysis” which can also include multiple subjects). N-of-1 trials are randomized controlled trials in which all the randomization takes place within the subject. For example, a patient might be randomized to the days they do and do not take an allergy medication and report their symptoms on each day. So an n-of-1 trial usually continues for longer study periods than SCEDs (e.g., a few weeks of randomized days of either A or B). All the randomization is within-subject and the results allow for clear causal inferences about the average effects of the medication on allergy symptoms for that particular subject.
I see a lot of similarities between N-of-1 designs and SCEDs. Indeed, both focus on identifying causality and providing strong internal validity. A major difference is that SCEDs can include randomization across subjects within the same design whereas N-of-1 designs are single subject by definition. Additionally, N-of-1 designs always appear to include randomization between treatment conditions whereas SCEDs can include quasi-experimental A-B designs with less rigorous randomization.
In writing this, I ended up concluding there are more similarities between these designs than I realized! Perhaps part of the problem rests in each field using the terms they are most accustomed to. I grew up using SCED terminology. However, I can easily see the value of N-of-1 designs in psychotherapy research. Therefore I hope we can continue to bridge linguistic gaps together in our hunt for effective research designs!