According to conventional wisdom, concurrent multiple baselines are superior because they allow for across-tier comparisons that can rule out coincidental events. A study may be at heightened risk of coincidental events if the target behavior is particularly sensitive to events in the environment that are uncontrolled by the experimenter. Addressing the second question requires data analysis that is informed by the specifics of the study. (1981). Thus, the additional temporal separation that is possible in a nonconcurrent design is a strength rather than a weakness in controlling for coincidental events. Oxford. Hayes argued that fortunately the logic of the strategy does not really require (p. 206) an across-tier comparison because the within-tier comparison rules out these threats. (p. 365), Of course, the major problem with this [nonconcurrent multiple baseline] strategy is that the control for history (i.e., the ability to assess subjects concurrently) is greatly diminished. As a result, concurrent and nonconcurrent designs are virtually identical in their control for maturation threats. This assumption was initially identified by Kazdin and Kopel in 1975, but its implications for the rigor of the across-tier comparison have rarely been discussed since that time. In the case of multiple baseline designs, a stable baseline supports a strong prediction that the data path would continue on the same trajectory in the absence of an effective treatment; these predictions are said to be verified by observing no change in trajectories of data in other tiers that are not subjected to treatment; and replication is demonstrated when a treatment effect is seen in multiple tiers. For example, for a child who is on the cusp of walking, a month of exposure to maturational variables may result in a significant improvement in walking, but much less change in fine motor skills. For example, instrumentation is addressed primarily through observer training, calibration, and IOA. Single-case research designs: Methods for clinical and applied settings (3rd ed.). The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. Book For example, knowing the date of session 10 in tier 1 tells us nothing about the date of session 10 in tier 2. Smith (2012) found that SCD was reported in 143 different journals that span a variety of fields such as behavior analysis, psychology, education, speech, and pain management; across these fields, multiple baselines account for 69% of SCDs. Poor execution can certainly worsen these problems, but good execution cannot eliminate them. The across-tier comparison is an additional basis for evaluating alternative explanations. This insensitivity is not due to poor experimental design or implementation, it is built in to the nature of multiple baseline designs across participants. Nonconcurrent multiple baseline designs are those in which tiers are not synchronized in real time. Further, it is impossible to know how many events, which events, or the severity of the events that are missed by an across-tier comparison. Thus, both of the articles introducing nonconcurrent multiple baselines made explicit arguments that replicated within-tier comparisons are sufficient to address the threat of coincidental events. In a concurrent multiple baseline that involves a single participant across settings, behaviors, antecedent stimuli etc., this kind of event would be expected to contact all tiers. In this design, behavior is measured across either multiple individuals, behaviors, or settings. The current SCD methodological literature and most SCD textbooks claim that because the tiers of nonconcurrent multiple baseline are not synchronized in real time they have a diminished capacity to control for extraneous variables, in particular coincidental events (e.g., Carr, 2005; Gast et al., 2018; Harvey et al., 2004; Johnston et al., 2020). Rosales-Ruiz, J., & Baer, D. M. (1997). must have stable baseline and tx in first bx If each tier of a multiple baseline represents a different participant in a different environment (e.g., school versus clinic) located in a different city, this would further reduce the chance that any single event or pattern of events could have contacted the participants coincident with the phase changes. Kennedy, C. H. (2005). In this case, the across-tier comparison would give the false appearance of strong internal validity. The authors argue that like the concurrent multiple baseline design, the nonconcurrent form can rule out coincidental events (i.e., history) as a threat to internal validity and that experimental control can be established by the replication of the within-tier comparison with phase changes offset relative to the beginning of baseline. Longer lags and more isolated tiers can reduce the number of tiers necessary to render extraneous variables implausible explanations of results. For example, in a multiple baseline across settings, the settings could present somewhat different demands. It is possible that a coincidental event may be present for all tiers but have different effects on different tiers. Still, for a given study, the results influence the number to tiers required in a rigorous multiple baseline design. By nature, undetected events are unknown. These variables share the key characteristic that their impact would be expected to accumulate as a function of number of experimental sessions. In particular, within-tier comparisons may be strengthened by isolating tiers from one another in ways that reduce the chance that any single coincidental event could coincide with a phase change in more than one tier (e.g., temporal separation). Single-case designs for educational research. Barlow, D. H., Nock, M. K., & Hersen, M. (2009). Watson and Workman (1981) noted that the requirement that observations be taken concurrently clearly poses problems for researchers in applied settings (e.g., schools, mental health centers), since clients with the same target behavior may only infrequently be referred at the same point in time (p. 257). The process begins with a simple baseline-treatment (AB) comparisona change from baseline to treatment within a single tier. (2011). Any alternative explanation of this pattern of results would have to posit an alternative set of causes that could plausibly result in changes in the dependent variable in this specific pattern across the multiple tiers. If the baseline phase provides sufficiently stable data to support a strong prediction of the subsequent data path and the data path prediction is contradicted by the actual data after the introduction of the independent variable, this provides some suggestion that the independent variable may have been the cause of the changea potential treatment effect. Routledge/Taylor & Francis Group. Coincidental events might be expected to be more variable in their effect than interventions that are designed to have consistent effects. In both within- and across-tier comparisons, the dates on which the sessions took place are not relevant to the effects of testing and session experience. Kazdin, A. E. (2021). How many tiers do we need? Type I errors and power in multiple baseline designs. (Similar arguments can be made for comparisons across settings, persons, and other variables that might define tiers.) (1968) who emphasized the replicated within-tier comparison. Nonconcurrent multiple baseline designs for educational program evaluation. Basic Books. The ABA or Reversal Design Using Single-Case Designs in Practical Settings: Is Within-Subject Replication Always Necessary? On the other hand, if we see a change in a treated tier and no change in untreated tiers, does this constitute strong evidence to rule out threats to internal validity? Correspondence to Having identified the criticisms of nonconcurrent multiple baseline designs, we now turn to a detailed analysis of threats to internal validity and features that can control these threats. Cooper et al. A researcher who puts great confidence in the across-tier comparison could falsely reject the idea that coincidental events were the cause of observed effects. It is clear that we cannot claim that these assumptions are always valid for multiple baseline designs. https://doi.org/10.1007/s40614-022-00343-0, DOI: https://doi.org/10.1007/s40614-022-00343-0. The key characteristic that maturational processes share is that they may produce behavioral changes that would be expected to accumulate as a function of elapsed time in the absence of participation in research.Footnote 2 In order to control for maturation, we must attend to the passage of timetypically, calendar days. This skepticism of nonconcurrent designs stems from an emphasis on the importance of across-tier comparisons and relatively low importance placed on replicated within-tier comparisons for addressing threats to internal validity and establishing experimental control. Finally, we make recommendations for more rigorous use, reporting, and evaluation of multiple baseline designs. Behavior Research Methods, 43(4), 971980. Peer reviewers and editors who serve as gatekeepers for the scientific literature must also have a deep understanding of these issues so that they can distinguish between stronger and weaker research, ensure that information critical to evaluating internal validity is included in research reports, and assess the appropriateness of discussion and interpretation of results. The point is that although the across-tier comparison may reveal a maturation effect, there are also circumstances in which it may fail to do so. This argument rests on the assumptions that any extraneous variable that affects one tier will (1) contact all tiers and (2) have a similar effect on all tiers. Third, patterns of results influence the number of tiers needed to yield definitive conclusions. As Kazdin and Kopel (1975) pointed out, multiple baseline designs require that the effects of the independent variable must have tier-specific effects, yet the across-tier analysis requires that extraneous variables must not have tier-specific effects. On resolving ambiguities of the multiple-baseline design: Problems and recommendations. Pearson. The within-tier comparison may be further strengthened by increasing independence of the tier in other dimensions. volume45,pages 619638 (2022)Cite this article. Journal of Applied Behavior Analysis, 1(1), 9197. A broad and general impression such as these designs are relatively strong is not sufficient to guide experimental design decisions or to evaluate particular variations of multiple baseline designs. Craig H. Kennedy. For example, it is implausible that the effects of maturation would coincide with a phase change after 5 days in one tier, after 10 days in a second tier, and after 15 days in a third. Behavior Therapy, 6(5), 601608. This is a significant problem for the across-tier comparison because its logic is dependent on these two assumptions. This has been the topic of important recent methodological research, including studies of the interobserver reliability of expert judgements of changes seen in published multiple baseline designs (Wolfe et al., 2016) and use of simulated data to test Type I and II error rates when judgements of experimental control are made based on different numbers of tiers (Lanovaz & Turgeon, 2020). https://doi.org/10.4324/9781315537085. Other design features that contribute to the isolation of tiers such that any single extraneous variable is unlikely to contact multiple tiers can also strengthen the independence of tiers. Pearson Education. This provides clear information about the number of sessions that precede the phase change in each tier, and therefore constitutes a strong basis for controlling the threat of testing and session experience. Thus, to demonstrate experimental control, the effects of the independent variable must not generalize; and to detect an extraneous variable through the across-tier comparison, the effects of that extraneous variable must generalize. Exceptional Children, 71, 165179. The authors discuss two designs commonly used to demonstrate reliable control of an important behavior change (p. 94). Multiple baseline designs are intended to evaluate whether there is a functional (causal) relation between the introduction of the independent variable and changes in the dependent variable. Therefore, we believe that these features should be explicitly included in the definition of multiple baseline designs. Hersen and Barlows (1976) textbook appears to be the first complete description of the multiple baseline design with many of the ideas about experimental control that are current to this day. In a review of the SCD literature, Shadish and Sullivan (2011) found multiple baseline designs making up 79% of the SCD literature (54% multiple baseline alone, 25% mixed/combined designs). WebAB design advantages - -simple to use AB design disadvantages - -cannot be used to make a confident assumption of a functional relation -vulnerable to confounding variables -does not provide for replication AB design - basic single subject design AB design has two phases of design - A: Baseline B: Intervention Reversal Design referred to as - When determining whether a multiple baseline study demonstrates experimental control, researchers examine the data within and across tiers and also consider the extent to which alternative explanations (e.g., extraneous variables or confounds) could plausibly account for the obtained data patterns. The consensus in recent textbooks and methodological papers is that nonconcurrent designs are less rigorous than concurrent designs because of their presumed limited ability to address the threat of coincidental events (i.e., history). WebMultiple-Baseline Designs There are two potential problems with the reversal designboth of which have to do with the removal of the treatment. This would draw attention to the relationship between the prediction from baseline and the (possible) contradiction of that prediction by the obtained treatment-phase data, and the replication of this prediction-contradiction pair in subsequent tiers. A coincidental event may contact a single unit of analysis (e.g., one of four participants) or multiple units (e.g., all participants). When conditions are less ideal, additional tiers may be necessary. Data analysis issues concern two closely related questions: (1) Was there a change in data patterns after the phase change? Journal of Behavioral Education, 13(4), 213226. We use function of elapsed time descriptively rather than causally. Perspectives on Behavior Science, 43, 605616. As we argued above, the observation of no change in an untreated tier is not strong evidence against a coincidental event affecting the treated tier. Likewise, setting-level coincidental events are those that contact a single setting. The first is the reversal design and the authors describe the important applied limitation with this designsituations in which reversals are not possible or feasible in applied settings. Houghton Mifflin. The multiple baseline family of designs includes multiple baseline and multiple probe designs. because a non-concurrent design does not allow any AB comparisons across baselines, it omits the opportunity to see if responding under the control condition changes when the treatment condition is implemented in the other baseline. If a nonconcurrent multiple baseline has a long lag in real time between phase changes (e.g., weeks or months), this may provide stronger control than a design with a lag of one or several days. (Our specification of phase change offset in terms of real time, days in baseline, and sessions in baseline is unusual. Three children (ages 4;3 to 5;3) with moderate-severe to severe SSDs participated in two cycles of therapy. An example of multiple baseline across behaviors might be to use feedback to develop a comprehensive exercise program that involves stretching, aerobic exercise, WebDisadvantages to Multiple Baseline Designs -Weaker method of showing experimental control than a reversal (b/c no withdrawal of treatment) -Delay in treatment can occur as We examine how these comparisons address maturation, testing and session experience, and coincidental events. Second, we briefly summarize historical methodological writing and current textbook treatment of these designs. Experimental and quasi-experimental designs of research. Textbook authors, editors, and readers of research should consider nonconcurrent multiple baseline designs to be capable of supporting conclusions every bit as strong as those from concurrent designs. The Nonconcurrent Multiple-Baseline Design: It is What it is and Not Something Else. In addition, multiple baseline designs are increasingly used in literatures that are not explicitly behavior analytic. The use of single-subject research to identify evidence-based practice in special education. Kennedy, C.H. Behavioral Interventions, 20(3), 219224. The across-tier comparison is valuable primarily when it suggests the presence of a threat by showing a change in an untreated tier at approximately the same time (i.e., days, sessions, or dates) as a potential treatment effect. A functional relation can be inferred if the pattern of data demonstrates experimental controlthe experimenters ability to produce a change in the dependent variable in a precise and reliable fashion (Sidman, 1960). Journal of Consulting & Clinical Psychology, 49(2), 193211. Part of Springer Nature. In this section, we examine how within- and across-tier comparisons may support (or fail to support), internal validity in concurrent and nonconcurrent multiple baseline designs. These coincidental events would contact all tiers of a multiple baseline that include this individual participant, but not tiers that do not involve this participant. Thus, although the across-tier analysis does provide a test of the maturation threat, a lack of change in untreated tiers cannot definitively rule it out. That is, it is not strong evidence verifying the prediction of no change in the initial tier in the absence of an intervention. Single case experimental design and empirical clinical practice. Create the graph from the data in Sheets; 3. Given that multiple baseline designs make up such a large proportion of the existing SCD literature and current research activity, it is critical that SCD researchers thoroughly understand the specific ways that multiple baseline designs address potential threats to internal validity so that they can make experimental design decisions that optimize internal validity and accurately evaluate, discuss, and interpret the results of their research. Use the Previous and Next buttons to navigate the slides or the slide controller buttons at the end to navigate through each slide. Hersen, M., & Barlow, D. H. (1976). An important drawback of pre-experimental designs is that they are subject to numerous threats to their validity. If factors other than the experimenters manipulation of the independent variable could plausibly account for the obtained data patterns, experimental control has not been demonstrated and functional relations cannot be inferred. Some current dimensions of applied behavior analysis. We are not pointing to flaws in execution of the design; we are pointing to inherent weaknesses. Correspondence to A baseline (A) and an intervention (B) are included in a straightforward AB design psychological experiment (B). With control for coincidental events in multiple baseline designs resting squarely on replicated within-tier comparisons, there is no basis for claiming that, in general, concurrent designs are methodologically stronger than nonconcurrent designs. However, an across-tier comparison is not definitive because testing or session experience could affect the tiers differently. Use of brief experimental analyses in outpatient clinic and home settings. 234235). Timothy A. Slocum. Oxford University Press. In J. R. Ledford & D. L. Gast (Eds. Coincidental events share the characteristic that their behavioral impact is expected to be a function of particular dates. The lag between phase changes must be long enough that maturation over any single amount of time cannot explain the results in multiple tiers. Pearson Education. However, critics of nonconcurrent designs have rarely (1) made a thorough and critical analysis of the potential weaknesses of across-tier comparisons in concurrent multiple baselines, or (2) evaluated the degree of experimental control that can be demonstrated by replicated within-tier comparisons. Additional replications further reduce the plausibility of extraneous variables causing change at approximately the same time that the independent variable is applied to each tier. Third, we explore how concurrent and nonconcurrent multiple baselines address each of the main threats to internal validity. One is that if a First, the design assumes that treatment effects will be tier-specific and not spread to untreated tiers. 2023 Springer Nature Switzerland AG. They state, the nonconcurrent multiple baseline across participants design is inherently weaker than other multiple baseline design variations. . The logic of replicated within-tier analysis applies equally to concurrent and nonconcurrent designs. Rand McNally. The general steps for the development of the line graphs are as follows: 1. Multiple baseline and multiple probe designs. Elapsed time does not directly cause maturational changes in behavior. Cooper, J. O., Heron, T. E., & Heward, W. L. (2020). Anyone you share the following link with will be able to read this content: Sorry, a shareable link is not currently available for this article. https://doi.org/10.1037/a0029312, Watson, P. J., & Workman, E. A. Journal of Consulting & Clinical Psychology, 49(2), 193211. Estimating reliabilities and correcting for sampling error in indices of within-person dynamics derived from intensive longitudinal data, Optimizing Detection of True Within-Person Effects for Intensive Measurement Designs: A Comparison of Multilevel SEM and Unit-Weighted Scale Scores, https://doi.org/10.1023/B:JOBE.0000044735.51022.5d, https://doi.org/10.1037/0022-006X.49.2.193, https://doi.org/10.1177/001440290507100203, https://doi.org/10.1016/S0005-7894(75)80181-X, https://doi.org/10.1007/s40614-020-00263-x, https://doi.org/10.3758/s13428-011-0111-y, https://doi.org/10.1016/0005-7916(81)90055-0, http://creativecommons.org/licenses/by/4.0/, SI: Commentary on Slocum et al, Threats to Internal Validity. ), Single case research methodology: Applications in special education and behavioral sciences (pp. In both forms of multiple baseline designs, a potential treatment effect in the first tier would be vulnerable to the threat that the changes in data could be a result of testing or session experience. Research methodologists have identified numerous potential alternative explanations that are threats to internal validity (e.g., Campbell & Stanley, 1963; Cooper et al., 2020; Kazdin, 2021; Shadish et al., 2002). Journal of Behavior Therapy & Experimental Psychiatry, 12(3), 257259. Independent from Watson and Workman (1981), Hayes (1981) published a lengthy article introducing SCDs to clinical psychologists and made the point that these designs are well-suited to conducting research in clinical practice. https://doi.org/10.4324/9781315150666, Chapter Nonconcurrent multiple baseline designs and the evaluation of educational systems. As Kazdin and Kopel point out, it is clearly possible for treatments to have broad effects on multiple tiers and for extraneous variables to have narrow effects on a specific tier. However, we can never ensure that any two contexts or any two session times are not subject to unique events during the study. Google Scholar. Single-case intervention research design standards. in their classic 1968 article that defined applied behavior analysis. https://doi.org/10.1177/0741932512452794, Lanovaz, M. J., & Turgeon, S. (2020). This is consistent with the judgements made by numerous existing standards and recommendations (e.g., Gast et al., 2018; Horner et al., 2005; Kazdin, 2021; Kratochwill et al., 2013). Effects of instructional set and experimenter influence on observer reliability. Although the across-tier comparison may detect some coincidental events; it cannot be assumed to detect them all. Textbooks commonly describe and characterize the design without clearly defining it. Multiple baseline designs can rigorously control these threats to internal validity. For example, physical growth and experiences with the environment can accumulate and result in relatively sudden behavioral changes when a toddler begins to walk. The nature of control for coincidental events (i.e., history) provided by the within-tier comparison in both concurrent and nonconcurrent multiple baseline designs is relatively straightforward. Johnston, J. M., Pennypacker, H. S., & Green, G. (2010). This would align the definition with the critical features required to demonstrate experimental control and thereby allow strong causal statements based on multiple baseline designs. However, this kind of support is not necessary: lagged replications of baseline predictions being contradicted by data in the treatment phase provide strong control for all of these threats to internal validity. Remedial and Special Education, 34(1), 2638. One area that has, in the past, been particularly controversial is the experimental rigor of concurrent versus nonconcurrent multiple baseline designs; that is, the degree to which each can rule out threats to internal validity. However, it does not rule out maturation as an alternative explanation of the change in behavior. If an extraneous variable were to have a tier-specific effect, it would be falsely interpreted as a treatment effect. Thus, to the degree that nonconcurrent designs support longer lags between phases changes than concurrent designs, they may support stronger control of the threat of coincidental events through replicated within-tier comparisons. By synchronized we mean that session 1 in all tiers takes place before session 2 in any tier, and this ordinal invariance of session number across tiers is true for all sessions. Characteristics of single-case designs used to assess intervention effects in 2008. To offer some guidance, we believe that under ideal conditionsadequate lags between phase changes, circumstances that do not suggest that threats are particularly likely, and clear results across tiersthree tiers in a multiple baseline can provide strong control against threats to internal validity. This pattern seriously weakens the argument that the independent variable was responsible for the change in the treated tier. For example, in a multiple baseline across participants, all the residents of a group home may contact peanut butter and jelly sandwiches for lunch but this change may disrupt the behavior of residents with a mild peanut allergy, but not other residents. If the pattern of change shortly after implementation of the treatment is replicated in the other tiers after differing lengths of time in baseline (i.e., different amounts of maturation), maturation becomes increasingly implausible as an alternative explanation. Watson and Workman described a nonconcurrent multiple baseline design in which participants could be begin a study as they became known to the researcher. - 181.212.136.34. Provided by the Springer Nature SharedIt content-sharing initiative, Over 10 million scientific documents at your fingertips, Not logged in Web14 : A multiple-baseline design requires that the targeted behavior return to baseline levels when the treatment is removed. Journal of Behavioral Education, 13(4), 267276. Each replication requires an assumption of a separate event coinciding with a distinct phase change. The non-concurrent multiple baseline across-individuals design: An extension of the traditional multiple baseline design. If this requirement is not met and a single extraneous event could explain the pattern of data in multiple tiers, then replications of the within-tier comparison do not rule out threats to internal validity as strongly. The issue of concurrence of tiers should be considered along with many other design variations that can be manipulated to create a design that fits the particular experimental challenges of a particular study. First, studies differ with respect to the experimental challenges imposed by the phenomena under study. Routledge/Taylor & Francis Group. Carr, J. E. (2005). Recommendations for reporting multiple-baseline designs across participants. Slider with three articles shown per slide. the effects of the treatment variable are inferred from the untreated behaviors (p. 227).
Vintage University Of Tennessee,
Distance To Mexican Border From My Location,
Articles M