Open Science in Education Sciences
What Is Open Science?
The Open Science movement tries to increase trust in research results and open the access to all elements of a research project to the public. Central to these goals, Open Science has promoted five critical elements: Open Data, Open Analysis, Open Materials, Preregistration, and Open Access. All Open Science elements can be thought of as extensions to the traditional way of achieving openness in science, which has been scientific publication of research outcomes in journals or books (Nosek et al., 2012). First, Open Data refers to the practice of making all raw data used in an analysis publicly available (Nosek et al., 2012; Nuijten, 2019), as opposed to presenting only summary data in a scientific publication, usually in the form of means and standard deviations. In this article, we consider Open Data to comprise only of data collected during a research study and exclude sharing materials related to intervention and data analysis (as these are described in different elements, Open Analysis and Open Materials). Open Data is one of the elements of Open Science to which grant funding agencies attribute most value. Providing access to data from federally funded research stems from the idea that data collected with public funds ultimately belong to the public, with research institutions serving as the stewards of the data. Sharing data openly gives other researchers the chance to combine these data sets for subsequent analysis. For example, several longitudinal studies examining students with LD and other comorbid conditions might be combined into one pooled sample spanning early elementary through high school. With this pooled sample, longer trajectories of symptomatology could be conducted using high-powered growth models (Curran & Hussong, 2009), giving insight into the development of the comorbidity across all levels of education without having to follow one cohort for over 12 years. Besides combining data sets that only include individuals with LD, LD researchers could also reuse data sets not specifically about individuals with LD. Many classroom intervention projects, for example, include students with LD. Data from these projects might be reanalyzed to show disaggregated intervention effects, for example, a difference in the expected time it would take students to reach proficiency (T. L. Johnson & Hancock, 2019).
The purpose of the second element, Open Analysis, is to provide consumers of research with a detailed task analysis of the steps researchers took to obtain final statistical results, starting at the raw data (Klein et al., 2018), as opposed to the shortened version usually specified in a scientific publication. This task analysis will likely include procedures used to clean raw data (i.e., correcting errors in data and making sure formatting is consistent) and to transform variables, as well as details about statistical procedures. In addition, it should provide detailed documentation of the software used for the analysis, including the packages (in case of open source) or add-ons (with commercial software) and the specific version used. Open Data and Open Analysis are often considered under one umbrella and some journals that are encouraging this practice have adopted a badge system whereby researchers receive an Open Data badge that is published with their article when the authors include access to their data and code. Other institutions, such as the National Institutes of Health (NIH) and the Institute of Education Sciences (IES), consider Open Data to be different from Open Analysis. Open Analysis can benefit research in LD, for example, because it might provide the exact criteria used in a study for classifying students as LD. Other researchers can then use the same criteria in their study to make results better comparable across research studies.
With adherence to the third element, Open Materials, researchers make sure anyone can reproduce a study’s procedures by providing all materials used in the study (Klein et al., 2018). The type of materials will differ according to the type of study, but might include all research created assessments, questionnaires, intervention protocols, and implementation fidelity checklists. Replication is important for research in LD to understand under which circumstances and for which population a specific intervention works. To directly replicate a study, however, the intervention should be followed as close to the original as possible. Often, published manuscripts only provide a snapshot or description of a subset of the materials used in the study limiting the opportunity for replication and evaluation. Besides facilitating replications, sharing study materials openly gives practitioners the opportunity to access tools that could benefit individuals with LD.
The fourth element of Open Science is Preregistration. In a preregistration, researchers delineate the parameters of their study by clearly describing their hypotheses, methods for data collection, and data analysis plan in a study protocol before executing a study (Nosek & Errington, 2019; van’t Veer & Giner-Sorolla, 2016). Preregistration is complete when the study protocol is uploaded to an online registry and available to the public to download. Some journals accept the submission of study protocols as Registered Reports. Registered Reports include an introduction section and undergo the same peer-review process as regular manuscripts. In the write-up of a study, researchers can refer back to the study protocol to indicate deviations from the original plan and, by doing so, indicate which results are confirmatory and differentiate them from those that are exploratory. In this way, we can be sure of the veracity of the results and make informed decisions about future research or policy benefiting individuals with LD.
The last element, Open Access, may be the most fundamental element of Open Science. By adhering to Open Access, researchers do their best to make sure the results of science are available to anyone, not just the individuals that have a subscription to a specific journal (Klein et al., 2018; Norris et al., 2008). To do this, researchers can publish in Open Access journals, pay extra fees to traditional journals to let an article become Open Access, or provide pre- and postprints of articles on preprint servers. Open Access benefits the LD community in general, by ensuring all stakeholders (i.e., researchers, practitioners, family members, and policy makers) have access to the latest research.
Progress Toward Open Science
Due to the recent influx of comments and emphasis on Open Science related to high impact failures to replicate original research, it may seem that Open Science is a buzzword associated with a relatively new phenomenon that is mainly an issue in psychology. The so-called replication crisis in psychology has had extensive coverage in scientific literature and the media since the late 2000s. It was dubbed the replication crisis after researchers found only 36% of statistically significant results published in prominent Psychology journals could be replicated and a large majority of this 36% resulted in effects smaller than the original effects (Open Science Collaboration & Others, 2015). However, the roots of the Open Science movement go back much further than the crisis. The current Open Science movement is the result of multiple decades of concerns about how science is conducted and how researchers engage the public through the results of their research. For example, selective outcome reporting and HARKing (hypothesizing after results are known) were flagged as problematic in the 1960s for psychologists (Meehl, 1967) and in the 1990s for epidemiologists (Taubes & Mann, 1995).
A central tenet of Open Science, increasing the access of data, analyses, and results to the general public, has a long history. In fact, David (1994) states that Open Science likely started in the 17th century during the Scientific Revolution. Scientific publishing was established under the assumption that printed versions of results would open science to the general public, allowing full disclosure of knowledge and public replicability (National Research Council, 2003). Beyond opening science through summary data in publications, making the full raw data collected as part of a research project open to the public became the normal in the late 1940s, through the start of repositories or archives for social sciences research data (Bisco, 1966). As such, sharing full raw data is also not a new phenomenon in science.
In medicine, the lack of reproducibility of outcomes and transparency of reports of clinical trials (Pocock et al., 1987) led to the creation of the Consolidated Standards of Reporting Trials (CONSORT) in 1996 with the intention to increase transparency of research methods and results (Begg et al., 1996). Adoption of the CONSORT guidelines has led to an increase in quality of medical clinical trial reporting (Plint et al., 2006). In addition, many grant funding agencies in medicine soon required researchers to preregister clinical trials (www.clinicaltrials.gov), a result of the FDA Modernization Act of 1997. Preregistration in medicine requires specifying methods, recruitment, and primary outcomes before a trial is conducted. As a result, the reported number of positive findings has decreased while the number of null results that were published increased (Kaplan & Irvin, 2015), expanding the knowledge of the true impact of new treatments. This example shows the gradual shift in a major research area over the last four decades to incorporate more and more details on data, design, and analysis with the objective to increase trust in published research results. In total, Open Science practices are not new.
Open Science in Educational Sciences
Open Science practices are also not new in education science. Similar to the other sciences, researchers in education science voiced similar concerns about research practices throughout 20th and 21st centuries. For example, in the early 1980s, Peterson and colleagues (1982) appealed to the applied behavior analysis field to provide more details in descriptions of independent variables. With better descriptions, they argued, failures to replicate might be alleviated, because replicators may not implement an intervention as intended (Peterson et al., 1982). More recently, as a response to the lack of transparency and the limited ability to evaluate the merits of research from published reports, special education researchers came together in the early 2000s to establish indicators of research quality for correlational, group design, single case, and qualitative research (Brantlinger et al., 2005; Gersten et al., 2005; Horner et al., 2005; Thompson et al., 2005). Around the same time, the IES started the What Works Clearinghouse. This initiative sets standards for what can be considered high-quality research and then reviews this work to provide publicly available information about educational programs, products, and practices, with the ultimate goal of helping educators provide evidence-based instruction and interventions that work (U.S. Department of Education, What Works Clearinghouse, n.d.).
More recently, federal grant funding institutions that support research in education and LD, such as the NIH, the National Science Foundation (NSF), and the IES, have set forth recommendations and requirements that align with Open Science practices. Since 2013, data collected during a project funded by any federal grant institution is mandated to be open and accessible to the public (Executive Order No. 13,642, 2013). All three grant funding agencies require applicants to include a data management plan, that is, a plan detailing how the final research data is going to be shared with the public. These funding agencies also house articles written about their funded projects in specific public access repositories (i.e., ERIC for IES, PubMed Central for NIH, and NSF-PAR for NSF).
In addition to requiring a data management plan and public access to articles, IES adopted the Standards for Excellence in Educational Research (SEER) Principles in their 2019 grant application cycle (U.S. Department of Education, Institute of Education Sciences, 2018). The SEER Principles are not practices researchers have to adhere to mandatorily to secure funding, but, ultimately, research projects will receive indications of excellence according to their adherence levels (U.S. Department of Education, Institute of Education Sciences, 2018). The SEER Principles cover both the period leading up to and following a research project and include practices intended to increase transparency such as preregistration, open analysis, and open data. In addition, the SEER Principles support the scaling up and generalization of results through providing Open Materials.
By focusing on producing high-quality research and adhering to mandates from funding agencies, many individual researchers are, perhaps unintentionally, moving toward adhering to the practices heralded by Open Science. Recent papers advocating the adoption of Open Science practices in education argue these practices are a safeguard against questionable research actions such as HARKing, selective outcome reporting, and p-hacking (e.g., Cook, 2016; van der Zee & Reich, 2018). Particularly in special education, providing evidence of these questionable actions is seen as an argument to stimulate a shift in academic culture, in particular, a shift toward Open Science practices (e.g., Cook, 2016). Open Science in education sciences, however, has the potential to be much more than a safeguard against questionable research. Open Science in education science provides opportunities to (a) increase the transparency and therefore replicability of research and (b) develop and answer research questions about our population of interest (i.e., individuals with LD and learning difficulties) that were previously impossible to answer due to complexities in data analysis methods.
One important aspect of research on, and interventions for, students with LD is understanding the parameters within which findings hold. These parameters could be related to various characteristics, such as participants, the environment, and the interventions (Coyne et al., 2016). For example, a particular reading intervention may have evidence to be useful with students in K-2, but that does not mean it is equally effective for students in grades 3 to 5. Similarly, some patterns may be present in children in an urban environment, but not in those living in rural settings. To find out where these parameters lie, research should go through several phases, from piloting a study, to direct replication, and finally to conceptual replications (Coyne et al., 2016). Direct replications are a way to ensure effects found in a study are robust, and not due to “error, bias, or chance” (Coyne et al., 2016, p. 250). Direct replications are essentially duplicates of the original study and difficult to realize in applied educational research. In fact, in a comprehensive review of 36 special education journals, Makel and colleagues (2016) found only 90 direct replications in 45,490 articles. Conceptual replications help define the parameters of an effect (Nosek & Errington, 2019). These replications can be closely aligned to the original study, with only a low number of dimensions different from the original study (Coyne et al., 2016), for example, examining the effect of an intervention using the same intervention materials and training, grades, and population, but conducting the study in a different geographical area (e.g., Gersten et al., 2015), or extending the length of the intervention (e.g., Toste et al., 2019). When more dimensions change, a replication is considered distal and will speak to the generalizability of the effect, for example, when changing both the group size of an intervention and the geographical area (e.g., Doabler et al., 2019). Ideally, researchers move systematically through the different phases of replication.
The importance of this phase structure for defining these parameters is not limited to Open Science. In fact, they are reflected in both grant funding structures of the NSF and IES. Moreover, IES includes mention of these parameters in their mission and this commitment is reflected in their recent commitment to funding replications of previous positive interventions. Specifically, IES holds “finding out what works for whom under what conditions” (U.S. Department of Education, Institute of Education Sciences, 2018) as a central goal. To achieve this goal, IES funds projects under the themes of Exploration, Development and Innovation, and Initial Efficacy and Follow-Up. Similarly, NSF funds projects as Exploratory, Design and Development, Impact, and Implementation and Improvement. For conceptual replications, IES has added a different funding competition specifically focused on reading and mathematics interventions that have previously shown to be effective (CFDA 84.305R and CFDA 84.324R).
Besides funding agencies, replication is also prominent in the discussion of finding evidence-based practices in single case design research (Horner et al., 2005). The proposed criteria by Horner and colleagues (2005) for determining if an intervention has sufficient evidence base include five different direct or distal, high-quality replications across three different researchers in three different geographical areas.
To be able to execute direct or conceptual replications, researchers will need access to the original materials, including intervention materials, assessments, and data analysis plans. Traditionally, researchers would have to reach out to other researchers to request materials and more information. Requesting information from authors may not have a high success rate for various reasons (Manca et al., 2018). For example, researchers may have changed institutions and the listed contact email address may therefore not be valid anymore. By adhering to the principles of Open Science and having materials such as Analysis publicly available in a central repository, such problems could be avoided, and more replications may be performed increasing our knowledge on the boundaries of intervention effects.
Developing and Answering Novel Research Questions
A second benefit of Open Science methods is they can serve as a catalyst for research. While data are often collected with a particular hypothesis in mind, it is likely other research questions could be answered using different models with the same data. For example, a large number of researchers have answered a copious amount of questions using data freely available from the Early Childhood Longitudinal Studies (ECLS) and the Schools and Staffing Survey (SASS). These large-scale data sets may seem fundamentally different from research project data. However, data sets from multiple projects can be combined with each other. The combination of data sets in common repositories can be a powerful tool to foster creativity and stimulate novel research questions. As an example, the combination of about 230 individual data sets on children’s language in the Child Language Data Exchange System (CHILDES), an open data repository, has led to the publication of over 5,000 articles. While it is not impossible that the original researchers might have reached this number of publications, it is more likely the repository gave other researchers a chance to explore a new question or theory.
Besides opening up data to allow others to ask novel questions, combining data can help answer questions that researchers were not able to answer in their own sample due to low numbers of participants, or low numbers of behavioral occurrence (Bainter & Curran, 2015; Curran & Hussong, 2009). This is also especially important in research on LD. Students with LD often comprise the left tail of a normal distribution. Thus, within any given sample of students, the number of students with LD will be low. It is often either time consuming or extremely costly to collect enough data on our population of interest to be able to run analyses with sufficient power. Other research groups, however, may have very similar data. Combined, these data may generate a sample size of students with LD that provides enough power to answer novel questions.
For example, to examine if students’ scores on executive functioning measures predicted reading disabilities, Daucourt and colleagues (2018) combined eight different data sets. The original data sets included student achievement data from reading intervention studies. However, the original research did not include measures of executive functioning. Daucourt and colleagues sent out an additional parental questionnaire to all students that included items related to executive functioning. Their final sample included only those students from the original studies whose parents returned the additional questionnaire and the sample consisted of about 10% of the original sample (i.e., 420 students). Around 30% (i.e., 139) of these students were considered having a reading disability. The authors were then able to show that lower executive functioning was related to reading disability.
In this particular example, Daucourt and colleagues (2018) were able to capitalize on existing data to run a relatively low-cost study. By using the already collected reading achievement data from the original intervention studies, they avoided having to unnecessarily spend resources on assessments and were able to append these data with the parent questionnaire. In other cases, collecting additional data may not be necessary. With the same eight data sets, researchers could examine the impact of the original reading interventions on only the subset of students with reading disabilities. Whether or not additional data collection is needed, researchers will need to be able to find and access the data sets containing their variables and populations of interest.
Open Science in Practice
Adopting and adhering to Open Science practices is a crucial step toward improving lives of individuals with LD through sound research. We will now provide overviews of the main tenets of Open Science (i.e., Open Data, Open Analysis, Open Materials, Preregistration, and Open Access) and resources on best practices for each of the tenets. A guide to the actions and decisions researchers will need to make to follow Open Science practices is presented in Figure 1. In this visual overview, we present options for both projects that are already in progress and those that are still in the design phase.
In most publications, authors present summary data on the variables of interest. To increase transparency and spur reuse of these data, Open Science urges researchers to make all data, at the individual level, publicly available (Nosek et al., 2012; Nuijten, 2019). Providing Open Data involves uploading a raw, yet curated data set to an online, public repository. Many funding agencies consider publicly available data sets permanent products of the grant. Deposited in a public repository, data set will obtain a digital object identifier (DOI) and is therefore a citable, permanent product of a research project. This guarantees data sets remain accessible with the same identifier over longer periods of time. Researchers can include this product on their CVs and show the impact of their work beyond publications through the number of times their data set was used in secondary analyses. Recognizing the amount of time and resources that are associated with curating and archiving a data set, funding agencies such as the NIH allow part of a budget to be allocated specifically for this purpose.
There are several online repositories available for archiving data sets. These repositories contain data from a wide range of disciplines, such as the Open Science Framework (www.osf.io), Figshare (http://www.figshare.com), and the Inter-university Consortium for Political and Social Research (ICPSR) (http://www.icpsr.umich.edu/). Increasingly, discipline-specific repositories are being developed, such as LDbase (http://www.ldbase.org/), a repository for data specific to research focused on learning differences and LD, and Databrary (https://nyu.databrary.org/), a repository for developmental video data. In addition, grant funding agencies may have their own repository. For example, the NIH supports DASH, a repository for data and specimens collected by NICHD grantees (https://dash.nichd.nih.gov/) and the National Database for Autism Research (NDAR; https://nda.nih.gov/). The NIH is currently exploring other archiving options for data sets that are not fully aligned with domain-specific repositories. A first attempt to this end is a specific NIH instance of Figshare (https://nih.figshare.com/). Finally, there is a specific repository for qualitative social science research (https://qdr.syr.edu/). Each repository may have specific requirements for depositing and storage of data (e.g., allowing embargoes, allowing researchers to self-deposit, types of data files supported). Table 1 provides an overview of several of such requirements for the repositories listed above.
Preparing data to be deposited in an online repository requires more than uploading files. Researchers should take several steps before data are ready to be shared. First, a raw data set needs to be cleaned and deidentified before making it available. Curating a data set involves removing all identifiable information about participants (such as birthdays, names), checking for out-of-range values, and ensuring consistency across variables (Klein et al., 2018). In line with the IES SEER principles, it is not necessary for a researcher to share all data that were collected, but at a minimum all data that were used in any publications. Finally, researchers should make decisions about access options. Besides making the data available immediately, many repositories can restrict access to a data set during an embargo period or release data only to researchers who requested access.
In addition to a raw data file, it is also imperative to include metadata. Metadata can be described as information that can support the “discovery, understanding, and stewardship” of other data’ (Day, 2005, p. 10). In other words, metadata can help researchers locate data sets that possibly contain information they are interested in and evaluate if the data can be used to answer their question. For educational research, the information provided in metadata will relate mostly to the context and procedures of data collection and storage and is often made available through a codebook. A codebook contains the names of the variables, their labels, specific text of a questionnaire, the values of the variable and their labels, and how missing data is indicated for each variable, and scoring rules. If variables have been transformed before analysis, additional information about the methods used to do so might be included (“What is a Codebook?,” n.d.). Several commercial software programs exist that can generate codebooks based on survey data (e.g., StatPac). In addition, the Document, Discover, and Interoperate alliance (DDI) provides an online tool to generate interactive codebooks that can handle postprocessing and ongoing data collection (https://ddialliance.org). For R-users, several packages exist that can add the metadata to data sets such as codebook (Arslan, 2018).
Adhering to Open Data may seem a daunting task. The Digital Curation Centre (DCC) has helpful resources on all aspects of data management and curation, including which data to share, best way to organize data, and how to write a good data management plan (www.dcc.ac.uk). Many academic librarians are also well versed in data curation and can be valuable resources when preparing data to be shared. Working collaboratively with experts in data management and curation will help researchers make their data findable, accessible, interoperable, and reusable.
With Open Analysis, researchers provide a detailed account of all steps taken in the statistical analysis, beginning at the raw data and ending with the final statistical results (Klein et al., 2018). Providing the complete steps of an analysis is important for several reasons. First, during any analysis, a researcher has the freedom to make choices on how to run the analysis. This is sometimes called the researcher degrees of freedom (Simmons et al., 2011). Using varying analytical decisions can lead to differences in analysis outcomes. Carefully annotating the decisions, in addition to commenting on the analysis itself, will provide the necessary details to understand the analysis for the study. In addition, this annotated workflow provides researchers who are new to a statistical analysis with an overview of the decisions that need to be made, and they may gain a deeper understanding for the specific statistical techniques. It is also possible that preparing documentation of data analysis can lead to the discovery of errors in code (Epskamp, 2019), giving authors the opportunity to rectify results.
Second, statistical software packages have different default settings for certain operations (Epskamp, 2019). This may seem problematic only for more complex and advanced statistical methods, such as structural equation models; however, even more commonly used statistical analyses are handled slightly different in different software. Running an unbalanced analysis of variance (ANOVA) model with the default options in SPSS can yield different estimates of parameters than the default in R, because each program calculates the differences between groups based on a different combination of components, called Type I, II, or III sums of squares (see Navarro, 2017, for a detailed explanation of the different types and how they influence parameter estimates). While these differences may be small, they could lead to erroneous decisions about irreproducibility. By providing details about the software, its version, and possible additional packages used in their workflow and write-up of the study, researchers can avoid confusion about their results.
Open Analysis documentation will look different for each project. Many (commercial) statistical software programs will allow the researcher to save the syntax (e.g., Mplus, SPSS, and SAS), sometimes including annotations. Open-source statistical software, such as R, Python, and JASP, always allow a researcher to save the complete workflow with comments. It is likely data analysis will not always be shared perfectly. Researchers may be unable to share a complete workflow, or the workflow may only work on certain systems, leaving users of other systems to evaluate the analysis based on code alone (Klein et al., 2018). Rather than to let this be a deterrent for sharing all together, it is preferred to share any part or version of a workflow available. If it is impossible to share syntax, for example, for data analysis performed in a spreadsheet program, researchers could share screenshots of the flow of menu options used to perform an analysis or a step-by-step description of decisions made (Epskamp, 2019; Klein et al., 2018). By sharing what is available, even if it seems scant, researchers can demystify their analysis and increase transparency.
In many journal articles, researchers include small sample items of a measure or limited examples of study protocols, such as intervention steps or implementation fidelity checklists. However, these materials are seldom sufficient for replication of a research project. Previous page restrictions in journals may influence the limited sharing of research materials, but with the advent of repositories and cloud-based storage, it is possible for researchers to share all details of their study with other researchers (Grahe, 2018). By adhering to Open Materials, researchers add to the overall transparency of their project and give independent researchers the opportunity to carefully control the differences between their project and the original project (Grahe, 2018; Klein et al., 2018).
When sharing research materials, it is best to be as exhaustive as possible. At a minimum, all study protocols, assessments, and stimuli needed to successfully run a replication study should be uploaded (Grahe, 2018). It is likely, however, that there is a need to add specific walk-throughs or instructions for parts of the project. In the case of intervention materials, for example, it will be helpful to note the degree of flexibility an interventionist has in going off script. Additional important materials include blank informed consent forms (Lewis, 2020). If sharing materials infringes on copyright, for example, for commercialized assessments and intervention materials, these materials do not have to be provided by the researcher, given that they are openly available already (Grahe, 2018).
Providing Open Materials, particularly the most essential materials, is likely the least complicated and time consuming of the open science practices. In many cases, materials have already been created and are likely stored in the project’s digital location. Most of the data repositories mentioned in the Open Data section allow researchers to add materials to their data sets for easy access. Similar to Open Data, repositories can assign DOIs to the materials, making them citable products of a project.
In a preregistration, researchers delineate the parameters of their study by clearly describing their hypotheses, methods for data collection, and data analysis plan in a study protocol before data analysis is conducted (van’t Veer & Giner-Sorolla, 2016). The ultimate goal of preregistration is to provide transparency on the research process. Transparency through preregistration does not imply a plan cannot be changed. On the contrary, preregistration can be an iterative process allowing researchers to specify how they responded to unforeseen challenges during the research design and collection analysis (Gehlbach & Robinson, 2018). For example, many researchers are currently forced to adapt research protocols due to COVID-19. In this case, an original preregistration protocol of a study examining the relation between independent reading, motivation, and LD may have included three waves of in-person data collection. Due to restrictions on face-to-face contact, researchers changed the setting for the last wave of data collection to video conferencing. The updated protocol should specify this change and address potential implications of interpreting the outcomes of the last wave given the change in setting.
In addition, some analyses may be difficult to list specifically. For example, researchers may have a set of predictor variables to include as random variables in a hierarchical linear model based on their substantive theory. During the model building process, however, some of these variables do not appear to vary in their slope across clusters and adding the random slope does not increase the fit of the model. The researcher decides to drop these variables. The final model depends on outcomes of intermediate tests of significance. In this case, the data analysis section should consist of a clear decision-making process for the inclusion or exclusion of variables. Researcher may also list contingencies to the original analysis plan (Gehlbach & Robinson, 2018). When uploaded to a registry, preregistrations are assigned an ID number and each iteration of a preregistration receives a specific time-stamp so that the history and appropriateness of the changes can be assessed by others.
Several of these registries exist, some with a wide range of topics, and others more specific. For systematic reviews and meta-analyses, for example, protocols are typically uploaded to Cochrane (https://us.cochrane.org/) or PROSPERO (https://edtechbooks.org/-vwSg). Both organizations provide extensive documentation on their sites guiding researchers through the protocol and registration process with specific templates to follow. Specifically for intervention research, registries are hosted by the Society for Research on Educational Effectiveness (SREE) (https://sreereg.icpsr.umich.edu/), OSF (www.osf.io), and AsPredicted (www.aspredicted.org). Most of these registries mainly support experimental and quasi-experimental group design studies and provide templates with guiding questions. Recently, the field of special education has also called for preregistration of single case research (A. H. Johnson & Cook, 2019) and it is certainly also possible to preregister qualitative studies.
For Registered Reports, a full introduction and methods section of a manuscript are submitted to a journal and then it goes through typical peer-review process. This process gives outside experts the opportunity to provide feedback on the design of the study, potentially signaling flaws or suggesting improvements and expansions. After this peer-review process, the journal may give a “provisional acceptance,” which means the journal will publish the study when executed according to plan regardless of the findings (Nosek et al., 2019; van’t Veer & Giner-Sorolla, 2016). With respect to education and research with LD populations in particular, several journals have specific guidelines for submitting registered reports including Exceptional Children and Scientific Studies of Reading. The Center for Open Science (COS) provides lists of other journals accepting registered reports and journals that have published special issues with registered reports (https://cos.io/rr/).
Preregistration is the most prominently featured aspect of Open Science in the SEER Principles. The SEER principles focus on the comparison between what was proposed and what was eventually done and reported. Besides promoting transparency, making protocols available before the start of a research projects helps to make a distinction between outcomes that were hypothesized before a study began (i.e., confirmatory results) and exploratory results that were the result of unexpected patterns in the data. The exploratory results might warrant subsequent confirmatory research especially designed to test the new hypothesis. This distinction between confirmatory and exploratory outcomes is the main benefit of preregistration and registered reports (Cook et al., 2018; van’t Veer & Giner-Sorolla, 2016). This does not mean that exploratory analyses are precluded from research. On the contrary, Open Science values exploratory analyses as a means to find unexpected results. These analyses and results should merely be noted as exploratory.
Open Access refers to making research reports publicly available without a subscription barrier (Klein et al., 2018; Norris et al., 2008). For many researchers, reading about a certain method, data set, or intervention in a paper is the stimulus to examine the issues more carefully and possibly to conduct direct or conceptual replications (Kraker et al., 2011). When research is presented Open Access, more researchers will have the opportunity to engage with the research.
Grant funding agencies already expect research articles to become available to the public and have their own outlets. In the case of research sponsored by IES, it is expected papers are made available to the public through ERIC; NIH grantees used PubMed Central, and NSF uses its own public access repository, NSF-PAR. Research in several different areas has shown that articles published Open Access (either through the journal or through self-archiving) get cited more often than articles behind a paywall (e.g., Eysenbach, 2006; Metcalfe, 2006; Norris et al., 2008). In general, there are two ways to share manuscripts with the public. Using the Green way, researchers post their work on preprint archives; using the Gold way, researchers either publish in a fully Open Access journal, or pay additional fees to the publishing journal to make the manuscript Open Access (Harnad et al., 2004). These fees differ per journal and can be as high as US$3,000, with an average cost of about US$900 (Solomon & Björk, 2012). To help researchers with the cost of making research open access, many universities now have grant programs specific to this purpose. In addition, costs for Gold access in journals can be written in budget justifications of major grants.
Many of the journals in which research on LD is published allow researchers to post preprints and postprints (i.e., Exceptional Children, Exceptionality, Learning Disability Quarterly, Journal of Learning Disabilities, The Journal of Special Education). The website hosted by SHERPA/ROMEO (https://edtechbooks.org/-DDfk) has information on the archiving policies for most journals related to education science and LD, as well as their access options. Preprints can take the form of near-final versions of a manuscript that has been submitted, or the final version, accepted for publication. Some journals require a preprint to be the unformatted version of the manuscript. Several archives exclusively hosting preprints exist. EdArXiv is a recently established archive for educational preprints and associated with the OSF repository and authors can link preprints hosted on EdArXiv to their OSF projects. There are several benefits of posting a preprint to an online archive. First, all papers that are archived receive a DOI and thus can be cited and referenced, prior to the lengthy peer-review process begins. This speeds up the impact our science can have. Relatedly, the archives will also track the number of downloads and citations of these papers. More importantly, the archives allow the researcher to protect their work legally by assigning it a license, such as a Creative Common license (https://edtechbooks.org/-sihS). Even if papers are theoretical, purely exploratory, or were not written from open science at the start, authors can make sure their work is accessible to all by posting preprints. See Fleming (2020) for a useful flowchart on the decisions on posting preprints.
It may seem the benefits of adhering to Open Science practices are limited mostly to grant funded research. The present focus on grant requirements served as a narrative thread to show how education science is adapting toward more Open Science practices. In fact, it is equally important and beneficial for unfunded research to become more open. For example, it is likely that these projects are conducted with smaller sample sizes. Studies with small sample sizes are more prone to Type I error, that is, reporting a statistically significant effect that occurred by chance (Simmons et al., 2011). Preregistering a study with a small sample provides transparency on the hypothesized relation and data analysis, making it easier for other to interpret the reliability of the results. In addition, data sets from several unfunded studies can also become a larger, unified data set.
How can LD researchers without current projects adhere and promote Open Science practices? First, it is never too late to share data and materials from previous projects, regardless if they were used in a publication. Even if a specific intervention did not yield statistically significant increases in students’ abilities, the data still contain valuable information about the student population that might be of interest to others and that could potentially be combined with other existing data. In addition, researchers conducting meta-analyses may be interested in using unpublished studies to combat outcomes skewed through publication bias (Rothstein & Hopewell, 2009). Increased precision in meta-analytic effect sizes will provide better estimates of the potential of an intervention, which in turn may limit the implementation of interventions that do not benefit students with LD. Similar to data, sharing materials from studies that have concluded can be valuable. This can provide opportunities to early career researchers or researchers with less access to funding to conduct small replication studies without having to spend resources on developing already existing materials. This can increase the research output in the LD field, hopefully resulting in more robust knowledge on interventions and their generalizability in less time.
Second, researchers can actively promote the culture shift toward Open Science practices. One way to encourage new norms is by talking about them in conversations with colleagues. For example, when collaboratively planning a new study, researchers can raise the possibility of preregistration or even propose replication research with openly available materials. Moreover, researcher can advocate to have discussion of these practices be included in research methodology courses offered to graduate students (Gehlbach & Robinson, 2018).
Finally, the review processes for both grant proposals and manuscripts submitted for publication are other opportunities. Reviewers of manuscripts can ask to see data and analyses (Davis et al., 2018), attempt to rerun the provided analyses to see to what degree the results are reproducible (Kraker et al., 2011), check previous studies or studies that are highly similar to compare outcomes (Kraker et al., 2011), and check if preprints or preregistration files are available to compare the proposed analyses with those reported. In the case of grant proposals, reviewers can check how investigators plan to share data, outcomes, and materials after termination of their project.
“The goal of intervention research in special education is to identify effective practices for students with disabilities and accumulate rigorous and trustworthy evidence about the conditions under which these practices are more or less effective” (Coyne et al., 2016, pp. 251–252). By embracing the central tenets of Open Science: Open Data, Open Analysis, Open Materials, Preregistration, and Open Access, researchers in LD can create an environment more conducive to this goal. Open Data gives the possibility to combine data sets and answer hitherto impossible questions; Open Analysis help other researchers rerun data to verify outcomes and learn to program complex models; Open Materials let other researchers replicate studies with more precision; Preregistration allows for improvements in design before a study is executed increasing the overall quality of the work and transparency about research decisions; and Open Access provides a larger audience for important work. The tenets of Open Science together can give an impetus to a more collaborative effort that will ultimately benefit the education and lives of individuals with LD.
Views expressed herein are those of the authors and have neither been reviewed nor approved by the granting agencies.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by Eunice Kennedy Shriver National Institute of Child Health & Human Development Grants HD052120 and HD095193.
Wilhelmina van Dijk https://edtechbooks.org/-zGX
Arslan, R. C. (2018). How to automatically generate rich codebooks from study metadata. PsyArXiv. https://doi.org/10.31234/osf.io/5qc6h
Bainter, S. A., Curran, P. J. (2015). Advantages of integrative data analysis for developmental research. Journal of Cognition and Development, 16(1), 1–10. https://doi.org/10.1080/15248372.2013.871721
Begg, C., Cho, M., Eastwood, S., Horton, R., Moher, D., Olkin, I., … Stroup, D. F. (1996). Improving the quality of reporting of randomized controlled trials: The CONSORT statement. The Journal of the American Medical Association, 276(8), 637–639. https://doi.org/10.1001/jama.1996.03540080059030
Bisco, R. L. (1966). Social science data archives: A review of developments. The American Political Science Review, 60(1), 93–109. https://doi.org/10.2307/1953810
Brantlinger, E., Jimenez, R., Klingner, J., Pugach, M., Richardson, V. (2005). Qualitative studies in special education. Exceptional Children, 71(2), 195–207. https://doi.org/10.1177/001440290507100205
Cook, B. G. (2016). Reforms in academic publishing: Should behavioral disorders and special education journals embrace them? Behavioral Disorders, 41(3), 161–172. https://doi.org/10.17988/0198-7429-41.3.161
Cook, B. G., Lloyd, J. W., Mellor, D., Nosek, B. A., Therrien, W. J. (2018). Promoting Open Science to increase the trustworthiness of evidence in special education. Exceptional Children, 85(1), 104–118. https://doi.org/10.1177/0014402918793138
Coyne, M. D., Cook, B. G., Therrien, W. J. (2016). Recommendations for replication research in special education: A framework of systematic, conceptual replications. Remedial and Special Education, 37(4), 244–253. https://doi.org/10.1177/0741932516648463
Curran, P. J., Hussong, A. M. (2009). Integrative data analysis: The simultaneous analysis of multiple data sets. Psychological Methods, 14(2), 81–100. https://doi.org/10.1037/a0015914
Daucourt, M. C., Schatschneider, C., Connor, C. M., Al Otaiba, S., Hart, S. A. (2018). Inhibition, updating working memory, and shifting predict reading disability symptoms in a hybrid model: Project KIDS. Frontiers in Psychology, 9, Article 238. https://doi.org/10.3389/fpsyg.2018.00238
David, P. A. (1994). Positive feedbacks and research productivity in science: Reopening another black box. In Grandstand, O. (Ed.), Economics and Technology (pp. 65–89). Elsevier.
Davis, W. E., Giner-Sorolla, R., Lindsay, D. S., Lougheed, J. P., Makel, M. C., Meier, M. E., Sun, J., Vaughn, L. A., Zelenski, J. M. (2018). Peer-review guidelines promoting replicability and transparency in psychological science. Advances in Methods and Practices in Psychological Science, 1(4), 556–573. https://doi.org/10.1177/2515245918806489
Day, M. (2005). Metadata. In Ross, S., Day, M. (Eds.), DCC Digital Curation Manual. http://www.dcc.ac.uk/resources/curation-reference-manual/completed-chapters/metadata
Doabler, C. T., Clarke, B., Kosty, D., Kurtz-Nelson, E., Fien, H., Smolkowski, K., Baker, S. K. (2019). Examining the impact of group size on the treatment intensity of a tier 2 mathematics intervention within a systematic framework of replication. Journal of Learning Disabilities, 52(2), 168–180. https://doi.org/10.1177/0022219418789376
Epskamp, S. (2019). Reproducibility and replicability in a fast-paced methodological world. Advances in Methods and Practices in Psychological Science, 2(2), 145–155. https://doi.org/10.1177/2515245919847421
Eysenbach, G. (2006). Citation advantage of Open Access articles. PLoS Biology, 4(5), e157. https://doi.org/10.1371/journal.pbio.0040157
Fleming, J. I. (2020, April 30). How to post a preprint flowchart. EdArXiv. https://doi.org/10.35542/osf.io/2jr68
Gehlbach, H., Robinson, C. D. (2018). Mitigating illusory results through preregistration in education. Journal of Research on Educational Effectiveness, 11(2), 296–315. https://doi.org/10.1080/19345747.2017.1387950
Gersten, R., Fuchs, L. S., Compton, D., Coyne, M. D., Greenwood, C., Innocenti, M. S. (2005). Quality indicators for group experimental and quasi-experimental research in special education. Exceptional Children, 71(2), 149–164. https://doi.org/10.1177/001440290507100202
Gersten, R., Rolfhus, E., Clarke, B., Decker, L. E., Wilkins, C., Dimino, J. (2015). Intervention for first graders with limited number knowledge: Large-scale replication of a randomized controlled trial. American Educational Research Journal, 52(3), 516–546. https://doi.org/10.3102/0002831214565787
Grahe, J. (2018). Another step towards scientific transparency: Requiring research materials for publication. The Journal of Social Psychology, 158(1), 1–6. https://doi.org/10.1080/00224545.2018.1416272
Harnad, S., Brody, T., Vallières, F., Carr, L., Hitchcock, S., Gingras, Y., Oppenheim, C., Stamerjohanns, H., Hilf, E. R. (2004). The access/impact problem and the green and gold roads to open access. Serials Review, 30(4), 310–314.
Google Scholar | Crossref
Horner, R. H., Carr, E. G., Halle, J., McGee, G., Odom, S., Wolery, M. (2005). The use of single-subject research to identify evidence-based practice in special education. Exceptional Children, 71(2), 165–179. https://doi.org/10.1177/001440290507100203
Johnson, A. H., Cook, B. G. (2019, July 8). Preregistration in single-case design research. EdArXiv. [Preprint]. https://doi.org/10.35542/osf.io/rmvgc
Johnson, T. L., Hancock, G. R. (2019). Time to criterion latent growth models. Psychological Methods, 24(6), 690–707.
Google Scholar | Crossref | Medline
Kaplan, R. M., Irvin, V. L. (2015). Likelihood of null effects of large NHLBI clinical trials has increased over time. PLOS ONE, 10(8), Article e0132382. https://doi.org/10.1371/journal.pone.0132382
Klein, O., Hardwicke, T. E., Aust, F., Breuer, J., Danielsson, H., Mohr, A. H., Ijzerman, H., Nilsonne, G., Vanpaemel, W., Frank, M. C. (2018). A practical guide for transparency in psychological science. Collabra: Psychology, 4(1), Article 20. https://doi.org/10.1525/collabra.158
Kraker, P., Leony, D., Reinhardt, W., Beham, G. (2011). The case for an open science in technology enhanced learning. International Journal of Technology Enhanced Learning, 3(6), 643–654. https://doi.org/10.1504/IJTEL.2011.045454
Lewis, N. A. (2020). Open communication science: A primer on why and some recommendations for how. Communication Methods and Measures, 14(2), 71–82. https://doi.org/10.1080/19312458.2019.1685660
Makel, M. C., Plucker, J. A., Freeman, J., Lombardi, A., Simonsen, B., Coyne, M. (2016). Replication of special education research: Necessary but far too rare. Remedial and Special Education, 37(4), 205–212. https://doi.org/10.1177/0741932516646083
Manca, A., Cugusi, L., Dvir, Z., Deriu, F. (2018). Non-corresponding authors in the era of meta-analyses. Journal of Clinical Epidemiology, 98, 159–161.
Google Scholar | Crossref | Medline
Meehl, P. E. (1967). Theory-testing in psychology and physics: A methodological paradox. Philosophy of Science, 34(2), 103–115. https://doi.org/10.1086/288135
Metcalfe, T. S. (2006). The citation impact of digital preprint archives for Solar Physics papers. Solar Physics, 239, 549–553. https://doi.org/10.1007/s11207-006-0262-7
National Research Council . (2003). The purpose of publication and responsibilities for sharing. In. Sharing publication-related data and materials: Responsibilities of authorship in the life sciences. https://doi.org/10.17226/10613
Navarro, D. (2017). Learning statistics with R. https://learningstatisticswithr.com/lsr-0.6.pdf
Norris, M., Oppenheim, C., Rowland, F. (2008). The citation advantage of open-access articles. Journal of the American Society for Information Science and Technology, 59(12), 1963–1972. https://doi.org/10.1002/asi.20898
Nosek, B. A., Beck, E. D., Campbell, L., Flake, J. K., Hardwicke, T. E., Mellor, D. T., van ‘t, Veer, A. E., Vazire, S. (2019, August 14). Preregistration is hard, and worthwhile. PsyArXiv. [Preprint]. https://doi.org/10.31234/osf.io/wu3vs
Nosek, B. A., Errington, T. M. (2019, September 10). What is replication? MetaArXiv. [Preprint]. https://doi.org/10.31222/osf.io/u4g6t
Nosek, B. A., Spies, J. R., Motyl, M. (2012). Scientific utopia: II. Restructuring incentives and practices to promote truth over publishability. Perspectives on Psychological Science, 7(6), 615–631.
Google Scholar | SAGE Journals | ISI
Nuijten, M. B. (2019). Practical tools and strategies for researchers to increase replicability. Developmental Medicine & Child Neurology, 61(5), 535–539. https://doi.org/10.1111/dmcn.14054
Open Science Collaboration, & Others . (2015). Estimating the reproducibility of psychological science. Science, 349(6251), Article aac4716.
Peterson, L., Homer, A. L., Wonderlich, S. A. (1982). The integrity of independent variables in behavior analysis. Journal of Applied Behavior Analysis, 15(4), 477–492. https://doi.org/10.1901/jaba.1982.15-477
Plint, A. C., Moher, D., Morrison, A., Schulz, K., Altman, D. G., Hill, C., Gaboury, I. (2006). Does the CONSORT checklist improve the quality of reports of randomised controlled trials? A systematic review. Medical Journal of Australia, 185(5), 263–267. https://doi.org/10.5694/j.1326-5377.2006.tb00557.x
Pocock, S. J., Hughes, M. D., Lee, R. J. (1987). Statistical problems in the reporting of clinical trials. The New England Journal of Medicine; Boston, 317(7), 426–432. http://dx.doi.org/10.1056/NEJM198708133170706
Rothstein, H. R., Hopewell, S. (2009). Grey literature. In Cooper, H., Hedges, L. V., Valentine, J. C. (Eds.), The handbook of research synthesis and meta-analysis (2nd ed., pp. 103–125). Russel Sage Foundation.
Simmons, J. P., Nelson, L. D., Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 1359–1366. https://doi.org/10.1177/0956797611417632
Solomon, D. J., Björk, B.-C. (2012). A study of open access journals using article processing charges. Journal of the American Society for Information Science and Technology, 63(8), 1485–1495. https://doi.org/10.1002/asi.22673
Taubes, G., Mann, C. C. (1995). Epidemiology faces its limits. Science, 269(5221), 164–169.
Google Scholar | Crossref | Medline | ISI
Thompson, B., Diamond, K. E., McWilliam, R., Snyder, P., Snyder, S. W. (2005). Evaluating the quality of evidence from correlational research for evidence-based practice. Exceptional Children, 71(2), 181–194. https://doi.org/10.1177/001440290507100204
Toste, J. R., Capin, P., Williams, K. J., Cho, E., Vaughn, S. (2019). Replication of an experimental study investigating the efficacy of a multisyllabic word reading intervention with and without motivational beliefs training for struggling readers. Journal of Learning Disabilities, 52(1), 45–58. https://doi.org/10.1177/0022219418775114
U.S. Department of Education, Institute of Education Sciences . (2018). Standards for excellence in education research. https://ies.ed.gov/seer.asp
U.S. Department of Education, What Works Clearinghouse . (n.d.). Find what works! https://ies.ed.gov/ncee/wwc/FWW
van der Zee, T., Reich, J. (2018). Open education science. AERA Open, 4(3), 1–15. https://doi.org/10.1177/2332858418787466
van’t Veer, A. E., Giner-Sorolla, R. (2016). Pre-registration in social psychology—A discussion and suggested template. Journal of Experimental Social Psychology, 67, 2–12. https://doi.org/10.1016/j.jesp.2016.03.004
What is a Codebook? (n.d.). ICPSR. https://www.icpsr.umich.edu/web/ICPSR/cms/1983
Suggested Citation, , & (2021). Open Science in Education Sciences. In , , , , , , , & (Eds.), An Introduction to Open Education. EdTech Books. https://edtechbooks.org/open_education/open_science_in_educ
CC BY-NC: This work is released under a CC BY-NC license, which means that you are free to do with it as you please as long as you (1) properly attribute it and (2) do not use it for commercial gain.
End-of-Chapter Survey: How would you rate the overall quality of this chapter?
- Very Low Quality
- Low Quality
- Moderate Quality
- High Quality
- Very High Quality