One of the cornerstones of modern science is the criterion of replicability. Replicability is the idea that if a phenomenon is real, it should be possible to demonstrate it repeatedly and on demand. If a team of researchers conducts an experiment to test a hypothesis and finds a positive result, then other researchers should get similar results if they conduct experiments on the same topic. This is the basis of almost any scientific field that uses experiments to test hypotheses. For instance, we can be reasonably sure that gravity exists because many studies on the orbits of planets, gravity’s relationship to weight, and other aspects of the phenomenon have generally produced consistent observations and data.
Replication is essential to the process of science because it allows scientists to separate good hypotheses from bad ones. Individual studies may yield inaccurate results for a number of reasons. Sometimes a study is biased, skewing the results and making them appear more or less significant than they actually are. Sometimes the participants don’t reflect the overall population, leading to results that can’t easily be applied to all people. Sometimes the study simply finds an illusory correlation that doesn’t exist in the real world. Replication of original studies helps researchers confirm that an effect is real and further examine how it works. If an idea is accurate, many replications will produce consistent evidence of its existence. If not, replications will show weak or conflicting information, leading researchers to drop the original idea and explore other ones.
However, within roughly the last decade and a half, the concept of replication has caused quite a bit of turmoil in the field of psychology. Researchers have looked back on the history of psychology research and realized that many psychology studies, including some for concepts that define major ideas in the field, have not been successfully replicated or failed to replicate as often as expected (Shrout and Rodgers 2018). In some cases, replication studies were performed but produced results that conflicted with the original findings. Other phenomena have been given replication studies, but the replications didn’t use the methods of the original study, leaving open the possibility that the original results may have been effects of experimental design. Finally, for a few concepts, there have been no attempts to replicate at all! How did this situation come to be?
History
Lack of replication has been noted for decades in psychological research (and in some other fields as well). However, several events in the early 2010s put psychology in the spotlight and made the problem impossible to ignore.
The first event to make headlines was the academic response to a 2011 study on precognition, the idea that some people have the ability to predict the future (Wiggins and Christopherson 2019). The study’s author claimed that he had produced clear evidence in favor of precognition’s existence, which other researchers found questionable due to the claim’s outlandish nature and apparent errors in how the experiment was conducted. A few replication studies were done and found no evidence for precognition. However, the replications were rejected by several scientific journals because they weren’t perceived as being important, despite providing clarification on the likelihood of a controversial idea. Had a person looked through the psychology literature on precognition at that time, they would have seen only the poorly conducted initial study in favor of precognition and not the more numerous replications that rejected its findings.
Later that year, Diedrik Stapel, a scientist who had authored many psychology studies, was proven to have committed fraud in most of them (including faking data). Although not directly related to replication, this event shocked the psychology community and demonstrated how easy it was for bad data to become a part of the overall scientific literature (Wiggins and Christopherson 2019). How had Stapel been able to lie for so long? Where had research and publication protocols had failed? Perhaps if someone had checked his research more thoroughly, such as by trying to replicate it, the fraud would have been discovered as soon as it started.
Finally, several researchers realized that many psychology studies had issues with problematic research practices, including the manipulation of negative results to appear significant. They took it upon themselves to expose these problems by publishing an article appropriately titled “False Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant in Psychological Science.” In the article, they used studies and statistics to “prove” the idea that a person can de-age themselves by listening to a Beatles song (Wiggins and Christopherson 2019). Despite the ridiculous conclusion, the studies had been conducted according to the norms of research at the time and their statistical methods were all common and accepted practices. The article demonstrated how easily ordinary research practices could be misused.
These events opened the floodgates to a wave of doubt about the validity of psychology research, and the more people looked at the problem, the more issues they found. One report estimated that the field has a replication rate as low as 36% (and only as high as 47%) (Wiggins and Christopherson 2019).
One area of psychology that was hit particularly hard was the concept of priming. Priming is the idea that exposure to certain sights or events make people more likely to respond in ways that relate to what they just experienced. For instance, if a person saw a picture of a couch and was asked to fill in the blank letters in the word “S_ _ a,” they would likely say “sofa” instead of “soda,” “saga,” or other words that would also fit the letters given. Basic effects like this are relatively uncontroversial, but many proponents of priming have claimed to find much more extensive influences. One study published in the 1990s claimed that participants who were exposed to words associated with old age walked more slowly afterwards, as if affected by the stereotype that old people are slow (Lunbeck 2025). Academics were quick to conclude that this meant that people are so heavily influenced by the environment that even simple actions like walking are predetermined and not under our control! However, follow-up experiments found nothing of the sort.
One more thing: It’s worth noting that although the most famous examples of failed replication involve hypotheses that are particularly extreme or unlikely, more ordinary research can also fall afoul of reproducibility norms. In fact, one of the most worrying implications of the replication crisis is that other, more mundane inaccuracies are flying under the scientific method’s radar, and researchers don’t realize the need to weed them out. It’s easy to doubt that humans are mindless robots programmed by the words they read but harder to reconsider ideas that seem intuitive and logical.
Potential Solutions
So how can this issue be fixed? There is no one solution, but a number of measures can greatly improve the reliability of psychological research and end the replication crisis.
- Preregistration of studies: Some academic journals now require researchers to “register” their experiments before actually conducting them. The researchers have to share their topic, methods (the protocols they will use for the study), and hypotheses and stick to them during the testing period. If they change any of the details, especially in their methods, the study may be rejected and not published by the journal. This measure encourages transparency in how data is collected and discourages unscrupulous authors from changing the methods partway through the experiment to manipulate results. Because it means that studies are published before the results are known, it also cuts back on the practice of choosing studies for publication because they support ideas that are popular.
- Better measurement and statistical hygiene: This reform is only related to replication indirectly, but it is nonetheless extremely important. Using more precise measurements when collecting data and analyzing that data more honestly will cut down on errors and yield more realistic conclusions. These measures can create more honest and accurate results, which go hand in hand with replicability.
- Sharing of specific methods: A major critique of psychological research papers is that they don’t always include enough details about their methods to allow for replication. As a result, follow-up studies may not be close enough to the original to accurately test the phenomenon under study. A simple fix for this issue is to require more specific reporting of data collection procedures, allowing any lab to perform exact replications.
- Publishing more replications and rewarding researchers for replicating significant studies: At the institutional level, there needs to be a significant change in what scientific journals consider to be worth publishing. Currently, many journals are much more willing to accept studies that show new and interesting findings than replications of those findings, and research that disproves or fails to find evidence for particular hypotheses is often rejected even if the methods are sound (a phenomenon called the “file drawer effect”). This issue is severe enough that the replication crisis probably cannot be solved without a new commitment to publishing replication studies, including ones with negative results. Proposed fixes include rewarding researchers for transparency in their methods (allowing for replication), relaxing publication standards to include more replications and negative studies, and even requiring journals to publish replications of studies that they previously accepted.
It’s also worth noting that some observers argue that the replication crisis was never very serious in the first place, or point to the positive changes that are being made. This is why the replication crisis is sometimes referred to as a “credibility revolution” instead (Korbmacher et al. 2023). Although the situation may look bleak, researchers have begun implementing reforms. Various actors are encouraging open access research, incentivizing high-quality studies while deemphasizing the study quantity, creating new guidelines for journal editors who make the decision to accept or reject research, creating better reviews that summarize many studies, and more. There is still a long way to go, but these measures can ameliorate many of the problems that plague psychology and other fields and create a better future for science.
Ultimately, the replication crisis in psychology demonstrates many of the challenges that science is facing today and points toward ways that it can be improved. There are serious issues in academia’s publication practices and ability to prevent fraud, but these problems are not unfixable. Better research practices and a unified effort by scientific journals can end the replication crisis and work towards ensuring a psychological science that is more replicable, accurate, and open for all.
Sources:
Korbmacher, Max, et al. “The Replication Crisis Has Led to Positive Structural, Procedural, and Community Changes.” Edge Hill University Research Information Repository (Edge Hill University), vol. 1, no. 1, 25 July 2023, www.nature.com/articles/s44271-023-00003-2, https://doi.org/10.1038/s44271-023-00003-2.
Lunbeck, Elizabeth. “Failure to Replicate” Anatomy of a Train Wreck: The Rise and Fall of Priming Research Ruth Leys University of Chicago Press, 2024. 416 Pp.” Science, vol. 387, no. 6730, 9 Jan. 2025, pp. 145–145, https://doi.org/10.1126/science.adu0370.
Shrout, Patrick E., and Joseph L. Rodgers. “Psychology, Science, and Knowledge Construction: Broadening Perspectives from the Replication Crisis.” Annual Review of Psychology, vol. 69, no. 1, 4 Jan. 2018, pp. 487–510, https://doi.org/10.1146/annurev-psych-122216-011845.
Wiggins, Bradford J., and Cody D. Chrisopherson. “The Replication Crisis in Psychology: An Overview for Theoretical and Philosophical Psychology.” Journal of Theoretical and Philosophical Psychology, vol. 39, no. 4, Nov. 2019, pp. 202–217, https://doi.org/10.1037/teo0000137.


