Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • Guide to Experimental Design | Overview, Steps, & Examples

Guide to Experimental Design | Overview, 5 steps & Examples

Published on December 3, 2019 by Rebecca Bevans . Revised on June 21, 2023.

Experiments are used to study causal relationships . You manipulate one or more independent variables and measure their effect on one or more dependent variables.

Experimental design create a set of procedures to systematically test a hypothesis . A good experimental design requires a strong understanding of the system you are studying.

There are five key steps in designing an experiment:

  • Consider your variables and how they are related
  • Write a specific, testable hypothesis
  • Design experimental treatments to manipulate your independent variable
  • Assign subjects to groups, either between-subjects or within-subjects
  • Plan how you will measure your dependent variable

For valid conclusions, you also need to select a representative sample and control any  extraneous variables that might influence your results. If random assignment of participants to control and treatment groups is impossible, unethical, or highly difficult, consider an observational study instead. This minimizes several types of research bias, particularly sampling bias , survivorship bias , and attrition bias as time passes.

Table of contents

Step 1: define your variables, step 2: write your hypothesis, step 3: design your experimental treatments, step 4: assign your subjects to treatment groups, step 5: measure your dependent variable, other interesting articles, frequently asked questions about experiments.

You should begin with a specific research question . We will work with two research question examples, one from health sciences and one from ecology:

To translate your research question into an experimental hypothesis, you need to define the main variables and make predictions about how they are related.

Start by simply listing the independent and dependent variables .

Research question Independent variable Dependent variable
Phone use and sleep Minutes of phone use before sleep Hours of sleep per night
Temperature and soil respiration Air temperature just above the soil surface CO2 respired from soil

Then you need to think about possible extraneous and confounding variables and consider how you might control  them in your experiment.

Extraneous variable How to control
Phone use and sleep in sleep patterns among individuals. measure the average difference between sleep with phone use and sleep without phone use rather than the average amount of sleep per treatment group.
Temperature and soil respiration also affects respiration, and moisture can decrease with increasing temperature. monitor soil moisture and add water to make sure that soil moisture is consistent across all treatment plots.

Finally, you can put these variables together into a diagram. Use arrows to show the possible relationships between variables and include signs to show the expected direction of the relationships.

Diagram of the relationship between variables in a sleep experiment

Here we predict that increasing temperature will increase soil respiration and decrease soil moisture, while decreasing soil moisture will lead to decreased soil respiration.

Prevent plagiarism. Run a free check.

Now that you have a strong conceptual understanding of the system you are studying, you should be able to write a specific, testable hypothesis that addresses your research question.

Null hypothesis (H ) Alternate hypothesis (H )
Phone use and sleep Phone use before sleep does not correlate with the amount of sleep a person gets. Increasing phone use before sleep leads to a decrease in sleep.
Temperature and soil respiration Air temperature does not correlate with soil respiration. Increased air temperature leads to increased soil respiration.

The next steps will describe how to design a controlled experiment . In a controlled experiment, you must be able to:

  • Systematically and precisely manipulate the independent variable(s).
  • Precisely measure the dependent variable(s).
  • Control any potential confounding variables.

If your study system doesn’t match these criteria, there are other types of research you can use to answer your research question.

How you manipulate the independent variable can affect the experiment’s external validity – that is, the extent to which the results can be generalized and applied to the broader world.

First, you may need to decide how widely to vary your independent variable.

  • just slightly above the natural range for your study region.
  • over a wider range of temperatures to mimic future warming.
  • over an extreme range that is beyond any possible natural variation.

Second, you may need to choose how finely to vary your independent variable. Sometimes this choice is made for you by your experimental system, but often you will need to decide, and this will affect how much you can infer from your results.

  • a categorical variable : either as binary (yes/no) or as levels of a factor (no phone use, low phone use, high phone use).
  • a continuous variable (minutes of phone use measured every night).

How you apply your experimental treatments to your test subjects is crucial for obtaining valid and reliable results.

First, you need to consider the study size : how many individuals will be included in the experiment? In general, the more subjects you include, the greater your experiment’s statistical power , which determines how much confidence you can have in your results.

Then you need to randomly assign your subjects to treatment groups . Each group receives a different level of the treatment (e.g. no phone use, low phone use, high phone use).

You should also include a control group , which receives no treatment. The control group tells us what would have happened to your test subjects without any experimental intervention.

When assigning your subjects to groups, there are two main choices you need to make:

  • A completely randomized design vs a randomized block design .
  • A between-subjects design vs a within-subjects design .

Randomization

An experiment can be completely randomized or randomized within blocks (aka strata):

  • In a completely randomized design , every subject is assigned to a treatment group at random.
  • In a randomized block design (aka stratified random design), subjects are first grouped according to a characteristic they share, and then randomly assigned to treatments within those groups.
Completely randomized design Randomized block design
Phone use and sleep Subjects are all randomly assigned a level of phone use using a random number generator. Subjects are first grouped by age, and then phone use treatments are randomly assigned within these groups.
Temperature and soil respiration Warming treatments are assigned to soil plots at random by using a number generator to generate map coordinates within the study area. Soils are first grouped by average rainfall, and then treatment plots are randomly assigned within these groups.

Sometimes randomization isn’t practical or ethical , so researchers create partially-random or even non-random designs. An experimental design where treatments aren’t randomly assigned is called a quasi-experimental design .

Between-subjects vs. within-subjects

In a between-subjects design (also known as an independent measures design or classic ANOVA design), individuals receive only one of the possible levels of an experimental treatment.

In medical or social research, you might also use matched pairs within your between-subjects design to make sure that each treatment group contains the same variety of test subjects in the same proportions.

In a within-subjects design (also known as a repeated measures design), every individual receives each of the experimental treatments consecutively, and their responses to each treatment are measured.

Within-subjects or repeated measures can also refer to an experimental design where an effect emerges over time, and individual responses are measured over time in order to measure this effect as it emerges.

Counterbalancing (randomizing or reversing the order of treatments among subjects) is often used in within-subjects designs to ensure that the order of treatment application doesn’t influence the results of the experiment.

Between-subjects (independent measures) design Within-subjects (repeated measures) design
Phone use and sleep Subjects are randomly assigned a level of phone use (none, low, or high) and follow that level of phone use throughout the experiment. Subjects are assigned consecutively to zero, low, and high levels of phone use throughout the experiment, and the order in which they follow these treatments is randomized.
Temperature and soil respiration Warming treatments are assigned to soil plots at random and the soils are kept at this temperature throughout the experiment. Every plot receives each warming treatment (1, 3, 5, 8, and 10C above ambient temperatures) consecutively over the course of the experiment, and the order in which they receive these treatments is randomized.

Finally, you need to decide how you’ll collect data on your dependent variable outcomes. You should aim for reliable and valid measurements that minimize research bias or error.

Some variables, like temperature, can be objectively measured with scientific instruments. Others may need to be operationalized to turn them into measurable observations.

  • Ask participants to record what time they go to sleep and get up each day.
  • Ask participants to wear a sleep tracker.

How precisely you measure your dependent variable also affects the kinds of statistical analysis you can use on your data.

Experiments are always context-dependent, and a good experimental design will take into account all of the unique considerations of your study system to produce information that is both valid and relevant to your research question.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Likert scale

Research bias

  • Implicit bias
  • Framing effect
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic

Experimental design means planning a set of procedures to investigate a relationship between variables . To design a controlled experiment, you need:

  • A testable hypothesis
  • At least one independent variable that can be precisely manipulated
  • At least one dependent variable that can be precisely measured

When designing the experiment, you decide:

  • How you will manipulate the variable(s)
  • How you will control for any potential confounding variables
  • How many subjects or samples will be included in the study
  • How subjects will be assigned to treatment levels

Experimental design is essential to the internal and external validity of your experiment.

The key difference between observational studies and experimental designs is that a well-done observational study does not influence the responses of participants, while experiments do have some sort of treatment condition applied to at least some participants by random assignment .

A confounding variable , also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.

A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.

In your research design , it’s important to identify potential confounding variables and plan how you will reduce their impact.

In a between-subjects design , every participant experiences only one condition, and researchers assess group differences between participants in various conditions.

In a within-subjects design , each participant experiences all conditions, and researchers test the same participants repeatedly for differences between conditions.

The word “between” means that you’re comparing different conditions between groups, while the word “within” means you’re comparing different conditions within the same group.

An experimental group, also known as a treatment group, receives the treatment whose effect researchers wish to study, whereas a control group does not. They should be identical in all other ways.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bevans, R. (2023, June 21). Guide to Experimental Design | Overview, 5 steps & Examples. Scribbr. Retrieved August 12, 2024, from https://www.scribbr.com/methodology/experimental-design/

Is this article helpful?

Rebecca Bevans

Rebecca Bevans

Other students also liked, random assignment in experiments | introduction & examples, quasi-experimental design | definition, types & examples, how to write a lab report, get unlimited documents corrected.

✔ Free APA citation check included ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

  • Privacy Policy

Research Method

Home » Experimental Design – Types, Methods, Guide

Experimental Design – Types, Methods, Guide

Table of Contents

Experimental Research Design

Experimental Design

Experimental design is a process of planning and conducting scientific experiments to investigate a hypothesis or research question. It involves carefully designing an experiment that can test the hypothesis, and controlling for other variables that may influence the results.

Experimental design typically includes identifying the variables that will be manipulated or measured, defining the sample or population to be studied, selecting an appropriate method of sampling, choosing a method for data collection and analysis, and determining the appropriate statistical tests to use.

Types of Experimental Design

Here are the different types of experimental design:

Completely Randomized Design

In this design, participants are randomly assigned to one of two or more groups, and each group is exposed to a different treatment or condition.

Randomized Block Design

This design involves dividing participants into blocks based on a specific characteristic, such as age or gender, and then randomly assigning participants within each block to one of two or more treatment groups.

Factorial Design

In a factorial design, participants are randomly assigned to one of several groups, each of which receives a different combination of two or more independent variables.

Repeated Measures Design

In this design, each participant is exposed to all of the different treatments or conditions, either in a random order or in a predetermined order.

Crossover Design

This design involves randomly assigning participants to one of two or more treatment groups, with each group receiving one treatment during the first phase of the study and then switching to a different treatment during the second phase.

Split-plot Design

In this design, the researcher manipulates one or more variables at different levels and uses a randomized block design to control for other variables.

Nested Design

This design involves grouping participants within larger units, such as schools or households, and then randomly assigning these units to different treatment groups.

Laboratory Experiment

Laboratory experiments are conducted under controlled conditions, which allows for greater precision and accuracy. However, because laboratory conditions are not always representative of real-world conditions, the results of these experiments may not be generalizable to the population at large.

Field Experiment

Field experiments are conducted in naturalistic settings and allow for more realistic observations. However, because field experiments are not as controlled as laboratory experiments, they may be subject to more sources of error.

Experimental Design Methods

Experimental design methods refer to the techniques and procedures used to design and conduct experiments in scientific research. Here are some common experimental design methods:

Randomization

This involves randomly assigning participants to different groups or treatments to ensure that any observed differences between groups are due to the treatment and not to other factors.

Control Group

The use of a control group is an important experimental design method that involves having a group of participants that do not receive the treatment or intervention being studied. The control group is used as a baseline to compare the effects of the treatment group.

Blinding involves keeping participants, researchers, or both unaware of which treatment group participants are in, in order to reduce the risk of bias in the results.

Counterbalancing

This involves systematically varying the order in which participants receive treatments or interventions in order to control for order effects.

Replication

Replication involves conducting the same experiment with different samples or under different conditions to increase the reliability and validity of the results.

This experimental design method involves manipulating multiple independent variables simultaneously to investigate their combined effects on the dependent variable.

This involves dividing participants into subgroups or blocks based on specific characteristics, such as age or gender, in order to reduce the risk of confounding variables.

Data Collection Method

Experimental design data collection methods are techniques and procedures used to collect data in experimental research. Here are some common experimental design data collection methods:

Direct Observation

This method involves observing and recording the behavior or phenomenon of interest in real time. It may involve the use of structured or unstructured observation, and may be conducted in a laboratory or naturalistic setting.

Self-report Measures

Self-report measures involve asking participants to report their thoughts, feelings, or behaviors using questionnaires, surveys, or interviews. These measures may be administered in person or online.

Behavioral Measures

Behavioral measures involve measuring participants’ behavior directly, such as through reaction time tasks or performance tests. These measures may be administered using specialized equipment or software.

Physiological Measures

Physiological measures involve measuring participants’ physiological responses, such as heart rate, blood pressure, or brain activity, using specialized equipment. These measures may be invasive or non-invasive, and may be administered in a laboratory or clinical setting.

Archival Data

Archival data involves using existing records or data, such as medical records, administrative records, or historical documents, as a source of information. These data may be collected from public or private sources.

Computerized Measures

Computerized measures involve using software or computer programs to collect data on participants’ behavior or responses. These measures may include reaction time tasks, cognitive tests, or other types of computer-based assessments.

Video Recording

Video recording involves recording participants’ behavior or interactions using cameras or other recording equipment. This method can be used to capture detailed information about participants’ behavior or to analyze social interactions.

Data Analysis Method

Experimental design data analysis methods refer to the statistical techniques and procedures used to analyze data collected in experimental research. Here are some common experimental design data analysis methods:

Descriptive Statistics

Descriptive statistics are used to summarize and describe the data collected in the study. This includes measures such as mean, median, mode, range, and standard deviation.

Inferential Statistics

Inferential statistics are used to make inferences or generalizations about a larger population based on the data collected in the study. This includes hypothesis testing and estimation.

Analysis of Variance (ANOVA)

ANOVA is a statistical technique used to compare means across two or more groups in order to determine whether there are significant differences between the groups. There are several types of ANOVA, including one-way ANOVA, two-way ANOVA, and repeated measures ANOVA.

Regression Analysis

Regression analysis is used to model the relationship between two or more variables in order to determine the strength and direction of the relationship. There are several types of regression analysis, including linear regression, logistic regression, and multiple regression.

Factor Analysis

Factor analysis is used to identify underlying factors or dimensions in a set of variables. This can be used to reduce the complexity of the data and identify patterns in the data.

Structural Equation Modeling (SEM)

SEM is a statistical technique used to model complex relationships between variables. It can be used to test complex theories and models of causality.

Cluster Analysis

Cluster analysis is used to group similar cases or observations together based on similarities or differences in their characteristics.

Time Series Analysis

Time series analysis is used to analyze data collected over time in order to identify trends, patterns, or changes in the data.

Multilevel Modeling

Multilevel modeling is used to analyze data that is nested within multiple levels, such as students nested within schools or employees nested within companies.

Applications of Experimental Design 

Experimental design is a versatile research methodology that can be applied in many fields. Here are some applications of experimental design:

  • Medical Research: Experimental design is commonly used to test new treatments or medications for various medical conditions. This includes clinical trials to evaluate the safety and effectiveness of new drugs or medical devices.
  • Agriculture : Experimental design is used to test new crop varieties, fertilizers, and other agricultural practices. This includes randomized field trials to evaluate the effects of different treatments on crop yield, quality, and pest resistance.
  • Environmental science: Experimental design is used to study the effects of environmental factors, such as pollution or climate change, on ecosystems and wildlife. This includes controlled experiments to study the effects of pollutants on plant growth or animal behavior.
  • Psychology : Experimental design is used to study human behavior and cognitive processes. This includes experiments to test the effects of different interventions, such as therapy or medication, on mental health outcomes.
  • Engineering : Experimental design is used to test new materials, designs, and manufacturing processes in engineering applications. This includes laboratory experiments to test the strength and durability of new materials, or field experiments to test the performance of new technologies.
  • Education : Experimental design is used to evaluate the effectiveness of teaching methods, educational interventions, and programs. This includes randomized controlled trials to compare different teaching methods or evaluate the impact of educational programs on student outcomes.
  • Marketing : Experimental design is used to test the effectiveness of marketing campaigns, pricing strategies, and product designs. This includes experiments to test the impact of different marketing messages or pricing schemes on consumer behavior.

Examples of Experimental Design 

Here are some examples of experimental design in different fields:

  • Example in Medical research : A study that investigates the effectiveness of a new drug treatment for a particular condition. Patients are randomly assigned to either a treatment group or a control group, with the treatment group receiving the new drug and the control group receiving a placebo. The outcomes, such as improvement in symptoms or side effects, are measured and compared between the two groups.
  • Example in Education research: A study that examines the impact of a new teaching method on student learning outcomes. Students are randomly assigned to either a group that receives the new teaching method or a group that receives the traditional teaching method. Student achievement is measured before and after the intervention, and the results are compared between the two groups.
  • Example in Environmental science: A study that tests the effectiveness of a new method for reducing pollution in a river. Two sections of the river are selected, with one section treated with the new method and the other section left untreated. The water quality is measured before and after the intervention, and the results are compared between the two sections.
  • Example in Marketing research: A study that investigates the impact of a new advertising campaign on consumer behavior. Participants are randomly assigned to either a group that is exposed to the new campaign or a group that is not. Their behavior, such as purchasing or product awareness, is measured and compared between the two groups.
  • Example in Social psychology: A study that examines the effect of a new social intervention on reducing prejudice towards a marginalized group. Participants are randomly assigned to either a group that receives the intervention or a control group that does not. Their attitudes and behavior towards the marginalized group are measured before and after the intervention, and the results are compared between the two groups.

When to use Experimental Research Design 

Experimental research design should be used when a researcher wants to establish a cause-and-effect relationship between variables. It is particularly useful when studying the impact of an intervention or treatment on a particular outcome.

Here are some situations where experimental research design may be appropriate:

  • When studying the effects of a new drug or medical treatment: Experimental research design is commonly used in medical research to test the effectiveness and safety of new drugs or medical treatments. By randomly assigning patients to treatment and control groups, researchers can determine whether the treatment is effective in improving health outcomes.
  • When evaluating the effectiveness of an educational intervention: An experimental research design can be used to evaluate the impact of a new teaching method or educational program on student learning outcomes. By randomly assigning students to treatment and control groups, researchers can determine whether the intervention is effective in improving academic performance.
  • When testing the effectiveness of a marketing campaign: An experimental research design can be used to test the effectiveness of different marketing messages or strategies. By randomly assigning participants to treatment and control groups, researchers can determine whether the marketing campaign is effective in changing consumer behavior.
  • When studying the effects of an environmental intervention: Experimental research design can be used to study the impact of environmental interventions, such as pollution reduction programs or conservation efforts. By randomly assigning locations or areas to treatment and control groups, researchers can determine whether the intervention is effective in improving environmental outcomes.
  • When testing the effects of a new technology: An experimental research design can be used to test the effectiveness and safety of new technologies or engineering designs. By randomly assigning participants or locations to treatment and control groups, researchers can determine whether the new technology is effective in achieving its intended purpose.

How to Conduct Experimental Research

Here are the steps to conduct Experimental Research:

  • Identify a Research Question : Start by identifying a research question that you want to answer through the experiment. The question should be clear, specific, and testable.
  • Develop a Hypothesis: Based on your research question, develop a hypothesis that predicts the relationship between the independent and dependent variables. The hypothesis should be clear and testable.
  • Design the Experiment : Determine the type of experimental design you will use, such as a between-subjects design or a within-subjects design. Also, decide on the experimental conditions, such as the number of independent variables, the levels of the independent variable, and the dependent variable to be measured.
  • Select Participants: Select the participants who will take part in the experiment. They should be representative of the population you are interested in studying.
  • Randomly Assign Participants to Groups: If you are using a between-subjects design, randomly assign participants to groups to control for individual differences.
  • Conduct the Experiment : Conduct the experiment by manipulating the independent variable(s) and measuring the dependent variable(s) across the different conditions.
  • Analyze the Data: Analyze the data using appropriate statistical methods to determine if there is a significant effect of the independent variable(s) on the dependent variable(s).
  • Draw Conclusions: Based on the data analysis, draw conclusions about the relationship between the independent and dependent variables. If the results support the hypothesis, then it is accepted. If the results do not support the hypothesis, then it is rejected.
  • Communicate the Results: Finally, communicate the results of the experiment through a research report or presentation. Include the purpose of the study, the methods used, the results obtained, and the conclusions drawn.

Purpose of Experimental Design 

The purpose of experimental design is to control and manipulate one or more independent variables to determine their effect on a dependent variable. Experimental design allows researchers to systematically investigate causal relationships between variables, and to establish cause-and-effect relationships between the independent and dependent variables. Through experimental design, researchers can test hypotheses and make inferences about the population from which the sample was drawn.

Experimental design provides a structured approach to designing and conducting experiments, ensuring that the results are reliable and valid. By carefully controlling for extraneous variables that may affect the outcome of the study, experimental design allows researchers to isolate the effect of the independent variable(s) on the dependent variable(s), and to minimize the influence of other factors that may confound the results.

Experimental design also allows researchers to generalize their findings to the larger population from which the sample was drawn. By randomly selecting participants and using statistical techniques to analyze the data, researchers can make inferences about the larger population with a high degree of confidence.

Overall, the purpose of experimental design is to provide a rigorous, systematic, and scientific method for testing hypotheses and establishing cause-and-effect relationships between variables. Experimental design is a powerful tool for advancing scientific knowledge and informing evidence-based practice in various fields, including psychology, biology, medicine, engineering, and social sciences.

Advantages of Experimental Design 

Experimental design offers several advantages in research. Here are some of the main advantages:

  • Control over extraneous variables: Experimental design allows researchers to control for extraneous variables that may affect the outcome of the study. By manipulating the independent variable and holding all other variables constant, researchers can isolate the effect of the independent variable on the dependent variable.
  • Establishing causality: Experimental design allows researchers to establish causality by manipulating the independent variable and observing its effect on the dependent variable. This allows researchers to determine whether changes in the independent variable cause changes in the dependent variable.
  • Replication : Experimental design allows researchers to replicate their experiments to ensure that the findings are consistent and reliable. Replication is important for establishing the validity and generalizability of the findings.
  • Random assignment: Experimental design often involves randomly assigning participants to conditions. This helps to ensure that individual differences between participants are evenly distributed across conditions, which increases the internal validity of the study.
  • Precision : Experimental design allows researchers to measure variables with precision, which can increase the accuracy and reliability of the data.
  • Generalizability : If the study is well-designed, experimental design can increase the generalizability of the findings. By controlling for extraneous variables and using random assignment, researchers can increase the likelihood that the findings will apply to other populations and contexts.

Limitations of Experimental Design

Experimental design has some limitations that researchers should be aware of. Here are some of the main limitations:

  • Artificiality : Experimental design often involves creating artificial situations that may not reflect real-world situations. This can limit the external validity of the findings, or the extent to which the findings can be generalized to real-world settings.
  • Ethical concerns: Some experimental designs may raise ethical concerns, particularly if they involve manipulating variables that could cause harm to participants or if they involve deception.
  • Participant bias : Participants in experimental studies may modify their behavior in response to the experiment, which can lead to participant bias.
  • Limited generalizability: The conditions of the experiment may not reflect the complexities of real-world situations. As a result, the findings may not be applicable to all populations and contexts.
  • Cost and time : Experimental design can be expensive and time-consuming, particularly if the experiment requires specialized equipment or if the sample size is large.
  • Researcher bias : Researchers may unintentionally bias the results of the experiment if they have expectations or preferences for certain outcomes.
  • Lack of feasibility : Experimental design may not be feasible in some cases, particularly if the research question involves variables that cannot be manipulated or controlled.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Phenomenology

Phenomenology – Methods, Examples and Guide

Textual Analysis

Textual Analysis – Types, Examples and Guide

Triangulation

Triangulation in Research – Types, Methods and...

Quasi-Experimental Design

Quasi-Experimental Research Design – Types...

Questionnaire

Questionnaire – Definition, Types, and Examples

Research Methods

Research Methods – Types, Examples and Guide

Encyclopedia Britannica

  • History & Society
  • Science & Tech
  • Biographies
  • Animals & Nature
  • Geography & Travel
  • Arts & Culture
  • Games & Quizzes
  • On This Day
  • One Good Fact
  • New Articles
  • Lifestyles & Social Issues
  • Philosophy & Religion
  • Politics, Law & Government
  • World History
  • Health & Medicine
  • Browse Biographies
  • Birds, Reptiles & Other Vertebrates
  • Bugs, Mollusks & Other Invertebrates
  • Environment
  • Fossils & Geologic Time
  • Entertainment & Pop Culture
  • Sports & Recreation
  • Visual Arts
  • Demystified
  • Image Galleries
  • Infographics
  • Top Questions
  • Britannica Kids
  • Saving Earth
  • Space Next 50
  • Student Center

flow chart of scientific method

scientific method

Our editors will review what you’ve submitted and determine whether to revise the article.

  • University of Nevada, Reno - College of Agriculture, Biotechnology and Natural Resources Extension - The Scientific Method
  • World History Encyclopedia - Scientific Method
  • LiveScience - What Is Science?
  • Verywell Mind - Scientific Method Steps in Psychology Research
  • WebMD - What is the Scientific Method?
  • Chemistry LibreTexts - The Scientific Method
  • National Center for Biotechnology Information - PubMed Central - Redefining the scientific method: as the use of sophisticated scientific methods that extend our mind
  • Khan Academy - The scientific method
  • Simply Psychology - What are the steps in the Scientific Method?
  • Stanford Encyclopedia of Philosophy - Scientific Method

flow chart of scientific method

scientific method , mathematical and experimental technique employed in the sciences . More specifically, it is the technique used in the construction and testing of a scientific hypothesis .

The process of observing, asking questions, and seeking answers through tests and experiments is not unique to any one field of science. In fact, the scientific method is applied broadly in science, across many different fields. Many empirical sciences, especially the social sciences , use mathematical tools borrowed from probability theory and statistics , together with outgrowths of these, such as decision theory , game theory , utility theory, and operations research . Philosophers of science have addressed general methodological problems, such as the nature of scientific explanation and the justification of induction .

the scientific method and experimental design

The scientific method is critical to the development of scientific theories , which explain empirical (experiential) laws in a scientifically rational manner. In a typical application of the scientific method, a researcher develops a hypothesis , tests it through various means, and then modifies the hypothesis on the basis of the outcome of the tests and experiments. The modified hypothesis is then retested, further modified, and tested again, until it becomes consistent with observed phenomena and testing outcomes. In this way, hypotheses serve as tools by which scientists gather data. From that data and the many different scientific investigations undertaken to explore hypotheses, scientists are able to develop broad general explanations, or scientific theories.

See also Mill’s methods ; hypothetico-deductive method .

SEP home page

  • Table of Contents
  • Random Entry
  • Chronological
  • Editorial Information
  • About the SEP
  • Editorial Board
  • How to Cite the SEP
  • Special Characters
  • Advanced Tools
  • Support the SEP
  • PDFs for SEP Friends
  • Make a Donation
  • SEPIA for Libraries
  • Entry Contents

Bibliography

Academic tools.

  • Friends PDF Preview
  • Author and Citation Info
  • Back to Top

Scientific Method

Science is an enormously successful human enterprise. The study of scientific method is the attempt to discern the activities by which that success is achieved. Among the activities often identified as characteristic of science are systematic observation and experimentation, inductive and deductive reasoning, and the formation and testing of hypotheses and theories. How these are carried out in detail can vary greatly, but characteristics like these have been looked to as a way of demarcating scientific activity from non-science, where only enterprises which employ some canonical form of scientific method or methods should be considered science (see also the entry on science and pseudo-science ). Others have questioned whether there is anything like a fixed toolkit of methods which is common across science and only science. Some reject privileging one view of method as part of rejecting broader views about the nature of science, such as naturalism (Dupré 2004); some reject any restriction in principle (pluralism).

Scientific method should be distinguished from the aims and products of science, such as knowledge, predictions, or control. Methods are the means by which those goals are achieved. Scientific method should also be distinguished from meta-methodology, which includes the values and justifications behind a particular characterization of scientific method (i.e., a methodology) — values such as objectivity, reproducibility, simplicity, or past successes. Methodological rules are proposed to govern method and it is a meta-methodological question whether methods obeying those rules satisfy given values. Finally, method is distinct, to some degree, from the detailed and contextual practices through which methods are implemented. The latter might range over: specific laboratory techniques; mathematical formalisms or other specialized languages used in descriptions and reasoning; technological or other material means; ways of communicating and sharing results, whether with other scientists or with the public at large; or the conventions, habits, enforced customs, and institutional controls over how and what science is carried out.

While it is important to recognize these distinctions, their boundaries are fuzzy. Hence, accounts of method cannot be entirely divorced from their methodological and meta-methodological motivations or justifications, Moreover, each aspect plays a crucial role in identifying methods. Disputes about method have therefore played out at the detail, rule, and meta-rule levels. Changes in beliefs about the certainty or fallibility of scientific knowledge, for instance (which is a meta-methodological consideration of what we can hope for methods to deliver), have meant different emphases on deductive and inductive reasoning, or on the relative importance attached to reasoning over observation (i.e., differences over particular methods.) Beliefs about the role of science in society will affect the place one gives to values in scientific method.

The issue which has shaped debates over scientific method the most in the last half century is the question of how pluralist do we need to be about method? Unificationists continue to hold out for one method essential to science; nihilism is a form of radical pluralism, which considers the effectiveness of any methodological prescription to be so context sensitive as to render it not explanatory on its own. Some middle degree of pluralism regarding the methods embodied in scientific practice seems appropriate. But the details of scientific practice vary with time and place, from institution to institution, across scientists and their subjects of investigation. How significant are the variations for understanding science and its success? How much can method be abstracted from practice? This entry describes some of the attempts to characterize scientific method or methods, as well as arguments for a more context-sensitive approach to methods embedded in actual scientific practices.

1. Overview and organizing themes

2. historical review: aristotle to mill, 3.1 logical constructionism and operationalism, 3.2. h-d as a logic of confirmation, 3.3. popper and falsificationism, 3.4 meta-methodology and the end of method, 4. statistical methods for hypothesis testing, 5.1 creative and exploratory practices.

  • 5.2 Computer methods and the ‘new ways’ of doing science

6.1 “The scientific method” in science education and as seen by scientists

6.2 privileged methods and ‘gold standards’, 6.3 scientific method in the court room, 6.4 deviating practices, 7. conclusion, other internet resources, related entries.

This entry could have been given the title Scientific Methods and gone on to fill volumes, or it could have been extremely short, consisting of a brief summary rejection of the idea that there is any such thing as a unique Scientific Method at all. Both unhappy prospects are due to the fact that scientific activity varies so much across disciplines, times, places, and scientists that any account which manages to unify it all will either consist of overwhelming descriptive detail, or trivial generalizations.

The choice of scope for the present entry is more optimistic, taking a cue from the recent movement in philosophy of science toward a greater attention to practice: to what scientists actually do. This “turn to practice” can be seen as the latest form of studies of methods in science, insofar as it represents an attempt at understanding scientific activity, but through accounts that are neither meant to be universal and unified, nor singular and narrowly descriptive. To some extent, different scientists at different times and places can be said to be using the same method even though, in practice, the details are different.

Whether the context in which methods are carried out is relevant, or to what extent, will depend largely on what one takes the aims of science to be and what one’s own aims are. For most of the history of scientific methodology the assumption has been that the most important output of science is knowledge and so the aim of methodology should be to discover those methods by which scientific knowledge is generated.

Science was seen to embody the most successful form of reasoning (but which form?) to the most certain knowledge claims (but how certain?) on the basis of systematically collected evidence (but what counts as evidence, and should the evidence of the senses take precedence, or rational insight?) Section 2 surveys some of the history, pointing to two major themes. One theme is seeking the right balance between observation and reasoning (and the attendant forms of reasoning which employ them); the other is how certain scientific knowledge is or can be.

Section 3 turns to 20 th century debates on scientific method. In the second half of the 20 th century the epistemic privilege of science faced several challenges and many philosophers of science abandoned the reconstruction of the logic of scientific method. Views changed significantly regarding which functions of science ought to be captured and why. For some, the success of science was better identified with social or cultural features. Historical and sociological turns in the philosophy of science were made, with a demand that greater attention be paid to the non-epistemic aspects of science, such as sociological, institutional, material, and political factors. Even outside of those movements there was an increased specialization in the philosophy of science, with more and more focus on specific fields within science. The combined upshot was very few philosophers arguing any longer for a grand unified methodology of science. Sections 3 and 4 surveys the main positions on scientific method in 20 th century philosophy of science, focusing on where they differ in their preference for confirmation or falsification or for waiving the idea of a special scientific method altogether.

In recent decades, attention has primarily been paid to scientific activities traditionally falling under the rubric of method, such as experimental design and general laboratory practice, the use of statistics, the construction and use of models and diagrams, interdisciplinary collaboration, and science communication. Sections 4–6 attempt to construct a map of the current domains of the study of methods in science.

As these sections illustrate, the question of method is still central to the discourse about science. Scientific method remains a topic for education, for science policy, and for scientists. It arises in the public domain where the demarcation or status of science is at issue. Some philosophers have recently returned, therefore, to the question of what it is that makes science a unique cultural product. This entry will close with some of these recent attempts at discerning and encapsulating the activities by which scientific knowledge is achieved.

Attempting a history of scientific method compounds the vast scope of the topic. This section briefly surveys the background to modern methodological debates. What can be called the classical view goes back to antiquity, and represents a point of departure for later divergences. [ 1 ]

We begin with a point made by Laudan (1968) in his historical survey of scientific method:

Perhaps the most serious inhibition to the emergence of the history of theories of scientific method as a respectable area of study has been the tendency to conflate it with the general history of epistemology, thereby assuming that the narrative categories and classificatory pigeon-holes applied to the latter are also basic to the former. (1968: 5)

To see knowledge about the natural world as falling under knowledge more generally is an understandable conflation. Histories of theories of method would naturally employ the same narrative categories and classificatory pigeon holes. An important theme of the history of epistemology, for example, is the unification of knowledge, a theme reflected in the question of the unification of method in science. Those who have identified differences in kinds of knowledge have often likewise identified different methods for achieving that kind of knowledge (see the entry on the unity of science ).

Different views on what is known, how it is known, and what can be known are connected. Plato distinguished the realms of things into the visible and the intelligible ( The Republic , 510a, in Cooper 1997). Only the latter, the Forms, could be objects of knowledge. The intelligible truths could be known with the certainty of geometry and deductive reasoning. What could be observed of the material world, however, was by definition imperfect and deceptive, not ideal. The Platonic way of knowledge therefore emphasized reasoning as a method, downplaying the importance of observation. Aristotle disagreed, locating the Forms in the natural world as the fundamental principles to be discovered through the inquiry into nature ( Metaphysics Z , in Barnes 1984).

Aristotle is recognized as giving the earliest systematic treatise on the nature of scientific inquiry in the western tradition, one which embraced observation and reasoning about the natural world. In the Prior and Posterior Analytics , Aristotle reflects first on the aims and then the methods of inquiry into nature. A number of features can be found which are still considered by most to be essential to science. For Aristotle, empiricism, careful observation (but passive observation, not controlled experiment), is the starting point. The aim is not merely recording of facts, though. For Aristotle, science ( epistêmê ) is a body of properly arranged knowledge or learning—the empirical facts, but also their ordering and display are of crucial importance. The aims of discovery, ordering, and display of facts partly determine the methods required of successful scientific inquiry. Also determinant is the nature of the knowledge being sought, and the explanatory causes proper to that kind of knowledge (see the discussion of the four causes in the entry on Aristotle on causality ).

In addition to careful observation, then, scientific method requires a logic as a system of reasoning for properly arranging, but also inferring beyond, what is known by observation. Methods of reasoning may include induction, prediction, or analogy, among others. Aristotle’s system (along with his catalogue of fallacious reasoning) was collected under the title the Organon . This title would be echoed in later works on scientific reasoning, such as Novum Organon by Francis Bacon, and Novum Organon Restorum by William Whewell (see below). In Aristotle’s Organon reasoning is divided primarily into two forms, a rough division which persists into modern times. The division, known most commonly today as deductive versus inductive method, appears in other eras and methodologies as analysis/​synthesis, non-ampliative/​ampliative, or even confirmation/​verification. The basic idea is there are two “directions” to proceed in our methods of inquiry: one away from what is observed, to the more fundamental, general, and encompassing principles; the other, from the fundamental and general to instances or implications of principles.

The basic aim and method of inquiry identified here can be seen as a theme running throughout the next two millennia of reflection on the correct way to seek after knowledge: carefully observe nature and then seek rules or principles which explain or predict its operation. The Aristotelian corpus provided the framework for a commentary tradition on scientific method independent of science itself (cosmos versus physics.) During the medieval period, figures such as Albertus Magnus (1206–1280), Thomas Aquinas (1225–1274), Robert Grosseteste (1175–1253), Roger Bacon (1214/1220–1292), William of Ockham (1287–1347), Andreas Vesalius (1514–1546), Giacomo Zabarella (1533–1589) all worked to clarify the kind of knowledge obtainable by observation and induction, the source of justification of induction, and best rules for its application. [ 2 ] Many of their contributions we now think of as essential to science (see also Laudan 1968). As Aristotle and Plato had employed a framework of reasoning either “to the forms” or “away from the forms”, medieval thinkers employed directions away from the phenomena or back to the phenomena. In analysis, a phenomena was examined to discover its basic explanatory principles; in synthesis, explanations of a phenomena were constructed from first principles.

During the Scientific Revolution these various strands of argument, experiment, and reason were forged into a dominant epistemic authority. The 16 th –18 th centuries were a period of not only dramatic advance in knowledge about the operation of the natural world—advances in mechanical, medical, biological, political, economic explanations—but also of self-awareness of the revolutionary changes taking place, and intense reflection on the source and legitimation of the method by which the advances were made. The struggle to establish the new authority included methodological moves. The Book of Nature, according to the metaphor of Galileo Galilei (1564–1642) or Francis Bacon (1561–1626), was written in the language of mathematics, of geometry and number. This motivated an emphasis on mathematical description and mechanical explanation as important aspects of scientific method. Through figures such as Henry More and Ralph Cudworth, a neo-Platonic emphasis on the importance of metaphysical reflection on nature behind appearances, particularly regarding the spiritual as a complement to the purely mechanical, remained an important methodological thread of the Scientific Revolution (see the entries on Cambridge platonists ; Boyle ; Henry More ; Galileo ).

In Novum Organum (1620), Bacon was critical of the Aristotelian method for leaping from particulars to universals too quickly. The syllogistic form of reasoning readily mixed those two types of propositions. Bacon aimed at the invention of new arts, principles, and directions. His method would be grounded in methodical collection of observations, coupled with correction of our senses (and particularly, directions for the avoidance of the Idols, as he called them, kinds of systematic errors to which naïve observers are prone.) The community of scientists could then climb, by a careful, gradual and unbroken ascent, to reliable general claims.

Bacon’s method has been criticized as impractical and too inflexible for the practicing scientist. Whewell would later criticize Bacon in his System of Logic for paying too little attention to the practices of scientists. It is hard to find convincing examples of Bacon’s method being put in to practice in the history of science, but there are a few who have been held up as real examples of 16 th century scientific, inductive method, even if not in the rigid Baconian mold: figures such as Robert Boyle (1627–1691) and William Harvey (1578–1657) (see the entry on Bacon ).

It is to Isaac Newton (1642–1727), however, that historians of science and methodologists have paid greatest attention. Given the enormous success of his Principia Mathematica and Opticks , this is understandable. The study of Newton’s method has had two main thrusts: the implicit method of the experiments and reasoning presented in the Opticks, and the explicit methodological rules given as the Rules for Philosophising (the Regulae) in Book III of the Principia . [ 3 ] Newton’s law of gravitation, the linchpin of his new cosmology, broke with explanatory conventions of natural philosophy, first for apparently proposing action at a distance, but more generally for not providing “true”, physical causes. The argument for his System of the World ( Principia , Book III) was based on phenomena, not reasoned first principles. This was viewed (mainly on the continent) as insufficient for proper natural philosophy. The Regulae counter this objection, re-defining the aims of natural philosophy by re-defining the method natural philosophers should follow. (See the entry on Newton’s philosophy .)

To his list of methodological prescriptions should be added Newton’s famous phrase “ hypotheses non fingo ” (commonly translated as “I frame no hypotheses”.) The scientist was not to invent systems but infer explanations from observations, as Bacon had advocated. This would come to be known as inductivism. In the century after Newton, significant clarifications of the Newtonian method were made. Colin Maclaurin (1698–1746), for instance, reconstructed the essential structure of the method as having complementary analysis and synthesis phases, one proceeding away from the phenomena in generalization, the other from the general propositions to derive explanations of new phenomena. Denis Diderot (1713–1784) and editors of the Encyclopédie did much to consolidate and popularize Newtonianism, as did Francesco Algarotti (1721–1764). The emphasis was often the same, as much on the character of the scientist as on their process, a character which is still commonly assumed. The scientist is humble in the face of nature, not beholden to dogma, obeys only his eyes, and follows the truth wherever it leads. It was certainly Voltaire (1694–1778) and du Chatelet (1706–1749) who were most influential in propagating the latter vision of the scientist and their craft, with Newton as hero. Scientific method became a revolutionary force of the Enlightenment. (See also the entries on Newton , Leibniz , Descartes , Boyle , Hume , enlightenment , as well as Shank 2008 for a historical overview.)

Not all 18 th century reflections on scientific method were so celebratory. Famous also are George Berkeley’s (1685–1753) attack on the mathematics of the new science, as well as the over-emphasis of Newtonians on observation; and David Hume’s (1711–1776) undermining of the warrant offered for scientific claims by inductive justification (see the entries on: George Berkeley ; David Hume ; Hume’s Newtonianism and Anti-Newtonianism ). Hume’s problem of induction motivated Immanuel Kant (1724–1804) to seek new foundations for empirical method, though as an epistemic reconstruction, not as any set of practical guidelines for scientists. Both Hume and Kant influenced the methodological reflections of the next century, such as the debate between Mill and Whewell over the certainty of inductive inferences in science.

The debate between John Stuart Mill (1806–1873) and William Whewell (1794–1866) has become the canonical methodological debate of the 19 th century. Although often characterized as a debate between inductivism and hypothetico-deductivism, the role of the two methods on each side is actually more complex. On the hypothetico-deductive account, scientists work to come up with hypotheses from which true observational consequences can be deduced—hence, hypothetico-deductive. Because Whewell emphasizes both hypotheses and deduction in his account of method, he can be seen as a convenient foil to the inductivism of Mill. However, equally if not more important to Whewell’s portrayal of scientific method is what he calls the “fundamental antithesis”. Knowledge is a product of the objective (what we see in the world around us) and subjective (the contributions of our mind to how we perceive and understand what we experience, which he called the Fundamental Ideas). Both elements are essential according to Whewell, and he was therefore critical of Kant for too much focus on the subjective, and John Locke (1632–1704) and Mill for too much focus on the senses. Whewell’s fundamental ideas can be discipline relative. An idea can be fundamental even if it is necessary for knowledge only within a given scientific discipline (e.g., chemical affinity for chemistry). This distinguishes fundamental ideas from the forms and categories of intuition of Kant. (See the entry on Whewell .)

Clarifying fundamental ideas would therefore be an essential part of scientific method and scientific progress. Whewell called this process “Discoverer’s Induction”. It was induction, following Bacon or Newton, but Whewell sought to revive Bacon’s account by emphasising the role of ideas in the clear and careful formulation of inductive hypotheses. Whewell’s induction is not merely the collecting of objective facts. The subjective plays a role through what Whewell calls the Colligation of Facts, a creative act of the scientist, the invention of a theory. A theory is then confirmed by testing, where more facts are brought under the theory, called the Consilience of Inductions. Whewell felt that this was the method by which the true laws of nature could be discovered: clarification of fundamental concepts, clever invention of explanations, and careful testing. Mill, in his critique of Whewell, and others who have cast Whewell as a fore-runner of the hypothetico-deductivist view, seem to have under-estimated the importance of this discovery phase in Whewell’s understanding of method (Snyder 1997a,b, 1999). Down-playing the discovery phase would come to characterize methodology of the early 20 th century (see section 3 ).

Mill, in his System of Logic , put forward a narrower view of induction as the essence of scientific method. For Mill, induction is the search first for regularities among events. Among those regularities, some will continue to hold for further observations, eventually gaining the status of laws. One can also look for regularities among the laws discovered in a domain, i.e., for a law of laws. Which “law law” will hold is time and discipline dependent and open to revision. One example is the Law of Universal Causation, and Mill put forward specific methods for identifying causes—now commonly known as Mill’s methods. These five methods look for circumstances which are common among the phenomena of interest, those which are absent when the phenomena are, or those for which both vary together. Mill’s methods are still seen as capturing basic intuitions about experimental methods for finding the relevant explanatory factors ( System of Logic (1843), see Mill entry). The methods advocated by Whewell and Mill, in the end, look similar. Both involve inductive generalization to covering laws. They differ dramatically, however, with respect to the necessity of the knowledge arrived at; that is, at the meta-methodological level (see the entries on Whewell and Mill entries).

3. Logic of method and critical responses

The quantum and relativistic revolutions in physics in the early 20 th century had a profound effect on methodology. Conceptual foundations of both theories were taken to show the defeasibility of even the most seemingly secure intuitions about space, time and bodies. Certainty of knowledge about the natural world was therefore recognized as unattainable. Instead a renewed empiricism was sought which rendered science fallible but still rationally justifiable.

Analyses of the reasoning of scientists emerged, according to which the aspects of scientific method which were of primary importance were the means of testing and confirming of theories. A distinction in methodology was made between the contexts of discovery and justification. The distinction could be used as a wedge between the particularities of where and how theories or hypotheses are arrived at, on the one hand, and the underlying reasoning scientists use (whether or not they are aware of it) when assessing theories and judging their adequacy on the basis of the available evidence. By and large, for most of the 20 th century, philosophy of science focused on the second context, although philosophers differed on whether to focus on confirmation or refutation as well as on the many details of how confirmation or refutation could or could not be brought about. By the mid-20 th century these attempts at defining the method of justification and the context distinction itself came under pressure. During the same period, philosophy of science developed rapidly, and from section 4 this entry will therefore shift from a primarily historical treatment of the scientific method towards a primarily thematic one.

Advances in logic and probability held out promise of the possibility of elaborate reconstructions of scientific theories and empirical method, the best example being Rudolf Carnap’s The Logical Structure of the World (1928). Carnap attempted to show that a scientific theory could be reconstructed as a formal axiomatic system—that is, a logic. That system could refer to the world because some of its basic sentences could be interpreted as observations or operations which one could perform to test them. The rest of the theoretical system, including sentences using theoretical or unobservable terms (like electron or force) would then either be meaningful because they could be reduced to observations, or they had purely logical meanings (called analytic, like mathematical identities). This has been referred to as the verifiability criterion of meaning. According to the criterion, any statement not either analytic or verifiable was strictly meaningless. Although the view was endorsed by Carnap in 1928, he would later come to see it as too restrictive (Carnap 1956). Another familiar version of this idea is operationalism of Percy William Bridgman. In The Logic of Modern Physics (1927) Bridgman asserted that every physical concept could be defined in terms of the operations one would perform to verify the application of that concept. Making good on the operationalisation of a concept even as simple as length, however, can easily become enormously complex (for measuring very small lengths, for instance) or impractical (measuring large distances like light years.)

Carl Hempel’s (1950, 1951) criticisms of the verifiability criterion of meaning had enormous influence. He pointed out that universal generalizations, such as most scientific laws, were not strictly meaningful on the criterion. Verifiability and operationalism both seemed too restrictive to capture standard scientific aims and practice. The tenuous connection between these reconstructions and actual scientific practice was criticized in another way. In both approaches, scientific methods are instead recast in methodological roles. Measurements, for example, were looked to as ways of giving meanings to terms. The aim of the philosopher of science was not to understand the methods per se , but to use them to reconstruct theories, their meanings, and their relation to the world. When scientists perform these operations, however, they will not report that they are doing them to give meaning to terms in a formal axiomatic system. This disconnect between methodology and the details of actual scientific practice would seem to violate the empiricism the Logical Positivists and Bridgman were committed to. The view that methodology should correspond to practice (to some extent) has been called historicism, or intuitionism. We turn to these criticisms and responses in section 3.4 . [ 4 ]

Positivism also had to contend with the recognition that a purely inductivist approach, along the lines of Bacon-Newton-Mill, was untenable. There was no pure observation, for starters. All observation was theory laden. Theory is required to make any observation, therefore not all theory can be derived from observation alone. (See the entry on theory and observation in science .) Even granting an observational basis, Hume had already pointed out that one could not deductively justify inductive conclusions without begging the question by presuming the success of the inductive method. Likewise, positivist attempts at analyzing how a generalization can be confirmed by observations of its instances were subject to a number of criticisms. Goodman (1965) and Hempel (1965) both point to paradoxes inherent in standard accounts of confirmation. Recent attempts at explaining how observations can serve to confirm a scientific theory are discussed in section 4 below.

The standard starting point for a non-inductive analysis of the logic of confirmation is known as the Hypothetico-Deductive (H-D) method. In its simplest form, a sentence of a theory which expresses some hypothesis is confirmed by its true consequences. As noted in section 2 , this method had been advanced by Whewell in the 19 th century, as well as Nicod (1924) and others in the 20 th century. Often, Hempel’s (1966) description of the H-D method, illustrated by the case of Semmelweiss’ inferential procedures in establishing the cause of childbed fever, has been presented as a key account of H-D as well as a foil for criticism of the H-D account of confirmation (see, for example, Lipton’s (2004) discussion of inference to the best explanation; also the entry on confirmation ). Hempel described Semmelsweiss’ procedure as examining various hypotheses explaining the cause of childbed fever. Some hypotheses conflicted with observable facts and could be rejected as false immediately. Others needed to be tested experimentally by deducing which observable events should follow if the hypothesis were true (what Hempel called the test implications of the hypothesis), then conducting an experiment and observing whether or not the test implications occurred. If the experiment showed the test implication to be false, the hypothesis could be rejected. If the experiment showed the test implications to be true, however, this did not prove the hypothesis true. The confirmation of a test implication does not verify a hypothesis, though Hempel did allow that “it provides at least some support, some corroboration or confirmation for it” (Hempel 1966: 8). The degree of this support then depends on the quantity, variety and precision of the supporting evidence.

Another approach that took off from the difficulties with inductive inference was Karl Popper’s critical rationalism or falsificationism (Popper 1959, 1963). Falsification is deductive and similar to H-D in that it involves scientists deducing observational consequences from the hypothesis under test. For Popper, however, the important point was not the degree of confirmation that successful prediction offered to a hypothesis. The crucial thing was the logical asymmetry between confirmation, based on inductive inference, and falsification, which can be based on a deductive inference. (This simple opposition was later questioned, by Lakatos, among others. See the entry on historicist theories of scientific rationality. )

Popper stressed that, regardless of the amount of confirming evidence, we can never be certain that a hypothesis is true without committing the fallacy of affirming the consequent. Instead, Popper introduced the notion of corroboration as a measure for how well a theory or hypothesis has survived previous testing—but without implying that this is also a measure for the probability that it is true.

Popper was also motivated by his doubts about the scientific status of theories like the Marxist theory of history or psycho-analysis, and so wanted to demarcate between science and pseudo-science. Popper saw this as an importantly different distinction than demarcating science from metaphysics. The latter demarcation was the primary concern of many logical empiricists. Popper used the idea of falsification to draw a line instead between pseudo and proper science. Science was science because its method involved subjecting theories to rigorous tests which offered a high probability of failing and thus refuting the theory.

A commitment to the risk of failure was important. Avoiding falsification could be done all too easily. If a consequence of a theory is inconsistent with observations, an exception can be added by introducing auxiliary hypotheses designed explicitly to save the theory, so-called ad hoc modifications. This Popper saw done in pseudo-science where ad hoc theories appeared capable of explaining anything in their field of application. In contrast, science is risky. If observations showed the predictions from a theory to be wrong, the theory would be refuted. Hence, scientific hypotheses must be falsifiable. Not only must there exist some possible observation statement which could falsify the hypothesis or theory, were it observed, (Popper called these the hypothesis’ potential falsifiers) it is crucial to the Popperian scientific method that such falsifications be sincerely attempted on a regular basis.

The more potential falsifiers of a hypothesis, the more falsifiable it would be, and the more the hypothesis claimed. Conversely, hypotheses without falsifiers claimed very little or nothing at all. Originally, Popper thought that this meant the introduction of ad hoc hypotheses only to save a theory should not be countenanced as good scientific method. These would undermine the falsifiabililty of a theory. However, Popper later came to recognize that the introduction of modifications (immunizations, he called them) was often an important part of scientific development. Responding to surprising or apparently falsifying observations often generated important new scientific insights. Popper’s own example was the observed motion of Uranus which originally did not agree with Newtonian predictions. The ad hoc hypothesis of an outer planet explained the disagreement and led to further falsifiable predictions. Popper sought to reconcile the view by blurring the distinction between falsifiable and not falsifiable, and speaking instead of degrees of testability (Popper 1985: 41f.).

From the 1960s on, sustained meta-methodological criticism emerged that drove philosophical focus away from scientific method. A brief look at those criticisms follows, with recommendations for further reading at the end of the entry.

Thomas Kuhn’s The Structure of Scientific Revolutions (1962) begins with a well-known shot across the bow for philosophers of science:

History, if viewed as a repository for more than anecdote or chronology, could produce a decisive transformation in the image of science by which we are now possessed. (1962: 1)

The image Kuhn thought needed transforming was the a-historical, rational reconstruction sought by many of the Logical Positivists, though Carnap and other positivists were actually quite sympathetic to Kuhn’s views. (See the entry on the Vienna Circle .) Kuhn shares with other of his contemporaries, such as Feyerabend and Lakatos, a commitment to a more empirical approach to philosophy of science. Namely, the history of science provides important data, and necessary checks, for philosophy of science, including any theory of scientific method.

The history of science reveals, according to Kuhn, that scientific development occurs in alternating phases. During normal science, the members of the scientific community adhere to the paradigm in place. Their commitment to the paradigm means a commitment to the puzzles to be solved and the acceptable ways of solving them. Confidence in the paradigm remains so long as steady progress is made in solving the shared puzzles. Method in this normal phase operates within a disciplinary matrix (Kuhn’s later concept of a paradigm) which includes standards for problem solving, and defines the range of problems to which the method should be applied. An important part of a disciplinary matrix is the set of values which provide the norms and aims for scientific method. The main values that Kuhn identifies are prediction, problem solving, simplicity, consistency, and plausibility.

An important by-product of normal science is the accumulation of puzzles which cannot be solved with resources of the current paradigm. Once accumulation of these anomalies has reached some critical mass, it can trigger a communal shift to a new paradigm and a new phase of normal science. Importantly, the values that provide the norms and aims for scientific method may have transformed in the meantime. Method may therefore be relative to discipline, time or place

Feyerabend also identified the aims of science as progress, but argued that any methodological prescription would only stifle that progress (Feyerabend 1988). His arguments are grounded in re-examining accepted “myths” about the history of science. Heroes of science, like Galileo, are shown to be just as reliant on rhetoric and persuasion as they are on reason and demonstration. Others, like Aristotle, are shown to be far more reasonable and far-reaching in their outlooks then they are given credit for. As a consequence, the only rule that could provide what he took to be sufficient freedom was the vacuous “anything goes”. More generally, even the methodological restriction that science is the best way to pursue knowledge, and to increase knowledge, is too restrictive. Feyerabend suggested instead that science might, in fact, be a threat to a free society, because it and its myth had become so dominant (Feyerabend 1978).

An even more fundamental kind of criticism was offered by several sociologists of science from the 1970s onwards who rejected the methodology of providing philosophical accounts for the rational development of science and sociological accounts of the irrational mistakes. Instead, they adhered to a symmetry thesis on which any causal explanation of how scientific knowledge is established needs to be symmetrical in explaining truth and falsity, rationality and irrationality, success and mistakes, by the same causal factors (see, e.g., Barnes and Bloor 1982, Bloor 1991). Movements in the Sociology of Science, like the Strong Programme, or in the social dimensions and causes of knowledge more generally led to extended and close examination of detailed case studies in contemporary science and its history. (See the entries on the social dimensions of scientific knowledge and social epistemology .) Well-known examinations by Latour and Woolgar (1979/1986), Knorr-Cetina (1981), Pickering (1984), Shapin and Schaffer (1985) seem to bear out that it was social ideologies (on a macro-scale) or individual interactions and circumstances (on a micro-scale) which were the primary causal factors in determining which beliefs gained the status of scientific knowledge. As they saw it therefore, explanatory appeals to scientific method were not empirically grounded.

A late, and largely unexpected, criticism of scientific method came from within science itself. Beginning in the early 2000s, a number of scientists attempting to replicate the results of published experiments could not do so. There may be close conceptual connection between reproducibility and method. For example, if reproducibility means that the same scientific methods ought to produce the same result, and all scientific results ought to be reproducible, then whatever it takes to reproduce a scientific result ought to be called scientific method. Space limits us to the observation that, insofar as reproducibility is a desired outcome of proper scientific method, it is not strictly a part of scientific method. (See the entry on reproducibility of scientific results .)

By the close of the 20 th century the search for the scientific method was flagging. Nola and Sankey (2000b) could introduce their volume on method by remarking that “For some, the whole idea of a theory of scientific method is yester-year’s debate …”.

Despite the many difficulties that philosophers encountered in trying to providing a clear methodology of conformation (or refutation), still important progress has been made on understanding how observation can provide evidence for a given theory. Work in statistics has been crucial for understanding how theories can be tested empirically, and in recent decades a huge literature has developed that attempts to recast confirmation in Bayesian terms. Here these developments can be covered only briefly, and we refer to the entry on confirmation for further details and references.

Statistics has come to play an increasingly important role in the methodology of the experimental sciences from the 19 th century onwards. At that time, statistics and probability theory took on a methodological role as an analysis of inductive inference, and attempts to ground the rationality of induction in the axioms of probability theory have continued throughout the 20 th century and in to the present. Developments in the theory of statistics itself, meanwhile, have had a direct and immense influence on the experimental method, including methods for measuring the uncertainty of observations such as the Method of Least Squares developed by Legendre and Gauss in the early 19 th century, criteria for the rejection of outliers proposed by Peirce by the mid-19 th century, and the significance tests developed by Gosset (a.k.a. “Student”), Fisher, Neyman & Pearson and others in the 1920s and 1930s (see, e.g., Swijtink 1987 for a brief historical overview; and also the entry on C.S. Peirce ).

These developments within statistics then in turn led to a reflective discussion among both statisticians and philosophers of science on how to perceive the process of hypothesis testing: whether it was a rigorous statistical inference that could provide a numerical expression of the degree of confidence in the tested hypothesis, or if it should be seen as a decision between different courses of actions that also involved a value component. This led to a major controversy among Fisher on the one side and Neyman and Pearson on the other (see especially Fisher 1955, Neyman 1956 and Pearson 1955, and for analyses of the controversy, e.g., Howie 2002, Marks 2000, Lenhard 2006). On Fisher’s view, hypothesis testing was a methodology for when to accept or reject a statistical hypothesis, namely that a hypothesis should be rejected by evidence if this evidence would be unlikely relative to other possible outcomes, given the hypothesis were true. In contrast, on Neyman and Pearson’s view, the consequence of error also had to play a role when deciding between hypotheses. Introducing the distinction between the error of rejecting a true hypothesis (type I error) and accepting a false hypothesis (type II error), they argued that it depends on the consequences of the error to decide whether it is more important to avoid rejecting a true hypothesis or accepting a false one. Hence, Fisher aimed for a theory of inductive inference that enabled a numerical expression of confidence in a hypothesis. To him, the important point was the search for truth, not utility. In contrast, the Neyman-Pearson approach provided a strategy of inductive behaviour for deciding between different courses of action. Here, the important point was not whether a hypothesis was true, but whether one should act as if it was.

Similar discussions are found in the philosophical literature. On the one side, Churchman (1948) and Rudner (1953) argued that because scientific hypotheses can never be completely verified, a complete analysis of the methods of scientific inference includes ethical judgments in which the scientists must decide whether the evidence is sufficiently strong or that the probability is sufficiently high to warrant the acceptance of the hypothesis, which again will depend on the importance of making a mistake in accepting or rejecting the hypothesis. Others, such as Jeffrey (1956) and Levi (1960) disagreed and instead defended a value-neutral view of science on which scientists should bracket their attitudes, preferences, temperament, and values when assessing the correctness of their inferences. For more details on this value-free ideal in the philosophy of science and its historical development, see Douglas (2009) and Howard (2003). For a broad set of case studies examining the role of values in science, see e.g. Elliott & Richards 2017.

In recent decades, philosophical discussions of the evaluation of probabilistic hypotheses by statistical inference have largely focused on Bayesianism that understands probability as a measure of a person’s degree of belief in an event, given the available information, and frequentism that instead understands probability as a long-run frequency of a repeatable event. Hence, for Bayesians probabilities refer to a state of knowledge, whereas for frequentists probabilities refer to frequencies of events (see, e.g., Sober 2008, chapter 1 for a detailed introduction to Bayesianism and frequentism as well as to likelihoodism). Bayesianism aims at providing a quantifiable, algorithmic representation of belief revision, where belief revision is a function of prior beliefs (i.e., background knowledge) and incoming evidence. Bayesianism employs a rule based on Bayes’ theorem, a theorem of the probability calculus which relates conditional probabilities. The probability that a particular hypothesis is true is interpreted as a degree of belief, or credence, of the scientist. There will also be a probability and a degree of belief that a hypothesis will be true conditional on a piece of evidence (an observation, say) being true. Bayesianism proscribes that it is rational for the scientist to update their belief in the hypothesis to that conditional probability should it turn out that the evidence is, in fact, observed (see, e.g., Sprenger & Hartmann 2019 for a comprehensive treatment of Bayesian philosophy of science). Originating in the work of Neyman and Person, frequentism aims at providing the tools for reducing long-run error rates, such as the error-statistical approach developed by Mayo (1996) that focuses on how experimenters can avoid both type I and type II errors by building up a repertoire of procedures that detect errors if and only if they are present. Both Bayesianism and frequentism have developed over time, they are interpreted in different ways by its various proponents, and their relations to previous criticism to attempts at defining scientific method are seen differently by proponents and critics. The literature, surveys, reviews and criticism in this area are vast and the reader is referred to the entries on Bayesian epistemology and confirmation .

5. Method in Practice

Attention to scientific practice, as we have seen, is not itself new. However, the turn to practice in the philosophy of science of late can be seen as a correction to the pessimism with respect to method in philosophy of science in later parts of the 20 th century, and as an attempted reconciliation between sociological and rationalist explanations of scientific knowledge. Much of this work sees method as detailed and context specific problem-solving procedures, and methodological analyses to be at the same time descriptive, critical and advisory (see Nickles 1987 for an exposition of this view). The following section contains a survey of some of the practice focuses. In this section we turn fully to topics rather than chronology.

A problem with the distinction between the contexts of discovery and justification that figured so prominently in philosophy of science in the first half of the 20 th century (see section 2 ) is that no such distinction can be clearly seen in scientific activity (see Arabatzis 2006). Thus, in recent decades, it has been recognized that study of conceptual innovation and change should not be confined to psychology and sociology of science, but are also important aspects of scientific practice which philosophy of science should address (see also the entry on scientific discovery ). Looking for the practices that drive conceptual innovation has led philosophers to examine both the reasoning practices of scientists and the wide realm of experimental practices that are not directed narrowly at testing hypotheses, that is, exploratory experimentation.

Examining the reasoning practices of historical and contemporary scientists, Nersessian (2008) has argued that new scientific concepts are constructed as solutions to specific problems by systematic reasoning, and that of analogy, visual representation and thought-experimentation are among the important reasoning practices employed. These ubiquitous forms of reasoning are reliable—but also fallible—methods of conceptual development and change. On her account, model-based reasoning consists of cycles of construction, simulation, evaluation and adaption of models that serve as interim interpretations of the target problem to be solved. Often, this process will lead to modifications or extensions, and a new cycle of simulation and evaluation. However, Nersessian also emphasizes that

creative model-based reasoning cannot be applied as a simple recipe, is not always productive of solutions, and even its most exemplary usages can lead to incorrect solutions. (Nersessian 2008: 11)

Thus, while on the one hand she agrees with many previous philosophers that there is no logic of discovery, discoveries can derive from reasoned processes, such that a large and integral part of scientific practice is

the creation of concepts through which to comprehend, structure, and communicate about physical phenomena …. (Nersessian 1987: 11)

Similarly, work on heuristics for discovery and theory construction by scholars such as Darden (1991) and Bechtel & Richardson (1993) present science as problem solving and investigate scientific problem solving as a special case of problem-solving in general. Drawing largely on cases from the biological sciences, much of their focus has been on reasoning strategies for the generation, evaluation, and revision of mechanistic explanations of complex systems.

Addressing another aspect of the context distinction, namely the traditional view that the primary role of experiments is to test theoretical hypotheses according to the H-D model, other philosophers of science have argued for additional roles that experiments can play. The notion of exploratory experimentation was introduced to describe experiments driven by the desire to obtain empirical regularities and to develop concepts and classifications in which these regularities can be described (Steinle 1997, 2002; Burian 1997; Waters 2007)). However the difference between theory driven experimentation and exploratory experimentation should not be seen as a sharp distinction. Theory driven experiments are not always directed at testing hypothesis, but may also be directed at various kinds of fact-gathering, such as determining numerical parameters. Vice versa , exploratory experiments are usually informed by theory in various ways and are therefore not theory-free. Instead, in exploratory experiments phenomena are investigated without first limiting the possible outcomes of the experiment on the basis of extant theory about the phenomena.

The development of high throughput instrumentation in molecular biology and neighbouring fields has given rise to a special type of exploratory experimentation that collects and analyses very large amounts of data, and these new ‘omics’ disciplines are often said to represent a break with the ideal of hypothesis-driven science (Burian 2007; Elliott 2007; Waters 2007; O’Malley 2007) and instead described as data-driven research (Leonelli 2012; Strasser 2012) or as a special kind of “convenience experimentation” in which many experiments are done simply because they are extraordinarily convenient to perform (Krohs 2012).

5.2 Computer methods and ‘new ways’ of doing science

The field of omics just described is possible because of the ability of computers to process, in a reasonable amount of time, the huge quantities of data required. Computers allow for more elaborate experimentation (higher speed, better filtering, more variables, sophisticated coordination and control), but also, through modelling and simulations, might constitute a form of experimentation themselves. Here, too, we can pose a version of the general question of method versus practice: does the practice of using computers fundamentally change scientific method, or merely provide a more efficient means of implementing standard methods?

Because computers can be used to automate measurements, quantifications, calculations, and statistical analyses where, for practical reasons, these operations cannot be otherwise carried out, many of the steps involved in reaching a conclusion on the basis of an experiment are now made inside a “black box”, without the direct involvement or awareness of a human. This has epistemological implications, regarding what we can know, and how we can know it. To have confidence in the results, computer methods are therefore subjected to tests of verification and validation.

The distinction between verification and validation is easiest to characterize in the case of computer simulations. In a typical computer simulation scenario computers are used to numerically integrate differential equations for which no analytic solution is available. The equations are part of the model the scientist uses to represent a phenomenon or system under investigation. Verifying a computer simulation means checking that the equations of the model are being correctly approximated. Validating a simulation means checking that the equations of the model are adequate for the inferences one wants to make on the basis of that model.

A number of issues related to computer simulations have been raised. The identification of validity and verification as the testing methods has been criticized. Oreskes et al. (1994) raise concerns that “validiation”, because it suggests deductive inference, might lead to over-confidence in the results of simulations. The distinction itself is probably too clean, since actual practice in the testing of simulations mixes and moves back and forth between the two (Weissart 1997; Parker 2008a; Winsberg 2010). Computer simulations do seem to have a non-inductive character, given that the principles by which they operate are built in by the programmers, and any results of the simulation follow from those in-built principles in such a way that those results could, in principle, be deduced from the program code and its inputs. The status of simulations as experiments has therefore been examined (Kaufmann and Smarr 1993; Humphreys 1995; Hughes 1999; Norton and Suppe 2001). This literature considers the epistemology of these experiments: what we can learn by simulation, and also the kinds of justifications which can be given in applying that knowledge to the “real” world. (Mayo 1996; Parker 2008b). As pointed out, part of the advantage of computer simulation derives from the fact that huge numbers of calculations can be carried out without requiring direct observation by the experimenter/​simulator. At the same time, many of these calculations are approximations to the calculations which would be performed first-hand in an ideal situation. Both factors introduce uncertainties into the inferences drawn from what is observed in the simulation.

For many of the reasons described above, computer simulations do not seem to belong clearly to either the experimental or theoretical domain. Rather, they seem to crucially involve aspects of both. This has led some authors, such as Fox Keller (2003: 200) to argue that we ought to consider computer simulation a “qualitatively different way of doing science”. The literature in general tends to follow Kaufmann and Smarr (1993) in referring to computer simulation as a “third way” for scientific methodology (theoretical reasoning and experimental practice are the first two ways.). It should also be noted that the debates around these issues have tended to focus on the form of computer simulation typical in the physical sciences, where models are based on dynamical equations. Other forms of simulation might not have the same problems, or have problems of their own (see the entry on computer simulations in science ).

In recent years, the rapid development of machine learning techniques has prompted some scholars to suggest that the scientific method has become “obsolete” (Anderson 2008, Carrol and Goodstein 2009). This has resulted in an intense debate on the relative merit of data-driven and hypothesis-driven research (for samples, see e.g. Mazzocchi 2015 or Succi and Coveney 2018). For a detailed treatment of this topic, we refer to the entry scientific research and big data .

6. Discourse on scientific method

Despite philosophical disagreements, the idea of the scientific method still figures prominently in contemporary discourse on many different topics, both within science and in society at large. Often, reference to scientific method is used in ways that convey either the legend of a single, universal method characteristic of all science, or grants to a particular method or set of methods privilege as a special ‘gold standard’, often with reference to particular philosophers to vindicate the claims. Discourse on scientific method also typically arises when there is a need to distinguish between science and other activities, or for justifying the special status conveyed to science. In these areas, the philosophical attempts at identifying a set of methods characteristic for scientific endeavors are closely related to the philosophy of science’s classical problem of demarcation (see the entry on science and pseudo-science ) and to the philosophical analysis of the social dimension of scientific knowledge and the role of science in democratic society.

One of the settings in which the legend of a single, universal scientific method has been particularly strong is science education (see, e.g., Bauer 1992; McComas 1996; Wivagg & Allchin 2002). [ 5 ] Often, ‘the scientific method’ is presented in textbooks and educational web pages as a fixed four or five step procedure starting from observations and description of a phenomenon and progressing over formulation of a hypothesis which explains the phenomenon, designing and conducting experiments to test the hypothesis, analyzing the results, and ending with drawing a conclusion. Such references to a universal scientific method can be found in educational material at all levels of science education (Blachowicz 2009), and numerous studies have shown that the idea of a general and universal scientific method often form part of both students’ and teachers’ conception of science (see, e.g., Aikenhead 1987; Osborne et al. 2003). In response, it has been argued that science education need to focus more on teaching about the nature of science, although views have differed on whether this is best done through student-led investigations, contemporary cases, or historical cases (Allchin, Andersen & Nielsen 2014)

Although occasionally phrased with reference to the H-D method, important historical roots of the legend in science education of a single, universal scientific method are the American philosopher and psychologist Dewey’s account of inquiry in How We Think (1910) and the British mathematician Karl Pearson’s account of science in Grammar of Science (1892). On Dewey’s account, inquiry is divided into the five steps of

(i) a felt difficulty, (ii) its location and definition, (iii) suggestion of a possible solution, (iv) development by reasoning of the bearing of the suggestions, (v) further observation and experiment leading to its acceptance or rejection. (Dewey 1910: 72)

Similarly, on Pearson’s account, scientific investigations start with measurement of data and observation of their correction and sequence from which scientific laws can be discovered with the aid of creative imagination. These laws have to be subject to criticism, and their final acceptance will have equal validity for “all normally constituted minds”. Both Dewey’s and Pearson’s accounts should be seen as generalized abstractions of inquiry and not restricted to the realm of science—although both Dewey and Pearson referred to their respective accounts as ‘the scientific method’.

Occasionally, scientists make sweeping statements about a simple and distinct scientific method, as exemplified by Feynman’s simplified version of a conjectures and refutations method presented, for example, in the last of his 1964 Cornell Messenger lectures. [ 6 ] However, just as often scientists have come to the same conclusion as recent philosophy of science that there is not any unique, easily described scientific method. For example, the physicist and Nobel Laureate Weinberg described in the paper “The Methods of Science … And Those By Which We Live” (1995) how

The fact that the standards of scientific success shift with time does not only make the philosophy of science difficult; it also raises problems for the public understanding of science. We do not have a fixed scientific method to rally around and defend. (1995: 8)

Interview studies with scientists on their conception of method shows that scientists often find it hard to figure out whether available evidence confirms their hypothesis, and that there are no direct translations between general ideas about method and specific strategies to guide how research is conducted (Schickore & Hangel 2019, Hangel & Schickore 2017)

Reference to the scientific method has also often been used to argue for the scientific nature or special status of a particular activity. Philosophical positions that argue for a simple and unique scientific method as a criterion of demarcation, such as Popperian falsification, have often attracted practitioners who felt that they had a need to defend their domain of practice. For example, references to conjectures and refutation as the scientific method are abundant in much of the literature on complementary and alternative medicine (CAM)—alongside the competing position that CAM, as an alternative to conventional biomedicine, needs to develop its own methodology different from that of science.

Also within mainstream science, reference to the scientific method is used in arguments regarding the internal hierarchy of disciplines and domains. A frequently seen argument is that research based on the H-D method is superior to research based on induction from observations because in deductive inferences the conclusion follows necessarily from the premises. (See, e.g., Parascandola 1998 for an analysis of how this argument has been made to downgrade epidemiology compared to the laboratory sciences.) Similarly, based on an examination of the practices of major funding institutions such as the National Institutes of Health (NIH), the National Science Foundation (NSF) and the Biomedical Sciences Research Practices (BBSRC) in the UK, O’Malley et al. (2009) have argued that funding agencies seem to have a tendency to adhere to the view that the primary activity of science is to test hypotheses, while descriptive and exploratory research is seen as merely preparatory activities that are valuable only insofar as they fuel hypothesis-driven research.

In some areas of science, scholarly publications are structured in a way that may convey the impression of a neat and linear process of inquiry from stating a question, devising the methods by which to answer it, collecting the data, to drawing a conclusion from the analysis of data. For example, the codified format of publications in most biomedical journals known as the IMRAD format (Introduction, Method, Results, Analysis, Discussion) is explicitly described by the journal editors as “not an arbitrary publication format but rather a direct reflection of the process of scientific discovery” (see the so-called “Vancouver Recommendations”, ICMJE 2013: 11). However, scientific publications do not in general reflect the process by which the reported scientific results were produced. For example, under the provocative title “Is the scientific paper a fraud?”, Medawar argued that scientific papers generally misrepresent how the results have been produced (Medawar 1963/1996). Similar views have been advanced by philosophers, historians and sociologists of science (Gilbert 1976; Holmes 1987; Knorr-Cetina 1981; Schickore 2008; Suppe 1998) who have argued that scientists’ experimental practices are messy and often do not follow any recognizable pattern. Publications of research results, they argue, are retrospective reconstructions of these activities that often do not preserve the temporal order or the logic of these activities, but are instead often constructed in order to screen off potential criticism (see Schickore 2008 for a review of this work).

Philosophical positions on the scientific method have also made it into the court room, especially in the US where judges have drawn on philosophy of science in deciding when to confer special status to scientific expert testimony. A key case is Daubert vs Merrell Dow Pharmaceuticals (92–102, 509 U.S. 579, 1993). In this case, the Supreme Court argued in its 1993 ruling that trial judges must ensure that expert testimony is reliable, and that in doing this the court must look at the expert’s methodology to determine whether the proffered evidence is actually scientific knowledge. Further, referring to works of Popper and Hempel the court stated that

ordinarily, a key question to be answered in determining whether a theory or technique is scientific knowledge … is whether it can be (and has been) tested. (Justice Blackmun, Daubert v. Merrell Dow Pharmaceuticals; see Other Internet Resources for a link to the opinion)

But as argued by Haack (2005a,b, 2010) and by Foster & Hubner (1999), by equating the question of whether a piece of testimony is reliable with the question whether it is scientific as indicated by a special methodology, the court was producing an inconsistent mixture of Popper’s and Hempel’s philosophies, and this has later led to considerable confusion in subsequent case rulings that drew on the Daubert case (see Haack 2010 for a detailed exposition).

The difficulties around identifying the methods of science are also reflected in the difficulties of identifying scientific misconduct in the form of improper application of the method or methods of science. One of the first and most influential attempts at defining misconduct in science was the US definition from 1989 that defined misconduct as

fabrication, falsification, plagiarism, or other practices that seriously deviate from those that are commonly accepted within the scientific community . (Code of Federal Regulations, part 50, subpart A., August 8, 1989, italics added)

However, the “other practices that seriously deviate” clause was heavily criticized because it could be used to suppress creative or novel science. For example, the National Academy of Science stated in their report Responsible Science (1992) that it

wishes to discourage the possibility that a misconduct complaint could be lodged against scientists based solely on their use of novel or unorthodox research methods. (NAS: 27)

This clause was therefore later removed from the definition. For an entry into the key philosophical literature on conduct in science, see Shamoo & Resnick (2009).

The question of the source of the success of science has been at the core of philosophy since the beginning of modern science. If viewed as a matter of epistemology more generally, scientific method is a part of the entire history of philosophy. Over that time, science and whatever methods its practitioners may employ have changed dramatically. Today, many philosophers have taken up the banners of pluralism or of practice to focus on what are, in effect, fine-grained and contextually limited examinations of scientific method. Others hope to shift perspectives in order to provide a renewed general account of what characterizes the activity we call science.

One such perspective has been offered recently by Hoyningen-Huene (2008, 2013), who argues from the history of philosophy of science that after three lengthy phases of characterizing science by its method, we are now in a phase where the belief in the existence of a positive scientific method has eroded and what has been left to characterize science is only its fallibility. First was a phase from Plato and Aristotle up until the 17 th century where the specificity of scientific knowledge was seen in its absolute certainty established by proof from evident axioms; next was a phase up to the mid-19 th century in which the means to establish the certainty of scientific knowledge had been generalized to include inductive procedures as well. In the third phase, which lasted until the last decades of the 20 th century, it was recognized that empirical knowledge was fallible, but it was still granted a special status due to its distinctive mode of production. But now in the fourth phase, according to Hoyningen-Huene, historical and philosophical studies have shown how “scientific methods with the characteristics as posited in the second and third phase do not exist” (2008: 168) and there is no longer any consensus among philosophers and historians of science about the nature of science. For Hoyningen-Huene, this is too negative a stance, and he therefore urges the question about the nature of science anew. His own answer to this question is that “scientific knowledge differs from other kinds of knowledge, especially everyday knowledge, primarily by being more systematic” (Hoyningen-Huene 2013: 14). Systematicity can have several different dimensions: among them are more systematic descriptions, explanations, predictions, defense of knowledge claims, epistemic connectedness, ideal of completeness, knowledge generation, representation of knowledge and critical discourse. Hence, what characterizes science is the greater care in excluding possible alternative explanations, the more detailed elaboration with respect to data on which predictions are based, the greater care in detecting and eliminating sources of error, the more articulate connections to other pieces of knowledge, etc. On this position, what characterizes science is not that the methods employed are unique to science, but that the methods are more carefully employed.

Another, similar approach has been offered by Haack (2003). She sets off, similar to Hoyningen-Huene, from a dissatisfaction with the recent clash between what she calls Old Deferentialism and New Cynicism. The Old Deferentialist position is that science progressed inductively by accumulating true theories confirmed by empirical evidence or deductively by testing conjectures against basic statements; while the New Cynics position is that science has no epistemic authority and no uniquely rational method and is merely just politics. Haack insists that contrary to the views of the New Cynics, there are objective epistemic standards, and there is something epistemologically special about science, even though the Old Deferentialists pictured this in a wrong way. Instead, she offers a new Critical Commonsensist account on which standards of good, strong, supportive evidence and well-conducted, honest, thorough and imaginative inquiry are not exclusive to the sciences, but the standards by which we judge all inquirers. In this sense, science does not differ in kind from other kinds of inquiry, but it may differ in the degree to which it requires broad and detailed background knowledge and a familiarity with a technical vocabulary that only specialists may possess.

  • Aikenhead, G.S., 1987, “High-school graduates’ beliefs about science-technology-society. III. Characteristics and limitations of scientific knowledge”, Science Education , 71(4): 459–487.
  • Allchin, D., H.M. Andersen and K. Nielsen, 2014, “Complementary Approaches to Teaching Nature of Science: Integrating Student Inquiry, Historical Cases, and Contemporary Cases in Classroom Practice”, Science Education , 98: 461–486.
  • Anderson, C., 2008, “The end of theory: The data deluge makes the scientific method obsolete”, Wired magazine , 16(7): 16–07
  • Arabatzis, T., 2006, “On the inextricability of the context of discovery and the context of justification”, in Revisiting Discovery and Justification , J. Schickore and F. Steinle (eds.), Dordrecht: Springer, pp. 215–230.
  • Barnes, J. (ed.), 1984, The Complete Works of Aristotle, Vols I and II , Princeton: Princeton University Press.
  • Barnes, B. and D. Bloor, 1982, “Relativism, Rationalism, and the Sociology of Knowledge”, in Rationality and Relativism , M. Hollis and S. Lukes (eds.), Cambridge: MIT Press, pp. 1–20.
  • Bauer, H.H., 1992, Scientific Literacy and the Myth of the Scientific Method , Urbana: University of Illinois Press.
  • Bechtel, W. and R.C. Richardson, 1993, Discovering complexity , Princeton, NJ: Princeton University Press.
  • Berkeley, G., 1734, The Analyst in De Motu and The Analyst: A Modern Edition with Introductions and Commentary , D. Jesseph (trans. and ed.), Dordrecht: Kluwer Academic Publishers, 1992.
  • Blachowicz, J., 2009, “How science textbooks treat scientific method: A philosopher’s perspective”, The British Journal for the Philosophy of Science , 60(2): 303–344.
  • Bloor, D., 1991, Knowledge and Social Imagery , Chicago: University of Chicago Press, 2 nd edition.
  • Boyle, R., 1682, New experiments physico-mechanical, touching the air , Printed by Miles Flesher for Richard Davis, bookseller in Oxford.
  • Bridgman, P.W., 1927, The Logic of Modern Physics , New York: Macmillan.
  • –––, 1956, “The Methodological Character of Theoretical Concepts”, in The Foundations of Science and the Concepts of Science and Psychology , Herbert Feigl and Michael Scriven (eds.), Minnesota: University of Minneapolis Press, pp. 38–76.
  • Burian, R., 1997, “Exploratory Experimentation and the Role of Histochemical Techniques in the Work of Jean Brachet, 1938–1952”, History and Philosophy of the Life Sciences , 19(1): 27–45.
  • –––, 2007, “On microRNA and the need for exploratory experimentation in post-genomic molecular biology”, History and Philosophy of the Life Sciences , 29(3): 285–311.
  • Carnap, R., 1928, Der logische Aufbau der Welt , Berlin: Bernary, transl. by R.A. George, The Logical Structure of the World , Berkeley: University of California Press, 1967.
  • –––, 1956, “The methodological character of theoretical concepts”, Minnesota studies in the philosophy of science , 1: 38–76.
  • Carrol, S., and D. Goodstein, 2009, “Defining the scientific method”, Nature Methods , 6: 237.
  • Churchman, C.W., 1948, “Science, Pragmatics, Induction”, Philosophy of Science , 15(3): 249–268.
  • Cooper, J. (ed.), 1997, Plato: Complete Works , Indianapolis: Hackett.
  • Darden, L., 1991, Theory Change in Science: Strategies from Mendelian Genetics , Oxford: Oxford University Press
  • Dewey, J., 1910, How we think , New York: Dover Publications (reprinted 1997).
  • Douglas, H., 2009, Science, Policy, and the Value-Free Ideal , Pittsburgh: University of Pittsburgh Press.
  • Dupré, J., 2004, “Miracle of Monism ”, in Naturalism in Question , Mario De Caro and David Macarthur (eds.), Cambridge, MA: Harvard University Press, pp. 36–58.
  • Elliott, K.C., 2007, “Varieties of exploratory experimentation in nanotoxicology”, History and Philosophy of the Life Sciences , 29(3): 311–334.
  • Elliott, K. C., and T. Richards (eds.), 2017, Exploring inductive risk: Case studies of values in science , Oxford: Oxford University Press.
  • Falcon, Andrea, 2005, Aristotle and the science of nature: Unity without uniformity , Cambridge: Cambridge University Press.
  • Feyerabend, P., 1978, Science in a Free Society , London: New Left Books
  • –––, 1988, Against Method , London: Verso, 2 nd edition.
  • Fisher, R.A., 1955, “Statistical Methods and Scientific Induction”, Journal of The Royal Statistical Society. Series B (Methodological) , 17(1): 69–78.
  • Foster, K. and P.W. Huber, 1999, Judging Science. Scientific Knowledge and the Federal Courts , Cambridge: MIT Press.
  • Fox Keller, E., 2003, “Models, Simulation, and ‘computer experiments’”, in The Philosophy of Scientific Experimentation , H. Radder (ed.), Pittsburgh: Pittsburgh University Press, 198–215.
  • Gilbert, G., 1976, “The transformation of research findings into scientific knowledge”, Social Studies of Science , 6: 281–306.
  • Gimbel, S., 2011, Exploring the Scientific Method , Chicago: University of Chicago Press.
  • Goodman, N., 1965, Fact , Fiction, and Forecast , Indianapolis: Bobbs-Merrill.
  • Haack, S., 1995, “Science is neither sacred nor a confidence trick”, Foundations of Science , 1(3): 323–335.
  • –––, 2003, Defending science—within reason , Amherst: Prometheus.
  • –––, 2005a, “Disentangling Daubert: an epistemological study in theory and practice”, Journal of Philosophy, Science and Law , 5, Haack 2005a available online . doi:10.5840/jpsl2005513
  • –––, 2005b, “Trial and error: The Supreme Court’s philosophy of science”, American Journal of Public Health , 95: S66-S73.
  • –––, 2010, “Federal Philosophy of Science: A Deconstruction-and a Reconstruction”, NYUJL & Liberty , 5: 394.
  • Hangel, N. and J. Schickore, 2017, “Scientists’ conceptions of good research practice”, Perspectives on Science , 25(6): 766–791
  • Harper, W.L., 2011, Isaac Newton’s Scientific Method: Turning Data into Evidence about Gravity and Cosmology , Oxford: Oxford University Press.
  • Hempel, C., 1950, “Problems and Changes in the Empiricist Criterion of Meaning”, Revue Internationale de Philosophie , 41(11): 41–63.
  • –––, 1951, “The Concept of Cognitive Significance: A Reconsideration”, Proceedings of the American Academy of Arts and Sciences , 80(1): 61–77.
  • –––, 1965, Aspects of scientific explanation and other essays in the philosophy of science , New York–London: Free Press.
  • –––, 1966, Philosophy of Natural Science , Englewood Cliffs: Prentice-Hall.
  • Holmes, F.L., 1987, “Scientific writing and scientific discovery”, Isis , 78(2): 220–235.
  • Howard, D., 2003, “Two left turns make a right: On the curious political career of North American philosophy of science at midcentury”, in Logical Empiricism in North America , G.L. Hardcastle & A.W. Richardson (eds.), Minneapolis: University of Minnesota Press, pp. 25–93.
  • Hoyningen-Huene, P., 2008, “Systematicity: The nature of science”, Philosophia , 36(2): 167–180.
  • –––, 2013, Systematicity. The Nature of Science , Oxford: Oxford University Press.
  • Howie, D., 2002, Interpreting probability: Controversies and developments in the early twentieth century , Cambridge: Cambridge University Press.
  • Hughes, R., 1999, “The Ising Model, Computer Simulation, and Universal Physics”, in Models as Mediators , M. Morgan and M. Morrison (eds.), Cambridge: Cambridge University Press, pp. 97–145
  • Hume, D., 1739, A Treatise of Human Nature , D. Fate Norton and M.J. Norton (eds.), Oxford: Oxford University Press, 2000.
  • Humphreys, P., 1995, “Computational science and scientific method”, Minds and Machines , 5(1): 499–512.
  • ICMJE, 2013, “Recommendations for the Conduct, Reporting, Editing, and Publication of Scholarly Work in Medical Journals”, International Committee of Medical Journal Editors, available online , accessed August 13 2014
  • Jeffrey, R.C., 1956, “Valuation and Acceptance of Scientific Hypotheses”, Philosophy of Science , 23(3): 237–246.
  • Kaufmann, W.J., and L.L. Smarr, 1993, Supercomputing and the Transformation of Science , New York: Scientific American Library.
  • Knorr-Cetina, K., 1981, The Manufacture of Knowledge , Oxford: Pergamon Press.
  • Krohs, U., 2012, “Convenience experimentation”, Studies in History and Philosophy of Biological and BiomedicalSciences , 43: 52–57.
  • Kuhn, T.S., 1962, The Structure of Scientific Revolutions , Chicago: University of Chicago Press
  • Latour, B. and S. Woolgar, 1986, Laboratory Life: The Construction of Scientific Facts , Princeton: Princeton University Press, 2 nd edition.
  • Laudan, L., 1968, “Theories of scientific method from Plato to Mach”, History of Science , 7(1): 1–63.
  • Lenhard, J., 2006, “Models and statistical inference: The controversy between Fisher and Neyman-Pearson”, The British Journal for the Philosophy of Science , 57(1): 69–91.
  • Leonelli, S., 2012, “Making Sense of Data-Driven Research in the Biological and the Biomedical Sciences”, Studies in the History and Philosophy of the Biological and Biomedical Sciences , 43(1): 1–3.
  • Levi, I., 1960, “Must the scientist make value judgments?”, Philosophy of Science , 57(11): 345–357
  • Lindley, D., 1991, Theory Change in Science: Strategies from Mendelian Genetics , Oxford: Oxford University Press.
  • Lipton, P., 2004, Inference to the Best Explanation , London: Routledge, 2 nd edition.
  • Marks, H.M., 2000, The progress of experiment: science and therapeutic reform in the United States, 1900–1990 , Cambridge: Cambridge University Press.
  • Mazzochi, F., 2015, “Could Big Data be the end of theory in science?”, EMBO reports , 16: 1250–1255.
  • Mayo, D.G., 1996, Error and the Growth of Experimental Knowledge , Chicago: University of Chicago Press.
  • McComas, W.F., 1996, “Ten myths of science: Reexamining what we think we know about the nature of science”, School Science and Mathematics , 96(1): 10–16.
  • Medawar, P.B., 1963/1996, “Is the scientific paper a fraud”, in The Strange Case of the Spotted Mouse and Other Classic Essays on Science , Oxford: Oxford University Press, 33–39.
  • Mill, J.S., 1963, Collected Works of John Stuart Mill , J. M. Robson (ed.), Toronto: University of Toronto Press
  • NAS, 1992, Responsible Science: Ensuring the integrity of the research process , Washington DC: National Academy Press.
  • Nersessian, N.J., 1987, “A cognitive-historical approach to meaning in scientific theories”, in The process of science , N. Nersessian (ed.), Berlin: Springer, pp. 161–177.
  • –––, 2008, Creating Scientific Concepts , Cambridge: MIT Press.
  • Newton, I., 1726, Philosophiae naturalis Principia Mathematica (3 rd edition), in The Principia: Mathematical Principles of Natural Philosophy: A New Translation , I.B. Cohen and A. Whitman (trans.), Berkeley: University of California Press, 1999.
  • –––, 1704, Opticks or A Treatise of the Reflections, Refractions, Inflections & Colors of Light , New York: Dover Publications, 1952.
  • Neyman, J., 1956, “Note on an Article by Sir Ronald Fisher”, Journal of the Royal Statistical Society. Series B (Methodological) , 18: 288–294.
  • Nickles, T., 1987, “Methodology, heuristics, and rationality”, in Rational changes in science: Essays on Scientific Reasoning , J.C. Pitt (ed.), Berlin: Springer, pp. 103–132.
  • Nicod, J., 1924, Le problème logique de l’induction , Paris: Alcan. (Engl. transl. “The Logical Problem of Induction”, in Foundations of Geometry and Induction , London: Routledge, 2000.)
  • Nola, R. and H. Sankey, 2000a, “A selective survey of theories of scientific method”, in Nola and Sankey 2000b: 1–65.
  • –––, 2000b, After Popper, Kuhn and Feyerabend. Recent Issues in Theories of Scientific Method , London: Springer.
  • –––, 2007, Theories of Scientific Method , Stocksfield: Acumen.
  • Norton, S., and F. Suppe, 2001, “Why atmospheric modeling is good science”, in Changing the Atmosphere: Expert Knowledge and Environmental Governance , C. Miller and P. Edwards (eds.), Cambridge, MA: MIT Press, 88–133.
  • O’Malley, M., 2007, “Exploratory experimentation and scientific practice: Metagenomics and the proteorhodopsin case”, History and Philosophy of the Life Sciences , 29(3): 337–360.
  • O’Malley, M., C. Haufe, K. Elliot, and R. Burian, 2009, “Philosophies of Funding”, Cell , 138: 611–615.
  • Oreskes, N., K. Shrader-Frechette, and K. Belitz, 1994, “Verification, Validation and Confirmation of Numerical Models in the Earth Sciences”, Science , 263(5147): 641–646.
  • Osborne, J., S. Simon, and S. Collins, 2003, “Attitudes towards science: a review of the literature and its implications”, International Journal of Science Education , 25(9): 1049–1079.
  • Parascandola, M., 1998, “Epidemiology—2 nd -Rate Science”, Public Health Reports , 113(4): 312–320.
  • Parker, W., 2008a, “Franklin, Holmes and the Epistemology of Computer Simulation”, International Studies in the Philosophy of Science , 22(2): 165–83.
  • –––, 2008b, “Computer Simulation through an Error-Statistical Lens”, Synthese , 163(3): 371–84.
  • Pearson, K. 1892, The Grammar of Science , London: J.M. Dents and Sons, 1951
  • Pearson, E.S., 1955, “Statistical Concepts in Their Relation to Reality”, Journal of the Royal Statistical Society , B, 17: 204–207.
  • Pickering, A., 1984, Constructing Quarks: A Sociological History of Particle Physics , Edinburgh: Edinburgh University Press.
  • Popper, K.R., 1959, The Logic of Scientific Discovery , London: Routledge, 2002
  • –––, 1963, Conjectures and Refutations , London: Routledge, 2002.
  • –––, 1985, Unended Quest: An Intellectual Autobiography , La Salle: Open Court Publishing Co..
  • Rudner, R., 1953, “The Scientist Qua Scientist Making Value Judgments”, Philosophy of Science , 20(1): 1–6.
  • Rudolph, J.L., 2005, “Epistemology for the masses: The origin of ‘The Scientific Method’ in American Schools”, History of Education Quarterly , 45(3): 341–376
  • Schickore, J., 2008, “Doing science, writing science”, Philosophy of Science , 75: 323–343.
  • Schickore, J. and N. Hangel, 2019, “‘It might be this, it should be that…’ uncertainty and doubt in day-to-day science practice”, European Journal for Philosophy of Science , 9(2): 31. doi:10.1007/s13194-019-0253-9
  • Shamoo, A.E. and D.B. Resnik, 2009, Responsible Conduct of Research , Oxford: Oxford University Press.
  • Shank, J.B., 2008, The Newton Wars and the Beginning of the French Enlightenment , Chicago: The University of Chicago Press.
  • Shapin, S. and S. Schaffer, 1985, Leviathan and the air-pump , Princeton: Princeton University Press.
  • Smith, G.E., 2002, “The Methodology of the Principia”, in The Cambridge Companion to Newton , I.B. Cohen and G.E. Smith (eds.), Cambridge: Cambridge University Press, 138–173.
  • Snyder, L.J., 1997a, “Discoverers’ Induction”, Philosophy of Science , 64: 580–604.
  • –––, 1997b, “The Mill-Whewell Debate: Much Ado About Induction”, Perspectives on Science , 5: 159–198.
  • –––, 1999, “Renovating the Novum Organum: Bacon, Whewell and Induction”, Studies in History and Philosophy of Science , 30: 531–557.
  • Sober, E., 2008, Evidence and Evolution. The logic behind the science , Cambridge: Cambridge University Press
  • Sprenger, J. and S. Hartmann, 2019, Bayesian philosophy of science , Oxford: Oxford University Press.
  • Steinle, F., 1997, “Entering New Fields: Exploratory Uses of Experimentation”, Philosophy of Science (Proceedings), 64: S65–S74.
  • –––, 2002, “Experiments in History and Philosophy of Science”, Perspectives on Science , 10(4): 408–432.
  • Strasser, B.J., 2012, “Data-driven sciences: From wonder cabinets to electronic databases”, Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences , 43(1): 85–87.
  • Succi, S. and P.V. Coveney, 2018, “Big data: the end of the scientific method?”, Philosophical Transactions of the Royal Society A , 377: 20180145. doi:10.1098/rsta.2018.0145
  • Suppe, F., 1998, “The Structure of a Scientific Paper”, Philosophy of Science , 65(3): 381–405.
  • Swijtink, Z.G., 1987, “The objectification of observation: Measurement and statistical methods in the nineteenth century”, in The probabilistic revolution. Ideas in History, Vol. 1 , L. Kruger (ed.), Cambridge MA: MIT Press, pp. 261–285.
  • Waters, C.K., 2007, “The nature and context of exploratory experimentation: An introduction to three case studies of exploratory research”, History and Philosophy of the Life Sciences , 29(3): 275–284.
  • Weinberg, S., 1995, “The methods of science… and those by which we live”, Academic Questions , 8(2): 7–13.
  • Weissert, T., 1997, The Genesis of Simulation in Dynamics: Pursuing the Fermi-Pasta-Ulam Problem , New York: Springer Verlag.
  • William H., 1628, Exercitatio Anatomica de Motu Cordis et Sanguinis in Animalibus , in On the Motion of the Heart and Blood in Animals , R. Willis (trans.), Buffalo: Prometheus Books, 1993.
  • Winsberg, E., 2010, Science in the Age of Computer Simulation , Chicago: University of Chicago Press.
  • Wivagg, D. & D. Allchin, 2002, “The Dogma of the Scientific Method”, The American Biology Teacher , 64(9): 645–646
How to cite this entry . Preview the PDF version of this entry at the Friends of the SEP Society . Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entry at PhilPapers , with links to its database.
  • Blackmun opinion , in Daubert v. Merrell Dow Pharmaceuticals (92–102), 509 U.S. 579 (1993).
  • Scientific Method at philpapers. Darrell Rowbottom (ed.).
  • Recent Articles | Scientific Method | The Scientist Magazine

al-Kindi | Albert the Great [= Albertus magnus] | Aquinas, Thomas | Arabic and Islamic Philosophy, disciplines in: natural philosophy and natural science | Arabic and Islamic Philosophy, historical and methodological topics in: Greek sources | Arabic and Islamic Philosophy, historical and methodological topics in: influence of Arabic and Islamic Philosophy on the Latin West | Aristotle | Bacon, Francis | Bacon, Roger | Berkeley, George | biology: experiment in | Boyle, Robert | Cambridge Platonists | confirmation | Descartes, René | Enlightenment | epistemology | epistemology: Bayesian | epistemology: social | Feyerabend, Paul | Galileo Galilei | Grosseteste, Robert | Hempel, Carl | Hume, David | Hume, David: Newtonianism and Anti-Newtonianism | induction: problem of | Kant, Immanuel | Kuhn, Thomas | Leibniz, Gottfried Wilhelm | Locke, John | Mill, John Stuart | More, Henry | Neurath, Otto | Newton, Isaac | Newton, Isaac: philosophy | Ockham [Occam], William | operationalism | Peirce, Charles Sanders | Plato | Popper, Karl | rationality: historicist theories of | Reichenbach, Hans | reproducibility, scientific | Schlick, Moritz | science: and pseudo-science | science: theory and observation in | science: unity of | scientific discovery | scientific knowledge: social dimensions of | simulations in science | skepticism: medieval | space and time: absolute and relational space and motion, post-Newtonian theories | Vienna Circle | Whewell, William | Zabarella, Giacomo

Copyright © 2021 by Brian Hepburn < brian . hepburn @ wichita . edu > Hanne Andersen < hanne . andersen @ ind . ku . dk >

  • Accessibility

Experimental Design: Definition and Types

By Jim Frost 3 Comments

What is Experimental Design?

An experimental design is a detailed plan for collecting and using data to identify causal relationships. Through careful planning, the design of experiments allows your data collection efforts to have a reasonable chance of detecting effects and testing hypotheses that answer your research questions.

An experiment is a data collection procedure that occurs in controlled conditions to identify and understand causal relationships between variables. Researchers can use many potential designs. The ultimate choice depends on their research question, resources, goals, and constraints. In some fields of study, researchers refer to experimental design as the design of experiments (DOE). Both terms are synonymous.

Scientist who developed an experimental design for her research.

Ultimately, the design of experiments helps ensure that your procedures and data will evaluate your research question effectively. Without an experimental design, you might waste your efforts in a process that, for many potential reasons, can’t answer your research question. In short, it helps you trust your results.

Learn more about Independent and Dependent Variables .

Design of Experiments: Goals & Settings

Experiments occur in many settings, ranging from psychology, social sciences, medicine, physics, engineering, and industrial and service sectors. Typically, experimental goals are to discover a previously unknown effect , confirm a known effect, or test a hypothesis.

Effects represent causal relationships between variables. For example, in a medical experiment, does the new medicine cause an improvement in health outcomes? If so, the medicine has a causal effect on the outcome.

An experimental design’s focus depends on the subject area and can include the following goals:

  • Understanding the relationships between variables.
  • Identifying the variables that have the largest impact on the outcomes.
  • Finding the input variable settings that produce an optimal result.

For example, psychologists have conducted experiments to understand how conformity affects decision-making. Sociologists have performed experiments to determine whether ethnicity affects the public reaction to staged bike thefts. These experiments map out the causal relationships between variables, and their primary goal is to understand the role of various factors.

Conversely, in a manufacturing environment, the researchers might use an experimental design to find the factors that most effectively improve their product’s strength, identify the optimal manufacturing settings, and do all that while accounting for various constraints. In short, a manufacturer’s goal is often to use experiments to improve their products cost-effectively.

In a medical experiment, the goal might be to quantify the medicine’s effect and find the optimum dosage.

Developing an Experimental Design

Developing an experimental design involves planning that maximizes the potential to collect data that is both trustworthy and able to detect causal relationships. Specifically, these studies aim to see effects when they exist in the population the researchers are studying, preferentially favor causal effects, isolate each factor’s true effect from potential confounders, and produce conclusions that you can generalize to the real world.

To accomplish these goals, experimental designs carefully manage data validity and reliability , and internal and external experimental validity. When your experiment is valid and reliable, you can expect your procedures and data to produce trustworthy results.

An excellent experimental design involves the following:

  • Lots of preplanning.
  • Developing experimental treatments.
  • Determining how to assign subjects to treatment groups.

The remainder of this article focuses on how experimental designs incorporate these essential items to accomplish their research goals.

Learn more about Data Reliability vs. Validity and Internal and External Experimental Validity .

Preplanning, Defining, and Operationalizing for Design of Experiments

A literature review is crucial for the design of experiments.

This phase of the design of experiments helps you identify critical variables, know how to measure them while ensuring reliability and validity, and understand the relationships between them. The review can also help you find ways to reduce sources of variability, which increases your ability to detect treatment effects. Notably, the literature review allows you to learn how similar studies designed their experiments and the challenges they faced.

Operationalizing a study involves taking your research question, using the background information you gathered, and formulating an actionable plan.

This process should produce a specific and testable hypothesis using data that you can reasonably collect given the resources available to the experiment.

  • Null hypothesis : The jumping exercise intervention does not affect bone density.
  • Alternative hypothesis : The jumping exercise intervention affects bone density.

To learn more about this early phase, read Five Steps for Conducting Scientific Studies with Statistical Analyses .

Formulating Treatments in Experimental Designs

In an experimental design, treatments are variables that the researchers control. They are the primary independent variables of interest. Researchers administer the treatment to the subjects or items in the experiment and want to know whether it causes changes in the outcome.

As the name implies, a treatment can be medical in nature, such as a new medicine or vaccine. But it’s a general term that applies to other things such as training programs, manufacturing settings, teaching methods, and types of fertilizers. I helped run an experiment where the treatment was a jumping exercise intervention that we hoped would increase bone density. All these treatment examples are things that potentially influence a measurable outcome.

Even when you know your treatment generally, you must carefully consider the amount. How large of a dose? If you’re comparing three different temperatures in a manufacturing process, how far apart are they? For my bone mineral density study, we had to determine how frequently the exercise sessions would occur and how long each lasted.

How you define the treatments in the design of experiments can affect your findings and the generalizability of your results.

Assigning Subjects to Experimental Groups

A crucial decision for all experimental designs is determining how researchers assign subjects to the experimental conditions—the treatment and control groups. The control group is often, but not always, the lack of a treatment. It serves as a basis for comparison by showing outcomes for subjects who don’t receive a treatment. Learn more about Control Groups .

How your experimental design assigns subjects to the groups affects how confident you can be that the findings represent true causal effects rather than mere correlation caused by confounders. Indeed, the assignment method influences how you control for confounding variables. This is the difference between correlation and causation .

Imagine a study finds that vitamin consumption correlates with better health outcomes. As a researcher, you want to be able to say that vitamin consumption causes the improvements. However, with the wrong experimental design, you might only be able to say there is an association. A confounder, and not the vitamins, might actually cause the health benefits.

Let’s explore some of the ways to assign subjects in design of experiments.

Completely Randomized Designs

A completely randomized experimental design randomly assigns all subjects to the treatment and control groups. You simply take each participant and use a random process to determine their group assignment. You can flip coins, roll a die, or use a computer. Randomized experiments must be prospective studies because they need to be able to control group assignment.

Random assignment in the design of experiments helps ensure that the groups are roughly equivalent at the beginning of the study. This equivalence at the start increases your confidence that any differences you see at the end were caused by the treatments. The randomization tends to equalize confounders between the experimental groups and, thereby, cancels out their effects, leaving only the treatment effects.

For example, in a vitamin study, the researchers can randomly assign participants to either the control or vitamin group. Because the groups are approximately equal when the experiment starts, if the health outcomes are different at the end of the study, the researchers can be confident that the vitamins caused those improvements.

Statisticians consider randomized experimental designs to be the best for identifying causal relationships.

If you can’t randomly assign subjects but want to draw causal conclusions about an intervention, consider using a quasi-experimental design .

Learn more about Randomized Controlled Trials and Random Assignment in Experiments .

Randomized Block Designs

Nuisance factors are variables that can affect the outcome, but they are not the researcher’s primary interest. Unfortunately, they can hide or distort the treatment results. When experimenters know about specific nuisance factors, they can use a randomized block design to minimize their impact.

This experimental design takes subjects with a shared “nuisance” characteristic and groups them into blocks. The participants in each block are then randomly assigned to the experimental groups. This process allows the experiment to control for known nuisance factors.

Blocking in the design of experiments reduces the impact of nuisance factors on experimental error. The analysis assesses the effects of the treatment within each block, which removes the variability between blocks. The result is that blocked experimental designs can reduce the impact of nuisance variables, increasing the ability to detect treatment effects accurately.

Suppose you’re testing various teaching methods. Because grade level likely affects educational outcomes, you might use grade level as a blocking factor. To use a randomized block design for this scenario, divide the participants by grade level and then randomly assign the members of each grade level to the experimental groups.

A standard guideline for an experimental design is to “Block what you can, randomize what you cannot.” Use blocking for a few primary nuisance factors. Then use random assignment to distribute the unblocked nuisance factors equally between the experimental conditions.

You can also use covariates to control nuisance factors. Learn about Covariates: Definition and Uses .

Observational Studies

In some experimental designs, randomly assigning subjects to the experimental conditions is impossible or unethical. The researchers simply can’t assign participants to the experimental groups. However, they can observe them in their natural groupings, measure the essential variables, and look for correlations. These observational studies are also known as quasi-experimental designs. Retrospective studies must be observational in nature because they look back at past events.

Imagine you’re studying the effects of depression on an activity. Clearly, you can’t randomly assign participants to the depression and control groups. But you can observe participants with and without depression and see how their task performance differs.

Observational studies let you perform research when you can’t control the treatment. However, quasi-experimental designs increase the problem of confounding variables. For this design of experiments, correlation does not necessarily imply causation. While special procedures can help control confounders in an observational study, you’re ultimately less confident that the results represent causal findings.

Learn more about Observational Studies .

For a good comparison, learn about the differences and tradeoffs between Observational Studies and Randomized Experiments .

Between-Subjects vs. Within-Subjects Experimental Designs

When you think of the design of experiments, you probably picture a treatment and control group. Researchers assign participants to only one of these groups, so each group contains entirely different subjects than the other groups. Analysts compare the groups at the end of the experiment. Statisticians refer to this method as a between-subjects, or independent measures, experimental design.

In a between-subjects design , you can have more than one treatment group, but each subject is exposed to only one condition, the control group or one of the treatment groups.

A potential downside to this approach is that differences between groups at the beginning can affect the results at the end. As you’ve read earlier, random assignment can reduce those differences, but it is imperfect. There will always be some variability between the groups.

In a  within-subjects experimental design , also known as repeated measures, subjects experience all treatment conditions and are measured for each. Each subject acts as their own control, which reduces variability and increases the statistical power to detect effects.

In this experimental design, you minimize pre-existing differences between the experimental conditions because they all contain the same subjects. However, the order of treatments can affect the results. Beware of practice and fatigue effects. Learn more about Repeated Measures Designs .

Assigned to one experimental condition Participates in all experimental conditions
Requires more subjects Fewer subjects
Differences between subjects in the groups can affect the results Uses same subjects in all conditions.
No order of treatment effects. Order of treatments can affect results.

Design of Experiments Examples

For example, a bone density study has three experimental groups—a control group, a stretching exercise group, and a jumping exercise group.

In a between-subjects experimental design, scientists randomly assign each participant to one of the three groups.

In a within-subjects design, all subjects experience the three conditions sequentially while the researchers measure bone density repeatedly. The procedure can switch the order of treatments for the participants to help reduce order effects.

Matched Pairs Experimental Design

A matched pairs experimental design is a between-subjects study that uses pairs of similar subjects. Researchers use this approach to reduce pre-existing differences between experimental groups. It’s yet another design of experiments method for reducing sources of variability.

Researchers identify variables likely to affect the outcome, such as demographics. When they pick a subject with a set of characteristics, they try to locate another participant with similar attributes to create a matched pair. Scientists randomly assign one member of a pair to the treatment group and the other to the control group.

On the plus side, this process creates two similar groups, and it doesn’t create treatment order effects. While matched pairs do not produce the perfectly matched groups of a within-subjects design (which uses the same subjects in all conditions), it aims to reduce variability between groups relative to a between-subjects study.

On the downside, finding matched pairs is very time-consuming. Additionally, if one member of a matched pair drops out, the other subject must leave the study too.

Learn more about Matched Pairs Design: Uses & Examples .

Another consideration is whether you’ll use a cross-sectional design (one point in time) or use a longitudinal study to track changes over time .

A case study is a research method that often serves as a precursor to a more rigorous experimental design by identifying research questions, variables, and hypotheses to test. Learn more about What is a Case Study? Definition & Examples .

In conclusion, the design of experiments is extremely sensitive to subject area concerns and the time and resources available to the researchers. Developing a suitable experimental design requires balancing a multitude of considerations. A successful design is necessary to obtain trustworthy answers to your research question and to have a reasonable chance of detecting treatment effects when they exist.

Share this:

the scientific method and experimental design

Reader Interactions

' src=

March 23, 2024 at 2:35 pm

Dear Jim You wrote a superb document, I will use it in my Buistatistics course, along with your three books. Thank you very much! Miguel

' src=

March 23, 2024 at 5:43 pm

Thanks so much, Miguel! Glad this post was helpful and I trust the books will be as well.

' src=

April 10, 2023 at 4:36 am

What are the purpose and uses of experimental research design?

Comments and Questions Cancel reply

science made simple logo

The Scientific Method by Science Made Simple

Understanding and using the scientific method.

The Scientific Method is a process used to design and perform experiments. It's important to minimize experimental errors and bias, and increase confidence in the accuracy of your results.

science experiment

In the previous sections, we talked about how to pick a good topic and specific question to investigate. Now we will discuss how to carry out your investigation.

Steps of the Scientific Method

  • Observation/Research
  • Experimentation

Now that you have settled on the question you want to ask, it's time to use the Scientific Method to design an experiment to answer that question.

If your experiment isn't designed well, you may not get the correct answer. You may not even get any definitive answer at all!

The Scientific Method is a logical and rational order of steps by which scientists come to conclusions about the world around them. The Scientific Method helps to organize thoughts and procedures so that scientists can be confident in the answers they find.

OBSERVATION is first step, so that you know how you want to go about your research.

HYPOTHESIS is the answer you think you'll find.

PREDICTION is your specific belief about the scientific idea: If my hypothesis is true, then I predict we will discover this.

EXPERIMENT is the tool that you invent to answer the question, and

CONCLUSION is the answer that the experiment gives.

Don't worry, it isn't that complicated. Let's take a closer look at each one of these steps. Then you can understand the tools scientists use for their science experiments, and use them for your own.

OBSERVATION

observation  magnifying glass

This step could also be called "research." It is the first stage in understanding the problem.

After you decide on topic, and narrow it down to a specific question, you will need to research everything that you can find about it. You can collect information from your own experiences, books, the internet, or even smaller "unofficial" experiments.

Let's continue the example of a science fair idea about tomatoes in the garden. You like to garden, and notice that some tomatoes are bigger than others and wonder why.

Because of this personal experience and an interest in the problem, you decide to learn more about what makes plants grow.

For this stage of the Scientific Method, it's important to use as many sources as you can find. The more information you have on your science fair topic, the better the design of your experiment is going to be, and the better your science fair project is going to be overall.

Also try to get information from your teachers or librarians, or professionals who know something about your science fair project. They can help to guide you to a solid experimental setup.

research science fair topic

The next stage of the Scientific Method is known as the "hypothesis." This word basically means "a possible solution to a problem, based on knowledge and research."

The hypothesis is a simple statement that defines what you think the outcome of your experiment will be.

All of the first stage of the Scientific Method -- the observation, or research stage -- is designed to help you express a problem in a single question ("Does the amount of sunlight in a garden affect tomato size?") and propose an answer to the question based on what you know. The experiment that you will design is done to test the hypothesis.

Using the example of the tomato experiment, here is an example of a hypothesis:

TOPIC: "Does the amount of sunlight a tomato plant receives affect the size of the tomatoes?"

HYPOTHESIS: "I believe that the more sunlight a tomato plant receives, the larger the tomatoes will grow.

This hypothesis is based on:

(1) Tomato plants need sunshine to make food through photosynthesis, and logically, more sun means more food, and;

(2) Through informal, exploratory observations of plants in a garden, those with more sunlight appear to grow bigger.

science fair project ideas

The hypothesis is your general statement of how you think the scientific phenomenon in question works.

Your prediction lets you get specific -- how will you demonstrate that your hypothesis is true? The experiment that you will design is done to test the prediction.

An important thing to remember during this stage of the scientific method is that once you develop a hypothesis and a prediction, you shouldn't change it, even if the results of your experiment show that you were wrong.

An incorrect prediction does NOT mean that you "failed." It just means that the experiment brought some new facts to light that maybe you hadn't thought about before.

Continuing our tomato plant example, a good prediction would be: Increasing the amount of sunlight tomato plants in my experiment receive will cause an increase in their size compared to identical plants that received the same care but less light.

This is the part of the scientific method that tests your hypothesis. An experiment is a tool that you design to find out if your ideas about your topic are right or wrong.

It is absolutely necessary to design a science fair experiment that will accurately test your hypothesis. The experiment is the most important part of the scientific method. It's the logical process that lets scientists learn about the world.

On the next page, we'll discuss the ways that you can go about designing a science fair experiment idea.

The final step in the scientific method is the conclusion. This is a summary of the experiment's results, and how those results match up to your hypothesis.

You have two options for your conclusions: based on your results, either:

(1) YOU CAN REJECT the hypothesis, or

(2) YOU CAN NOT REJECT the hypothesis.

This is an important point!

You can not PROVE the hypothesis with a single experiment, because there is a chance that you made an error somewhere along the way.

What you can say is that your results SUPPORT the original hypothesis.

If your original hypothesis didn't match up with the final results of your experiment, don't change the hypothesis.

Instead, try to explain what might have been wrong with your original hypothesis. What information were you missing when you made your prediction? What are the possible reasons the hypothesis and experimental results didn't match up?

Remember, a science fair experiment isn't a failure simply because does not agree with your hypothesis. No one will take points off if your prediction wasn't accurate. Many important scientific discoveries were made as a result of experiments gone wrong!

A science fair experiment is only a failure if its design is flawed. A flawed experiment is one that (1) doesn't keep its variables under control, and (2) doesn't sufficiently answer the question that you asked of it.

Search This Site:

Science Fairs

  • Introduction
  • Project Ideas
  • Types of Projects
  • Pick a Topic
  • Scientific Method
  • Design Your Experiment
  • Present Your Project
  • What Judges Want
  • Parent Info

Recommended *

  • Sample Science Projects - botany, ecology, microbiology, nutrition

scientific method book

* This site contains affiliate links to carefully chosen, high quality products. We may receive a commission for purchases made through these links.

  • Terms of Service

Copyright © 2006 - 2023, Science Made Simple, Inc. All Rights Reserved.

The science fair projects & ideas, science articles and all other material on this website are covered by copyright laws and may not be reproduced without permission.

Experimental Design

Ethics, Integrity, and the Scientific Method

  • Reference work entry
  • First Online: 02 April 2020
  • Cite this reference work entry

the scientific method and experimental design

  • Jonathan Lewis 2  

2718 Accesses

5 Citations

Experimental design is one aspect of a scientific method. A well-designed, properly conducted experiment aims to control variables in order to isolate and manipulate causal effects and thereby maximize internal validity, support causal inferences, and guarantee reliable results. Traditionally employed in the natural sciences, experimental design has become an important part of research in the social and behavioral sciences. Experimental methods are also endorsed as the most reliable guides to policy effectiveness. Through a discussion of some of the central concepts associated with experimental design, including controlled variation and randomization, this chapter will provide a summary of key ethical issues that tend to arise in experimental contexts. In addition, by exploring assumptions about the nature of causation and by analyzing features of causal relationships, systems, and inferences in social contexts, this chapter will summarize the ways in which experimental design can undermine the integrity of not only social and behavioral research but policies implemented on the basis of such research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

the scientific method and experimental design

Research Design: Toward a Realistic Role for Causal Analysis

the scientific method and experimental design

Experiments and Econometrics

Alderson P (1996) Equipoise as a means of managing uncertainty: personal, communal and proxy. J Med Ethics 223:135–139

Article   Google Scholar  

Arabatzis T (2014) Experiment. In: Curd M, Psillos S (eds) The Routledge companion to philosophy of science, 2nd edn. Routledge, London, pp 191–202

Google Scholar  

Baele S (2013) The ethics of new development economics: is the experimental approach to development economics morally wrong? J Philos Econ 7(1):2–42

Beauchamp T, Childress J (2013) Principles of biomedical ethics, 7th edn. Oxford University Press, Oxford

Binmore K (1999) Why experiment in economics? Econ J 109(453):16–24

Bogen J (2002) Epistemological custard pies from functional brain imaging. Philos Sci 69(3):59–71

Bordens K, Abbott B (2013) Research and design methods: a process approach. McGraw-Hill, Boston

Brady H (2011) Causation and explanation in social science. In: Goodin R (ed) The Oxford handbook of political science. Oxford University Press, Oxford, pp 1054–1107

Broome J (1984) Selecting people randomly. Ethics 95(1):38–55

Brown A, Mehta T, Allison D (2017) Publication bias in science: what is it, why is it problematic, and how can it be addressed? In: Jamieson K, Kahan D, Scheufele D (eds) The Oxford handbook of the science of science communication. Oxford University Press, Oxford, pp 93–101

Cartwright N (1999) The dappled world: a study of the boundaries of science. Cambridge University Press, Cambridge, UK

Book   Google Scholar  

Cartwright N (2007) Hunting causes and using them. Cambridge University Press, Cambridge, UK

Cartwright N (2012) RCTs, evidence, and predicting policy effectiveness. In: Kincaid H (ed) The Oxford handbook of philosophy of social science. Oxford University Press, Oxford, UK, pp 298–318

Cartwright N (2014) Causal inference. In: Cartwright N, Montuschi E (eds) Philosophy of social science: a new introduction. Oxford University Press, Oxford, pp 308–337

Churchill L (1980) Physician-investigator/patient-subject: exploring the logic and the tension. J Med Philos 5(3):215–224

Clarke S (1999) Justifying deception in social science research. J Appl Philos 16(2):151–166

Conner R (1982) Random assignment of clients in social experimentation. In: Sieber J (ed) The ethics of social research: surveys and experiments. Springer, New York, pp 57–77

Chapter   Google Scholar  

Cook T, Campbell D (1986) The causal assumptions of quasi-experimental practice. Synthese 68(1):141–180

Cook C, Sheets C (2011) Clinical equipoise and personal equipoise: two necessary ingredients for reducing bias in manual therapy trials. J Man Manipulative Ther 19(1):55–57

Crasnow S (2017) Bias in social science experiments. In: McIntyre L, Rosenberg A (eds) The Routledge companion to the philosophy of social science. Routledge, London, pp 191–201

Douglas H (2014) Values in social science. In: Cartwright N, Montuschi E (eds) Philosophy of social science: a new introduction. Oxford University Press, Oxford, pp 162–182

Feest U, Steinle F (2016) Experiment. In: Humphreys P (ed) The Oxford handbook of philosophy of science. Oxford University Press, Oxford, pp 274–295

Freedman B (1987) Equipoise and the ethics of clinical research. N Engl J Med 317(3):141–145

Freedman B, Glass K, Weijer C (1996) Placebo orthodoxy in clinical research II: ethical, legal, and regulatory myths. J Law Med Ethics 24(3):252–259

Fried C (1974) Medical experimentation: personal integrity and social policy. Elsevier, New York

Gangl M (2010) Causal inference in sociological research. Annu Rev Sociol 36:21–47

Geller D (1982) Alternatives to deception: why, what, and how? In: Sieber JE (ed) The ethics of social research: surveys and experiments. Springer, New York, pp 38–55

Gifford F (1986) The conflict between randomized clinical trials and the therapeutic obligation. J Med Philos 11:347–366

Gillon R (1994) Medical ethics: four principles plus attention to scope. Br Med J 309(6948):184–188

Goldthorpe J (2001) Causation, statistics, and sociology. Eur Sociol Rev 17(1):1–20

Guala F (2005) The methodology of experimental economics. Cambridge University Press, Cambridge

Guala F (2009) Methodological issues in experimental design and interpretation. In: Kincaid H, Ross D (eds) The Oxford handbook of philosophy of economics. Oxford University Press, Oxford, pp 280–305

Guala F (2012) Experimentation in economics. In: Mäki U (ed) Philosophy of economics. Elsevier/North Holland, Oxford, pp 597–640

Hacking I (1999) The social construction of what? Harvard University Press, Cambridge, MA

Hammersley M (2008) Paradigm war revived? On the diagnosis of resistance to randomized controlled trials and systematic review in education. Int J Res Method Educ 31(1):3–10

Hegtvedt K (2014) Ethics and experiments. In: Webster M, Sell J (eds) Laboratory experiments in the social sciences. Academic, London, pp 23–51

Holmes D (1976) ‘Debriefing after psychological experiments: I. Effectiveness of postdeception dehoaxing’ and ‘Debriefing after psychological experiments: II. Effectiveness of postexperimental desensitizing’. Am Psychol 32:858–875

Humphreys M (2015) Reflections on the ethics of social experimentation. J Glob Dev 6(1):87–112

Kaidesoja T (2017) Causal inference and modeling. In: McIntyre L, Rosenberg A (eds) The Routledge companion to philosophy of social science. Routledge, London, pp 202–213

Kelman H (1982) Ethical issues in different social science methods. In: Beauchamp T et al (eds) Ethical issues in social science research. John Hopkins University Press, Baltimore, pp 40–98

Kuorikoski J, Marchionni C (2014) Philosophy of economics. In: French S, Saatsi J (eds) The Bloomsbury companion to the philosophy of science. Bloomsbury, London, pp 314–333

Levine R (1979) Clarifying the concepts of research ethics. Hast Cent Rep 9(3):21–26

Lilford R, Jackson J (1995) Equipoise and the ethics of randomization. J R Soc Med 88(10):552–559

Miller F, Brody H (2003) A critique of clinical equipoise: therapeutic misconception in the ethics of clinical trials. Hast Cent Rep 33(3):19–28

Miller P, Weijer C (2006) Fiduciary obligation in clinical research. J Law Med Ethics 34(2):424–440

Mitchell S (2009) Unsimple truths: science, complexity, and policy. University of Chicago Press, Chicago

Morton R, Williams K (2010) Experimental political science and the study of causality: from nature to the lab. Cambridge University Press, Cambridge, UK

Oakley A et al (2003) Using random allocation to evaluate social interventions: three recent UK examples. Ann Am Acad Pol Soc Sci 589(1):170–189

Papineau D (1994) The virtues of randomization. Br J Philos Sci 45:437–450

Pearl J (2000) Causality-models, reasoning and inference. Cambridge University Press, Cambridge, UK

Risjord M (2014) Philosophy of social science: a contemporary introduction. Routledge, London

Sieber, Joan (1982) Ethical dilemmas in social research. In: Sieber J (ed) The ethics of social research: surveys and experiments. Springer, New York, pp 1–29

Sieber J (1992) Planning ethically responsible research: a guide for students and internal review boards. Sage, Newbury Park

Sobel M (1996) An introduction to causal inference. Sociol Methods Res 24(3):353–379

Sullivan J (2009) The multiplicity of experimental protocols. A challenge to reductionist and non-reductionist models of the unity of neuroscience. Synthese 167:511–539

Urbach P (1985) Randomization and the design of experiments. Philos Sci 52:256–273

Veatch R (2007) The irrelevance of equipoise. J Med Philos 32(2):167–183

Wilholt T (2009) Bias and values in scientific research. Stud Hist Phil Sci 40(1):92–101

Woodward J (2008) Invariance, modularity, and all that. Cartwright on causation. In: Cartwright N et al (eds) Nancy Cartwright’s philosophy of science. Routledge, New York, pp 198–237

Worrall J (2002) What evidence in evidence-based medicine? Philos Sci 69(3):316–330

Worrall J (2007) Why there’s no cause to randomize. Br J Philos Sci 58(3):451–488

Download references

Author information

Authors and affiliations.

Institute of Ethics, School of Theology, Philosophy and Music, Faculty of Humanities and Social Sciences, Dublin City University, Dublin, Ireland

Jonathan Lewis

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Jonathan Lewis .

Editor information

Editors and affiliations.

Chatelaillon Plage, France

Ron Iphofen

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this entry

Cite this entry.

Lewis, J. (2020). Experimental Design. In: Iphofen, R. (eds) Handbook of Research Ethics and Scientific Integrity. Springer, Cham. https://doi.org/10.1007/978-3-030-16759-2_19

Download citation

DOI : https://doi.org/10.1007/978-3-030-16759-2_19

Published : 02 April 2020

Publisher Name : Springer, Cham

Print ISBN : 978-3-030-16758-5

Online ISBN : 978-3-030-16759-2

eBook Packages : Religion and Philosophy Reference Module Humanities and Social Sciences Reference Module Humanities

Share this entry

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Experimental Design: Types, Examples & Methods

Saul McLeod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul McLeod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

Experimental design refers to how participants are allocated to different groups in an experiment. Types of design include repeated measures, independent groups, and matched pairs designs.

Probably the most common way to design an experiment in psychology is to divide the participants into two groups, the experimental group and the control group, and then introduce a change to the experimental group, not the control group.

The researcher must decide how he/she will allocate their sample to the different experimental groups.  For example, if there are 10 participants, will all 10 participants participate in both groups (e.g., repeated measures), or will the participants be split in half and take part in only one group each?

Three types of experimental designs are commonly used:

1. Independent Measures

Independent measures design, also known as between-groups , is an experimental design where different participants are used in each condition of the independent variable.  This means that each condition of the experiment includes a different group of participants.

This should be done by random allocation, ensuring that each participant has an equal chance of being assigned to one group.

Independent measures involve using two separate groups of participants, one in each condition. For example:

Independent Measures Design 2

  • Con : More people are needed than with the repeated measures design (i.e., more time-consuming).
  • Pro : Avoids order effects (such as practice or fatigue) as people participate in one condition only.  If a person is involved in several conditions, they may become bored, tired, and fed up by the time they come to the second condition or become wise to the requirements of the experiment!
  • Con : Differences between participants in the groups may affect results, for example, variations in age, gender, or social background.  These differences are known as participant variables (i.e., a type of extraneous variable ).
  • Control : After the participants have been recruited, they should be randomly assigned to their groups. This should ensure the groups are similar, on average (reducing participant variables).

2. Repeated Measures Design

Repeated Measures design is an experimental design where the same participants participate in each independent variable condition.  This means that each experiment condition includes the same group of participants.

Repeated Measures design is also known as within-groups or within-subjects design .

  • Pro : As the same participants are used in each condition, participant variables (i.e., individual differences) are reduced.
  • Con : There may be order effects. Order effects refer to the order of the conditions affecting the participants’ behavior.  Performance in the second condition may be better because the participants know what to do (i.e., practice effect).  Or their performance might be worse in the second condition because they are tired (i.e., fatigue effect). This limitation can be controlled using counterbalancing.
  • Pro : Fewer people are needed as they participate in all conditions (i.e., saves time).
  • Control : To combat order effects, the researcher counter-balances the order of the conditions for the participants.  Alternating the order in which participants perform in different conditions of an experiment.

Counterbalancing

Suppose we used a repeated measures design in which all of the participants first learned words in “loud noise” and then learned them in “no noise.”

We expect the participants to learn better in “no noise” because of order effects, such as practice. However, a researcher can control for order effects using counterbalancing.

The sample would be split into two groups: experimental (A) and control (B).  For example, group 1 does ‘A’ then ‘B,’ and group 2 does ‘B’ then ‘A.’ This is to eliminate order effects.

Although order effects occur for each participant, they balance each other out in the results because they occur equally in both groups.

counter balancing

3. Matched Pairs Design

A matched pairs design is an experimental design where pairs of participants are matched in terms of key variables, such as age or socioeconomic status. One member of each pair is then placed into the experimental group and the other member into the control group .

One member of each matched pair must be randomly assigned to the experimental group and the other to the control group.

matched pairs design

  • Con : If one participant drops out, you lose 2 PPs’ data.
  • Pro : Reduces participant variables because the researcher has tried to pair up the participants so that each condition has people with similar abilities and characteristics.
  • Con : Very time-consuming trying to find closely matched pairs.
  • Pro : It avoids order effects, so counterbalancing is not necessary.
  • Con : Impossible to match people exactly unless they are identical twins!
  • Control : Members of each pair should be randomly assigned to conditions. However, this does not solve all these problems.

Experimental design refers to how participants are allocated to an experiment’s different conditions (or IV levels). There are three types:

1. Independent measures / between-groups : Different participants are used in each condition of the independent variable.

2. Repeated measures /within groups : The same participants take part in each condition of the independent variable.

3. Matched pairs : Each condition uses different participants, but they are matched in terms of important characteristics, e.g., gender, age, intelligence, etc.

Learning Check

Read about each of the experiments below. For each experiment, identify (1) which experimental design was used; and (2) why the researcher might have used that design.

1 . To compare the effectiveness of two different types of therapy for depression, depressed patients were assigned to receive either cognitive therapy or behavior therapy for a 12-week period.

The researchers attempted to ensure that the patients in the two groups had similar severity of depressed symptoms by administering a standardized test of depression to each participant, then pairing them according to the severity of their symptoms.

2 . To assess the difference in reading comprehension between 7 and 9-year-olds, a researcher recruited each group from a local primary school. They were given the same passage of text to read and then asked a series of questions to assess their understanding.

3 . To assess the effectiveness of two different ways of teaching reading, a group of 5-year-olds was recruited from a primary school. Their level of reading ability was assessed, and then they were taught using scheme one for 20 weeks.

At the end of this period, their reading was reassessed, and a reading improvement score was calculated. They were then taught using scheme two for a further 20 weeks, and another reading improvement score for this period was calculated. The reading improvement scores for each child were then compared.

4 . To assess the effect of the organization on recall, a researcher randomly assigned student volunteers to two conditions.

Condition one attempted to recall a list of words that were organized into meaningful categories; condition two attempted to recall the same words, randomly grouped on the page.

Experiment Terminology

Ecological validity.

The degree to which an investigation represents real-life experiences.

Experimenter effects

These are the ways that the experimenter can accidentally influence the participant through their appearance or behavior.

Demand characteristics

The clues in an experiment lead the participants to think they know what the researcher is looking for (e.g., the experimenter’s body language).

Independent variable (IV)

The variable the experimenter manipulates (i.e., changes) is assumed to have a direct effect on the dependent variable.

Dependent variable (DV)

Variable the experimenter measures. This is the outcome (i.e., the result) of a study.

Extraneous variables (EV)

All variables which are not independent variables but could affect the results (DV) of the experiment. Extraneous variables should be controlled where possible.

Confounding variables

Variable(s) that have affected the results (DV), apart from the IV. A confounding variable could be an extraneous variable that has not been controlled.

Random Allocation

Randomly allocating participants to independent variable conditions means that all participants should have an equal chance of taking part in each condition.

The principle of random allocation is to avoid bias in how the experiment is carried out and limit the effects of participant variables.

Order effects

Changes in participants’ performance due to their repeating the same or similar test more than once. Examples of order effects include:

(i) practice effect: an improvement in performance on a task due to repetition, for example, because of familiarity with the task;

(ii) fatigue effect: a decrease in performance of a task due to repetition, for example, because of boredom or tiredness.

Print Friendly, PDF & Email

  • Foundations
  • Write Paper

Search form

  • Experiments
  • Anthropology
  • Self-Esteem
  • Social Anxiety

the scientific method and experimental design

Experimental Research

Experimental Research

Experimental research is commonly used in sciences such as sociology and psychology, physics, chemistry, biology and medicine etc.

This article is a part of the guide:

  • Pretest-Posttest
  • Third Variable
  • Research Bias
  • Independent Variable
  • Between Subjects

Browse Full Outline

  • 1 Experimental Research
  • 2.1 Independent Variable
  • 2.2 Dependent Variable
  • 2.3 Controlled Variables
  • 2.4 Third Variable
  • 3.1 Control Group
  • 3.2 Research Bias
  • 3.3.1 Placebo Effect
  • 3.3.2 Double Blind Method
  • 4.1 Randomized Controlled Trials
  • 4.2 Pretest-Posttest
  • 4.3 Solomon Four Group
  • 4.4 Between Subjects
  • 4.5 Within Subject
  • 4.6 Repeated Measures
  • 4.7 Counterbalanced Measures
  • 4.8 Matched Subjects

It is a collection of research designs which use manipulation and controlled testing to understand causal processes. Generally, one or more variables are manipulated to determine their effect on a dependent variable.

The experimental method is a systematic and scientific approach to research in which the researcher manipulates one or more variables, and controls and measures any change in other variables.

Experimental Research is often used where:

  • There is time priority in a causal relationship ( cause precedes effect )
  • There is consistency in a causal relationship (a cause will always lead to the same effect)
  • The magnitude of the correlation is great.

(Reference: en.wikipedia.org)

The word experimental research has a range of definitions. In the strict sense, experimental research is what we call a true experiment .

This is an experiment where the researcher manipulates one variable, and control/randomizes the rest of the variables. It has a control group , the subjects have been randomly assigned between the groups, and the researcher only tests one effect at a time. It is also important to know what variable(s) you want to test and measure.

A very wide definition of experimental research, or a quasi experiment , is research where the scientist actively influences something to observe the consequences. Most experiments tend to fall in between the strict and the wide definition.

A rule of thumb is that physical sciences, such as physics, chemistry and geology tend to define experiments more narrowly than social sciences, such as sociology and psychology, which conduct experiments closer to the wider definition.

the scientific method and experimental design

Aims of Experimental Research

Experiments are conducted to be able to predict phenomenons. Typically, an experiment is constructed to be able to explain some kind of causation . Experimental research is important to society - it helps us to improve our everyday lives.

the scientific method and experimental design

Identifying the Research Problem

After deciding the topic of interest, the researcher tries to define the research problem . This helps the researcher to focus on a more narrow research area to be able to study it appropriately.  Defining the research problem helps you to formulate a  research hypothesis , which is tested against the  null hypothesis .

The research problem is often operationalizationed , to define how to measure the research problem. The results will depend on the exact measurements that the researcher chooses and may be operationalized differently in another study to test the main conclusions of the study.

An ad hoc analysis is a hypothesis invented after testing is done, to try to explain why the contrary evidence. A poor ad hoc analysis may be seen as the researcher's inability to accept that his/her hypothesis is wrong, while a great ad hoc analysis may lead to more testing and possibly a significant discovery.

Constructing the Experiment

There are various aspects to remember when constructing an experiment. Planning ahead ensures that the experiment is carried out properly and that the results reflect the real world, in the best possible way.

Sampling Groups to Study

Sampling groups correctly is especially important when we have more than one condition in the experiment. One sample group often serves as a control group , whilst others are tested under the experimental conditions.

Deciding the sample groups can be done in using many different sampling techniques. Population sampling may chosen by a number of methods, such as randomization , "quasi-randomization" and pairing.

Reducing sampling errors is vital for getting valid results from experiments. Researchers often adjust the sample size to minimize chances of random errors .

Here are some common sampling techniques :

  • probability sampling
  • non-probability sampling
  • simple random sampling
  • convenience sampling
  • stratified sampling
  • systematic sampling
  • cluster sampling
  • sequential sampling
  • disproportional sampling
  • judgmental sampling
  • snowball sampling
  • quota sampling

Creating the Design

The research design is chosen based on a range of factors. Important factors when choosing the design are feasibility, time, cost, ethics, measurement problems and what you would like to test. The design of the experiment is critical for the validity of the results.

Typical Designs and Features in Experimental Design

  • Pretest-Posttest Design Check whether the groups are different before the manipulation starts and the effect of the manipulation. Pretests sometimes influence the effect.
  • Control Group Control groups are designed to measure research bias and measurement effects, such as the Hawthorne Effect or the Placebo Effect . A control group is a group not receiving the same manipulation as the experimental group. Experiments frequently have 2 conditions, but rarely more than 3 conditions at the same time.
  • Randomized Controlled Trials Randomized Sampling, comparison between an Experimental Group and a Control Group and strict control/randomization of all other variables
  • Solomon Four-Group Design With two control groups and two experimental groups. Half the groups have a pretest and half do not have a pretest. This to test both the effect itself and the effect of the pretest.
  • Between Subjects Design Grouping Participants to Different Conditions
  • Within Subject Design Participants Take Part in the Different Conditions - See also: Repeated Measures Design
  • Counterbalanced Measures Design Testing the effect of the order of treatments when no control group is available/ethical
  • Matched Subjects Design Matching Participants to Create Similar Experimental- and Control-Groups
  • Double-Blind Experiment Neither the researcher, nor the participants, know which is the control group. The results can be affected if the researcher or participants know this.
  • Bayesian Probability Using bayesian probability to "interact" with participants is a more "advanced" experimental design. It can be used for settings were there are many variables which are hard to isolate. The researcher starts with a set of initial beliefs, and tries to adjust them to how participants have responded

Pilot Study

It may be wise to first conduct a pilot-study or two before you do the real experiment. This ensures that the experiment measures what it should, and that everything is set up right.

Minor errors, which could potentially destroy the experiment, are often found during this process. With a pilot study, you can get information about errors and problems, and improve the design, before putting a lot of effort into the real experiment.

If the experiments involve humans, a common strategy is to first have a pilot study with someone involved in the research, but not too closely, and then arrange a pilot with a person who resembles the subject(s) . Those two different pilots are likely to give the researcher good information about any problems in the experiment.

Conducting the Experiment

An experiment is typically carried out by manipulating a variable, called the independent variable , affecting the experimental group. The effect that the researcher is interested in, the dependent variable(s) , is measured.

Identifying and controlling non-experimental factors which the researcher does not want to influence the effects, is crucial to drawing a valid conclusion. This is often done by controlling variables , if possible, or randomizing variables to minimize effects that can be traced back to third variables . Researchers only want to measure the effect of the independent variable(s) when conducting an experiment , allowing them to conclude that this was the reason for the effect.

Analysis and Conclusions

In quantitative research , the amount of data measured can be enormous. Data not prepared to be analyzed is called "raw data". The raw data is often summarized as something called "output data", which typically consists of one line per subject (or item). A cell of the output data is, for example, an average of an effect in many trials for a subject. The output data is used for statistical analysis, e.g. significance tests, to see if there really is an effect.

The aim of an analysis is to draw a conclusion , together with other observations. The researcher might generalize the results to a wider phenomenon, if there is no indication of confounding variables "polluting" the results.

If the researcher suspects that the effect stems from a different variable than the independent variable, further investigation is needed to gauge the validity of the results. An experiment is often conducted because the scientist wants to know if the independent variable is having any effect upon the dependent variable. Variables correlating are not proof that there is causation .

Experiments are more often of quantitative nature than qualitative nature, although it happens.

Examples of Experiments

This website contains many examples of experiments. Some are not true experiments , but involve some kind of manipulation to investigate a phenomenon. Others fulfill most or all criteria of true experiments.

Here are some examples of scientific experiments:

Social Psychology

  • Stanley Milgram Experiment - Will people obey orders, even if clearly dangerous?
  • Asch Experiment - Will people conform to group behavior?
  • Stanford Prison Experiment - How do people react to roles? Will you behave differently?
  • Good Samaritan Experiment - Would You Help a Stranger? - Explaining Helping Behavior
  • Law Of Segregation - The Mendel Pea Plant Experiment
  • Transforming Principle - Griffith's Experiment about Genetics
  • Ben Franklin Kite Experiment - Struck by Lightning
  • J J Thomson Cathode Ray Experiment
  • Psychology 101
  • Flags and Countries
  • Capitals and Countries

Oskar Blakstad (Jul 10, 2008). Experimental Research. Retrieved Aug 14, 2024 from Explorable.com: https://explorable.com/experimental-research

You Are Allowed To Copy The Text

The text in this article is licensed under the Creative Commons-License Attribution 4.0 International (CC BY 4.0) .

This means you're free to copy, share and adapt any parts (or all) of the text in the article, as long as you give appropriate credit and provide a link/reference to this page.

That is it. You don't need our permission to copy the article; just include a link/reference back to this page. You can use it freely (with some kind of link), and we're also okay with people reprinting in publications like books, blogs, newsletters, course-material, papers, wikipedia and presentations (with clear attribution).

Want to stay up to date? Follow us!

Get all these articles in 1 guide.

Want the full version to study at home, take to school or just scribble on?

Whether you are an academic novice, or you simply want to brush up your skills, this book will take your academic writing skills to the next level.

the scientific method and experimental design

Download electronic versions: - Epub for mobiles and tablets - For Kindle here - For iBooks here - PDF version here

Save this course for later

Don't have time for it all now? No problem, save it as a course and come back to it later.

Footer bottom

  • Privacy Policy

the scientific method and experimental design

  • Subscribe to our RSS Feed
  • Like us on Facebook
  • Follow us on Twitter

Logo for VCU Pressbooks

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Part 3: Using quantitative methods

13. Experimental design

Chapter outline.

  • What is an experiment and when should you use one? (8 minute read)
  • True experimental designs (7 minute read)
  • Quasi-experimental designs (8 minute read)
  • Non-experimental designs (5 minute read)
  • Critical, ethical, and critical considerations  (5 minute read)

Content warning : examples in this chapter contain references to non-consensual research in Western history, including experiments conducted during the Holocaust and on African Americans (section 13.6).

13.1 What is an experiment and when should you use one?

Learning objectives.

Learners will be able to…

  • Identify the characteristics of a basic experiment
  • Describe causality in experimental design
  • Discuss the relationship between dependent and independent variables in experiments
  • Explain the links between experiments and generalizability of results
  • Describe advantages and disadvantages of experimental designs

The basics of experiments

The first experiment I can remember using was for my fourth grade science fair. I wondered if latex- or oil-based paint would hold up to sunlight better. So, I went to the hardware store and got a few small cans of paint and two sets of wooden paint sticks. I painted one with oil-based paint and the other with latex-based paint of different colors and put them in a sunny spot in the back yard. My hypothesis was that the oil-based paint would fade the most and that more fading would happen the longer I left the paint sticks out. (I know, it’s obvious, but I was only 10.)

I checked in on the paint sticks every few days for a month and wrote down my observations. The first part of my hypothesis ended up being wrong—it was actually the latex-based paint that faded the most. But the second part was right, and the paint faded more and more over time. This is a simple example, of course—experiments get a heck of a lot more complex than this when we’re talking about real research.

Merriam-Webster defines an experiment   as “an operation or procedure carried out under controlled conditions in order to discover an unknown effect or law, to test or establish a hypothesis, or to illustrate a known law.” Each of these three components of the definition will come in handy as we go through the different types of experimental design in this chapter. Most of us probably think of the physical sciences when we think of experiments, and for good reason—these experiments can be pretty flashy! But social science and psychological research follow the same scientific methods, as we’ve discussed in this book.

As the video discusses, experiments can be used in social sciences just like they can in physical sciences. It makes sense to use an experiment when you want to determine the cause of a phenomenon with as much accuracy as possible. Some types of experimental designs do this more precisely than others, as we’ll see throughout the chapter. If you’ll remember back to Chapter 11  and the discussion of validity, experiments are the best way to ensure internal validity, or the extent to which a change in your independent variable causes a change in your dependent variable.

Experimental designs for research projects are most appropriate when trying to uncover or test a hypothesis about the cause of a phenomenon, so they are best for explanatory research questions. As we’ll learn throughout this chapter, different circumstances are appropriate for different types of experimental designs. Each type of experimental design has advantages and disadvantages, and some are better at controlling the effect of extraneous variables —those variables and characteristics that have an effect on your dependent variable, but aren’t the primary variable whose influence you’re interested in testing. For example, in a study that tries to determine whether aspirin lowers a person’s risk of a fatal heart attack, a person’s race would likely be an extraneous variable because you primarily want to know the effect of aspirin.

In practice, many types of experimental designs can be logistically challenging and resource-intensive. As practitioners, the likelihood that we will be involved in some of the types of experimental designs discussed in this chapter is fairly low. However, it’s important to learn about these methods, even if we might not ever use them, so that we can be thoughtful consumers of research that uses experimental designs.

While we might not use all of these types of experimental designs, many of us will engage in evidence-based practice during our time as social workers. A lot of research developing evidence-based practice, which has a strong emphasis on generalizability, will use experimental designs. You’ve undoubtedly seen one or two in your literature search so far.

The logic of experimental design

How do we know that one phenomenon causes another? The complexity of the social world in which we practice and conduct research means that causes of social problems are rarely cut and dry. Uncovering explanations for social problems is key to helping clients address them, and experimental research designs are one road to finding answers.

As you read about in Chapter 8 (and as we’ll discuss again in Chapter 15 ), just because two phenomena are related in some way doesn’t mean that one causes the other. Ice cream sales increase in the summer, and so does the rate of violent crime; does that mean that eating ice cream is going to make me murder someone? Obviously not, because ice cream is great. The reality of that relationship is far more complex—it could be that hot weather makes people more irritable and, at times, violent, while also making people want ice cream. More likely, though, there are other social factors not accounted for in the way we just described this relationship.

Experimental designs can help clear up at least some of this fog by allowing researchers to isolate the effect of interventions on dependent variables by controlling extraneous variables . In true experimental design (discussed in the next section) and some quasi-experimental designs, researchers accomplish this w ith the control group and the experimental group . (The experimental group is sometimes called the “treatment group,” but we will call it the experimental group in this chapter.) The control group does not receive the intervention you are testing (they may receive no intervention or what is known as “treatment as usual”), while the experimental group does. (You will hopefully remember our earlier discussion of control variables in Chapter 8 —conceptually, the use of the word “control” here is the same.)

the scientific method and experimental design

In a well-designed experiment, your control group should look almost identical to your experimental group in terms of demographics and other relevant factors. What if we want to know the effect of CBT on social anxiety, but we have learned in prior research that men tend to have a more difficult time overcoming social anxiety? We would want our control and experimental groups to have a similar gender mix because it would limit the effect of gender on our results, since ostensibly, both groups’ results would be affected by gender in the same way. If your control group has 5 women, 6 men, and 4 non-binary people, then your experimental group should be made up of roughly the same gender balance to help control for the influence of gender on the outcome of your intervention. (In reality, the groups should be similar along other dimensions, as well, and your group will likely be much larger.) The researcher will use the same outcome measures for both groups and compare them, and assuming the experiment was designed correctly, get a pretty good answer about whether the intervention had an effect on social anxiety.

You will also hear people talk about comparison groups , which are similar to control groups. The primary difference between the two is that a control group is populated using random assignment, but a comparison group is not. Random assignment entails using a random process to decide which participants are put into the control or experimental group (which participants receive an intervention and which do not). By randomly assigning participants to a group, you can reduce the effect of extraneous variables on your research because there won’t be a systematic difference between the groups.

Do not confuse random assignment with random sampling. Random sampling is a method for selecting a sample from a population, and is rarely used in psychological research. Random assignment is a method for assigning participants in a sample to the different conditions, and it is an important element of all experimental research in psychology and other related fields. Random sampling also helps a great deal with generalizability , whereas random assignment increases internal validity .

We have already learned about internal validity in Chapter 11 . The use of an experimental design will bolster internal validity since it works to isolate causal relationships. As we will see in the coming sections, some types of experimental design do this more effectively than others. It’s also worth considering that true experiments, which most effectively show causality , are often difficult and expensive to implement. Although other experimental designs aren’t perfect, they still produce useful, valid evidence and may be more feasible to carry out.

Key Takeaways

  • Experimental designs are useful for establishing causality, but some types of experimental design do this better than others.
  • Experiments help researchers isolate the effect of the independent variable on the dependent variable by controlling for the effect of extraneous variables .
  • Experiments use a control/comparison group and an experimental group to test the effects of interventions. These groups should be as similar to each other as possible in terms of demographics and other relevant factors.
  • True experiments have control groups with randomly assigned participants, while other types of experiments have comparison groups to which participants are not randomly assigned.
  • Think about the research project you’ve been designing so far. How might you use a basic experiment to answer your question? If your question isn’t explanatory, try to formulate a new explanatory question and consider the usefulness of an experiment.
  • Why is establishing a simple relationship between two variables not indicative of one causing the other?

13.2 True experimental design

  • Describe a true experimental design in social work research
  • Understand the different types of true experimental designs
  • Determine what kinds of research questions true experimental designs are suited for
  • Discuss advantages and disadvantages of true experimental designs

True experimental design , often considered to be the “gold standard” in research designs, is thought of as one of the most rigorous of all research designs. In this design, one or more independent variables are manipulated by the researcher (as treatments), subjects are randomly assigned to different treatment levels (random assignment), and the results of the treatments on outcomes (dependent variables) are observed. The unique strength of experimental research is its internal validity and its ability to establish ( causality ) through treatment manipulation, while controlling for the effects of extraneous variable. Sometimes the treatment level is no treatment, while other times it is simply a different treatment than that which we are trying to evaluate. For example, we might have a control group that is made up of people who will not receive any treatment for a particular condition. Or, a control group could consist of people who consent to treatment with DBT when we are testing the effectiveness of CBT.

As we discussed in the previous section, a true experiment has a control group with participants randomly assigned , and an experimental group . This is the most basic element of a true experiment. The next decision a researcher must make is when they need to gather data during their experiment. Do they take a baseline measurement and then a measurement after treatment, or just a measurement after treatment, or do they handle measurement another way? Below, we’ll discuss the three main types of true experimental designs. There are sub-types of each of these designs, but here, we just want to get you started with some of the basics.

Using a true experiment in social work research is often pretty difficult, since as I mentioned earlier, true experiments can be quite resource intensive. True experiments work best with relatively large sample sizes, and random assignment, a key criterion for a true experimental design, is hard (and unethical) to execute in practice when you have people in dire need of an intervention. Nonetheless, some of the strongest evidence bases are built on true experiments.

For the purposes of this section, let’s bring back the example of CBT for the treatment of social anxiety. We have a group of 500 individuals who have agreed to participate in our study, and we have randomly assigned them to the control and experimental groups. The folks in the experimental group will receive CBT, while the folks in the control group will receive more unstructured, basic talk therapy. These designs, as we talked about above, are best suited for explanatory research questions.

Before we get started, take a look at the table below. When explaining experimental research designs, we often use diagrams with abbreviations to visually represent the experiment. Table 13.1 starts us off by laying out what each of the abbreviations mean.

Table 13.1 Experimental research design notations
R Randomly assigned group (control/comparison or experimental)
O Observation/measurement taken of dependent variable
X Intervention or treatment
X Experimental or new intervention
X Typical intervention/treatment as usual
A, B, C, etc. Denotes different groups (control/comparison and experimental)

Pretest and post-test control group design

In pretest and post-test control group design , participants are given a pretest of some kind to measure their baseline state before their participation in an intervention. In our social anxiety experiment, we would have participants in both the experimental and control groups complete some measure of social anxiety—most likely an established scale and/or a structured interview—before they start their treatment. As part of the experiment, we would have a defined time period during which the treatment would take place (let’s say 12 weeks, just for illustration). At the end of 12 weeks, we would give both groups the same measure as a post-test .

the scientific method and experimental design

In the diagram, RA (random assignment group A) is the experimental group and RB is the control group. O 1 denotes the pre-test, X e denotes the experimental intervention, and O 2 denotes the post-test. Let’s look at this diagram another way, using the example of CBT for social anxiety that we’ve been talking about.

the scientific method and experimental design

In a situation where the control group received treatment as usual instead of no intervention, the diagram would look this way, with X i denoting treatment as usual (Figure 13.3).

the scientific method and experimental design

Hopefully, these diagrams provide you a visualization of how this type of experiment establishes time order , a key component of a causal relationship. Did the change occur after the intervention? Assuming there is a change in the scores between the pretest and post-test, we would be able to say that yes, the change did occur after the intervention. Causality can’t exist if the change happened before the intervention—this would mean that something else led to the change, not our intervention.

Post-test only control group design

Post-test only control group design involves only giving participants a post-test, just like it sounds (Figure 13.4).

the scientific method and experimental design

But why would you use this design instead of using a pretest/post-test design? One reason could be the testing effect that can happen when research participants take a pretest. In research, the testing effect refers to “measurement error related to how a test is given; the conditions of the testing, including environmental conditions; and acclimation to the test itself” (Engel & Schutt, 2017, p. 444) [1] (When we say “measurement error,” all we mean is the accuracy of the way we measure the dependent variable.) Figure 13.4 is a visualization of this type of experiment. The testing effect isn’t always bad in practice—our initial assessments might help clients identify or put into words feelings or experiences they are having when they haven’t been able to do that before. In research, however, we might want to control its effects to isolate a cleaner causal relationship between intervention and outcome.

Going back to our CBT for social anxiety example, we might be concerned that participants would learn about social anxiety symptoms by virtue of taking a pretest. They might then identify that they have those symptoms on the post-test, even though they are not new symptoms for them. That could make our intervention look less effective than it actually is.

However, without a baseline measurement establishing causality can be more difficult. If we don’t know someone’s state of mind before our intervention, how do we know our intervention did anything at all? Establishing time order is thus a little more difficult. You must balance this consideration with the benefits of this type of design.

Solomon four group design

One way we can possibly measure how much the testing effect might change the results of the experiment is with the Solomon four group design. Basically, as part of this experiment, you have two control groups and two experimental groups. The first pair of groups receives both a pretest and a post-test. The other pair of groups receives only a post-test (Figure 13.5). This design helps address the problem of establishing time order in post-test only control group designs.

the scientific method and experimental design

For our CBT project, we would randomly assign people to four different groups instead of just two. Groups A and B would take our pretest measures and our post-test measures, and groups C and D would take only our post-test measures. We could then compare the results among these groups and see if they’re significantly different between the folks in A and B, and C and D. If they are, we may have identified some kind of testing effect, which enables us to put our results into full context. We don’t want to draw a strong causal conclusion about our intervention when we have major concerns about testing effects without trying to determine the extent of those effects.

Solomon four group designs are less common in social work research, primarily because of the logistics and resource needs involved. Nonetheless, this is an important experimental design to consider when we want to address major concerns about testing effects.

  • True experimental design is best suited for explanatory research questions.
  • True experiments require random assignment of participants to control and experimental groups.
  • Pretest/post-test research design involves two points of measurement—one pre-intervention and one post-intervention.
  • Post-test only research design involves only one point of measurement—post-intervention. It is a useful design to minimize the effect of testing effects on our results.
  • Solomon four group research design involves both of the above types of designs, using 2 pairs of control and experimental groups. One group receives both a pretest and a post-test, while the other receives only a post-test. This can help uncover the influence of testing effects.
  • Think about a true experiment you might conduct for your research project. Which design would be best for your research, and why?
  • What challenges or limitations might make it unrealistic (or at least very complicated!) for you to carry your true experimental design in the real-world as a student researcher?
  • What hypothesis(es) would you test using this true experiment?

13.4 Quasi-experimental designs

  • Describe a quasi-experimental design in social work research
  • Understand the different types of quasi-experimental designs
  • Determine what kinds of research questions quasi-experimental designs are suited for
  • Discuss advantages and disadvantages of quasi-experimental designs

Quasi-experimental designs are a lot more common in social work research than true experimental designs. Although quasi-experiments don’t do as good a job of giving us robust proof of causality , they still allow us to establish time order , which is a key element of causality. The prefix quasi means “resembling,” so quasi-experimental research is research that resembles experimental research, but is not true experimental research. Nonetheless, given proper research design, quasi-experiments can still provide extremely rigorous and useful results.

There are a few key differences between true experimental and quasi-experimental research. The primary difference between quasi-experimental research and true experimental research is that quasi-experimental research does not involve random assignment to control and experimental groups. Instead, we talk about comparison groups in quasi-experimental research instead. As a result, these types of experiments don’t control the effect of extraneous variables as well as a true experiment.

Quasi-experiments are most likely to be conducted in field settings in which random assignment is difficult or impossible. They are often conducted to evaluate the effectiveness of a treatment—perhaps a type of psychotherapy or an educational intervention.  We’re able to eliminate some threats to internal validity, but we can’t do this as effectively as we can with a true experiment.  Realistically, our CBT-social anxiety project is likely to be a quasi experiment, based on the resources and participant pool we’re likely to have available. 

It’s important to note that not all quasi-experimental designs have a comparison group.  There are many different kinds of quasi-experiments, but we will discuss the three main types below: nonequivalent comparison group designs, time series designs, and ex post facto comparison group designs.

Nonequivalent comparison group design

You will notice that this type of design looks extremely similar to the pretest/post-test design that we discussed in section 13.3. But instead of random assignment to control and experimental groups, researchers use other methods to construct their comparison and experimental groups. A diagram of this design will also look very similar to pretest/post-test design, but you’ll notice we’ve removed the “R” from our groups, since they are not randomly assigned (Figure 13.6).

the scientific method and experimental design

Researchers using this design select a comparison group that’s as close as possible based on relevant factors to their experimental group. Engel and Schutt (2017) [2] identify two different selection methods:

  • Individual matching : Researchers take the time to match individual cases in the experimental group to similar cases in the comparison group. It can be difficult, however, to match participants on all the variables you want to control for.
  • Aggregate matching : Instead of trying to match individual participants to each other, researchers try to match the population profile of the comparison and experimental groups. For example, researchers would try to match the groups on average age, gender balance, or median income. This is a less resource-intensive matching method, but researchers have to ensure that participants aren’t choosing which group (comparison or experimental) they are a part of.

As we’ve already talked about, this kind of design provides weaker evidence that the intervention itself leads to a change in outcome. Nonetheless, we are still able to establish time order using this method, and can thereby show an association between the intervention and the outcome. Like true experimental designs, this type of quasi-experimental design is useful for explanatory research questions.

What might this look like in a practice setting? Let’s say you’re working at an agency that provides CBT and other types of interventions, and you have identified a group of clients who are seeking help for social anxiety, as in our earlier example. Once you’ve obtained consent from your clients, you can create a comparison group using one of the matching methods we just discussed. If the group is small, you might match using individual matching, but if it’s larger, you’ll probably sort people by demographics to try to get similar population profiles. (You can do aggregate matching more easily when your agency has some kind of electronic records or database, but it’s still possible to do manually.)

Time series design

Another type of quasi-experimental design is a time series design. Unlike other types of experimental design, time series designs do not have a comparison group. A time series is a set of measurements taken at intervals over a period of time (Figure 13.7). Proper time series design should include at least three pre- and post-intervention measurement points. While there are a few types of time series designs, we’re going to focus on the most common: interrupted time series design.

the scientific method and experimental design

But why use this method? Here’s an example. Let’s think about elementary student behavior throughout the school year. As anyone with children or who is a teacher knows, kids get very excited and animated around holidays, days off, or even just on a Friday afternoon. This fact might mean that around those times of year, there are more reports of disruptive behavior in classrooms. What if we took our one and only measurement in mid-December? It’s possible we’d see a higher-than-average rate of disruptive behavior reports, which could bias our results if our next measurement is around a time of year students are in a different, less excitable frame of mind. When we take multiple measurements throughout the first half of the school year, we can establish a more accurate baseline for the rate of these reports by looking at the trend over time.

We may want to test the effect of extended recess times in elementary school on reports of disruptive behavior in classrooms. When students come back after the winter break, the school extends recess by 10 minutes each day (the intervention), and the researchers start tracking the monthly reports of disruptive behavior again. These reports could be subject to the same fluctuations as the pre-intervention reports, and so we once again take multiple measurements over time to try to control for those fluctuations.

This method improves the extent to which we can establish causality because we are accounting for a major extraneous variable in the equation—the passage of time. On its own, it does not allow us to account for other extraneous variables, but it does establish time order and association between the intervention and the trend in reports of disruptive behavior. Finding a stable condition before the treatment that changes after the treatment is evidence for causality between treatment and outcome.

Ex post facto comparison group design

Ex post facto (Latin for “after the fact”) designs are extremely similar to nonequivalent comparison group designs. There are still comparison and experimental groups, pretest and post-test measurements, and an intervention. But in ex post facto designs, participants are assigned to the comparison and experimental groups once the intervention has already happened. This type of design often occurs when interventions are already up and running at an agency and the agency wants to assess effectiveness based on people who have already completed treatment.

In most clinical agency environments, social workers conduct both initial and exit assessments, so there are usually some kind of pretest and post-test measures available. We also typically collect demographic information about our clients, which could allow us to try to use some kind of matching to construct comparison and experimental groups.

In terms of internal validity and establishing causality, ex post facto designs are a bit of a mixed bag. The ability to establish causality depends partially on the ability to construct comparison and experimental groups that are demographically similar so we can control for these extraneous variables .

Quasi-experimental designs are common in social work intervention research because, when designed correctly, they balance the intense resource needs of true experiments with the realities of research in practice. They still offer researchers tools to gather robust evidence about whether interventions are having positive effects for clients.

  • Quasi-experimental designs are similar to true experiments, but do not require random assignment to experimental and control groups.
  • In quasi-experimental projects, the group not receiving the treatment is called the comparison group, not the control group.
  • Nonequivalent comparison group design is nearly identical to pretest/post-test experimental design, but participants are not randomly assigned to the experimental and control groups. As a result, this design provides slightly less robust evidence for causality.
  • Nonequivalent groups can be constructed by individual matching or aggregate matching .
  • Time series design does not have a control or experimental group, and instead compares the condition of participants before and after the intervention by measuring relevant factors at multiple points in time. This allows researchers to mitigate the error introduced by the passage of time.
  • Ex post facto comparison group designs are also similar to true experiments, but experimental and comparison groups are constructed after the intervention is over. This makes it more difficult to control for the effect of extraneous variables, but still provides useful evidence for causality because it maintains the time order of the experiment.
  • Think back to the experiment you considered for your research project in Section 13.3. Now that you know more about quasi-experimental designs, do you still think it’s a true experiment? Why or why not?
  • What should you consider when deciding whether an experimental or quasi-experimental design would be more feasible or fit your research question better?

13.5 Non-experimental designs

  • Describe non-experimental designs in social work research
  • Discuss how non-experimental research differs from true and quasi-experimental research
  • Demonstrate an understanding the different types of non-experimental designs
  • Determine what kinds of research questions non-experimental designs are suited for
  • Discuss advantages and disadvantages of non-experimental designs

The previous sections have laid out the basics of some rigorous approaches to establish that an intervention is responsible for changes we observe in research participants. This type of evidence is extremely important to build an evidence base for social work interventions, but it’s not the only type of evidence to consider. We will discuss qualitative methods, which provide us with rich, contextual information, in Part 4 of this text. The designs we’ll talk about in this section are sometimes used in qualitative research  but in keeping with our discussion of experimental design so far, we’re going to stay in the quantitative research realm for now. Non-experimental is also often a stepping stone for more rigorous experimental design in the future, as it can help test the feasibility of your research.

In general, non-experimental designs do not strongly support causality and don’t address threats to internal validity. However, that’s not really what they’re intended for. Non-experimental designs are useful for a few different types of research, including explanatory questions in program evaluation. Certain types of non-experimental design are also helpful for researchers when they are trying to develop a new assessment or scale. Other times, researchers or agency staff did not get a chance to gather any assessment information before an intervention began, so a pretest/post-test design is not possible.

A genderqueer person sitting on a couch, talking to a therapist in a brightly-lit room

A significant benefit of these types of designs is that they’re pretty easy to execute in a practice or agency setting. They don’t require a comparison or control group, and as Engel and Schutt (2017) [3] point out, they “flow from a typical practice model of assessment, intervention, and evaluating the impact of the intervention” (p. 177). Thus, these designs are fairly intuitive for social workers, even when they aren’t expert researchers. Below, we will go into some detail about the different types of non-experimental design.

One group pretest/post-test design

Also known as a before-after one-group design, this type of research design does not have a comparison group and everyone who participates in the research receives the intervention (Figure 13.8). This is a common type of design in program evaluation in the practice world. Controlling for extraneous variables is difficult or impossible in this design, but given that it is still possible to establish some measure of time order, it does provide weak support for causality.

the scientific method and experimental design

Imagine, for example, a researcher who is interested in the effectiveness of an anti-drug education program on elementary school students’ attitudes toward illegal drugs. The researcher could assess students’ attitudes about illegal drugs (O 1 ), implement the anti-drug program (X), and then immediately after the program ends, the researcher could once again measure students’ attitudes toward illegal drugs (O 2 ). You can see how this would be relatively simple to do in practice, and have probably been involved in this type of research design yourself, even if informally. But hopefully, you can also see that this design would not provide us with much evidence for causality because we have no way of controlling for the effect of extraneous variables. A lot of things could have affected any change in students’ attitudes—maybe girls already had different attitudes about illegal drugs than children of other genders, and when we look at the class’s results as a whole, we couldn’t account for that influence using this design.

All of that doesn’t mean these results aren’t useful, however. If we find that children’s attitudes didn’t change at all after the drug education program, then we need to think seriously about how to make it more effective or whether we should be using it at all. (This immediate, practical application of our results highlights a key difference between program evaluation and research, which we will discuss in Chapter 23 .)

After-only design

As the name suggests, this type of non-experimental design involves measurement only after an intervention. There is no comparison or control group, and everyone receives the intervention. I have seen this design repeatedly in my time as a program evaluation consultant for nonprofit organizations, because often these organizations realize too late that they would like to or need to have some sort of measure of what effect their programs are having.

Because there is no pretest and no comparison group, this design is not useful for supporting causality since we can’t establish the time order and we can’t control for extraneous variables. However, that doesn’t mean it’s not useful at all! Sometimes, agencies need to gather information about how their programs are functioning. A classic example of this design is satisfaction surveys—realistically, these can only be administered after a program or intervention. Questions regarding satisfaction, ease of use or engagement, or other questions that don’t involve comparisons are best suited for this type of design.

Static-group design

A final type of non-experimental research is the static-group design. In this type of research, there are both comparison and experimental groups, which are not randomly assigned. There is no pretest, only a post-test, and the comparison group has to be constructed by the researcher. Sometimes, researchers will use matching techniques to construct the groups, but often, the groups are constructed by convenience of who is being served at the agency.

Non-experimental research designs are easy to execute in practice, but we must be cautious about drawing causal conclusions from the results. A positive result may still suggest that we should continue using a particular intervention (and no result or a negative result should make us reconsider whether we should use that intervention at all). You have likely seen non-experimental research in your daily life or at your agency, and knowing the basics of how to structure such a project will help you ensure you are providing clients with the best care possible.

  • Non-experimental designs are useful for describing phenomena, but cannot demonstrate causality.
  • After-only designs are often used in agency and practice settings because practitioners are often not able to set up pre-test/post-test designs.
  • Non-experimental designs are useful for explanatory questions in program evaluation and are helpful for researchers when they are trying to develop a new assessment or scale.
  • Non-experimental designs are well-suited to qualitative methods.
  • If you were to use a non-experimental design for your research project, which would you choose? Why?
  • Have you conducted non-experimental research in your practice or professional life? Which type of non-experimental design was it?

13.6 Critical, ethical, and cultural considerations

  • Describe critiques of experimental design
  • Identify ethical issues in the design and execution of experiments
  • Identify cultural considerations in experimental design

As I said at the outset, experiments, and especially true experiments, have long been seen as the gold standard to gather scientific evidence. When it comes to research in the biomedical field and other physical sciences, true experiments are subject to far less nuance than experiments in the social world. This doesn’t mean they are easier—just subject to different forces. However, as a society, we have placed the most value on quantitative evidence obtained through empirical observation and especially experimentation.

Major critiques of experimental designs tend to focus on true experiments, especially randomized controlled trials (RCTs), but many of these critiques can be applied to quasi-experimental designs, too. Some researchers, even in the biomedical sciences, question the view that RCTs are inherently superior to other types of quantitative research designs. RCTs are far less flexible and have much more stringent requirements than other types of research. One seemingly small issue, like incorrect information about a research participant, can derail an entire RCT. RCTs also cost a great deal of money to implement and don’t reflect “real world” conditions. The cost of true experimental research or RCTs also means that some communities are unlikely to ever have access to these research methods. It is then easy for people to dismiss their research findings because their methods are seen as “not rigorous.”

Obviously, controlling outside influences is important for researchers to draw strong conclusions, but what if those outside influences are actually important for how an intervention works? Are we missing really important information by focusing solely on control in our research? Is a treatment going to work the same for white women as it does for indigenous women? With the myriad effects of our societal structures, you should be very careful ever assuming this will be the case. This doesn’t mean that cultural differences will negate the effect of an intervention; instead, it means that you should remember to practice cultural humility implementing all interventions, even when we “know” they work.

How we build evidence through experimental research reveals a lot about our values and biases, and historically, much experimental research has been conducted on white people, and especially white men. [4] This makes sense when we consider the extent to which the sciences and academia have historically been dominated by white patriarchy. This is especially important for marginalized groups that have long been ignored in research literature, meaning they have also been ignored in the development of interventions and treatments that are accepted as “effective.” There are examples of marginalized groups being experimented on without their consent, like the Tuskegee Experiment or Nazi experiments on Jewish people during World War II. We cannot ignore the collective consciousness situations like this can create about experimental research for marginalized groups.

None of this is to say that experimental research is inherently bad or that you shouldn’t use it. Quite the opposite—use it when you can, because there are a lot of benefits, as we learned throughout this chapter. As a social work researcher, you are uniquely positioned to conduct experimental research while applying social work values and ethics to the process and be a leader for others to conduct research in the same framework. It can conflict with our professional ethics, especially respect for persons and beneficence, if we do not engage in experimental research with our eyes wide open. We also have the benefit of a great deal of practice knowledge that researchers in other fields have not had the opportunity to get. As with all your research, always be sure you are fully exploring the limitations of the research.

  • While true experimental research gathers strong evidence, it can also be inflexible, expensive, and overly simplistic in terms of important social forces that affect the resources.
  • Marginalized communities’ past experiences with experimental research can affect how they respond to research participation.
  • Social work researchers should use both their values and ethics, and their practice experiences, to inform research and push other researchers to do the same.
  • Think back to the true experiment you sketched out in the exercises for Section 13.3. Are there cultural or historical considerations you hadn’t thought of with your participant group? What are they? Does this change the type of experiment you would want to do?
  • How can you as a social work researcher encourage researchers in other fields to consider social work ethics and values in their experimental research?

Media Attributions

  • Being kinder to yourself © Evgenia Makarova is licensed under a CC BY-NC-ND (Attribution NonCommercial NoDerivatives) license
  • Original by author is licensed under a CC BY-NC-SA (Attribution NonCommercial ShareAlike) license
  • Original by author. is licensed under a CC BY-NC-SA (Attribution NonCommercial ShareAlike) license
  • Orginal by author. is licensed under a CC BY-NC-SA (Attribution NonCommercial ShareAlike) license
  • therapist © Zackary Drucker is licensed under a CC BY-NC-ND (Attribution NonCommercial NoDerivatives) license
  • nonexper-pretest-posttest is licensed under a CC BY-NC-SA (Attribution NonCommercial ShareAlike) license
  • Engel, R. & Schutt, R. (2016). The practice of research in social work. Thousand Oaks, CA: SAGE Publications, Inc. ↵
  • Sullivan, G. M. (2011). Getting off the “gold standard”: Randomized controlled trials and education research. Journal of Graduate Medical Education ,  3 (3), 285-289. ↵

an operation or procedure carried out under controlled conditions in order to discover an unknown effect or law, to test or establish a hypothesis, or to illustrate a known law.

explains why particular phenomena work in the way that they do; answers “why” questions

variables and characteristics that have an effect on your outcome, but aren't the primary variable whose influence you're interested in testing.

the group of participants in our study who do not receive the intervention we are researching in experiments with random assignment

in experimental design, the group of participants in our study who do receive the intervention we are researching

the group of participants in our study who do not receive the intervention we are researching in experiments without random assignment

using a random process to decide which participants are tested in which conditions

The ability to apply research findings beyond the study sample to some broader population,

Ability to say that one variable "causes" something to happen to another variable. Very important to assess when thinking about studies that examine causation such as experimental or quasi-experimental designs.

the idea that one event, behavior, or belief will result in the occurrence of another, subsequent event, behavior, or belief

An experimental design in which one or more independent variables are manipulated by the researcher (as treatments), subjects are randomly assigned to different treatment levels (random assignment), and the results of the treatments on outcomes (dependent variables) are observed

a type of experimental design in which participants are randomly assigned to control and experimental groups, one group receives an intervention, and both groups receive pre- and post-test assessments

A measure of a participant's condition before they receive an intervention or treatment.

A measure of a participant's condition after an intervention or, if they are part of the control/comparison group, at the end of an experiment.

A demonstration that a change occurred after an intervention. An important criterion for establishing causality.

an experimental design in which participants are randomly assigned to control and treatment groups, one group receives an intervention, and both groups receive only a post-test assessment

The measurement error related to how a test is given; the conditions of the testing, including environmental conditions; and acclimation to the test itself

a subtype of experimental design that is similar to a true experiment, but does not have randomly assigned control and treatment groups

In nonequivalent comparison group designs, the process by which researchers match individual cases in the experimental group to similar cases in the comparison group.

In nonequivalent comparison group designs, the process in which researchers match the population profile of the comparison and experimental groups.

a set of measurements taken at intervals over a period of time

Research that involves the use of data that represents human expression through words, pictures, movies, performance and other artifacts.

Graduate research methods in social work Copyright © 2021 by Matthew DeCarlo, Cory Cummings, Kate Agnelli is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

  • Skip to primary navigation
  • Skip to main content
  • Skip to footer

the scientific method and experimental design

Understanding Science

How science REALLY works...

Frequently asked questions about how science works

The Understanding Science site is assembling an expanded list of FAQs for the site and you can contribute. Have a question about how science works, what science is, or what it’s like to be a scientist? Send it to  [email protected] !

Expand the individual panels to reveal the answers or Expand all | Collapse all

What is the scientific method?

The “scientific method” is traditionally presented in the first chapter of science textbooks as a simple, linear, five- or six-step procedure for performing scientific investigations. Although the Scientific Method captures the core logic of science (testing ideas with evidence), it misrepresents many other aspects of the true process of science — the dynamic, nonlinear, and creative ways in which science is actually done. In fact, the Scientific Method more accurately describes how science is summarized  after the fact  — in textbooks and journal articles — than how scientific research is actually performed. Teachers may ask that students use the format of the scientific method to write up the results of their investigations (e.g., by reporting their  question, background information, hypothesis, study design, data analysis,  and  conclusion ), even though the process that students went through in their investigations may have involved many iterations of questioning, background research, data collection, and data analysis and even though the students’ “conclusions” will always be tentative ones. To learn more about how science really works and to see a more accurate representation of this process, visit  The  real  process of science .

Why do scientists often seem tentative about their explanations?

Scientists often seem tentative about their explanations because they are aware that those explanations could change if new evidence or perspectives come to light. When scientists write about their ideas in journal articles, they are expected to carefully analyze the evidence for and against their ideas and to be explicit about alternative explanations for what they are observing. Because they are trained to do this for their scientific writing, scientist often do the same thing when talking to the press or a broader audience about their ideas. Unfortunately, this means that they are sometimes misinterpreted as being wishy-washy or unsure of their ideas. Even worse, ideas supported by masses of evidence are sometimes discounted by the public or the press because scientists talk about those ideas in tentative terms. It’s important for the public to recognize that, while provisionality is a fundamental characteristic of scientific knowledge, scientific ideas supported by evidence are trustworthy. To learn more about provisionality in science, visit our page describing  how science builds knowledge . To learn more about how this provisionality can be misinterpreted, visit a section of the  Science toolkit .

Why is peer review useful?

Peer review helps assure the quality of published scientific work: that the authors haven’t ignored key ideas or lines of evidence, that the study was fairly-designed, that the authors were objective in their assessment of their results, etc. This means that even if you are unfamiliar with the research presented in a particular peer-reviewed study, you can trust it to meet certain standards of scientific quality. This also saves scientists time in keeping up-to-date with advances in their fields by weeding out untrustworthy studies. Peer-reviewed work isn’t necessarily correct or conclusive, but it does meet the standards of science. To learn more, visit  Scrutinizing science .

What is the difference between independent and dependent variables?

In an experiment, the independent variables are the factors that the experimenter manipulates. The dependent variable is the outcome of interest—the outcome that depends on the experimental set-up. Experiments are set-up to learn more about how the independent variable does or does not affect the dependent variable. So, for example, if you were testing a new drug to treat Alzheimer’s disease, the independent variable might be whether or not the patient received the new drug, and the dependent variable might be how well participants perform on memory tests. On the other hand, to study how the temperature, volume, and pressure of a gas are related, you might set up an experiment in which you change the volume of a gas, while keeping the temperature constant, and see how this affects the gas’s pressure. In this case, the independent variable is the gas’s volume, and the dependent variable is the pressure of the gas. The temperature of the gas is a controlled variable. To learn more about experimental design, visit Fair tests: A do-it-yourself guide .

What is a control group?

In scientific testing, a control group is a group of individuals or cases that is treated in the same way as the experimental group, but that is not exposed to the experimental treatment or factor. Results from the experimental group and control group can be compared. If the control group is treated very similarly to the experimental group, it increases our confidence that any difference in outcome is caused by the presence of the experimental treatment in the experimental group. For an example, visit our side trip  Fair tests in the field of medicine .

What is the difference between a positive and a negative control group?

A negative control group is a control group that is not exposed to the experimental treatment or to any other treatment that is expected to have an effect. A positive control group is a control group that is not exposed to the experimental treatment but that is exposed to some other treatment that is known to produce the expected effect. These sorts of controls are particularly useful for validating the experimental procedure. For example, imagine that you wanted to know if some lettuce carried bacteria. You set up an experiment in which you wipe lettuce leaves with a swab, wipe the swab on a bacterial growth plate, incubate the plate, and see what grows on the plate. As a negative control, you might just wipe a sterile swab on the growth plate. You would not expect to see any bacterial growth on this plate, and if you do, it is an indication that your swabs, plates, or incubator are contaminated with bacteria that could interfere with the results of the experiment. As a positive control, you might swab an existing colony of bacteria and wipe it on the growth plate. In this case, you  would  expect to see bacterial growth on the plate, and if you do not, it is an indication that something in your experimental set-up is preventing the growth of bacteria. Perhaps the growth plates contain an antibiotic or the incubator is set to too high a temperature. If either the positive or negative control does not produce the expected result, it indicates that the investigator should reconsider his or her experimental procedure. To learn more about experimental design, visit  Fair tests: A do-it-yourself guide .

What is a correlational study, and how is it different from an experimental study?

In a correlational study, a scientist looks for associations between variables (e.g., are people who eat lots of vegetables less likely to suffer heart attacks than others?) without manipulating any variables (e.g., without asking a group of people to eat more or fewer vegetables than they usually would). In a correlational study, researchers may be interested in any sort of statistical association — a positive relationship among variables, a negative relationship among variables, or a more complex one. Correlational studies are used in many fields (e.g., ecology, epidemiology, astronomy, etc.), but the term is frequently associated with psychology. Correlational studies are often discussed in contrast to experimental studies. In experimental studies, researchers do manipulate a variable (e.g., by asking one group of people to eat more vegetables and asking a second group of people to eat as they usually do) and investigate the effect of that change. If an experimental study is well-designed, it can tell a researcher more about the cause of an association than a correlational study of the same system can. Despite this difference, correlational studies still generate important lines of evidence for testing ideas and often serve as the inspiration for new hypotheses. Both types of study are very important in science and rely on the same logic to relate evidence to ideas. To learn more about the basic logic of scientific arguments, visit  The core of science .

What is the difference between deductive and inductive reasoning?

Deductive reasoning involves logically extrapolating from a set of premises or hypotheses. You can think of this as logical “if-then” reasoning. For example, IF an asteroid strikes Earth, and IF iridium is more prevalent in asteroids than in Earth’s crust, and IF nothing else happens to the asteroid iridium afterwards, THEN there will be a spike in iridium levels at Earth’s surface. The THEN statement is the logical consequence of the IF statements. Another case of deductive reasoning involves reasoning from a general premise or hypothesis to a specific instance. For example, based on the idea that all living things are built from cells, we might  deduce  that a jellyfish (a specific example of a living thing) has cells. Inductive reasoning, on the other hand, involves making a generalization based on many individual observations. For example, a scientist who samples rock layers from the Cretaceous-Tertiary (KT) boundary in many different places all over the world and always observes a spike in iridium may  induce  that all KT boundary layers display an iridium spike. The logical leap from many individual observations to one all-inclusive statement isn’t always warranted. For example, it’s possible that, somewhere in the world, there is a KT boundary layer without the iridium spike. Nevertheless, many individual observations often make a strong case for a more general pattern. Deductive, inductive, and other modes of reasoning are all useful in science. It’s more important to understand the logic behind these different ways of reasoning than to worry about what they are called.

What is the difference between a theory and a hypothesis?

Scientific theories are broad explanations for a wide range of phenomena, whereas hypotheses are proposed explanations for a fairly narrow set of phenomena. The difference between the two is largely one of breadth. Theories have broader explanatory power than hypotheses do and often integrate and generalize many hypotheses. To be accepted by the scientific community, both theories and hypotheses must be supported by many different lines of evidence. However, both theories and hypotheses may be modified or overturned if warranted by new evidence and perspectives.

What is a null hypothesis?

A null hypothesis is usually a statement asserting that there is no difference or no association between variables. The null hypothesis is a tool that makes it possible to use certain statistical tests to figure out if another hypothesis of interest is likely to be accurate or not. For example, if you were testing the idea that sugar makes kids hyperactive, your null hypothesis might be that there is no difference in the amount of time that kids previously given a sugary drink and kids previously given a sugar-substitute drink are able to sit still. After making your observations, you would then perform a statistical test to determine whether or not there is a significant difference between the two groups of kids in time spent sitting still.

What is Ockhams's razor?

Ockham’s razor is an idea with a long philosophical history. Today, the term is frequently used to refer to the principle of parsimony — that, when two explanations fit the observations equally well, a simpler explanation should be preferred over a more convoluted and complex explanation. Stated another way, Ockham’s razor suggests that, all else being equal, a straightforward explanation should be preferred over an explanation requiring more assumptions and sub-hypotheses. Visit  Competing ideas: Other considerations  to read more about parsimony.

What does science have to say about ghosts, ESP, and astrology?

Rigorous and well controlled scientific investigations 1  have examined these topics and have found  no  evidence supporting their usual interpretations as natural phenomena (i.e., ghosts as apparitions of the dead, ESP as the ability to read minds, and astrology as the influence of celestial bodies on human personalities and affairs) — although, of course, different people interpret these topics in different ways. Science can investigate such phenomena and explanations only if they are thought to be part of the natural world. To learn more about the differences between science and astrology, visit  Astrology: Is it scientific?  To learn more about the natural world and the sorts of questions and phenomena that science can investigate, visit  What’s  natural ?  To learn more about how science approaches the topic of ESP, visit  ESP: What can science say?

Has science had any negative effects on people or the world in general?

Knowledge generated by science has had many effects that most would classify as positive (e.g., allowing humans to treat disease or communicate instantly with people half way around the world); it also has had some effects that are often considered negative (e.g., allowing humans to build nuclear weapons or pollute the environment with industrial processes). However, it’s important to remember that the process of science and scientific knowledge are distinct from the uses to which people put that knowledge. For example, through the process of science, we have learned a lot about deadly pathogens. That knowledge might be used to develop new medications for protecting people from those pathogens (which most would consider a positive outcome), or it might be used to build biological weapons (which many would consider a negative outcome). And sometimes, the same application of scientific knowledge can have effects that would be considered both positive and negative. For example, research in the first half of the 20th century allowed chemists to create pesticides and synthetic fertilizers. Supporters argue that the spread of these technologies prevented widespread famine. However, others argue that these technologies did more harm than good to global food security. Scientific knowledge itself is neither good nor bad; however, people can choose to use that knowledge in ways that have either positive or negative effects. Furthermore, different people may make different judgments about whether the overall impact of a particular piece of scientific knowledge is positive or negative. To learn more about the applications of scientific knowledge, visit  What has science done for you lately?

1 For examples, see:

  • Milton, J., and R. Wiseman. 1999. Does psi exist? Lack of replication of an anomalous process of information transfer.  Psychological Bulletin  125:387-391.
  • Carlson, S. 1985. A double-blind test of astrology.  Nature  318:419-425.
  • Arzy, S., M. Seeck, S. Ortigue, L. Spinelli, and O. Blanke. 2006. Induction of an illusory shadow person.  Nature  443:287.
  • Gassmann, G., and D. Glindemann. 1993. Phosphane (PH 3 ) in the biosphere.  Angewandte Chemie International Edition in English  32:761-763.

Subscribe to our newsletter

  • Understanding Science 101
  • The science flowchart
  • Science stories
  • Grade-level teaching guides
  • Teaching resource database
  • Journaling tool
  • Misconceptions

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

To log in and use all the features of Khan Academy, please enable JavaScript in your browser.

Biology archive

Course: biology archive   >   unit 1.

  • The scientific method

Controlled experiments

  • The scientific method and experimental design

the scientific method and experimental design

Introduction

How are hypotheses tested.

  • One pot of seeds gets watered every afternoon.
  • The other pot of seeds doesn't get any water at all.

Control and experimental groups

Independent and dependent variables, independent variables, dependent variables, variability and repetition, controlled experiment case study: co 2 ‍   and coral bleaching.

  • What your control and experimental groups would be
  • What your independent and dependent variables would be
  • What results you would predict in each group

Experimental setup

  • Some corals were grown in tanks of normal seawater, which is not very acidic ( pH ‍   around 8.2 ‍   ). The corals in these tanks served as the control group .
  • Other corals were grown in tanks of seawater that were more acidic than usual due to addition of CO 2 ‍   . One set of tanks was medium-acidity ( pH ‍   about 7.9 ‍   ), while another set was high-acidity ( pH ‍   about 7.65 ‍   ). Both the medium-acidity and high-acidity groups were experimental groups .
  • In this experiment, the independent variable was the acidity ( pH ‍   ) of the seawater. The dependent variable was the degree of bleaching of the corals.
  • The researchers used a large sample size and repeated their experiment. Each tank held 5 ‍   fragments of coral, and there were 5 ‍   identical tanks for each group (control, medium-acidity, and high-acidity). Note: None of these tanks was "acidic" on an absolute scale. That is, the pH ‍   values were all above the neutral pH ‍   of 7.0 ‍   . However, the two groups of experimental tanks were moderately and highly acidic to the corals , that is, relative to their natural habitat of plain seawater.

Analyzing the results

Non-experimental hypothesis tests, case study: coral bleaching and temperature, attribution:, works cited:.

  • Hoegh-Guldberg, O. (1999). Climate change, coral bleaching, and the future of the world's coral reefs. Mar. Freshwater Res. , 50 , 839-866. Retrieved from www.reef.edu.au/climate/Hoegh-Guldberg%201999.pdf.
  • Anthony, K. R. N., Kline, D. I., Diaz-Pulido, G., Dove, S., and Hoegh-Guldberg, O. (2008). Ocean acidification causes bleaching and productivity loss in coral reef builders. PNAS , 105 (45), 17442-17446. http://dx.doi.org/10.1073/pnas.0804478105 .
  • University of California Museum of Paleontology. (2016). Misconceptions about science. In Understanding science . Retrieved from http://undsci.berkeley.edu/teaching/misconceptions.php .
  • Hoegh-Guldberg, O. and Smith, G. J. (1989). The effect of sudden changes in temperature, light and salinity on the density and export of zooxanthellae from the reef corals Stylophora pistillata (Esper, 1797) and Seriatopora hystrix (Dana, 1846). J. Exp. Mar. Biol. Ecol. , 129 , 279-303. Retrieved from http://www.reef.edu.au/ohg/res-pic/HG%20papers/HG%20and%20Smith%201989%20BLEACH.pdf .

Additional references:

Want to join the conversation.

  • Upvote Button navigates to signup page
  • Downvote Button navigates to signup page
  • Flag Button navigates to signup page

Great Answer

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 10 August 2024

The implications of healthcare professionals wearing jewelry on patient care biosafety: observational insights and experimental approaches

  • Isabela Fernanda Larios Fracarolli 1 ,
  • Evandro Watanabe 2 , 3 ,
  • Viviane de Cássia Oliveira 2 , 4 ,
  • Marinila Buzanelo Machado 2 ,
  • Felipe Lazarini Bim 2 ,
  • Lucas Lazarini Bim 2 ,
  • Denise de Andrade 2 , 5 &
  • Maria Helena Palucci Marziale 5  

Scientific Reports volume  14 , Article number:  18601 ( 2024 ) Cite this article

144 Accesses

Metrics details

  • Disease prevention
  • Occupational health

The use of jewelry among healthcare professionals poses a risk of cross contamination due to potential bacterial accumulation and spread. Through a mixed-method design, this study first analyzed the implications of healthcare professionals wearing jewelry on patient care biosafety as well as on the residual bacterial load of hands and rings after hand hygiene. Firstly, an observational prevalence study to verify whether nursing professionals wear personal accessories during healthcare assistance was carried out. Second, an experimental design involving intentional contamination and hygiene of the hands, with and without a ring, was conducted. The bacterial load of both hands and rings was measured by counting colony forming units. The observational study showed that nursing workers frequently wear jewelry during healthcare assistance. Nonetheless, the experimental study did not indicate differences in bacterial contamination between hands with and without a ring, despite the hand hygiene procedure applied. In conclusion, many nursing workers wear jewelry in the workplace. Although hands with and without a ring exhibited similar microbial load, rings appeared as a potential source of bacterial contamination, reinforcing the need to remove jewelry during working hours. Hand hygiene using alcohol, or soap and water significantly decreased the bacterial load on the participants’ hands, with handwashing proving to be the most efficient method for removing intentional contamination.

Similar content being viewed by others

the scientific method and experimental design

Prevalence and determinants of hand hygiene behavior among Indian population: a systematic review and meta-analysis

the scientific method and experimental design

Assessment of cleaning methods on bacterial burden of hospital privacy curtains: a pilot randomized controlled trial

the scientific method and experimental design

Adult and children’s use of hand sanitizer during a pandemic – an observational study

Introduction.

The hands of healthcare workers are among the main routes of transmitting microorganisms responsible to cause healthcare-associated infections (HAIs) 1 . Besides hands, other environmental factors such as contaminate hospital surfaces, equipment, mobile phones and workers’ clothing are associated with pathogen transmission 2 , 3 . Hence, it is important to consider that wearing personal accessories could be an additional factor that contributes to the spread of microorganisms 4 , 5 .

Considering the critical status of ESKAPE pathogens (i.e. Enterococcus faecium , Staphylococcus aureus , Klebsiella pneumoniae , Acinetobacter baumannii , Pseudomonas aeruginosa and Enterobacter spp.), and multidrug-resistant microorganisms, the importance of hand hygiene in preventing HAIs transmission has repeatedly been pointed out 6 . Many studies have linked the use of personal accessories with increased risk of transferring microorganisms. It is generally accepted that the skin covered by a ring remains contaminated even after hand hygiene 7 , 8 , 9 , 10 , 11 , 12 . Khodavaisy et al. reported that 73.1% of hands and rings of health workers were contaminated by bacteria and fungi frequently associated with nosocomial infections 5 .

Wearing jewelry during healthcare assistant procedures raises concerns about its impact on regular and appropriate hand hygiene 13 . This issue, added to the low adherence of health workers to hand hygiene guidelines, can be associated with pathogen transmission and a source of HAIs transmission 4 . Although the Brazilian guideline (Regulatory Standard 32) 14 bans the wearing personal accessories by healthcare workers and international guidelines 15 , 16 recommend their removal before beginning surgical hand preparation, their use is frequently associated with relationships (wedding), professional jewelry (graduation rings), and religious reasons. Since previous reports indicate that the use of jewelry contribute to overall hand hygiene protocols 10 , 12 , 13 , attempts have been made to address the quality of the hand hygiene technique among health workers wearing jewelry. However, most studies addressing this topic lack standardized methodological approaches, hindering the ability to draw definitive conclusions 16 . Hence, there is still a need to investigate the real impact of wearing a single plain ring on hand hygiene in order to obtain evidence on whether it significantly influences and compromises the quality of the procedure.

Therefore, this study aimed to identify the prevalence and the reasons why nursing workers fail to adhere to the Brazilian guideline (Regulatory Standard 32) 14 and international guidelines 15 , 16 that discourage personal adornments and accessories in the workplace. In parallel, in an in situ microbiological assay, it was evaluated whether the use of rings alters the bacterial load of hands, as well as whether it implies hand hygiene by handwashing with soap and water is more effective than rubbing with 70% alcohol.

Study design

This study was conducted in two stages: (I) The first step comprised an observational prevalence study that provided insights into the wearing of jewelry by healthcare workers. (II) In the second step, an in situ microbiological experiment, which intentionally contaminated the participants’ hands in the presence or absence of a ring, was carried out. The study was approved by the local Human Research Ethics Committee under protocol CAAE: 11213219.0.0000.5393 and all methods were carried out in accordance with relevant guidelines and regulations. The research protocol was started only after securing written informed consent from all participants and/or their legal guardian(s), confirming their informed and voluntary approval.

Observational prevalence study

To carry out the prevalence study, the adherence of nursing workers to the Brazilian guideline (Regulatory Standard 32) 14 and international guidelines 15 that discourage the use of jewelry during working hours in health facilities, and the barriers impeding their adherence were assessed. The sampling method was the “virtual snowball”, started by sending invitations with the access link to the electronic questionnaire (mandatory response to all questions) through the WhatsApp virtual social network or by email for nursing workers in various hospitals across the Brazil to collect information concerning social, cultural, and institutional aspects of wearing jewelry (Supplementary document 1 ). All professionals responded voluntarily, without any remuneration or compensation for their responses. It is noteworthy that before submitting the questionnaire to the study participants, the process of content validation by experts according to the Delphi technique was carried out to ensure the clarity and objectivity of the questions in the questionnaire. Data were collected from March 22nd to April 30th, 2021, in a groups of nursing workers in Brazil (i.e., nurses, nursing auxiliaries, and technicians). The inclusion criteria were nursing workers providing direct care to hospitalized patients and accepting the invitation.

In situ microbiological experiment

To carry out the microbiological experiment, 60 volunteers (healthcare professionals and students from the nursing, medical, dentistry, and pharmacy school) of both sexes were invited. Volunteers with skin lesions on their hands, treating fungal nail infections, taking immunosuppressant or antimicrobial agents (topical or cutaneous) or wearing artificial nails were excluded. Stainless steel wedding rings without external notches, grooves, and stones were selected according to the volunteers’ finger size.

A standard test method for evaluation of the effectiveness of healthcare personnel handwash formulations was followed, with modification, for intentional contamination of participants’ hands, hands hygiene, and microbial recovery 17 . Briefly, the participants were trained to wash their hands with non-antimicrobial neutral liquid soap (Protex baby; Colgate Palmolive, São Paulo, SP, Brazil) according to the standard technique recommended by WHO to reduce dirt and microbial load 1 , 15 . The participants then dried their hands using six sheets of previously sterilized paper towels to avoid carrying microorganisms from another source. The volunteers received a sterile ring on their right hand (ring finger) and both hands were artificially contaminated with 50 mL of Tryptic Soy Broth (BD Difco, Sparks, MD, USA) containing Lactobacillus casei (ATCC 334) [10 8 Colony Forming Units per milliliter (CFU/mL)]. Contamination occurred by immersing each of the participants’ hands into plastic bags with 50 mL of the culture medium with bacterium inoculum. Two researchers simultaneously massaged the participants’ palms, backhands, and fingers for one minute; each researcher massaged one hand for 30 s and then swapped hands to prevent bias between the researchers. Subsequently, the hands were removed from the polyethylene bags, kept suspended for three minutes to dry, and finally considered contaminated in a standard manner.

The volunteers were randomly assigned into one of three groups each having 20 members. These groups were based on one of two standard hand hygiene techniques recommended by WHO and a control group: Group 1 [hand hygiene with alcohol-based handrub (Start Quimica, RJ, Brazil)]; Group 2 (handwashing with non-antimicrobial liquid soap); Group 3 (Control, no hand hygiene) 1 .

To procced with the collection of microorganisms after the hand hygiene procedure the participants were instructed to individually immerse their hands into plastic bags containing 50 mL of Letheen Broth medium (BD Difco). Two researchers simultaneously massaged the participants’ hands as mentioned previously. Afterward, the participant removed the ring using sterilized gauze and transferred it to a polystyrene conical tube (50 mL; TPP, Trasadingen, Switzerland) containing 15 mL of Letheen Broth . Aliquots of 50 μL from the hands (polyethylene bags) and rings (polystyrene conical tubes) were immediately seeded onto Rogosa Agar (BD Difco) and Mannitol Salt Agar (BD Difco) in Petri dishes (60 × 15 mm). Petri dishes and polystyrene conical tube were incubated at 37 ℃ for 24 to 48 h or until reaching turbidity. After turbidity was registered the suspension was seeded onto Rogosa Agar and Mannitol Salt Agar in Petri dishes to verify the growth of L. casei and Staphylococcus spp. The number of CFU was registered and presented as log 10 CFU/mL. An overview of the methodological design is presented in Fig.  1 .

figure 1

Methodological design overview. This figure provides an overview of the methodological design used in the study, illustrating the step-by-step process and key components involved in the research methodology.

Statistical analysis

Data from the observational prevalence study were presented as absolute frequency values.

For the in situ microbiological experiment, the sample size was estimated by considering statically significant changes in the bacterial counts (log 10 CFU/mL). Based on the study of Trick et al. it was admitted that the hand wearing ring would be associated with tenfold (log 1) higher skin organism counts 10 . A pilot study with five participants indicated that the standard deviation in the bacterial counts of intentional contaminated hands is 1.5 log 10 CFU/mL. Thus, a total of 20 participants / group were considered sufficient to detect relevant differences (α = 0.05; β = 0.20) 18 .

The normality and homogeneity of variances of data were verified using the Shapiro–Wilk test, and Levene test, respectively. Due to the non-normal distribution, Brunner and Languer’s non-parametric analysis of longitudinal data in factorial experiments (nparLD package 19 ) was used (The R foundation for statistical computing, version 3.6.2). Turbidity in the polystyrene conical tubes was analyzed using Fisher’s exact test. The significance level was set at 0.05 for all the analyses.

A total of 133 nursing workers participated of the study: 113 (85.0%) were women, and 20 (15.0%) were men. Nurses were the workers who most frequently completed the questionnaire among all the professionals. Regarding the workplace of nursing workers, the critical ward appeared most frequently (55.0%). Regarding wearing jewelry in the work environment, the following stood out: rings (14.1%), badges with strings (12.3%), and wristwatches (9.5%). The participants who did not wear any personal accessories presented the highest prevalence (35.3%). When asked about what accessories the participants considered important to remove at the workplace, 14.4% reported rings, 12.1% wedding rings, and 12.1% wristwatches. Regarding training programs provided by their institutions, 66.1% of the participants reported having received training, while 24.8% reported that no training regarding the use of personal accessories was provided; 9.0% did not know. Whether the institutions supervised the use of personal accessories, 62.4% participants reported supervision on the part of their employers, 31.5% reported no supervision, and 6.0% did not know. As for the health workers’ knowledge regarding the consequences of wearing personal accessories at work, most participants (88.7%) reported that these are related to an increased spread of microorganisms, while 9.8% reported that the use of accessories is not related with the spread of microorganisms in a hospital setting. Table 1 summarizes the variables of the observational prevalence study.

There were no statistically significant differences in the bacterial load of the hands with and without rings ( L. casei : p = 0.465; Staphylococcus spp.: p = 0.985), as well the ringed hand had no impact across all the three hang hygiene protocols ( L. casei : p = 0.126; Staphylococcus spp.: p = 0.667). Therefore, the null hypothesis could not be rejected, indicating that wearing a ring did not change the L. casei or Staphylococcus spp. load on hands (Table 2 ). Only the hand hygiene protocol, evaluated as an isolated factor, altered the bacterial load (p < 0.001). Handwashing with soap and water significantly reduced the load of L. casei (Median: 3.60; 95% Interval confidence of mean: 2.74–3.55) in comparison with hand hygiene with 70% alcohol-based handrub (Median: 4.72; 95% Interval confidence of mean: 4.02–4.84) and no hand hygiene (Median: 6.46; 95% Interval confidence of mean: 5.91–6.42). However, no statistically significant differences (p = 0.094) were found among the groups when Staphylococcus spp. was analyzed (Table 2 ).

Table 3 shows that 80% (16), 10% (2), and 90% (18), of the polystyrene conical tubes with rings from Groups 1, 2, and 3, respectively, showed L. casei after 48 h of incubation. This finding indicates that the L. casei load on the rings decreased after handwashing with soap and water (p < 0.001). Regarding Staphylococcus spp., no statistically significant difference was found among the groups (p = 0.302).

Integration of results

The integration of results from the observational and in situ studies allowed to assess the potential risk posed by nursing professionals wearing personal accessories during healthcare assistance. In fact, nursing professionals wear jewelry in the work environment. Nonetheless, when quantifying the bacterial load of the hands with and without rings no difference was observed. The findings suggested that the practice of wearing one ring that does not feature with any decorative notches or gemstones is not associated with an elevated microbial load on hands. The results also underscore the importance of adhering to hand hygiene protocols to minimize the risk of mirobial transmission in healthcare settings.

Personal accessories were prevalent among nursing workers, mainly wristwatches, string hanging badges, neckties, earrings, and rings. This fact was explored in 2011 by a study assessing the hygiene conditions of health workers’ nails and how frequently they wore jewelry. The study reported that 49% out of 344 health workers wore one or more jewelry pieces, and more than half of the sample wore wedding rings and wristwatches 20 .

Removing jewelry at the workplace is an organizational policy change recommended decades ago in Brazil and supported by the Brazilian guideline (Regulatory Standard 32) 14 . Here, the evidence suggests that health workers do not fully adhere to this recommendation in their practices. This matter was recently discussed by Greenshield and collaborators in 2020 who highlighted that nurses should remove or move away jewelry when performing antisepsis of the skin surfaces superimposed by jewelry 13 . Although the use of personal accessories has been discouraged by health regulations, it remains a cultural issue that demands attention.

The findings reveal that the participants fail in removing personal accessories because they do not recognize the importance of the action or do not consider that the objects are a source of transmission of microorganisms. Additionally, some participants reported that there is a fear of removing jewelry in the workplace and losing it. A previous Brazilian study showed the relationship between the banishment of personal accessories and professional self-concept. The study demonstrated that professional self-concept were influenced by the lack of jewelry, time since graduation, and position. The participants reported they missed wearing jewelry at work, especially earrings, wristwatches, and wedding rings 21 . In this sense, it is important to note that institutions need to encourage and demand workers to remove jewelry and personal accessories. Improved leadership is critical to ensuring success in these situations. The number of nurses wearing jewelry decreased from 15 to 3%, mainly in the wards where a strategy was implemented by teams and leaders 22 .

Even though most nursing workers reported that they had received some training and were aware of regulatory guidelines, many workers reported partial or total ignorance of the standard recommending that workers do not wear personal accessories in health facilities. These results corroborate the study conducted in 2012 in which nurses revealed unfamiliarity about regulatory guidelines. The authors reported that 80% of the professionals revealed they had never received any training addressing this topic, 67% did not know the purpose of the regulatory guideline, and 57% reported poor knowledge about the topic 23 .

The hands and rings of health workers are contaminated with various types of microorganisms, including fungi and bacteria such as Staphylococcus spp. Greenshield and collaborators in 2020 implemented a descriptive and comparative study to assess bacteria on the skin of nurses, from an intensive care unit, at baseline and 30 days after removing jewelry. The bacteria identified included: Bacillus , Micrococcus , Staphylococcus coagulase-negative, diphtheroid , streptococci , and non-enteric gram-negative rods. The number of bacteria decreased significantly after the intervention, corroborating the need to remove or move jewelry when washing hands or rubbing hand sanitizers to clean the underlying skin 13 . Additionally, the perforation rate of gloves was demonstrated to be higher at the base of the ring finger when professionals wear wedding rings 24 .

This study, however, did not indicate a difference regarding the bacterial load of artificially contaminated hands with and without rings. Moreover, the presence of a ring has not influenced the hand hygiene protocol by washing with soap and water or rubbing with 70% alcohol. In agreement with our findings, recent studies showed that the presence of single plain ring was not associated with an increased hand bacterial load or impacted the transmission of Staphylococcus aureus and glucose-nonfermenting gram-negative bacteria 11 , 25 . Nonetheless, healthcare workers should remove rings when wearing gloves because glove leak rates were significantly higher when wearing rings 26 . Other studies also revealed no differences on the microbial load of hands with and without rings 27 , 28 . In contrast, some studies indicate a larger number of microorganisms on the hands of workers wearing rings than of those who do not 12 , 29 , 30 . Trick et al. in a multivariate analysis of 564 hands demonstrated that health workers wearing a ring were ten times more likely to harbor microorganisms ( Staphylococcus aureus , gram-negative bacilli, or Candida spp.) 10 . Besides, Hoffman et al. evaluated 50 nurses in medical and surgical wards who wore rings permanently, showing that the microbial load was greater on the surface of the fingers under the rings than on the fingers without the rings 31 . Moreover, the pattern of isolation of gram-negative bacilli suggested that these bacteria were colonizers than transient contaminants because the same strains were persistently isolated over several months.

It is widely accepted that hand hygiene is vital to prevent hospital-acquired infections, since it has been proven that hand hygiene considerably decreases the number of microorganisms on the skin surface 32 . Hautemeniere and collaborators indicated that wearing a ring was associated with decreased hand hygiene quality 33 . A literature review reports that wearing a ring while performing hand hygiene impairs the technique’s effectiveness, corroborating that wearing a ring when providing care to patients can considerably impact microbial contamination of hands 33 . Most studies evaluated health workers, such as physicians, nurses, and nursing technicians to perform microbiological analyses of hands and rings. Here nursing professionals were included because these workers are directly and constantly in contact with patients during care delivery, and for this reason, are considered the primary agents of microbial transmission in health settings 34 .

It was expected that a greater bacterial load was present on the right hand (with ring). Thus, in view of the inconsistency with preceding results, we suggest the chosen ring, with a stone, notches or grooves could have influenced the result. Nonetheless, a previous study carried out in a pediatric emergency care unit, assessed the effect of wearing a ring and types of rings on the contamination of hands and the effectiveness of alcohol-based sanitizer among nurses working in ICUs. The results showed that the nurses randomly assigned to wear a plain ring or a ring with a stone presented a higher amount of gram-positive and gram-negative bacteria after rubbing alcohol than those not wearing a ring. However, no differences were found between the plain ring and the ring with a stone 29 .

The findings of this study indicate that handwashing with soap and water considerably decreased the contamination with L. casei . Breidablik et al. also suggested that hand washing with soap and water is effective to decrease microbial load 35 . Overall, it is considered that handwashing with soap and water or rubbing with alcohol are efficient methods, even though both present limitations 35 . Different studies has pointed that alcohol-based hand sanitizer can be used to perform hand hygiene before and after direct contact with patients receiving clinical care 36 , 37 , 38 . One deviation from this indication occurs when hands are visibly soiled or potentially contaminated with body fluids, thereby recommending the use of soap and water for hand hygiene, while acknowledging that in inadequate settings, hand hygiene with alcohol can be particularly beneficial 36 , 37 . According to Hillier, handwashing with soap and water is accompanied by a mechanical process that helps remove contaminants from hands 39 . Our findings corroborate the author’s statement, indicating that mechanical process on handwashing may explain the reduction in the microbial load in comparison to rubbing with alcohol.

Certainly, an advance on current knowledge was reached; however, we are aware that our research may have limitations. It includes the small number of participants in the prevalence observational stage and the fact that the participants do not represent all professions in the health field listed in the Brazilian Code of Occupations. Additionally, the fact that a non-pathogenic food-borne bacterium ( L. casei ) was used to artificially contaminate the participants’ hands in the microbiological stage configures a limitation because it does not represent the microbiota prevalent in HAIs. However, this technique was selected to prevent exposing volunteers to Serratia marcescens , a pathogenic bacterium preconized by ASTM as alternative to E. coli 17 . However, it is important to highlight that scientific literature suggests that alcohol works by denaturing proteins and rendering cell membranes permeable, offering similar effects against Gram-positive and Gram-negative bacteria 40 .

Despite such limitations, our study provides new insight on the knowledge related to the onset of the jewelry-wearing context among health workers in a hospital setting and the reasons for the low adherence to regulatory guidelines that discourage the use of jewelry. Additionally, it revealed the need to provide opportunities for workers to obtain information regarding regulatory guidelines due to its importance for occupational safety. Regarding the microbiological experiment stage, two hand hygiene techniques were assessed against two different bacteria species, one artificially contaminating the participants’ hands and rings and one that belonged to the volunteers’ microbiota. Overall, both gram-positive bacteria were highly resistant to physical and chemical agents, enabling the inference that rings are a potential source of bacterial contamination even when hand hygiene decreases the hands’ bacterial load.

In conclusion, contrary to the recommendation of the Brazilian guideline (Regulatory Standard 32), many nursing workers wear jewelry in the workplace. Although hands with and without a ring exhibited similar microbial load, rings appeared as a potential source of bacterial contamination, reinforcing the need to remove jewelry during working hours. Hand hygiene using alcohol, or soap and water significantly decreased the bacterial load on the participants’ hands, with handwashing proving to be the most efficient method for removing intentional contamination.

Data availability

The data that support the findings of this study are available on request from the corresponding author (E.W.).

World Health Organization (WHO). WHO Guidelines on Hand Hygiene in Health Care: First Global Patient Safety Challenge Clean Care is Safer Care (Geneva: World Health Organization, 2009).

Google Scholar  

Treakle, A. M. et al. Bacterial contamination of health care workers’ white coats. Am. J. Infect. Control 37 , 101–105 (2009).

Article   PubMed   Google Scholar  

Tusabe, F. et al. Bacterial contamination of healthcare worker’s mobile phones: A case study at two referral hospitals in Uganda. Global Secur. Health Sci. Policy 7 , 1–6 (2022).

Article   Google Scholar  

Kanayama, K. A. et al. Cross-contamination of bacteria-colonized pierced earring holes and fingers in nurses is a potential source of health care-associated infections. Am. J. Infect. Control 47 , 78–81 (2019).

Khodavaisy, S., Nabili, M., Davari, B. & Vahedi, M. Evaluation of bacterial and fungal contamination in the health care workers’ hands and rings in the intensive care unit. J. Prev. Med. Hyg. 52 , 215–218 (2011).

CAS   PubMed   Google Scholar  

De la Rosa-Zamboni, D. et al. Everybody hands-on to avoid ESKAPE: Effect of sustained hand hygiene compliance on healthcare-associated infections and multidrug resistance in a paediatric hospital. J. Med. Microbiol. 67 , 1761–1771 (2018).

Fagernes, M. & Fagermoen, M. S. Self-reported behavior and attitudes toward ring wearing during clinical care: A survey among Norwegian care personnel. Nurs. Sci. Res. Nord. Ctries. 30 , 26–31 (2010).

Saxena, S., Singh, T., Agarwal, H., Mhta, G. & Dutta, R. Bacterial colonization of rings and cell phones carried by health-care providers: Are these mobile bacterial zoos in the hospital?. Trop. Doctor 41 , 116–118 (2011).

Surase, P., Nataraj, G., Kuyare, S. & Mehta, P. The ever increasing reservoirs of infection in the health care environment - time for a sixth moment of hygiene. J. Assoc. Physicians India 64 , 31–36 (2016).

PubMed   Google Scholar  

Trick, W. E. et al. Impact of ring wearing on hand contamination and comparison of hand hygiene agents in a hospital. Clin. Infect. Dis. 36 , 1383–1390 (2003).

Fagernes, M., Lingaas, E. & Bjark, P. Impact of a single plain finger ring on the bacterial load on the hands of healthcare workers. Infect. Control Hosp. Epidemiol. 28 , 1191–1195 (2007).

Fagernes, M. & Lingaas, E. Impact of finger rings on transmission of bacteria during hand contact. Infect. Control Hosp. Epidemiol. 30 , 427–432 (2009).

Greenshield, K., Chavez, J., Nial, K. J. & Baldwin, K. Examining bacteria on skin and jewelry since the implementation of hand sanitizer in hospitals. Am. J. Infect. Control 48 , 1402–1403 (2020).

Brazil. Regulatory Standard No. 32 - Safety and health at work in health establishments . https://www.gov.br/trabalho-e-emprego/pt-br/acesso-a-informacao/participacao-social/conselhos-e-orgaos-colegiados/comissao-tripartite partitaria-permanente/arquivos/normas-regulamentadoras/nr-32-atualizada-2022–2.pdf. (2005).

CDC. Guideline for hand hygiene in health-care settings: Recommendations of the Healthcare Infection Control Practices Advisory Committee and the HICPAC/SHEA/APIC/IDSA hand hygiene task force. MMWR 51 , 1–44 (2002).

Cimon, K. & Featherstone, R. Jewellery and Nail Polish Worn by Health Care Workers and the Risk of Infection Transmission: A Review of Clinical Evidence and Guidelines (Canadian Agency for Drugs and Technologies in Health, 2017).

ASTM International. ASTM E 1174–13 Standard test method for evaluation of the effectiveness of healthcare personnel handwash formulations . https://www.astm.org/Standards/E1174.htm (2021).

Sullivan, K. M., Dean, A. & Soe, M. M. OpenEpi: A web-based epidemiologic and statistical calculator for public health. Public Health Rep. 124 , 471–474 (2009).

Article   PubMed   PubMed Central   Google Scholar  

Noguchi, K., Gel, Y. R., Brunner, E. & Konietschke, F. nparLD: An R software package for the nonparametric analysis of longitudinal data in factorial experiments. J. Stat. Softw. 50 , 1–23 (2012).

Vandenbos, F. et al. Assessing the wearing of jewellery by French healthcare professionals. Med. Mal. Infect. 41 , 192–196 (2011).

Article   CAS   PubMed   Google Scholar  

Cavalheiro, A. C., Trentino, J. P., Alves, F. C. & Puggina, A. C. Regulatory Standard 32 ban on adornments and professional self-concept of nursing professionals. Rev. Bras. Med. Trab. 17 , 219–227 (2019).

Huis, A. et al. Impact of a team and leaders-directed strategy to improve nurses’ adherence to hand hygiene guidelines: A cluster randomised trial. Int. J. Nurs. Stud. 50 , 467–474 (2013).

Marziale, M. H. P., Galon, T., Cassioloto, F. L. & Girão, F. B. Implementation of Regulatory Standard 32 and the control of occupational accidents. Acta Paul. Enferm. 25 , 859–866 (2012).

Nicolai, P., Aldam, C. H. & Allen, P. W. Increased awareness of glove perforation in major joint replacement. A prospective, randomised study of Regent Biogel Reveal gloves. J. Bone Joint Surg. Br. 79 , 371–373 (1997).

Cabrera, E. M. A., Rosa, S. R., Vargas, M. D. M. O., Flores, C. N. H. & Costa, H. E. M. A single plain ring is not associated with increased bacterial load on hands: An experimental study among healthcare worker students undertaking mock surgery. Infect. Dis. Health 29 , 51–60 (2024).

Hansen, K. N., Korniewicz, D. M., Hexter, D. A., Kornilow, J. R. & Kelen, G. D. Loss of glove integrity during emergency department procedures. Ann. Emerg. Med. 31 , 65–72 (1998).

Wongworawat, M. D. & Jones, S. G. Influence of rings on the efficacy of hand sanitization and residual bacterial contamination. Infect. Control Hosp. Epidemiol. 28 , 351–353 (2007).

Ramón-Cantón, C., Boada-Sanmartín, N. & Pagespetit-Casas, L. Evaluation of a hand hygiene technique in healthcare workers. Rev. Calid. Asist. 26 , 376–379 (2011).

Yildirim, I. et al. A prospective comparative study of the relationship between different types of ring and microbial hand colonization among pediatric intensive care unit nurses. Int. J. Nurs. Stud. 45 , 1572–1576 (2008).

Fagernes, M. & Lingaas, E. Factors interfering with the microflora on hands: A regression analysis of samples from 465 healthcare workers. J. Adv. Nurs. 67 , 297–307 (2011).

Hoffman, P. N., Cooke, E. M., McCarville, M. R. & Emmerson, A. M. Micro-organisms isolated from skin under wedding rings worn by hospital staff. Br. Med. J. (Clin. Res. Ed.). 290 , 206–207 (1985).

Belela-anacleto, A. S. C., Peterlini, M. A. S. & Pedreira, M. L. G. Hand hygiene as a caring practice: A reflection on professional responsibility. Rev. Bras. Enferm. 70 , 442–445 (2017).

Hautemaniere, A. et al. Factors determining poor practice in alcoholic gel hand rub technique in hospital workers. J. Infect. Public Health 3 , 25–34 (2010).

Fracarolli, I. F. L. & Marziale, M. H. P. Microbiological characteristics of the hands and rings of health workers - integrative review. Ciencia Enferm. 25 , 1–10 (2019).

Breidablik, H. J. et al. Effects of hand disinfection with alcohol hand rub, ozonized water, or soap and water: Time for reconsideration?. J. Hosp. Infect. 105 , 213–215 (2020).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Loveday, H. P. et al. Epic-3: National evidence-based guidelines for preventing healthcare-associated infections in NHS hospitals in England. J. Hosp. Infect. 86 , 1–70 (2014).

Ndegwa, L. et al. Evaluation of a program to improve hand hygiene in Kenyan hospitals through production and promotion of alcohol-based Handrub—2012–2014. Antimicrob. Resist. Infect. Control 8 , 2 (2019).

Siddharta, A. et al. Virucidal Activity of World Health Organization-recommended formulations against enveloped viruses, including Zika, Ebola, and emerging coronaviruses. J. Infect. Dis. 215 , 902–906 (2017).

Hillier, M. D. Using effective hand hygiene practice to prevent and control infection. Nurs. Stand. 35 , 45–50 (2020).

Kampf, G. & Hollingsworth, A. Comprehensive bactericidal activity of an ethanol-based hand gel in 15 seconds. Ann. Clin. Microbiol. Antimicrob. 7 , 2 (2008).

Download references

Acknowledgements

The present work was carried out with the support of the Coordination for the Improvement of Higher Education Personnel—Brazil (CAPES)—Financing Code 001.

Coordination for the Improvement of Higher Education Personnel—Brazil (CAPES), Financing Code 001, Financing Code 001, Financing Code 001, Financing Code 001.

Author information

Authors and affiliations.

Nursing Department, State University of Londrina, Londrina, PR, Brazil

Isabela Fernanda Larios Fracarolli

Human Exposome and Infectious Diseases Network (HEID), Ribeirão Preto College of Nursing, University of São Paulo, Ribeirão Preto, SP, Brazil

Evandro Watanabe, Viviane de Cássia Oliveira, Marinila Buzanelo Machado, Felipe Lazarini Bim, Lucas Lazarini Bim & Denise de Andrade

Department of Restorative Dentistry, Ribeirão Preto School of Dentistry, University of São Paulo, Avenida do Café, s/n. Campus Universitário, Monte Alegre, Ribeirão Preto, SP, 14040-904, Brazil

Evandro Watanabe

Department of Dental Materials and Prosthodontics, Ribeirão Preto School of Dentistry, University of São Paulo, Ribeirão Preto, SP, Brazil

Viviane de Cássia Oliveira

Department of General and Specialized Nursing, Ribeirão Preto College of Nursing, PAHO/WHO Collaborating Centre for Nursing Research Development, University of São Paulo, Ribeirão Preto, SP, Brazil

Denise de Andrade & Maria Helena Palucci Marziale

You can also search for this author in PubMed   Google Scholar

Contributions

I.F.L.F. and M.H.P.M.: conceived the original idea. I.F.L.F. took responsibility for performing the experiments and managing and reporting the data under supervision of E.W. D.A.: involved in constructing ideas or hypotheses for research and the manuscript. V.C.O.: provided critical feedback and helped to shape the research, data analysis and manuscript writing. M.B.M., F.L.B. and L.L.B.: contributed to performing the experiments, and interpreting the results. All authors thoroughly reviewed and approved the final manuscript.

Corresponding author

Correspondence to Evandro Watanabe .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary information., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Fracarolli, I.F.L., Watanabe, E., Oliveira, V.d. et al. The implications of healthcare professionals wearing jewelry on patient care biosafety: observational insights and experimental approaches. Sci Rep 14 , 18601 (2024). https://doi.org/10.1038/s41598-024-69711-x

Download citation

Received : 07 February 2024

Accepted : 07 August 2024

Published : 10 August 2024

DOI : https://doi.org/10.1038/s41598-024-69711-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Hand hygiene
  • Health workers
  • Hospital infection

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

the scientific method and experimental design

bioRxiv

Results of the Protein Engineering Tournament: An Open Science Benchmark for Protein Modeling and Design

  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Chase Armer
  • ORCID record for Hassan Kane
  • ORCID record for Dana L. Cortade
  • ORCID record for Henning Redestig
  • ORCID record for David A. Estell
  • ORCID record for Adil Yusuf
  • ORCID record for Nathan Rollins
  • ORCID record for Hansen Spinner
  • ORCID record for Debora Marks
  • ORCID record for TJ Brunette
  • ORCID record for Erika DeBenedictis
  • For correspondence: [email protected]
  • Info/History
  • Supplementary material
  • Preview PDF

The grand challenge of protein engineering is the development of computational models to characterize and generate protein sequences for arbitrary functions. Progress is limited by lack of 1) benchmarking opportunities, 2) large protein function datasets, and 3) access to experimental protein characterization. We introduce the Protein Engineering Tournament—a fully-remote competition designed to foster the development and evaluation of computational approaches in protein engineering. The tournament consists of an in silico round, predicting biophysical properties from protein sequences, followed by an in vitro round where novel protein sequences are designed, expressed and characterized using automated methods. Upon completion, all datasets, experimental protocols, and methods are made publicly available. We detail the structure and outcomes of a pilot Tournament involving seven protein design teams, powered by six multi-objective datasets, with experimental characterization by our partner, International Flavors and Fragrances. Forthcoming Protein Engineering Tournaments aim to mobilize the scientific community towards transparent evaluation of progress in the field.

  • Download figure
  • Open in new tab

Competing Interest Statement

The authors have declared no competing interest.

https://github.com/the-protein-engineering-tournament/pet-pilot-2023

View the discussion thread.

Supplementary Material

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Twitter logo

Citation Manager Formats

  • EndNote (tagged)
  • EndNote 8 (xml)
  • RefWorks Tagged
  • Ref Manager
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioengineering
  • Animal Behavior and Cognition (5518)
  • Biochemistry (12555)
  • Bioengineering (9417)
  • Bioinformatics (30790)
  • Biophysics (15833)
  • Cancer Biology (12904)
  • Cell Biology (18496)
  • Clinical Trials (138)
  • Developmental Biology (9992)
  • Ecology (14947)
  • Epidemiology (2067)
  • Evolutionary Biology (19143)
  • Genetics (12727)
  • Genomics (17525)
  • Immunology (12664)
  • Microbiology (29684)
  • Molecular Biology (12359)
  • Neuroscience (64655)
  • Paleontology (479)
  • Pathology (2000)
  • Pharmacology and Toxicology (3449)
  • Physiology (5322)
  • Plant Biology (11070)
  • Scientific Communication and Education (1728)
  • Synthetic Biology (3061)
  • Systems Biology (7681)
  • Zoology (1728)

IMAGES

  1. The scientific method is a process for experimentation

    the scientific method and experimental design

  2. Experimental Designs (Part-1): Principle and Concept : Plantlet

    the scientific method and experimental design

  3. The Scientific Method For Experimental Design

    the scientific method and experimental design

  4. Experimental Design Steps

    the scientific method and experimental design

  5. PPT

    the scientific method and experimental design

  6. PPT

    the scientific method and experimental design

COMMENTS

  1. Guide to Experimental Design

    Experimental design is the process of planning an experiment to test a hypothesis. The choices you make affect the validity of your results.

  2. The scientific method and experimental design

    A. The facts collected from an experiment are written in the form of a hypothesis. (Choice B) A hypothesis is the correct answer to a scientific question. B. A hypothesis is the correct answer to a scientific question. (Choice C) A hypothesis is a possible, testable explanation for a scientific question. C.

  3. Experimental Design

    Experimental design is a process of planning and conducting scientific experiments to investigate a hypothesis or research question. It involves carefully designing an experiment that can test the hypothesis, and controlling for other variables that may influence the results. Experimental design typically includes identifying the variables that ...

  4. Introduction to experimental design (video)

    Introduction to experimental design. Scientific progress hinges on well-designed experiments. Most experiments start with a testable hypothesis. To avoid errors, researchers may randomly divide subjects into control and experimental groups. Both groups should receive a treatment, like a pill (real or placebo), to counteract the placebo effect.

  5. The scientific method (article)

    The scientific method. At the core of biology and other sciences lies a problem-solving approach called the scientific method. The scientific method has five basic steps, plus one feedback step: Make an observation. Ask a question. Form a hypothesis, or testable explanation. Make a prediction based on the hypothesis.

  6. Scientific method

    Scientific method, mathematical and experimental technique employed in the sciences. More specifically, it is the technique used in the construction and testing of a scientific hypothesis. The scientific method is applied broadly across the sciences.

  7. What Are The Steps Of The Scientific Method?

    The scientific method is a step-by-step process used by researchers and scientists to determine if there is a relationship between two or more variables. Psychologists use this method to conduct psychological research, gather data, process information, and describe behaviors.

  8. Scientific method

    Learn about the scientific method, the process of empirical research and experimentation that guides scientific inquiry and discovery.

  9. Scientific Method

    The study of scientific method is the attempt to discern the activities by which that success is achieved. Among the activities often identified as characteristic of science are systematic observation and experimentation, inductive and deductive reasoning, and the formation and testing of hypotheses and theories.

  10. Steps of the Scientific Method

    The six steps of the scientific method include: 1) asking a question about something you observe, 2) doing background research to learn what is already known about the topic, 3) constructing a hypothesis, 4) experimenting to test the hypothesis, 5) analyzing the data from the experiment and drawing conclusions, and 6) communicating the results ...

  11. Experimental Design

    Experimental Design A well-designed experiment should provide robust and unbiased results. Scientific rigor, as defined by the NIH, is the strict application of the scientific method to ensure robust and unbiased experimental design, methodology, analysis, interpretation and reporting of results.

  12. Experimental Design: Definition and Types

    What is Experimental Design? An experimental design is a detailed plan for collecting and using data to identify causal relationships. Through careful planning, the design of experiments allows your data collection efforts to have a reasonable chance of detecting effects and testing hypotheses that answer your research questions.

  13. The Scientific Method

    The Scientific Method is a process used to design and perform experiments. It's important to minimize experimental errors and bias, and increase confidence in the accuracy of your results. In the previous sections, we talked about how to pick a good topic and specific question to investigate.

  14. Experimental Design

    Experimental design is one aspect of a scientific method. A well-designed, properly conducted experiment aims to control variables in order to isolate and manipulate causal effects and thereby maximize internal validity, support causal inferences, and guarantee reliable results. Traditionally employed in the natural sciences, experimental ...

  15. Experimental Design in Science

    The experimental design is a set of procedures that are designed to test a hypothesis. The process has five steps: define variables, formulate a hypothesis, design an experiment, assign subjects ...

  16. The scientific method (video)

    The scientific method is a logical approach to understanding the world. It starts with an observation, followed by a question. A testable explanation or hypothesis is then created. An experiment is designed to test the hypothesis, and based on the results, the hypothesis is refined.

  17. Experimental Design: Types, Examples & Methods

    Experimental design refers to how participants are allocated to different groups in an experiment. Types of design include repeated measures, independent groups, and matched pairs designs.

  18. PDF Scientific Method

    The Scientific Method Exploring Experimental Design Unit Overview OBJECTIVE Students will identify and apply the steps of the scientific method.

  19. PDF The Scientific Method and Experimental Design

    The Scientific Method and Experimental Design Vertiefungskurs 2020 "Molecular Evolutionary Biology" Julián Torres-Dowdall Super short history of (western) Philosophy of Science Why we need a Philosophy of Biology? Issues in Philosophy of Biology

  20. PDF Microsoft Word

    A poorly used design may not generate any useful information for meaningful analysis. A wisely designed experiment can provide factual evidence which can easily be analyzed and understood by the researcher. Obviously methods of experimental design are at least as important as methods of data analysis in a research program.

  21. Experimental Research

    Experimental research is a systematic and scientific approach to the scientific method where the scientist manipulates variables.

  22. Experimental Design in Science: Definition and Method

    Visit Study.com for thousands more videos like this one. You'll get full access to our interactive quizzes and transcripts and can find out how to use our vi...

  23. 13. Experimental design

    Key Takeaways. Experimental designs are useful for establishing causality, but some types of experimental design do this better than others. Experiments help researchers isolate the effect of the independent variable on the dependent variable by controlling for the effect of extraneous variables.; Experiments use a control/comparison group and an experimental group to test the effects of ...

  24. Frequently asked questions about how science works

    Although the Scientific Method captures the core logic of science (testing ideas with evidence), it misrepresents many other aspects of the true process of science — the dynamic, nonlinear, and creative ways in which science is actually done. In fact, ... To learn more about experimental design, ...

  25. Controlled experiments (article)

    Introduction Biologists and other scientists use the scientific method to ask questions about the natural world. The scientific method begins with an observation, which leads the scientist to ask a question. They then come up with a hypothesis, a testable explanation that addresses the question.

  26. Deciphering RNA splicing logic with interpretable machine learning

    Machine learning methods, particularly neural networks trained on large datasets, are transforming how scientists approach scientific discovery and experimental design.

  27. The implications of healthcare professionals wearing jewelry ...

    The use of jewelry among healthcare professionals poses a risk of cross contamination due to potential bacterial accumulation and spread. Through a mixed-method design, this study first analyzed ...

  28. Results of the Protein Engineering Tournament: An Open Science

    Upon completion, all datasets, experimental protocols, and methods are made publicly available. We detail the structure and outcomes of a pilot Tournament involving seven protein design teams, powered by six multi-objective datasets, with experimental characterization by our partner, International Flavors and Fragrances.