2. variables
3. variables
4. variables
5. variables
6. variables
7. variables
8. variables
The simplest way to understand a variable is as any characteristic or attribute that can experience change or vary over time or context – hence the name “variable”. For example, the dosage of a particular medicine could be classified as a variable, as the amount can vary (i.e., a higher dose or a lower dose). Similarly, gender, age or ethnicity could be considered demographic variables, because each person varies in these respects.
Within research, especially scientific research, variables form the foundation of studies, as researchers are often interested in how one variable impacts another, and the relationships between different variables. For example:
As you can see, variables are often used to explain relationships between different elements and phenomena. In scientific studies, especially experimental studies, the objective is often to understand the causal relationships between variables. In other words, the role of cause and effect between variables. This is achieved by manipulating certain variables while controlling others – and then observing the outcome. But, we’ll get into that a little later…
Variables can be a little intimidating for new researchers because there are a wide variety of variables, and oftentimes, there are multiple labels for the same thing. To lay a firm foundation, we’ll first look at the three main types of variables, namely:
Simply put, the independent variable is the “ cause ” in the relationship between two (or more) variables. In other words, when the independent variable changes, it has an impact on another variable.
For example:
It’s useful to know that independent variables can go by a few different names, including, explanatory variables (because they explain an event or outcome) and predictor variables (because they predict the value of another variable). Terminology aside though, the most important takeaway is that independent variables are assumed to be the “cause” in any cause-effect relationship. As you can imagine, these types of variables are of major interest to researchers, as many studies seek to understand the causal factors behind a phenomenon.
While the independent variable is the “ cause ”, the dependent variable is the “ effect ” – or rather, the affected variable . In other words, the dependent variable is the variable that is assumed to change as a result of a change in the independent variable.
Keeping with the previous example, let’s look at some dependent variables in action:
In scientific studies, researchers will typically pay very close attention to the dependent variable (or variables), carefully measuring any changes in response to hypothesised independent variables. This can be tricky in practice, as it’s not always easy to reliably measure specific phenomena or outcomes – or to be certain that the actual cause of the change is in fact the independent variable.
As the adage goes, correlation is not causation . In other words, just because two variables have a relationship doesn’t mean that it’s a causal relationship – they may just happen to vary together. For example, you could find a correlation between the number of people who own a certain brand of car and the number of people who have a certain type of job. Just because the number of people who own that brand of car and the number of people who have that type of job is correlated, it doesn’t mean that owning that brand of car causes someone to have that type of job or vice versa. The correlation could, for example, be caused by another factor such as income level or age group, which would affect both car ownership and job type.
To confidently establish a causal relationship between an independent variable and a dependent variable (i.e., X causes Y), you’ll typically need an experimental design , where you have complete control over the environmen t and the variables of interest. But even so, this doesn’t always translate into the “real world”. Simply put, what happens in the lab sometimes stays in the lab!
As an alternative to pure experimental research, correlational or “ quasi-experimental ” research (where the researcher cannot manipulate or change variables) can be done on a much larger scale more easily, allowing one to understand specific relationships in the real world. These types of studies also assume some causality between independent and dependent variables, but it’s not always clear. So, if you go this route, you need to be cautious in terms of how you describe the impact and causality between variables and be sure to acknowledge any limitations in your own research.
In an experimental design, a control variable (or controlled variable) is a variable that is intentionally held constant to ensure it doesn’t have an influence on any other variables. As a result, this variable remains unchanged throughout the course of the study. In other words, it’s a variable that’s not allowed to vary – tough life 🙂
As we mentioned earlier, one of the major challenges in identifying and measuring causal relationships is that it’s difficult to isolate the impact of variables other than the independent variable. Simply put, there’s always a risk that there are factors beyond the ones you’re specifically looking at that might be impacting the results of your study. So, to minimise the risk of this, researchers will attempt (as best possible) to hold other variables constant . These factors are then considered control variables.
Some examples of variables that you may need to control include:
Which specific variables need to be controlled for will vary tremendously depending on the research project at hand, so there’s no generic list of control variables to consult. As a researcher, you’ll need to think carefully about all the factors that could vary within your research context and then consider how you’ll go about controlling them. A good starting point is to look at previous studies similar to yours and pay close attention to which variables they controlled for.
Of course, you won’t always be able to control every possible variable, and so, in many cases, you’ll just have to acknowledge their potential impact and account for them in the conclusions you draw. Every study has its limitations , so don’t get fixated or discouraged by troublesome variables. Nevertheless, always think carefully about the factors beyond what you’re focusing on – don’t make assumptions!
As we mentioned, independent, dependent and control variables are the most common variables you’ll come across in your research, but they’re certainly not the only ones you need to be aware of. Next, we’ll look at a few “secondary” variables that you need to keep in mind as you design your research.
Let’s jump into it…
A moderating variable is a variable that influences the strength or direction of the relationship between an independent variable and a dependent variable. In other words, moderating variables affect how much (or how little) the IV affects the DV, or whether the IV has a positive or negative relationship with the DV (i.e., moves in the same or opposite direction).
For example, in a study about the effects of sleep deprivation on academic performance, gender could be used as a moderating variable to see if there are any differences in how men and women respond to a lack of sleep. In such a case, one may find that gender has an influence on how much students’ scores suffer when they’re deprived of sleep.
It’s important to note that while moderators can have an influence on outcomes , they don’t necessarily cause them ; rather they modify or “moderate” existing relationships between other variables. This means that it’s possible for two different groups with similar characteristics, but different levels of moderation, to experience very different results from the same experiment or study design.
Mediating variables are often used to explain the relationship between the independent and dependent variable (s). For example, if you were researching the effects of age on job satisfaction, then education level could be considered a mediating variable, as it may explain why older people have higher job satisfaction than younger people – they may have more experience or better qualifications, which lead to greater job satisfaction.
Mediating variables also help researchers understand how different factors interact with each other to influence outcomes. For instance, if you wanted to study the effect of stress on academic performance, then coping strategies might act as a mediating factor by influencing both stress levels and academic performance simultaneously. For example, students who use effective coping strategies might be less stressed but also perform better academically due to their improved mental state.
In addition, mediating variables can provide insight into causal relationships between two variables by helping researchers determine whether changes in one factor directly cause changes in another – or whether there is an indirect relationship between them mediated by some third factor(s). For instance, if you wanted to investigate the impact of parental involvement on student achievement, you would need to consider family dynamics as a potential mediator, since it could influence both parental involvement and student achievement simultaneously.
A confounding variable (also known as a third variable or lurking variable ) is an extraneous factor that can influence the relationship between two variables being studied. Specifically, for a variable to be considered a confounding variable, it needs to meet two criteria:
Some common examples of confounding variables include demographic factors such as gender, ethnicity, socioeconomic status, age, education level, and health status. In addition to these, there are also environmental factors to consider. For example, air pollution could confound the impact of the variables of interest in a study investigating health outcomes.
Naturally, it’s important to identify as many confounding variables as possible when conducting your research, as they can heavily distort the results and lead you to draw incorrect conclusions . So, always think carefully about what factors may have a confounding effect on your variables of interest and try to manage these as best you can.
Latent variables are unobservable factors that can influence the behaviour of individuals and explain certain outcomes within a study. They’re also known as hidden or underlying variables , and what makes them rather tricky is that they can’t be directly observed or measured . Instead, latent variables must be inferred from other observable data points such as responses to surveys or experiments.
For example, in a study of mental health, the variable “resilience” could be considered a latent variable. It can’t be directly measured , but it can be inferred from measures of mental health symptoms, stress, and coping mechanisms. The same applies to a lot of concepts we encounter every day – for example:
One way in which we overcome the challenge of measuring the immeasurable is latent variable models (LVMs). An LVM is a type of statistical model that describes a relationship between observed variables and one or more unobserved (latent) variables. These models allow researchers to uncover patterns in their data which may not have been visible before, thanks to their complexity and interrelatedness with other variables. Those patterns can then inform hypotheses about cause-and-effect relationships among those same variables which were previously unknown prior to running the LVM. Powerful stuff, we say!
In the world of scientific research, there’s no shortage of variable types, some of which have multiple names and some of which overlap with each other. In this post, we’ve covered some of the popular ones, but remember that this is not an exhaustive list .
To recap, we’ve explored:
If you’re still feeling a bit lost and need a helping hand with your research project, check out our 1-on-1 coaching service , where we guide you through each step of the research journey. Also, be sure to check out our free dissertation writing course and our collection of free, fully-editable chapter templates .
This post was based on one of our popular Research Bootcamps . If you're working on a research project, you'll definitely want to check this out ...
Very informative, concise and helpful. Thank you
Helping information.Thanks
practical and well-demonstrated
Very helpful and insightful
Your email address will not be published. Required fields are marked *
Save my name, email, and website in this browser for the next time I comment.
Statistics By Jim
Making statistics intuitive
By Jim Frost 15 Comments
In this post, learn the definitions of independent and dependent variables, how to identify each type, how they differ between different types of studies, and see examples of them in use.
Independent variables (IVs) are the ones that you include in the model to explain or predict changes in the dependent variable. The name helps you understand their role in statistical analysis. These variables are independent . In this context, independent indicates that they stand alone and other variables in the model do not influence them. The researchers are not seeking to understand what causes the independent variables to change.
Independent variables are also known as predictors, factors , treatment variables, explanatory variables, input variables, x-variables, and right-hand variables—because they appear on the right side of the equals sign in a regression equation. In notation, statisticians commonly denote them using Xs. On graphs, analysts place independent variables on the horizontal, or X, axis.
In machine learning, independent variables are known as features.
For example, in a plant growth study, the independent variables might be soil moisture (continuous) and type of fertilizer (categorical).
Statistical models will estimate effect sizes for the independent variables.
Relate post : Effect Sizes in Statistics
The nature of independent variables changes based on the type of experiment or study:
Controlled experiments : Researchers systematically control and set the values of the independent variables. In randomized experiments, relationships between independent and dependent variables tend to be causal. The independent variables cause changes in the dependent variable.
Observational studies : Researchers do not set the values of the explanatory variables but instead observe them in their natural environment. When the independent and dependent variables are correlated, those relationships might not be causal.
When you include one independent variable in a regression model, you are performing simple regression. For more than one independent variable, it is multiple regression. Despite the different names, it’s really the same analysis with the same interpretations and assumptions.
Determining which IVs to include in a statistical model is known as model specification. That process involves in-depth research and many subject-area, theoretical, and statistical considerations. At its most basic level, you’ll want to include the predictors you are specifically assessing in your study and confounding variables that will bias your results if you don’t add them—particularly for observational studies.
For more information about choosing independent variables, read my post about Specifying the Correct Regression Model .
Related posts : Randomized Experiments , Observational Studies , Covariates , and Confounding Variables
The dependent variable (DV) is what you want to use the model to explain or predict. The values of this variable depend on other variables. It is the outcome that you’re studying. It’s also known as the response variable, outcome variable, and left-hand variable. Statisticians commonly denote them using a Y. Traditionally, graphs place dependent variables on the vertical, or Y, axis.
For example, in the plant growth study example, a measure of plant growth is the dependent variable. That is the outcome of the experiment, and we want to determine what affects it.
If you’re reading a study’s write-up, how do you distinguish independent variables from dependent variables? Here are some tips!
How statisticians discuss independent variables changes depending on the field of study and type of experiment.
In randomized experiments, look for the following descriptions to identify the independent variables:
In observational studies, independent variables are a bit different. While the researchers likely want to establish causation, that’s harder to do with this type of study, so they often won’t use the word “cause.” They also don’t set the values of the predictors. Some independent variables are the experiment’s focus, while others help keep the experimental results valid.
Here’s how to recognize independent variables in observational studies:
Regardless of the study type, if you see an estimated effect size, it is an independent variable.
Dependent variables are the outcome. The IVs explain the variability or causes changes in the DV. Focus on the “depends” aspect. The value of the dependent variable depends on the IVs. If Y depends on X, then Y is the dependent variable. This aspect applies to both randomized experiments and observational studies.
In an observational study about the effects of smoking, the researchers observe the subjects’ smoking status (smoker/non-smoker) and their lung cancer rates. It’s an observational study because they cannot randomly assign subjects to either the smoking or non-smoking group. In this study, the researchers want to know whether lung cancer rates depend on smoking status. Therefore, the lung cancer rate is the dependent variable.
In a randomized COVID-19 vaccine experiment , the researchers randomly assign subjects to the treatment or control group. They want to determine whether COVID-19 infection rates depend on vaccination status. Hence, the infection rate is the DV.
Note that a variable can be an independent variable in one study but a dependent variable in another. It depends on the context.
For example, one study might assess how the amount of exercise (IV) affects health (DV). However, another study might study the factors (IVs) that influence how much someone exercises (DV). The amount of exercise is an independent variable in one study but a dependent variable in the other!
Regression analysis and ANOVA mathematically describe the relationships between each independent variable and the dependent variable. Typically, you want to determine how changes in one or more predictors associate with changes in the dependent variable. These analyses estimate an effect size for each independent variable.
Suppose researchers study the relationship between wattage, several types of filaments, and the output from a light bulb. In this study, light output is the dependent variable because it depends on the other two variables. Wattage (continuous) and filament type (categorical) are the independent variables.
After performing the regression analysis, the researchers will understand the nature of the relationship between these variables. How much does the light output increase on average for each additional watt? Does the mean light output differ by filament types? They will also learn whether these effects are statistically significant.
Related post : When to Use Regression Analysis
As I mentioned earlier, graphs traditionally display the independent variables on the horizontal X-axis and the dependent variable on the vertical Y-axis. The type of graph depends on the nature of the variables. Here are a couple of examples.
Suppose you experiment to determine whether various teaching methods affect learning outcomes. Teaching method is a categorical predictor that defines the experimental groups. To display this type of data, you can use a boxplot, as shown below.
The groups are along the horizontal axis, while the dependent variable, learning outcomes, is on the vertical. From the graph, method 4 has the best results. A one-way ANOVA will tell you whether these results are statistically significant. Learn more about interpreting boxplots .
Now, imagine that you are studying people’s height and weight. Specifically, do height increases cause weight to increase? Consequently, height is the independent variable on the horizontal axis, and weight is the dependent variable on the vertical axis. You can use a scatterplot to display this type of data.
It appears that as height increases, weight tends to increase. Regression analysis will tell you if these results are statistically significant. Learn more about interpreting scatterplots .
April 2, 2024 at 2:05 am
Hi again Jim
Thanks so much for taking an interest in New Zealand’s Equity Index.
Rather than me trying to explain what our Ministry of Education has done, here is a link to a fairly short paper. Scroll down to page 4 of this (if you have the inclination) – https://fyi.org.nz/request/21253/response/80708/attach/4/1301098%20Response%20and%20Appendix.pdf
The Equity Index is used to allocate only 4% of total school funding. The most advantaged 5% of schools get no “equity funding” and the other 95% get a share of the equity funding pool based on their index score. We are talking a maximum of around $1,000NZD per child per year for the most disadvantaged schools. The average amount is around $200-$300 per child per year.
My concern is that I thought the dependent variable is the thing you want to explain or predict using one or more independent variables. Choosing the form of dependent variable that gets a good fit seems to be answering the question “what can we predict well?” rather than “how do we best predict the factor of interest?” The factor is educational achievement and I think this should have been decided upon using theory rather than experimentation with the data.
As it turns out, the Ministry has chosen a measure of educational achievement that puts a heavy weight on achieving an “excellence” rating on a qualification and a much lower weight on simply gaining a qualification. My reading is that they have taken what our universities do when looking at which students to admit.
It doesn’t seem likely to me that a heavy weighting on excellent achievement is appropriate for targeting extra funding to schools with a lot of under-achieving students.
However, my stats knowledge isn’t extensive and it’s definitely rusty, so your thoughts are most helpful.
Regards Kathy Spencer
April 1, 2024 at 4:08 pm
Hi Jim, Great website, thank you.
I have been looking at New Zealand’s Equity Index which is used to allocate a small amount of extra funding to schools attended by children from disadvantaged backgrounds. The Index uses 37 socioeconomic measures relating to a child’s and their parents’ backgrounds that are found to be associated with educational achievement.
I was a bit surprised to read how they had decided on the dependent variable to be used as the measure of educational achievement, or dependent variable. Part of the process was as follows- “Each measure was tested to see the degree to which it could be predicted by the socioeconomic factors selected for the Equity Index.”
Any comment?
Many thanks Kathy Spencer
April 1, 2024 at 9:20 pm
That’s a very complex study and I don’t know much about it. So, that limits what I can say about it. But I’ll give you a few thoughts that come to mind.
This method is common in educational and social research, particularly when the goal is to understand or mitigate the impact of socioeconomic disparities on educational outcomes.
There are the usual concerns about not confusing correlation with causation. However, because this program seems to quantify barriers and then provide extra funding based on the index, I don’t think that’s a problem. They’re not attempting to adjust the socioeconomic measures so no worries about whether they’re directly causal or not.
I might have a small concern about cherry picking the model that happens to maximize the R-squared. Chasing the R-squared rather than having theory drive model selecting is often problematic. Chasing the best fit increases the likelihood that the model fits this specific dataset best by random chance rather than being truly the best. If so, it won’t perform as well outside the dataset used to fit the model. Hopefully, they validated the predicted ability of the model using other data.
However, I’m not sure if the extra funding is determined by the model? I don’t know if the index value is calculated separately outside the candidate models and then fed into the various models. Or does the choice of model affect how the index value is calculated? If it’s the former, then the funding doesn’t depend on a potentially cherry picked model. If the latter, it does.
So, I’m not really clear on the purpose of the model. I’m guessing they just want to validate their Equity Index. And maximizing the R-squared doesn’t really say it’s the best Index but it does at least show that it likely has some merit. I’d be curious how the took the 37 measures and combined them to one index. So, I have more questions than answers. I don’t mean that in a critical sense. Just that I know almost nothing about this program.
I’m curious, what was the outcome they picked? How high was the R-squared? And what were your concerns?
February 6, 2024 at 6:57 pm
Excellent explanation, thank you.
February 5, 2024 at 5:04 pm
Thank you for this insightful blog. Is it valid to use a dependent variable delivered from the mean of independent variables in multiple regression if you want to evaluate the influence of each unique independent variable on the dependent variables?
February 5, 2024 at 11:11 pm
It’s difficult to answer your question because I’m not sure what you mean that the DV is “delivered from the mean of IVs.” If you mean that multiple IVs explain changes in the DV’s mean, yes, that’s the standard use for multiple regression.
If you mean something else, please explain in further detail. Thanks!
February 6, 2024 at 6:32 am
What I meant is; the DV values used as parameters for multiple regression is basically calculated as the average of the IVs. For instance:
From 3 IVs (X1, X2, X3), Y is delivered as :
Y = (Sum of all IVs) / (3)
Then the resulting Y is used as the DV along with the initial IVs to compute the multiple regression.
February 6, 2024 at 2:17 pm
There are a couple of reasons why you shouldn’t do that.
For starters, Y-hat (the predicted value of the regression equation) is the mean of the DV given specific values of the IV. However, that mean is calculated by using the regression coefficients and constant in the regression equation. You don’t calculate the DV mean as the sum of the IVs divided by the number of IVs. Perhaps given a very specific subject-area context, using this approach might seem to make sense but there are other problems.
A critical problem is that the Y is now calculated using the IVs. Instead, the DVs should be measured outcomes and not calculated from IVs. This violates regression assumptions and produces questionable results.
Additionally, it complicates the interpretation. Because the DV is calculated from the IV, you know the regression analysis will find a relationship between them. But you have no idea if that relationship exists in the real world. This complication occurs because your results are based on forcing the DV to equal a function of the IVs and do not reflect real-world outcomes.
In short, DVs should be real-world outcomes that you measure! And be sure to keep your IVs and DV independent. Let the regression analysis estimate the regression equation from your data that contains measured DVs. Don’t use a function to force the DV to equal some function of the IVs because that’s the opposite direction of how regression works!
I hope that helps!
September 6, 2022 at 7:43 pm
Thank you for sharing.
March 3, 2022 at 1:59 am
Excellent explanation.
February 13, 2022 at 12:31 pm
Thanks a lot for creating this excellent blog. This is my go-to resource for Statistics.
I had been pondering over a question for sometime, it would be great if you could shed some light on this.
In linear and non-linear regression, should the distribution of independent and dependent variables be unskewed? When is there a need to transform the data (say, Box-Cox transformation), and do we transform the independent variables as well?
October 28, 2021 at 12:55 pm
If I use a independent variable (X) and it displays a low p-value <.05, why is it if I introduce another independent variable to regression the coefficient and p-value of Y that I used in first regression changes to look insignificant? The second variable that I introduced has a low p-value in regression.
October 29, 2021 at 11:22 pm
Keep in mind that the significance of each IV is calculated after accounting for the variance of all the other variables in the model, assuming you’re using the standard adjusted sums of squares rather than sequential sums of squares. The sums of squares (SS) is a measure of how much dependent variable variability that each IV accounts for. In the illustration below, I’ll assume you’re using the standard of adjusted SS.
So, let’s say that originally you have X1 in the model along with some other IVs. Your model estimates the significance of X1 after assessing the variability that the other IVs account for and finds that X1 is significant. Now, you add X2 to the model in addition to X1 and the other IVs. Now, when assessing X1, the model accounts for the variability of the IVs including the newly added X2. And apparently X2 explains a good portion of the variability. X1 is no longer able to account for that variability, which causes it to not be statistically significant.
In other words, X2 explains some of the variability that X1 previously explained. Because X1 no longer explains it, it is no longer significant.
Additionally, the significance of IVs is more likely to change when you add or remove IVs that are correlated. Correlated IVs is known as multicollinearity. Multicollinearity can be a problem when you have too much. Given the change in significance, I’d check your model for multicollinearity just to be safe! Click the link to read a post that wrote about that!
September 6, 2021 at 8:35 am
nice explanation
August 25, 2021 at 3:09 am
it is excellent explanation
General Education
Independent and dependent variables are important for both math and science. If you don't understand what these two variables are and how they differ, you'll struggle to analyze an experiment or plot equations. Fortunately, we make learning these concepts easy!
In this guide, we break down what independent and dependent variables are , give examples of the variables in actual experiments, explain how to properly graph them, provide a quiz to test your skills, and discuss the one other important variable you need to know.
A variable is something you're trying to measure. It can be practically anything, such as objects, amounts of time, feelings, events, or ideas. If you're studying how people feel about different television shows, the variables in that experiment are television shows and feelings. If you're studying how different types of fertilizer affect how tall plants grow, the variables are type of fertilizer and plant height.
There are two key variables in every experiment: the independent variable and the dependent variable.
Independent variable: What the scientist changes or what changes on its own.
Dependent variable: What is being studied/measured.
The independent variable (sometimes known as the manipulated variable) is the variable whose change isn't affected by any other variable in the experiment. Either the scientist has to change the independent variable herself or it changes on its own; nothing else in the experiment affects or changes it. Two examples of common independent variables are age and time. There's nothing you or anything else can do to speed up or slow down time or increase or decrease age. They're independent of everything else.
The dependent variable (sometimes known as the responding variable) is what is being studied and measured in the experiment. It's what changes as a result of the changes to the independent variable. An example of a dependent variable is how tall you are at different ages. The dependent variable (height) depends on the independent variable (age).
An easy way to think of independent and dependent variables is, when you're conducting an experiment, the independent variable is what you change, and the dependent variable is what changes because of that. You can also think of the independent variable as the cause and the dependent variable as the effect.
It can be a lot easier to understand the differences between these two variables with examples, so let's look at some sample experiments below.
Below are overviews of three experiments, each with their independent and dependent variables identified.
Experiment 1: You want to figure out which brand of microwave popcorn pops the most kernels so you can get the most value for your money. You test different brands of popcorn to see which bag pops the most popcorn kernels.
Experiment 2 : You want to see which type of fertilizer helps plants grow fastest, so you add a different brand of fertilizer to each plant and see how tall they grow.
Experiment 3: You're interested in how rising sea temperatures impact algae life, so you design an experiment that measures the number of algae in a sample of water taken from a specific ocean site under varying temperatures.
For each of the independent variables above, it's clear that they can't be changed by other variables in the experiment. You have to be the one to change the popcorn and fertilizer brands in Experiments 1 and 2, and the ocean temperature in Experiment 3 cannot be significantly changed by other factors. Changes to each of these independent variables cause the dependent variables to change in the experiments.
Independent and dependent variables always go on the same places in a graph. This makes it easy for you to quickly see which variable is independent and which is dependent when looking at a graph or chart. The independent variable always goes on the x-axis, or the horizontal axis. The dependent variable goes on the y-axis, or vertical axis.
Here's an example:
As you can see, this is a graph showing how the number of hours a student studies affects the score she got on an exam. From the graph, it looks like studying up to six hours helped her raise her score, but as she studied more than that her score dropped slightly.
The amount of time studied is the independent variable, because it's what she changed, so it's on the x-axis. The score she got on the exam is the dependent variable, because it's what changed as a result of the independent variable, and it's on the y-axis. It's common to put the units in parentheses next to the axis titles, which this graph does.
There are different ways to title a graph, but a common way is "[Independent Variable] vs. [Dependent Variable]" like this graph. Using a standard title like that also makes it easy for others to see what your independent and dependent variables are.
Independent and dependent variables are the two most important variables to know and understand when conducting or studying an experiment, but there is one other type of variable that you should be aware of: constant variables.
Constant variables (also known as "constants") are simple to understand: they're what stay the same during the experiment. Most experiments usually only have one independent variable and one dependent variable, but they will all have multiple constant variables.
For example, in Experiment 2 above, some of the constant variables would be the type of plant being grown, the amount of fertilizer each plant is given, the amount of water each plant is given, when each plant is given fertilizer and water, the amount of sunlight the plants receive, the size of the container each plant is grown in, and more. The scientist is changing the type of fertilizer each plant gets which in turn changes how much each plant grows, but every other part of the experiment stays the same.
In experiments, you have to test one independent variable at a time in order to accurately understand how it impacts the dependent variable. Constant variables are important because they ensure that the dependent variable is changing because, and only because, of the independent variable so you can accurately measure the relationship between the dependent and independent variables.
If you didn't have any constant variables, you wouldn't be able to tell if the independent variable was what was really affecting the dependent variable. For example, in the example above, if there were no constants and you used different amounts of water, different types of plants, different amounts of fertilizer and put the plants in windows that got different amounts of sun, you wouldn't be able to say how fertilizer type affected plant growth because there would be so many other factors potentially affecting how the plants grew.
If you're still having a hard time understanding the relationship between independent and dependent variable, it might help to see them in action. Here are three experiments you can try at home.
One simple way to explore independent and dependent variables is to construct a biology experiment with seeds. Try growing some sunflowers and see how different factors affect their growth. For example, say you have ten sunflower seedlings, and you decide to give each a different amount of water each day to see if that affects their growth. The independent variable here would be the amount of water you give the plants, and the dependent variable is how tall the sunflowers grow.
Explore a wide range of chemical reactions with this chemistry kit . It includes 100+ ideas for experiments—pick one that interests you and analyze what the different variables are in the experiment!
Build and test a range of simple and complex machines with this K'nex kit . How does increasing a vehicle's mass affect its velocity? Can you lift more with a fixed or movable pulley? Remember, the independent variable is what you control/change, and the dependent variable is what changes because of that.
Can you identify the independent and dependent variables for each of the four scenarios below? The answers are at the bottom of the guide for you to check your work.
Scenario 1: You buy your dog multiple brands of food to see which one is her favorite.
Scenario 2: Your friends invite you to a party, and you decide to attend, but you're worried that staying out too long will affect how well you do on your geometry test tomorrow morning.
Scenario 3: Your dentist appointment will take 30 minutes from start to finish, but that doesn't include waiting in the lounge before you're called in. The total amount of time you spend in the dentist's office is the amount of time you wait before your appointment, plus the 30 minutes of the actual appointment
Scenario 4: You regularly babysit your little cousin who always throws a tantrum when he's asked to eat his vegetables. Over the course of the week, you ask him to eat vegetables four times.
Knowing the independent variable definition and dependent variable definition is key to understanding how experiments work. The independent variable is what you change, and the dependent variable is what changes as a result of that. You can also think of the independent variable as the cause and the dependent variable as the effect.
When graphing these variables, the independent variable should go on the x-axis (the horizontal axis), and the dependent variable goes on the y-axis (vertical axis).
Constant variables are also important to understand. They are what stay the same throughout the experiment so you can accurately measure the impact of the independent variable on the dependent variable.
Independent and dependent variables are commonly taught in high school science classes. Read our guide to learn which science classes high school students should be taking.
Scoring well on standardized tests is an important part of having a strong college application. Check out our guides on the best study tips for the SAT and ACT.
Interested in science? Science Olympiad is a great extracurricular to include on your college applications, and it can help you win big scholarships. Check out our complete guide to winning Science Olympiad competitions.
Quiz Answers
1: Independent: dog food brands; Dependent: how much you dog eats
2: Independent: how long you spend at the party; Dependent: your exam score
3: Independent: Amount of time you spend waiting; Dependent: Total time you're at the dentist (the 30 minutes of appointment time is the constant)
4: Independent: Number of times your cousin is asked to eat vegetables; Dependent: number of tantrums
These recommendations are based solely on our knowledge and experience. If you purchase an item through one of our links, PrepScholar may receive a commission.
How to Get Into Harvard and the Ivy League
How to Get a Perfect 4.0 GPA
How to Write an Amazing College Essay
What Exactly Are Colleges Looking For?
ACT vs. SAT: Which Test Should You Take?
When should you take the SAT or ACT?
Get Your Free
Find Your Target SAT Score
Free Complete Official SAT Practice Tests
Score 800 on SAT Math
Score 800 on SAT Reading and Writing
Score 600 on SAT Math
Score 600 on SAT Reading and Writing
Find Your Target ACT Score
Complete Official Free ACT Practice Tests
Get a 36 on ACT English
Get a 36 on ACT Math
Get a 36 on ACT Reading
Get a 36 on ACT Science
Get a 24 on ACT English
Get a 24 on ACT Math
Get a 24 on ACT Reading
Get a 24 on ACT Science
Stay Informed
Get the latest articles and test prep tips!
Christine graduated from Michigan State University with degrees in Environmental Biology and Geography and received her Master's from Duke University. In high school she scored in the 99th percentile on the SAT and was named a National Merit Finalist. She has taught English and biology in several countries.
Have any questions about this article or other topics? Ask below and we'll reply!
Dave Cornell (PhD)
Dr. Cornell has worked in education for more than 20 years. His work has involved designing teacher certification for Trinity College in London and in-service training for state governments in the United States. He has trained kindergarten teachers in 8 countries and helped businessmen and women open baby centers and kindergartens in 3 countries.
Learn about our Editorial Process
Chris Drew (PhD)
This article was peer-reviewed and edited by Chris Drew (PhD). The review process on Helpful Professor involves having a PhD level expert fact check, edit, and contribute to articles. Reviewers ensure all content reflects expert academic consensus and is backed up with reference to academic studies. Dr. Drew has published over 20 academic articles in scholarly journals. He is the former editor of the Journal of Learning Development in Higher Education and holds a PhD in Education from ACU.
An independent variable (IV) is what is manipulated in a scientific experiment to determine its effect on the dependent variable (DV).
By varying the level of the independent variable and observing associated changes in the dependent variable, a researcher can conclude whether the independent variable affects the dependent variable or not.
This can provide very valuable information when studying just about any subject.
Because the researcher controls the level of the independent variable, it can be determined if the independent variable has a causal effect on the dependent variable.
The term causation is vitally important. Scientists want to know what causes changes in the dependent variable. The only way to do that is to manipulate the independent variable and observe any changes in the dependent variable.
The independent variable and dependent variable are used in a very specific type of scientific study called the experiment .
Although there are many variations of the experiment, generally speaking, it involves either the presence or absence of the independent variable and the observation of what happens to the dependent variable.
The research participants are randomly assigned to either receive the independent variable (called the treatment condition), or not receive the independent variable (called the control condition).
Other variations of an experiment might include having multiple levels of the independent variable.
If the independent variable affects the dependent variable, then it should be possible to observe changes in the dependent variable based on the presence or absence of the independent variable.
Of course, there are a lot of issues to consider when conducting an experiment, but these are the basic principles.
These concepts should not be confused with predictor and outcome variables .
1. gatorade and improved athletic performance.
A sports medicine researcher has been hired by Gatorade to test the effects of its sports drink on athletic performance. The company wants to claim that when an athlete drinks Gatorade, their performance will improve.
If they can back up that claim with hard scientific data, that would be great for sales.
So, the researcher goes to a nearby university and randomly selects both male and female athletes from several sports: track and field, volleyball, basketball, and football. Each athlete will run on a treadmill for one hour while their heart rate is tracked.
All of the athletes are given the exact same amount of liquid to consume 30-minutes before and during their run. Half are given Gatorade, and the other half are given water, but no one knows what they are given because both liquids have been colored.
In this example, the independent variable is Gatorade, and the dependent variable is heart rate.
A hospital is investigating the effectiveness of a new type of chemotherapy on cancer. The researchers identified 120 patients with relatively similar types of cancerous tumors in both size and stage of progression.
The patients are randomly assigned to one of three groups: one group receives no chemotherapy, one group receives a low dose of chemotherapy, and one group receives a high dose of chemotherapy.
Each group receives chemotherapy treatment three times a week for two months, except for the no-treatment group. At the end of two months, the doctors measure the size of each patient’s tumor.
In this study, despite the ethical issues (remember this is just a hypothetical example), the independent variable is chemotherapy, and the dependent variable is tumor size.
A well-known fast-food corporation wants to know if the color of the interior of their restaurants will affect how fast people eat. Of course, they would prefer that consumers enter and exit quickly to increase sales volume and profit.
So, they rent space in a large shopping mall and create three different simulated restaurant interiors of different colors. One room is painted mostly white with red trim and seats; one room is painted mostly white with blue trim and seats; and one room is painted mostly white with off-white trim and seats.
Next, they randomly select shoppers on Saturdays and Sundays to eat for free in one of the three rooms. Each shopper is given a box of the same food and drink items and sent to one of the rooms. The researchers record how much time elapses from the moment they enter the room to the moment they leave.
The independent variable is the color of the room, and the dependent variable is the amount of time spent in the room eating.
A large multinational cosmetics company wants to know if the color of a woman’s hair affects the level of perceived attractiveness in males. So, they use Photoshop to manipulate the same image of a female by altering the color of her hair: blonde, brunette, red, and brown.
Next, they randomly select university males to enter their testing facilities. Each participant sits in front of a computer screen and responds to questions on a survey. At the end of the survey, the screen shows one of the photos of the female.
At the same time, software on the computer that utilizes the computer’s camera is measuring each male’s pupil dilation. The researchers believe that larger dilation indicates greater perceived attractiveness.
The independent variable is hair color, and the dependent variable is pupil dilation.
After many claims that listening to Mozart will make you smarter, a group of education specialists decides to put it to the test. So, first, they go to a nearby school in a middle-class neighborhood.
During the first three months of the academic year, they randomly select some 5th-grade classrooms to listen to Mozart during their lessons and exams. Other 5 th grade classrooms will not listen to any music during their lessons and exams.
The researchers then compare the scores of the exams between the two groups of classrooms.
Although there are a lot of obvious limitations to this hypothetical, it is the first step.
The independent variable is Mozart, and the dependent variable is exam scores.
A company that specializes in essential oils wants to examine the effects of lavender on sleep quality. They hire a sleep research lab to conduct the study. The researchers at the lab have their usual test volunteers sleep in individual rooms every night for one week.
The conditions of each room are all exactly the same, except that half of the rooms have lavender released into the rooms and half do not. While the study participants are sleeping, their heart rates and amount of time spent in deep sleep are recorded with high-tech equipment.
At the end of the study, the researchers compare the total amount of time spent in deep sleep of the lavender-room participants with the no lavender-room participants.
The independent variable in this sleep study is lavender, and the dependent variable is the total amount of time spent in deep sleep.
A group of teachers is interested in which teaching method will work best for developing critical thinking skills.
So, they train a group of teachers in three different teaching styles : teacher-centered, where the teacher tells the students all about critical thinking; student-centered, where the students practice critical thinking and receive teacher feedback; and AI-assisted teaching, where the teacher uses a special software program to teach critical thinking.
At the end of three months, all the students take the same test that assesses critical thinking skills. The teachers then compare the scores of each of the three groups of students.
The independent variable is the teaching method, and the dependent variable is performance on the critical thinking test.
A chemicals company has developed three different versions of their concrete mix. Each version contains a different blend of specially developed chemicals. The company wants to know which version is the strongest.
So, they create three bridge molds that are identical in every way. They fill each mold with one of the different concrete mixtures. Next, they test the strength of each bridge by placing progressively more weight on its center until the bridge collapses.
In this study, the independent variable is the concrete mixture, and the dependent variable is the amount of weight at collapse.
People in the pizza business know that the crust is key. Many companies, large and small, will keep their recipe a top secret. Before rolling out a new type of crust, the company decides to conduct some research on consumer preferences.
The company has prepared three versions of their crust that vary in crunchiness, they are: a little crunchy, very crunchy, and super crunchy. They already have a pool of consumers that fit their customer profile and they often use them for testing.
Each participant sits in a booth and takes a bite of one version of the crust. They then indicate how much they liked it by pressing one of 5 buttons: didn’t like at all, liked, somewhat liked, liked very much, loved it.
The independent variable is the level of crust crunchiness, and the dependent variable is how much it was liked.
A large food company is considering entering the health and nutrition sector. Their R&D food scientists have developed a protein supplement that is designed to help build muscle mass for people that work out regularly.
The company approaches several gyms near its headquarters. They enlist the cooperation of over 120 gym rats that work out 5 days a week. Their muscle mass is measured, and only those with a lower level are selected for the study, leaving a total of 80 study participants.
They randomly assign half of the participants to take the recommended dosage of their supplement every day for three months after each workout. The other half takes the same amount of something that looks the same but actually does nothing to the body.
At the end of three months, the muscle mass of all participants is measured.
The independent variable is the supplement, and the dependent variable is muscle mass.
In the early days of airbags , automobile companies conducted a great deal of testing. At first, many people in the industry didn’t think airbags would be effective at all. Fortunately, there was a way to test this theory objectively.
In a representative example: Several crash cars were outfitted with an airbag, and an equal number were not. All crash cars were of the same make, year, and model. Then the crash experts rammed each car into a crash wall at the same speed. Sensors on the crash dummy skulls allowed for a scientific analysis of how much damage a human skull would incur.
The amount of skull damage of dummies in cars with airbags was then compared with those without airbags.
The independent variable was the airbag and the dependent variable was the amount of skull damage.
Some people take vitamins every day. A group of health scientists decides to conduct a study to determine if taking vitamins improves health.
They randomly select 1,000 people that are relatively similar in terms of their physical health. The key word here is “similar.”
Because the scientists have an unlimited budget (and because this is a hypothetical example, all of the participants have the same meals delivered to their homes (breakfast, lunch, and dinner), every day for one year.
In addition, the scientists randomly assign half of the participants to take a set of vitamins, supplied by the researchers every day for 1 year. The other half do not take the vitamins.
At the end of one year, the health of all participants is assessed, using blood pressure and cholesterol level as the key measurements.
In this highly unrealistic study, the independent variable is vitamins, and the dependent variable is health, as measured by blood pressure and cholesterol levels.
Does practicing meditation reduce stress? If you have ever wondered if this is true or not, then you are in luck because there is a way to know one way or the other.
All we have to do is find 90 people that are similar in age, stress levels, diet and exercise, and as many other factors as we can think of.
Next, we randomly assign each person to either practice meditation every day, three days a week, or not at all. After three months, we measure the stress levels of each person and compare the groups.
How should we measure stress? Well, there are a lot of ways. We could measure blood pressure, or the amount of the stress hormone cortisol in their blood, or by using a paper and pencil measure such as a questionnaire that asks them how much stress they feel.
In this study, the independent variable is meditation and the dependent variable is the amount of stress (however it is measured).
When video games started to become increasingly graphic, it was a huge concern in many countries in the world. Educators, social scientists, and parents were shocked at how graphic games were becoming.
Since then, there have been hundreds of studies conducted by psychologists and other researchers. A lot of those studies used an experimental design that involved males of various ages randomly assigned to play a graphic or non-graphic video game.
Afterward, their level of aggression was measured via a wide range of methods, including direct observations of their behavior, their actions when given the opportunity to be aggressive, or a variety of other measures.
So many studies have used so many different ways of measuring aggression.
In these experimental studies, the independent variable was graphic video games, and the dependent variable was observed level of aggression.
Car pollution is a concern for a lot of reasons. In addition to being bad for the environment, car exhaust may cause damage to the brain and impair cognitive performance.
One way to examine this possibility would be to conduct an animal study. The research would look something like this: laboratory rats would be raised in three different rooms that varied in the degree of car exhaust circulating in the room: no exhaust, little exhaust, or a lot of exhaust.
After a certain period of time, perhaps several months, the effects on cognitive performance could be measured.
One common way of assessing cognitive performance in laboratory rats is by measuring the amount of time it takes to run a maze successfully. It would also be possible to examine the physical effects of car exhaust on the brain by conducting an autopsy.
In this animal study, the independent variable would be car exhaust and the dependent variable would be amount of time to run a maze.
Read Next: Extraneous Variables Examples
The experiment is an incredibly valuable way to answer scientific questions regarding the cause and effect of certain variables. By manipulating the level of an independent variable and observing corresponding changes in a dependent variable, scientists can gain an understanding of many phenomena.
For example, scientists can learn if graphic video games make people more aggressive, if mediation reduces stress, if Gatorade improves athletic performance, and even if certain medical treatments can cure cancer.
The determination of causality is the key benefit of manipulating the independent variable and them observing changes in the dependent variable. Other research methodologies can reveal factors that are related to the dependent variable or associated with the dependent variable, but only when the independent variable is controlled by the researcher can causality be determined.
Ferguson, C. J. (2010). Blazing Angels or Resident Evil? Can graphic video games be a force for good? Review of General Psychology, 14 (2), 68-81. https://doi.org/10.1037/a0018941
Flannelly, L. T., Flannelly, K. J., & Jankowski, K. R. (2014). Independent, dependent, and other variables in healthcare and chaplaincy research. Journal of Health Care Chaplaincy , 20 (4), 161–170. https://doi.org/10.1080/08854726.2014.959374
Manocha, R., Black, D., Sarris, J., & Stough, C.(2011). A randomized, controlled trial of meditation for work stress, anxiety and depressed mood in full-time workers. Evidence-Based Complementary and Alternative Medicine , vol. 2011, Article ID 960583. https://doi.org/10.1155/2011/960583
Rumrill, P. D., Jr. (2004). Non-manipulation quantitative designs. Work (Reading, Mass.) , 22 (3), 255–260.
Taylor, J. M., & Rowe, B. J. (2012). The “Mozart Effect” and the mathematical connection, Journal of College Reading and Learning, 42 (2), 51-66. https://doi.org/10.1080/10790195.2012.10850354
Your email address will not be published. Required fields are marked *
What’s the definition of a dependent variable.
A dependent variable is what changes as a result of the independent variable manipulation in experiments . It’s what you’re interested in measuring, and it “depends” on your independent variable.
In statistics, dependent variables are also called:
Attrition refers to participants leaving a study. It always happens to some extent—for example, in randomized controlled trials for medical research.
Differential attrition occurs when attrition or dropout rates differ systematically between the intervention and the control group . As a result, the characteristics of the participants who drop out differ from the characteristics of those who stay in the study. Because of this, study results may be biased .
Action research is conducted in order to solve a particular issue immediately, while case studies are often conducted over a longer period of time and focus more on observing and analyzing a particular ongoing phenomenon.
Action research is focused on solving a problem or informing individual and community-based knowledge in a way that impacts teaching, learning, and other related processes. It is less focused on contributing theoretical input, instead producing actionable input.
Action research is particularly popular with educators as a form of systematic inquiry because it prioritizes reflection and bridges the gap between theory and practice. Educators are able to simultaneously investigate an issue as they solve it, and the method is very iterative and flexible.
A cycle of inquiry is another name for action research . It is usually visualized in a spiral shape following a series of steps, such as “planning → acting → observing → reflecting.”
To make quantitative observations , you need to use instruments that are capable of measuring the quantity you want to observe. For example, you might use a ruler to measure the length of an object or a thermometer to measure its temperature.
Criterion validity and construct validity are both types of measurement validity . In other words, they both show you how accurately a method measures something.
While construct validity is the degree to which a test or other measurement method measures what it claims to measure, criterion validity is the degree to which a test can predictively (in the future) or concurrently (in the present) measure something.
Construct validity is often considered the overarching type of measurement validity . You need to have face validity , content validity , and criterion validity in order to achieve construct validity.
Convergent validity and discriminant validity are both subtypes of construct validity . Together, they help you evaluate whether a test measures the concept it was designed to measure.
You need to assess both in order to demonstrate construct validity. Neither one alone is sufficient for establishing construct validity.
Content validity shows you how accurately a test or other measurement method taps into the various aspects of the specific construct you are researching.
In other words, it helps you answer the question: “does the test measure all aspects of the construct I want to measure?” If it does, then the test has high content validity.
The higher the content validity, the more accurate the measurement of the construct.
If the test fails to include parts of the construct, or irrelevant parts are included, the validity of the instrument is threatened, which brings your results into question.
Face validity and content validity are similar in that they both evaluate how suitable the content of a test is. The difference is that face validity is subjective, and assesses content at surface level.
When a test has strong face validity, anyone would agree that the test’s questions appear to measure what they are intended to measure.
For example, looking at a 4th grade math test consisting of problems in which students have to add and multiply, most people would agree that it has strong face validity (i.e., it looks like a math test).
On the other hand, content validity evaluates how well a test represents all the aspects of a topic. Assessing content validity is more systematic and relies on expert evaluation. of each question, analyzing whether each one covers the aspects that the test was designed to cover.
A 4th grade math test would have high content validity if it covered all the skills taught in that grade. Experts(in this case, math teachers), would have to evaluate the content validity by comparing the test to the learning objectives.
Snowball sampling is a non-probability sampling method . Unlike probability sampling (which involves some form of random selection ), the initial individuals selected to be studied are the ones who recruit new participants.
Because not every member of the target population has an equal chance of being recruited into the sample, selection in snowball sampling is non-random.
Snowball sampling is a non-probability sampling method , where there is not an equal chance for every member of the population to be included in the sample .
This means that you cannot use inferential statistics and make generalizations —often the goal of quantitative research . As such, a snowball sample is not representative of the target population and is usually a better fit for qualitative research .
Snowball sampling relies on the use of referrals. Here, the researcher recruits one or more initial participants, who then recruit the next ones.
Participants share similar characteristics and/or know each other. Because of this, not every member of the population has an equal chance of being included in the sample, giving rise to sampling bias .
Snowball sampling is best used in the following cases:
The reproducibility and replicability of a study can be ensured by writing a transparent, detailed method section and using clear, unambiguous language.
Reproducibility and replicability are related terms.
Stratified sampling and quota sampling both involve dividing the population into subgroups and selecting units from each subgroup. The purpose in both cases is to select a representative sample and/or to allow comparisons between subgroups.
The main difference is that in stratified sampling, you draw a random sample from each subgroup ( probability sampling ). In quota sampling you select a predetermined number or proportion of units, in a non-random manner ( non-probability sampling ).
Purposive and convenience sampling are both sampling methods that are typically used in qualitative data collection.
A convenience sample is drawn from a source that is conveniently accessible to the researcher. Convenience sampling does not distinguish characteristics among the participants. On the other hand, purposive sampling focuses on selecting participants possessing characteristics associated with the research study.
The findings of studies based on either convenience or purposive sampling can only be generalized to the (sub)population from which the sample is drawn, and not to the entire population.
Random sampling or probability sampling is based on random selection. This means that each unit has an equal chance (i.e., equal probability) of being included in the sample.
On the other hand, convenience sampling involves stopping people at random, which means that not everyone has an equal chance of being selected depending on the place, time, or day you are collecting your data.
Convenience sampling and quota sampling are both non-probability sampling methods. They both use non-random criteria like availability, geographical proximity, or expert knowledge to recruit study participants.
However, in convenience sampling, you continue to sample units or cases until you reach the required sample size.
In quota sampling, you first need to divide your population of interest into subgroups (strata) and estimate their proportions (quota) in the population. Then you can start your data collection, using convenience sampling to recruit participants, until the proportions in each subgroup coincide with the estimated proportions in the population.
A sampling frame is a list of every member in the entire population . It is important that the sampling frame is as complete as possible, so that your sample accurately reflects your population.
Stratified and cluster sampling may look similar, but bear in mind that groups created in cluster sampling are heterogeneous , so the individual characteristics in the cluster vary. In contrast, groups created in stratified sampling are homogeneous , as units share characteristics.
Relatedly, in cluster sampling you randomly select entire groups and include all units of each group in your sample. However, in stratified sampling, you select some units of all groups and include them in your sample. In this way, both methods can ensure that your sample is representative of the target population .
A systematic review is secondary research because it uses existing research. You don’t collect new data yourself.
The key difference between observational studies and experimental designs is that a well-done observational study does not influence the responses of participants, while experiments do have some sort of treatment condition applied to at least some participants by random assignment .
An observational study is a great choice for you if your research question is based purely on observations. If there are ethical, logistical, or practical concerns that prevent you from conducting a traditional experiment , an observational study may be a good choice. In an observational study, there is no interference or manipulation of the research subjects, as well as no control or treatment groups .
It’s often best to ask a variety of people to review your measurements. You can ask experts, such as other researchers, or laypeople, such as potential participants, to judge the face validity of tests.
While experts have a deep understanding of research methods , the people you’re studying can provide you with valuable insights you may have missed otherwise.
Face validity is important because it’s a simple first step to measuring the overall validity of a test or technique. It’s a relatively intuitive, quick, and easy way to start checking whether a new measure seems useful at first glance.
Good face validity means that anyone who reviews your measure says that it seems to be measuring what it’s supposed to. With poor face validity, someone reviewing your measure may be left confused about what you’re measuring and why you’re using this method.
Face validity is about whether a test appears to measure what it’s supposed to measure. This type of validity is concerned with whether a measure seems relevant and appropriate for what it’s assessing only on the surface.
Statistical analyses are often applied to test validity with data from your measures. You test convergent validity and discriminant validity with correlations to see if results from your test are positively or negatively related to those of other established tests.
You can also use regression analyses to assess whether your measure is actually predictive of outcomes that you expect it to predict theoretically. A regression analysis that supports your expectations strengthens your claim of construct validity .
When designing or evaluating a measure, construct validity helps you ensure you’re actually measuring the construct you’re interested in. If you don’t have construct validity, you may inadvertently measure unrelated or distinct constructs and lose precision in your research.
Construct validity is often considered the overarching type of measurement validity , because it covers all of the other types. You need to have face validity , content validity , and criterion validity to achieve construct validity.
Construct validity is about how well a test measures the concept it was designed to evaluate. It’s one of four types of measurement validity , which includes construct validity, face validity , and criterion validity.
There are two subtypes of construct validity.
Naturalistic observation is a valuable tool because of its flexibility, external validity , and suitability for topics that can’t be studied in a lab setting.
The downsides of naturalistic observation include its lack of scientific control , ethical considerations , and potential for bias from observers and subjects.
Naturalistic observation is a qualitative research method where you record the behaviors of your research subjects in real world settings. You avoid interfering or influencing anything in a naturalistic observation.
You can think of naturalistic observation as “people watching” with a purpose.
An independent variable is the variable you manipulate, control, or vary in an experimental study to explore its effects. It’s called “independent” because it’s not influenced by any other variables in the study.
Independent variables are also called:
As a rule of thumb, questions related to thoughts, beliefs, and feelings work well in focus groups. Take your time formulating strong questions, paying special attention to phrasing. Be careful to avoid leading questions , which can bias your responses.
Overall, your focus group questions should be:
A structured interview is a data collection method that relies on asking questions in a set order to collect data on a topic. They are often quantitative in nature. Structured interviews are best used when:
More flexible interview options include semi-structured interviews , unstructured interviews , and focus groups .
Social desirability bias is the tendency for interview participants to give responses that will be viewed favorably by the interviewer or other participants. It occurs in all types of interviews and surveys , but is most common in semi-structured interviews , unstructured interviews , and focus groups .
Social desirability bias can be mitigated by ensuring participants feel at ease and comfortable sharing their views. Make sure to pay attention to your own body language and any physical or verbal cues, such as nodding or widening your eyes.
This type of bias can also occur in observations if the participants know they’re being observed. They might alter their behavior accordingly.
The interviewer effect is a type of bias that emerges when a characteristic of an interviewer (race, age, gender identity, etc.) influences the responses given by the interviewee.
There is a risk of an interviewer effect in all types of interviews , but it can be mitigated by writing really high-quality interview questions.
A semi-structured interview is a blend of structured and unstructured types of interviews. Semi-structured interviews are best used when:
An unstructured interview is the most flexible type of interview, but it is not always the best fit for your research topic.
Unstructured interviews are best used when:
The four most common types of interviews are:
Deductive reasoning is commonly used in scientific research, and it’s especially associated with quantitative research .
In research, you might have come across something called the hypothetico-deductive method . It’s the scientific method of testing hypotheses to check whether your predictions are substantiated by real-world data.
Deductive reasoning is a logical approach where you progress from general ideas to specific conclusions. It’s often contrasted with inductive reasoning , where you start with specific observations and form general conclusions.
Deductive reasoning is also called deductive logic.
There are many different types of inductive reasoning that people use formally or informally.
Here are a few common types:
Inductive reasoning is a bottom-up approach, while deductive reasoning is top-down.
Inductive reasoning takes you from the specific to the general, while in deductive reasoning, you make inferences by going from general premises to specific conclusions.
In inductive research , you start by making observations or gathering data. Then, you take a broad scan of your data and search for patterns. Finally, you make general conclusions that you might incorporate into theories.
Inductive reasoning is a method of drawing conclusions by going from the specific to the general. It’s usually contrasted with deductive reasoning, where you proceed from general information to specific conclusions.
Inductive reasoning is also called inductive logic or bottom-up reasoning.
A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question.
A hypothesis is not just a guess — it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations and statistical analysis of data).
Triangulation can help:
But triangulation can also pose problems:
There are four main types of triangulation :
Many academic fields use peer review , largely to determine whether a manuscript is suitable for publication. Peer review enhances the credibility of the published manuscript.
However, peer review is also common in non-academic settings. The United Nations, the European Union, and many individual nations use peer review to evaluate grant applications. It is also widely used in medical and health-related fields as a teaching or quality-of-care measure.
Peer assessment is often used in the classroom as a pedagogical tool. Both receiving feedback and providing it are thought to enhance the learning process, helping students think critically and collaboratively.
Peer review can stop obviously problematic, falsified, or otherwise untrustworthy research from being published. It also represents an excellent opportunity to get feedback from renowned experts in your field. It acts as a first defense, helping you ensure your argument is clear and that there are no gaps, vague terms, or unanswered questions for readers who weren’t involved in the research process.
Peer-reviewed articles are considered a highly credible source due to this stringent process they go through before publication.
In general, the peer review process follows the following steps:
Exploratory research is often used when the issue you’re studying is new or when the data collection process is challenging for some reason.
You can use exploratory research if you have a general idea or a specific question that you want to study but there is no preexisting knowledge or paradigm with which to study it.
Exploratory research is a methodology approach that explores research questions that have not previously been studied in depth. It is often used when the issue you’re studying is new, or the data collection process is challenging in some way.
Explanatory research is used to investigate how or why a phenomenon occurs. Therefore, this type of research is often one of the first stages in the research process , serving as a jumping-off point for future research.
Exploratory research aims to explore the main aspects of an under-researched problem, while explanatory research aims to explain the causes and consequences of a well-defined problem.
Explanatory research is a research method used to investigate how or why something occurs when only a small amount of information is available pertaining to that topic. It can help you increase your understanding of a given topic.
Clean data are valid, accurate, complete, consistent, unique, and uniform. Dirty data include inconsistencies and errors.
Dirty data can come from any part of the research process, including poor research design , inappropriate measurement materials, or flawed data entry.
Data cleaning takes place between data collection and data analyses. But you can use some methods even before collecting data.
For clean data, you should start by designing measures that collect valid data. Data validation at the time of data entry or collection helps you minimize the amount of data cleaning you’ll need to do.
After data collection, you can use data standardization and data transformation to clean your data. You’ll also deal with any missing values, outliers, and duplicate values.
Every dataset requires different techniques to clean dirty data , but you need to address these issues in a systematic way. You focus on finding and resolving data points that don’t agree or fit with the rest of your dataset.
These data might be missing values, outliers, duplicate values, incorrectly formatted, or irrelevant. You’ll start with screening and diagnosing your data. Then, you’ll often standardize and accept or remove data to make your dataset consistent and valid.
Data cleaning is necessary for valid and appropriate analyses. Dirty data contain inconsistencies or errors , but cleaning your data helps you minimize or resolve these.
Without data cleaning, you could end up with a Type I or II error in your conclusion. These types of erroneous conclusions can be practically significant with important consequences, because they lead to misplaced investments or missed opportunities.
Data cleaning involves spotting and resolving potential data inconsistencies or errors to improve your data quality. An error is any value (e.g., recorded weight) that doesn’t reflect the true value (e.g., actual weight) of something that’s being measured.
In this process, you review, analyze, detect, modify, or remove “dirty” data to make your dataset “clean.” Data cleaning is also called data cleansing or data scrubbing.
Research misconduct means making up or falsifying data, manipulating data analyses, or misrepresenting results in research reports. It’s a form of academic fraud.
These actions are committed intentionally and can have serious consequences; research misconduct is not a simple mistake or a point of disagreement but a serious ethical failure.
Anonymity means you don’t know who the participants are, while confidentiality means you know who they are but remove identifying information from your research report. Both are important ethical considerations .
You can only guarantee anonymity by not collecting any personally identifying information—for example, names, phone numbers, email addresses, IP addresses, physical characteristics, photos, or videos.
You can keep data confidential by using aggregate information in your research report, so that you only refer to groups of participants rather than individuals.
Research ethics matter for scientific integrity, human rights and dignity, and collaboration between science and society. These principles make sure that participation in studies is voluntary, informed, and safe.
Ethical considerations in research are a set of principles that guide your research designs and practices. These principles include voluntary participation, informed consent, anonymity, confidentiality, potential for harm, and results communication.
Scientists and researchers must always adhere to a certain code of conduct when collecting data from others .
These considerations protect the rights of research participants, enhance research validity , and maintain scientific integrity.
In multistage sampling , you can use probability or non-probability sampling methods .
For a probability sample, you have to conduct probability sampling at every stage.
You can mix it up by using simple random sampling , systematic sampling , or stratified sampling to select units at different stages, depending on what is applicable and relevant to your study.
Multistage sampling can simplify data collection when you have large, geographically spread samples, and you can obtain a probability sample without a complete sampling frame.
But multistage sampling may not lead to a representative sample, and larger samples are needed for multistage samples to achieve the statistical properties of simple random samples .
These are four of the most common mixed methods designs :
Triangulation in research means using multiple datasets, methods, theories and/or investigators to address a research question. It’s a research strategy that can help you enhance the validity and credibility of your findings.
Triangulation is mainly used in qualitative research , but it’s also commonly applied in quantitative research . Mixed methods research always uses triangulation.
In multistage sampling , or multistage cluster sampling, you draw a sample from a population using smaller and smaller groups at each stage.
This method is often used to collect data from a large, geographically spread group of people in national surveys, for example. You take advantage of hierarchical groupings (e.g., from state to city to neighborhood) to create a sample that’s less expensive and time-consuming to collect data from.
No, the steepness or slope of the line isn’t related to the correlation coefficient value. The correlation coefficient only tells you how closely your data fit on a line, so two datasets with the same correlation coefficient can have very different slopes.
To find the slope of the line, you’ll need to perform a regression analysis .
Correlation coefficients always range between -1 and 1.
The sign of the coefficient tells you the direction of the relationship: a positive value means the variables change together in the same direction, while a negative value means they change together in opposite directions.
The absolute value of a number is equal to the number without its sign. The absolute value of a correlation coefficient tells you the magnitude of the correlation: the greater the absolute value, the stronger the correlation.
These are the assumptions your data must meet if you want to use Pearson’s r :
Quantitative research designs can be divided into two main categories:
Qualitative research designs tend to be more flexible. Common types of qualitative design include case study , ethnography , and grounded theory designs.
A well-planned research design helps ensure that your methods match your research aims, that you collect high-quality data, and that you use the right kind of analysis to answer your questions, utilizing credible sources . This allows you to draw valid , trustworthy conclusions.
The priorities of a research design can vary depending on the field, but you usually have to specify:
A research design is a strategy for answering your research question . It defines your overall approach and determines how you will collect and analyze data.
Questionnaires can be self-administered or researcher-administered.
Self-administered questionnaires can be delivered online or in paper-and-pen formats, in person or through mail. All questions are standardized so that all respondents receive the same questions with identical wording.
Researcher-administered questionnaires are interviews that take place by phone, in-person, or online between researchers and respondents. You can gain deeper insights by clarifying questions for respondents or asking follow-up questions.
You can organize the questions logically, with a clear progression from simple to complex, or randomly between respondents. A logical flow helps respondents process the questionnaire easier and quicker, but it may lead to bias. Randomization can minimize the bias from order effects.
Closed-ended, or restricted-choice, questions offer respondents a fixed set of choices to select from. These questions are easier to answer quickly.
Open-ended or long-form questions allow respondents to answer in their own words. Because there are no restrictions on their choices, respondents can answer in ways that researchers may not have otherwise considered.
A questionnaire is a data collection tool or instrument, while a survey is an overarching research method that involves collecting and analyzing data from people using questionnaires.
The third variable and directionality problems are two main reasons why correlation isn’t causation .
The third variable problem means that a confounding variable affects both variables to make them seem causally related when they are not.
The directionality problem is when two variables correlate and might actually have a causal relationship, but it’s impossible to conclude which variable causes changes in the other.
Correlation describes an association between variables : when one variable changes, so does the other. A correlation is a statistical indicator of the relationship between variables.
Causation means that changes in one variable brings about changes in the other (i.e., there is a cause-and-effect relationship between variables). The two variables are correlated with each other, and there’s also a causal link between them.
While causation and correlation can exist simultaneously, correlation does not imply causation. In other words, correlation is simply a relationship where A relates to B—but A doesn’t necessarily cause B to happen (or vice versa). Mistaking correlation for causation is a common error and can lead to false cause fallacy .
Controlled experiments establish causality, whereas correlational studies only show associations between variables.
In general, correlational research is high in external validity while experimental research is high in internal validity .
A correlation is usually tested for two variables at a time, but you can test correlations between three or more variables.
A correlation coefficient is a single number that describes the strength and direction of the relationship between your variables.
Different types of correlation coefficients might be appropriate for your data based on their levels of measurement and distributions . The Pearson product-moment correlation coefficient (Pearson’s r ) is commonly used to assess a linear relationship between two quantitative variables.
A correlational research design investigates relationships between two variables (or more) without the researcher controlling or manipulating any of them. It’s a non-experimental type of quantitative research .
A correlation reflects the strength and/or direction of the association between two or more variables.
Random error is almost always present in scientific studies, even in highly controlled settings. While you can’t eradicate it completely, you can reduce random error by taking repeated measurements, using a large sample, and controlling extraneous variables .
You can avoid systematic error through careful design of your sampling , data collection , and analysis procedures. For example, use triangulation to measure your variables using multiple methods; regularly calibrate instruments or procedures; use random sampling and random assignment ; and apply masking (blinding) where possible.
Systematic error is generally a bigger problem in research.
With random error, multiple measurements will tend to cluster around the true value. When you’re collecting data from a large sample , the errors in different directions will cancel each other out.
Systematic errors are much more problematic because they can skew your data away from the true value. This can lead you to false conclusions ( Type I and II errors ) about the relationship between the variables you’re studying.
Random and systematic error are two types of measurement error.
Random error is a chance difference between the observed and true values of something (e.g., a researcher misreading a weighing scale records an incorrect measurement).
Systematic error is a consistent or proportional difference between the observed and true values of something (e.g., a miscalibrated scale consistently records weights as higher than they actually are).
On graphs, the explanatory variable is conventionally placed on the x-axis, while the response variable is placed on the y-axis.
The term “ explanatory variable ” is sometimes preferred over “ independent variable ” because, in real world contexts, independent variables are often influenced by other variables. This means they aren’t totally independent.
Multiple independent variables may also be correlated with each other, so “explanatory variables” is a more appropriate term.
The difference between explanatory and response variables is simple:
In a controlled experiment , all extraneous variables are held constant so that they can’t influence the results. Controlled experiments require:
Depending on your study topic, there are various other methods of controlling variables .
There are 4 main types of extraneous variables :
An extraneous variable is any variable that you’re not investigating that can potentially affect the dependent variable of your research study.
A confounding variable is a type of extraneous variable that not only affects the dependent variable, but is also related to the independent variable.
In a factorial design, multiple independent variables are tested.
If you test two variables, each level of one independent variable is combined with each level of the other independent variable to create different conditions.
Within-subjects designs have many potential threats to internal validity , but they are also very statistically powerful .
Advantages:
Disadvantages:
While a between-subjects design has fewer threats to internal validity , it also requires more participants for high statistical power than a within-subjects design .
Yes. Between-subjects and within-subjects designs can be combined in a single study when you have two or more independent variables (a factorial design). In a mixed factorial design, one variable is altered between subjects and another is altered within subjects.
In a between-subjects design , every participant experiences only one condition, and researchers assess group differences between participants in various conditions.
In a within-subjects design , each participant experiences all conditions, and researchers test the same participants repeatedly for differences between conditions.
The word “between” means that you’re comparing different conditions between groups, while the word “within” means you’re comparing different conditions within the same group.
Random assignment is used in experiments with a between-groups or independent measures design. In this research design, there’s usually a control group and one or more experimental groups. Random assignment helps ensure that the groups are comparable.
In general, you should always use random assignment in this type of experimental design when it is ethically possible and makes sense for your study topic.
To implement random assignment , assign a unique number to every member of your study’s sample .
Then, you can use a random number generator or a lottery method to randomly assign each number to a control or experimental group. You can also do so manually, by flipping a coin or rolling a dice to randomly assign participants to groups.
Random selection, or random sampling , is a way of selecting members of a population for your study’s sample.
In contrast, random assignment is a way of sorting the sample into control and experimental groups.
Random sampling enhances the external validity or generalizability of your results, while random assignment improves the internal validity of your study.
In experimental research, random assignment is a way of placing participants from your sample into different groups using randomization. With this method, every member of the sample has a known or equal chance of being placed in a control group or an experimental group.
“Controlling for a variable” means measuring extraneous variables and accounting for them statistically to remove their effects on other variables.
Researchers often model control variable data along with independent and dependent variable data in regression analyses and ANCOVAs . That way, you can isolate the control variable’s effects from the relationship between the variables of interest.
Control variables help you establish a correlational or causal relationship between variables by enhancing internal validity .
If you don’t control relevant extraneous variables , they may influence the outcomes of your study, and you may not be able to demonstrate that your results are really an effect of your independent variable .
A control variable is any variable that’s held constant in a research study. It’s not a variable of interest in the study, but it’s controlled because it could influence the outcomes.
Including mediators and moderators in your research helps you go beyond studying a simple relationship between two variables for a fuller picture of the real world. They are important to consider when studying complex correlational or causal relationships.
Mediators are part of the causal pathway of an effect, and they tell you how or why an effect takes place. Moderators usually help you judge the external validity of your study by identifying the limitations of when the relationship between variables holds.
If something is a mediating variable :
A confounder is a third variable that affects variables of interest and makes them seem related when they are not. In contrast, a mediator is the mechanism of a relationship between two variables: it explains the process by which they are related.
A mediator variable explains the process through which two variables are related, while a moderator variable affects the strength and direction of that relationship.
There are three key steps in systematic sampling :
Systematic sampling is a probability sampling method where researchers select members of the population at a regular interval – for example, by selecting every 15th person on a list of the population. If the population is in a random order, this can imitate the benefits of simple random sampling .
Yes, you can create a stratified sample using multiple characteristics, but you must ensure that every participant in your study belongs to one and only one subgroup. In this case, you multiply the numbers of subgroups for each characteristic to get the total number of groups.
For example, if you were stratifying by location with three subgroups (urban, rural, or suburban) and marital status with five subgroups (single, divorced, widowed, married, or partnered), you would have 3 x 5 = 15 subgroups.
You should use stratified sampling when your sample can be divided into mutually exclusive and exhaustive subgroups that you believe will take on different mean values for the variable that you’re studying.
Using stratified sampling will allow you to obtain more precise (with lower variance ) statistical estimates of whatever you are trying to measure.
For example, say you want to investigate how income differs based on educational attainment, but you know that this relationship can vary based on race. Using stratified sampling, you can ensure you obtain a large enough sample from each racial group, allowing you to draw more precise conclusions.
In stratified sampling , researchers divide subjects into subgroups called strata based on characteristics that they share (e.g., race, gender, educational attainment).
Once divided, each subgroup is randomly sampled using another probability sampling method.
Cluster sampling is more time- and cost-efficient than other probability sampling methods , particularly when it comes to large samples spread across a wide geographical area.
However, it provides less statistical certainty than other methods, such as simple random sampling , because it is difficult to ensure that your clusters properly represent the population as a whole.
There are three types of cluster sampling : single-stage, double-stage and multi-stage clustering. In all three types, you first divide the population into clusters, then randomly select clusters for use in your sample.
Cluster sampling is a probability sampling method in which you divide a population into clusters, such as districts or schools, and then randomly select some of these clusters as your sample.
The clusters should ideally each be mini-representations of the population as a whole.
If properly implemented, simple random sampling is usually the best sampling method for ensuring both internal and external validity . However, it can sometimes be impractical and expensive to implement, depending on the size of the population to be studied,
If you have a list of every member of the population and the ability to reach whichever members are selected, you can use simple random sampling.
The American Community Survey is an example of simple random sampling . In order to collect detailed data on the population of the US, the Census Bureau officials randomly select 3.5 million households per year and use a variety of methods to convince them to fill out the survey.
Simple random sampling is a type of probability sampling in which the researcher randomly selects a subset of participants from a population . Each member of the population has an equal chance of being selected. Data is then collected from as large a percentage as possible of this random subset.
Quasi-experimental design is most useful in situations where it would be unethical or impractical to run a true experiment .
Quasi-experiments have lower internal validity than true experiments, but they often have higher external validity as they can use real-world interventions instead of artificial laboratory settings.
A quasi-experiment is a type of research design that attempts to establish a cause-and-effect relationship. The main difference with a true experiment is that the groups are not randomly assigned.
Blinding is important to reduce research bias (e.g., observer bias , demand characteristics ) and ensure a study’s internal validity .
If participants know whether they are in a control or treatment group , they may adjust their behavior in ways that affect the outcome that researchers are trying to measure. If the people administering the treatment are aware of group assignment, they may treat participants differently and thus directly or indirectly influence the final results.
Blinding means hiding who is assigned to the treatment group and who is assigned to the control group in an experiment .
A true experiment (a.k.a. a controlled experiment) always includes at least one control group that doesn’t receive the experimental treatment.
However, some experiments use a within-subjects design to test treatments without a control group. In these designs, you usually compare one group’s outcomes before and after a treatment (instead of comparing outcomes between different groups).
For strong internal validity , it’s usually best to include a control group if possible. Without a control group, it’s harder to be certain that the outcome was caused by the experimental treatment and not by other variables.
An experimental group, also known as a treatment group, receives the treatment whose effect researchers wish to study, whereas a control group does not. They should be identical in all other ways.
Individual Likert-type questions are generally considered ordinal data , because the items have clear rank order, but don’t have an even distribution.
Overall Likert scale scores are sometimes treated as interval data. These scores are considered to have directionality and even spacing between them.
The type of data determines what statistical tests you should use to analyze your data.
A Likert scale is a rating scale that quantitatively assesses opinions, attitudes, or behaviors. It is made up of 4 or more questions that measure a single attitude or trait when response scores are combined.
To use a Likert scale in a survey , you present participants with Likert-type questions or statements, and a continuum of items, usually with 5 or 7 possible responses, to capture their degree of agreement.
In scientific research, concepts are the abstract ideas or phenomena that are being studied (e.g., educational achievement). Variables are properties or characteristics of the concept (e.g., performance at school), while indicators are ways of measuring or quantifying variables (e.g., yearly grade reports).
The process of turning abstract concepts into measurable variables and indicators is called operationalization .
There are various approaches to qualitative data analysis , but they all share five steps in common:
The specifics of each step depend on the focus of the analysis. Some common approaches include textual analysis , thematic analysis , and discourse analysis .
There are five common approaches to qualitative research :
Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.
Operationalization means turning abstract conceptual ideas into measurable observations.
For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioral avoidance of crowded places, or physical anxiety symptoms in social situations.
Before collecting data , it’s important to consider how you will operationalize the variables that you want to measure.
When conducting research, collecting original data has significant advantages:
However, there are also some drawbacks: data collection can be time-consuming, labor-intensive and expensive. In some cases, it’s more efficient to use secondary data that has already been collected by someone else, but the data might be less reliable.
Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organizations.
There are several methods you can use to decrease the impact of confounding variables on your research: restriction, matching, statistical control and randomization.
In restriction , you restrict your sample by only including certain subjects that have the same values of potential confounding variables.
In matching , you match each of the subjects in your treatment group with a counterpart in the comparison group. The matched subjects have the same values on any potential confounding variables, and only differ in the independent variable .
In statistical control , you include potential confounders as variables in your regression .
In randomization , you randomly assign the treatment (or independent variable) in your study to a sufficiently large number of subjects, which allows you to control for all potential confounding variables.
A confounding variable is closely related to both the independent and dependent variables in a study. An independent variable represents the supposed cause , while the dependent variable is the supposed effect . A confounding variable is a third variable that influences both the independent and dependent variables.
Failing to account for confounding variables can cause you to wrongly estimate the relationship between your independent and dependent variables.
To ensure the internal validity of your research, you must consider the impact of confounding variables. If you fail to account for them, you might over- or underestimate the causal relationship between your independent and dependent variables , or even find a causal relationship where none exists.
Yes, but including more than one of either type requires multiple research questions .
For example, if you are interested in the effect of a diet on health, you can use multiple measures of health: blood sugar, blood pressure, weight, pulse, and many more. Each of these is its own dependent variable with its own research question.
You could also choose to look at the effect of exercise levels as well as diet, or even the additional effect of the two combined. Each of these is a separate independent variable .
To ensure the internal validity of an experiment , you should only change one independent variable at a time.
No. The value of a dependent variable depends on an independent variable, so a variable cannot be both independent and dependent at the same time. It must be either the cause or the effect, not both!
You want to find out how blood sugar levels are affected by drinking diet soda and regular soda, so you conduct an experiment .
Determining cause and effect is one of the most important parts of scientific research. It’s essential to know which is the cause – the independent variable – and which is the effect – the dependent variable.
In non-probability sampling , the sample is selected based on non-random criteria, and not every member of the population has a chance of being included.
Common non-probability sampling methods include convenience sampling , voluntary response sampling, purposive sampling , snowball sampling, and quota sampling .
Probability sampling means that every member of the target population has a known chance of being included in the sample.
Probability sampling methods include simple random sampling , systematic sampling , stratified sampling , and cluster sampling .
Using careful research design and sampling procedures can help you avoid sampling bias . Oversampling can be used to correct undercoverage bias .
Some common types of sampling bias include self-selection bias , nonresponse bias , undercoverage bias , survivorship bias , pre-screening or advertising bias, and healthy user bias.
Sampling bias is a threat to external validity – it limits the generalizability of your findings to a broader group of people.
A sampling error is the difference between a population parameter and a sample statistic .
A statistic refers to measures about the sample , while a parameter refers to measures about the population .
Populations are used when a research question requires data from every member of the population. This is usually only feasible when the population is small and easily accessible.
Samples are used to make inferences about populations . Samples are easier to collect data from because they are practical, cost-effective, convenient, and manageable.
There are seven threats to external validity : selection bias , history, experimenter effect, Hawthorne effect , testing effect, aptitude-treatment and situation effect.
The two types of external validity are population validity (whether you can generalize to other groups of people) and ecological validity (whether you can generalize to other situations and settings).
The external validity of a study is the extent to which you can generalize your findings to different groups of people, situations, and measures.
Cross-sectional studies cannot establish a cause-and-effect relationship or analyze behavior over a period of time. To investigate cause and effect, you need to do a longitudinal study or an experimental study .
Cross-sectional studies are less expensive and time-consuming than many other types of study. They can provide useful insights into a population’s characteristics and identify correlations for further research.
Sometimes only cross-sectional data is available for analysis; other times your research question may only require a cross-sectional study to answer it.
Longitudinal studies can last anywhere from weeks to decades, although they tend to be at least a year long.
The 1970 British Cohort Study , which has collected data on the lives of 17,000 Brits since their births in 1970, is one well-known example of a longitudinal study .
Longitudinal studies are better to establish the correct sequence of events, identify changes over time, and provide insight into cause-and-effect relationships, but they also tend to be more expensive and time-consuming than other types of studies.
Longitudinal studies and cross-sectional studies are two different types of research design . In a cross-sectional study you collect data from a population at a specific point in time; in a longitudinal study you repeatedly collect data from the same sample over an extended period of time.
Longitudinal study | Cross-sectional study |
---|---|
observations | Observations at a in time |
Observes the multiple times | Observes (a “cross-section”) in the population |
Follows in participants over time | Provides of society at a given point |
There are eight threats to internal validity : history, maturation, instrumentation, testing, selection bias , regression to the mean, social interaction and attrition .
Internal validity is the extent to which you can be confident that a cause-and-effect relationship established in a study cannot be explained by other factors.
In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .
The research methods you use depend on the type of data you need to answer your research question .
A confounding variable , also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.
A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.
In your research design , it’s important to identify potential confounding variables and plan how you will reduce their impact.
Discrete and continuous variables are two types of quantitative variables :
Quantitative variables are any variables where the data represent amounts (e.g. height, weight, or age).
Categorical variables are any variables where the data represent groups. This includes rankings (e.g. finishing places in a race), classifications (e.g. brands of cereal), and binary outcomes (e.g. coin flips).
You need to know what type of variables you are working with to choose the right statistical test for your data and interpret your results .
You can think of independent and dependent variables in terms of cause and effect: an independent variable is the variable you think is the cause , while a dependent variable is the effect .
In an experiment, you manipulate the independent variable and measure the outcome in the dependent variable. For example, in an experiment about the effect of nutrients on crop growth:
Defining your variables, and deciding how you will manipulate and measure them, is an important part of experimental design .
Experimental design means planning a set of procedures to investigate a relationship between variables . To design a controlled experiment, you need:
When designing the experiment, you decide:
Experimental design is essential to the internal and external validity of your experiment.
I nternal validity is the degree of confidence that the causal relationship you are testing is not influenced by other factors or variables .
External validity is the extent to which your results can be generalized to other contexts.
The validity of your experiment depends on your experimental design .
Reliability and validity are both about how well a method measures something:
If you are doing experimental research, you also have to consider the internal and external validity of your experiment.
A sample is a subset of individuals from a larger population . Sampling means selecting the group that you will actually collect data from in your research. For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.
In statistics, sampling allows you to test a hypothesis about the characteristics of a population.
Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.
Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.
Methodology refers to the overarching strategy and rationale of your research project . It involves studying the methods used in your field and the theories or principles behind them, in order to develop an approach that matches your objectives.
Methods are the specific tools and procedures you use to collect and analyze data (for example, experiments, surveys , and statistical tests ).
In shorter scientific papers, where the aim is to report the findings of a specific study, you might simply describe what you did in a methods section .
In a longer or more complex research project, such as a thesis or dissertation , you will probably include a methodology section , where you explain your approach to answering the research questions and cite relevant sources to support your choice of methods.
Want to contact us directly? No problem. We are always here for you.
Our team helps students graduate by offering:
Scribbr specializes in editing study-related documents . We proofread:
Scribbr’s Plagiarism Checker is powered by elements of Turnitin’s Similarity Checker , namely the plagiarism detection software and the Internet Archive and Premium Scholarly Publications content databases .
The add-on AI detector is powered by Scribbr’s proprietary software.
The Scribbr Citation Generator is developed using the open-source Citation Style Language (CSL) project and Frank Bennett’s citeproc-js . It’s the same technology used by dozens of other popular citation tools, including Mendeley and Zotero.
You can find all the citation styles and locales used in the Scribbr Citation Generator in our publicly accessible repository on Github .
Dependent Variable The variable that depends on other factors that are measured. These variables are expected to change as a result of an experimental manipulation of the independent variable or variables. It is the presumed effect.
Independent Variable The variable that is stable and unaffected by the other variables you are trying to measure. It refers to the condition of an experiment that is systematically manipulated by the investigator. It is the presumed cause.
Cramer, Duncan and Dennis Howitt. The SAGE Dictionary of Statistics . London: SAGE, 2004; Penslar, Robin Levin and Joan P. Porter. Institutional Review Board Guidebook: Introduction . Washington, DC: United States Department of Health and Human Services, 2010; "What are Dependent and Independent Variables?" Graphic Tutorial.
Don't feel bad if you are confused about what is the dependent variable and what is the independent variable in social and behavioral sciences research . However, it's important that you learn the difference because framing a study using these variables is a common approach to organizing the elements of a social sciences research study in order to discover relevant and meaningful results. Specifically, it is important for these two reasons:
A variable in research simply refers to a person, place, thing, or phenomenon that you are trying to measure in some way. The best way to understand the difference between a dependent and independent variable is that the meaning of each is implied by what the words tell us about the variable you are using. You can do this with a simple exercise from the website, Graphic Tutorial. Take the sentence, "The [independent variable] causes a change in [dependent variable] and it is not possible that [dependent variable] could cause a change in [independent variable]." Insert the names of variables you are using in the sentence in the way that makes the most sense. This will help you identify each type of variable. If you're still not sure, consult with your professor before you begin to write.
Fan, Shihe. "Independent Variable." In Encyclopedia of Research Design. Neil J. Salkind, editor. (Thousand Oaks, CA: SAGE, 2010), pp. 592-594; "What are Dependent and Independent Variables?" Graphic Tutorial; Salkind, Neil J. "Dependent Variable." In Encyclopedia of Research Design , Neil J. Salkind, editor. (Thousand Oaks, CA: SAGE, 2010), pp. 348-349;
The process of examining a research problem in the social and behavioral sciences is often framed around methods of analysis that compare, contrast, correlate, average, or integrate relationships between or among variables . Techniques include associations, sampling, random selection, and blind selection. Designation of the dependent and independent variable involves unpacking the research problem in a way that identifies a general cause and effect and classifying these variables as either independent or dependent.
The variables should be outlined in the introduction of your paper and explained in more detail in the methods section . There are no rules about the structure and style for writing about independent or dependent variables but, as with any academic writing, clarity and being succinct is most important.
After you have described the research problem and its significance in relation to prior research, explain why you have chosen to examine the problem using a method of analysis that investigates the relationships between or among independent and dependent variables . State what it is about the research problem that lends itself to this type of analysis. For example, if you are investigating the relationship between corporate environmental sustainability efforts [the independent variable] and dependent variables associated with measuring employee satisfaction at work using a survey instrument, you would first identify each variable and then provide background information about the variables. What is meant by "environmental sustainability"? Are you looking at a particular company [e.g., General Motors] or are you investigating an industry [e.g., the meat packing industry]? Why is employee satisfaction in the workplace important? How does a company make their employees aware of sustainability efforts and why would a company even care that its employees know about these efforts?
Identify each variable for the reader and define each . In the introduction, this information can be presented in a paragraph or two when you describe how you are going to study the research problem. In the methods section, you build on the literature review of prior studies about the research problem to describe in detail background about each variable, breaking each down for measurement and analysis. For example, what activities do you examine that reflect a company's commitment to environmental sustainability? Levels of employee satisfaction can be measured by a survey that asks about things like volunteerism or a desire to stay at the company for a long time.
The structure and writing style of describing the variables and their application to analyzing the research problem should be stated and unpacked in such a way that the reader obtains a clear understanding of the relationships between the variables and why they are important. This is also important so that the study can be replicated in the future using the same variables but applied in a different way.
Fan, Shihe. "Independent Variable." In Encyclopedia of Research Design. Neil J. Salkind, editor. (Thousand Oaks, CA: SAGE, 2010), pp. 592-594; "What are Dependent and Independent Variables?" Graphic Tutorial; “Case Example for Independent and Dependent Variables.” ORI Curriculum Examples. U.S. Department of Health and Human Services, Office of Research Integrity; Salkind, Neil J. "Dependent Variable." In Encyclopedia of Research Design , Neil J. Salkind, editor. (Thousand Oaks, CA: SAGE, 2010), pp. 348-349; “Independent Variables and Dependent Variables.” Karl L. Wuensch, Department of Psychology, East Carolina University [posted email exchange]; “Variables.” Elements of Research. Dr. Camille Nebeker, San Diego State University.
Run a free plagiarism check in 10 minutes, automatically generate references for free.
Published on 4 May 2022 by Pritha Bhandari . Revised on 17 October 2022.
In research, variables are any characteristics that can take on different values, such as height, age, temperature, or test scores.
Researchers often manipulate or measure independent and dependent variables in studies to test cause-and-effect relationships.
Your independent variable is the temperature of the room. You vary the room temperature by making it cooler for half the participants, and warmer for the other half.
What is an independent variable, types of independent variables, what is a dependent variable, identifying independent vs dependent variables, independent and dependent variables in research, visualising independent and dependent variables, frequently asked questions about independent and dependent variables.
An independent variable is the variable you manipulate or vary in an experimental study to explore its effects. It’s called ‘independent’ because it’s not influenced by any other variables in the study.
Independent variables are also called:
These terms are especially used in statistics , where you estimate the extent to which an independent variable change can explain or predict changes in the dependent variable.
There are two main types of independent variables.
In experiments, you manipulate independent variables directly to see how they affect your dependent variable. The independent variable is usually applied at different levels to see how the outcomes differ.
You can apply just two levels in order to find out if an independent variable has an effect at all.
You can also apply multiple levels to find out how the independent variable affects the dependent variable.
You have three independent variable levels, and each group gets a different level of treatment.
You randomly assign your patients to one of the three groups:
A true experiment requires you to randomly assign different levels of an independent variable to your participants.
Random assignment helps you control participant characteristics, so that they don’t affect your experimental results. This helps you to have confidence that your dependent variable results come solely from the independent variable manipulation.
Subject variables are characteristics that vary across participants, and they can’t be manipulated by researchers. For example, gender identity, ethnicity, race, income, and education are all important subject variables that social researchers treat as independent variables.
It’s not possible to randomly assign these to participants, since these are characteristics of already existing groups. Instead, you can create a research design where you compare the outcomes of groups of participants with characteristics. This is a quasi-experimental design because there’s no random assignment.
Your independent variable is a subject variable, namely the gender identity of the participants. You have three groups: men, women, and other.
Your dependent variable is the brain activity response to hearing infant cries. You record brain activity with fMRI scans when participants hear infant cries without their awareness.
A dependent variable is the variable that changes as a result of the independent variable manipulation. It’s the outcome you’re interested in measuring, and it ‘depends’ on your independent variable.
In statistics , dependent variables are also called:
The dependent variable is what you record after you’ve manipulated the independent variable. You use this measurement data to check whether and to what extent your independent variable influences the dependent variable by conducting statistical analyses.
Based on your findings, you can estimate the degree to which your independent variable variation drives changes in your dependent variable. You can also predict how much your dependent variable will change as a result of variation in the independent variable.
Distinguishing between independent and dependent variables can be tricky when designing a complex study or reading an academic paper.
A dependent variable from one study can be the independent variable in another study, so it’s important to pay attention to research design.
Here are some tips for identifying each variable type.
Use this list of questions to check whether you’re dealing with an independent variable:
Check whether you’re dealing with a dependent variable:
Independent and dependent variables are generally used in experimental and quasi-experimental research.
Here are some examples of research questions and corresponding independent and dependent variables.
Research question | Independent variable | Dependent variable(s) |
---|---|---|
Do tomatoes grow fastest under fluorescent, incandescent, or natural light? | ||
What is the effect of intermittent fasting on blood sugar levels? | ||
Is medical marijuana effective for pain reduction in people with chronic pain? | ||
To what extent does remote working increase job satisfaction? |
For experimental data, you analyse your results by generating descriptive statistics and visualising your findings. Then, you select an appropriate statistical test to test your hypothesis .
The type of test is determined by:
You’ll often use t tests or ANOVAs to analyse your data and answer your research questions.
In quantitative research , it’s good practice to use charts or graphs to visualise the results of studies. Generally, the independent variable goes on the x -axis (horizontal) and the dependent variable on the y -axis (vertical).
The type of visualisation you use depends on the variable types in your research questions:
To inspect your data, you place your independent variable of treatment level on the x -axis and the dependent variable of blood pressure on the y -axis.
You plot bars for each treatment group before and after the treatment to show the difference in blood pressure.
An independent variable is the variable you manipulate, control, or vary in an experimental study to explore its effects. It’s called ‘independent’ because it’s not influenced by any other variables in the study.
A dependent variable is what changes as a result of the independent variable manipulation in experiments . It’s what you’re interested in measuring, and it ‘depends’ on your independent variable.
In statistics, dependent variables are also called:
Determining cause and effect is one of the most important parts of scientific research. It’s essential to know which is the cause – the independent variable – and which is the effect – the dependent variable.
You want to find out how blood sugar levels are affected by drinking diet cola and regular cola, so you conduct an experiment .
Yes, but including more than one of either type requires multiple research questions .
For example, if you are interested in the effect of a diet on health, you can use multiple measures of health: blood sugar, blood pressure, weight, pulse, and many more. Each of these is its own dependent variable with its own research question.
You could also choose to look at the effect of exercise levels as well as diet, or even the additional effect of the two combined. Each of these is a separate independent variable .
To ensure the internal validity of an experiment , you should only change one independent variable at a time.
No. The value of a dependent variable depends on an independent variable, so a variable cannot be both independent and dependent at the same time. It must be either the cause or the effect, not both.
If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.
Bhandari, P. (2022, October 17). Independent vs Dependent Variables | Definition & Examples. Scribbr. Retrieved 3 September 2024, from https://www.scribbr.co.uk/research-methods/independent-vs-dependent-variables/
Other students also liked, a quick guide to experimental design | 5 steps & examples, quasi-experimental design | definition, types & examples, types of variables in research | definitions & examples.
Independent vs. Dependent Variables
The two main variables in a scientific experiment are the independent and dependent variables. An independent variable is changed or controlled in a scientific experiment to test the effects on another variable. This variable being tested and measured is called the dependent variable.
As its name suggests, the dependent variable is "dependent" on the independent variable. As the experimenter changes the independent variable, the effect on the dependent variable is observed and recorded.
Let's say a scientist wants to see if the brightness of light has any effect on a moth's attraction to the light. The brightness of the light is controlled by the scientist. This would be the independent variable . How the moth reacts to the different light levels (such as its distance to the light source) would be the dependent variable .
As another example, say you want to know whether eating breakfast affects student test scores. The factor under the experimenter's control is the presence or absence of breakfast, so you know it is the independent variable. The experiment measures test scores of students who ate breakfast versus those who did not. Theoretically, the test results depend on breakfast, so the test results are the dependent variable. Note that test scores are the dependent variable even if it turns out there is no relationship between scores and breakfast.
For another experiment, a scientist wants to determine whether one drug is more effective than another at controlling high blood pressure. The independent variable is the drug, while the patient's blood pressure is the dependent variable. In some ways, this experiment resembles the one with breakfast and test scores. However, when comparing two different treatments, such as drug A and drug B, it's usual to add another variable, called the control variable. The control variable , which in this case is a placebo that contains the same inactive ingredients as the drugs, makes it possible to tell whether either drug actually affects blood pressure.
The independent and dependent variables in an experiment may be viewed in terms of cause and effect. If the independent variable is changed, then an effect is seen, or measured, in the dependent variable. Remember, the values of both variables may change in an experiment and are recorded. The difference is that the value of the independent variable is controlled by the experimenter, while the value of the dependent variable only changes in response to the independent variable.
When results are plotted in graphs, the convention is to use the independent variable as the x-axis and the dependent variable as the y-axis. The DRY MIX acronym can help keep the variables straight:
D is the dependent variable R is the responding variable Y is the axis on which the dependent or responding variable is graphed (the vertical axis)
M is the manipulated variable or the one that is changed in an experiment I is the independent variable X is the axis on which the independent or manipulated variable is graphed (the horizontal axis)
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
Scientific Reports volume 14 , Article number: 20377 ( 2024 ) Cite this article
Metrics details
Portable X-ray Fluorescence probe (pXRF) is a tool used to measure many elements quickly and efficiently in soil with minimal sample preparation. Although this sensing technique has been widely used to determine total elemental concentrations, it has not been calibrated for plant-available nutrient predictions. We evaluated the potential of using pXRF for fast plant-available nutrient quantification. Two experiments were conducted in soils treated with two types of biochars to obtain a practical range of soil pH (5.5 − 8.0) and organic carbon (2.0 − 5.5%). Biochars applied were derived from switchgrass (SGB) and poultry litter (PLB). The first experiment received biochars at application rates up to 8% (w/w) and had no plants. The second experiment had up to 4% of SGB or PLB planted with ryegrass ( Lolium perenne ). Linear regression (LR), polynomial regression (PolR), power regression (PowR), and stepwise multiple linear regression (SMLR) were the models tested. Regardless of the extraction method, phosphorus (P) showed a strong relationship between pXRF and several laboratory extraction methods; however, K prediction via pXRF was sensitive to the plant factor. The optimum soil available-P corresponding to the maximum P uptake in plant tissues can be assessed with pXRF. The LR was inconsistent for calcium (Ca), sulfur (S), and copper (Cu) and non-significant for magnesium (Mg), iron (Fe), and zinc (Zn). Our results showed that pXRF is applicable to estimate P availability in soils receiving organic amendments. More evaluations are needed with diverse soil types to confirm the findings before using pXRF for fertilizer recommendation.
Introduction.
Plant available nutrient concentration in the soil is one of the important factors for fertilizer recommendations and it is obtained through traditional time consuming and costly laboratory methods using wet chemistry (WC). As a proximal sensor, portable X-ray fluorescence (pXRF) spectrometry can be a potential tool in precision agriculture to quantify several elements in a matter of seconds to minutes 1 , 2 . Disadvantageously, the elemental concentrations obtained by such a technique are total amount, which includes free ions in soil solution, elements in the structure of soil minerals and organic matter, and nutrients strongly adsorbed or fixed to clay-sized particles 3 . Fortunately, with the assistance of mathematical and statistical techniques, the total concentrations of elements determined by pXRF can be used to predict the soil exchangeable/available nutrient concentrations 4 . For this, the development of prediction models can be explored and validated to reduce the cost and time required by traditional WC laboratory analyses of nutrient concentrations 3 , 5 . Hence, it becomes important to evaluate the precision of measurements and the accuracy of prediction models using pXRF 1 , 6 .
Some of the current literature is inconsistent with respect to the prediction of soil nutrient availability using pXRF. To exemplify, the XRF analysis of dry spectra can reliably predict exchangeable-K 7 . XRF has also been reported to accurately predict exchangeable Ca and available P 3 , but the study also reported that available K was not successfully predicted. To date, the amount of research on the prediction of soil micronutrients (Cu, Fe, Mn, Zn, etc.) availability using a pXRF is nearly non-existent. In that scenario, more efficient methods for soil analysis using a pXRF are needed to complement precision nutrient management via more accurate and fast fertilizer recommendations.
One of the reasons behind the lack of good relationships between pXRF measurements and plant available nutrients from WC might be due to the traditional extraction method used by some laboratories. Therefore, it also becomes essential to explore more than one conventional method for a single nutrient to assess different relationships between those available/exchangeable amounts with total amounts from pXRF. This can identify which method is better predicted by pXRF to determine the soil nutrient availability. Among the traditional methods used to assess soil nutrients availability in soils are (1) Mehlich-3 (M3), to determine phosphorus (P), potassium (K), calcium (Ca), magnesium (Mg), sulfur (S), copper (Cu), iron (Fe), manganese (Mn), and zinc (Zn); (2) monocalcium phosphate (MCP) for S; (3) ammonium acetate for exchangeable-Ca, -Mg, and -K, (4) DTPA for most of the micronutrients; and (5) other methods developed just for soil available P such as Olsen, Bray, and Water Extraction.
Hence, various methods are employed to extract available phosphorus (P), potassium (K), calcium (Ca), magnesium (Mg), and micronutrients from soil, each with its specific principles and applications. The Mehlich-3 (M3) extraction method is widely used for its effectiveness in extracting P, K, Ca, Mg, and micronutrients simultaneously, making it suitable for multi-nutrient analysis. Bray and Olsen's methods are specific to P extraction, with Bray suitable for acidic soils and Olsen for alkaline soils. Water-extractable phosphorus (WEP) offers a rapid assessment of soluble P but may not capture all plant-available forms. DTPA extraction is common for assessing micronutrient availability, while ammonium acetate is used for K, Ca, and Mg in soils with low cation exchange capacity (CEC). Selection depends on soil type, pH, nutrient of interest, and analysis goals, emphasizing the need for method choice based on soil characteristics for accurate nutrient management.
Biochar's application as a soil amendment proves invaluable when assessing the reliability of pXRF measurements in predicting nutrient availability. This is primarily because the incorporation of biochar into the soil provides a diverse range of soil pH levels and organic carbon (OC) content, mirroring the natural conditions of many agricultural soils 8 . Moreover, broad literature investigating biochar’s positive effects on soil properties has commonly found that increased soil nutrient levels are one of the main benefits 9 . As an example, the application of poultry-based biochars has been shown to increase both total and plant available P and K concentrations in amended soils 10 , 11 .
We undertook two comprehensive studies to assess the viability of utilizing X-ray Fluorescence (XRF) as a predictive tool for plant-available nutrient estimation. In one of these studies, we introduced ryegrass cultivation to validate potential alterations in nutrient dynamics that could influence the accuracy of portable XRF measurements. We evaluated the prediction of exchangeable P, K, Ca, Mg, S, Cu, Fe, and Zn from pXRF elemental data by identifying the best possible regression model. To the best of the authors' knowledge, no work has compared the prediction of soil available nutrients by pXRF with several traditional methods, such as those previously mentioned, by evaluating relationships with and without plants grown in the trials. Furthermore, the utilization of distinct biochars tailored to suit a practical range of soil OC and pH levels exemplifies the uniqueness of this research and justifies the assessment of several extraction methods for comparison goals. Finally, the biochar application promotes an exhaustive evaluation of the pXRF technique since the soil is amended with organic material, making the analyzed matrix even more heterogeneous, which makes this work unique. We hypothesize that there is predictive accuracy of X-ray Fluorescence (XRF) for estimating plant-available nutrients in soil when compared to traditional soil testing methods, regardless of the presence of plants or the application of distinct biochars.
Biochar production and characterization.
The biochars were produced from two feedstocks, switchgrass ( Panicum virgatum ) and poultry litter using high-temperature slow pyrolysis (700 °C). The details of switchgrass-derived biochar (SGB) and poultry litter-derived biochar (PLB) were described in Antonangelo and Zhang 12 . The coarse biochar materials were evenly and gently ground with a mortar and pestle before being sieved through 1- and 0.25-mm sieve for physicochemical analyses and experimentation, respectively 13 . The SGB and PLB characterization including major physicochemical properties can be found in Table 1 and in Antonangelo et al. 14 . In the experiments, the two distinct biochars were applied separately at elevated rates, as will be elaborated upon later.
The soil chosen for our experiment constituted a composite sample derived from the merging of three individual subsamples (equal volumes) to reduce heterogeneity, all obtained from a depth range of 0–15 cm. These subsamples were collected from locations proximate to chat piles (debris from milling operations), once used for agriculture purposes, and soil yards within the Tar Creek region, situated in Picher, Ottawa County, Oklahoma—a site characterized by significant metal contamination. However, the contaminated sites have undergone various remediation measures and have been repurposed for feed and food production or recreational use, provided they meet appropriate guidelines set forth by the EPA under the Comprehensive Environmental Response, Compensation, and Liability Act (CERCLA) and the Resource Conservation and Recovery Act (RCRA). The initial characteristics of both the subsamples and the resulting composite sample can be found in Supplementary Table S1 . Notably, our study leveraged the same portable X-ray fluorescence (pXRF) instrument, which had been previously validated and calibrated for measuring heavy metal concentrations 15 . This approach allowed us to broaden the scope of analysis to encompass soil nutrient availability, thereby serving a dual purpose: the assessment of metal contaminants and the prediction of nutrient accessibility within the soil. This information holds particular relevance for potential land use scenarios, such as phytoremediation initiatives or agricultural cultivation in these areas.
In the first experiment (no plants), 200 g of 2 mm sieved soils were mixed with 0.0 (control), 1.0, 2.0, 4.0, and 8.0% (w/w) of 0.25-mm sieved SGB, and PLB in plastic containers. The mixture was then incubated and monitored for 10 weeks, while keeping the experiment units at 75% of field capacity before sampling and analyses. The pots were arranged in a laboratory environment using the completely randomized design (CRD) with 3 replications ( n = 3).
In a second trial, ryegrass was cultivated in a pot experiment. 1.2 kg of the dried and sieved soils were mixed with 0.0 (control), 0.5, 1.0, 2.0, and 4.0% (w/w) of the same 0.25-mm sieved SGB and PLB in plastic pots. The 8.0% treatment used in the 1 st experiment was excluded since preliminary experiment show ryegrass growth was negatively impacted at this rate of biochar application 12 . Each pot was supplied with a uniform quantity of nitrogen (N) determined based on the soil test specifically for grass production, with the N contribution from biochars being subtracted. Given those biochars already offered sufficient phosphorus (P) and potassium (K), only the control group received extra P and K supplementation. The soil + biochar mixture was incubated for 30 days at 75% of field capacity before ryegrass sowing (each pot was sown at a rate of 30 kg ha −1 as estimated from the pot surface area). Ryegrass growth was monitored for 75 days in an environmentally controlled growth chamber. The pots were arranged in a CRD with 3 replications ( n = 3). Plots were rotated once a week to eliminate spatial variability in the growth chamber. Procedures for the pot experiment were detailed in Antonangelo and Zhang 12 .
Across experiments, biochars added the following amounts of major macro- and micro-nutrients (in kg ha ‒1 ) for SGB (0.5‒8%): P (20‒320), K (40‒640), Ca (80‒1280), Mg (30‒480), S (4‒64), Cu (0.23‒3.68), Fe (1.11‒17.8), and Zn (0.54‒8.64); and for PLB (0.5‒8%): P (400‒6400), K (80‒12,800), Ca (500‒8000), Mg (200‒3200), S (100‒1600), Cu (2.53‒40.5), Fe (69‒1104), and Zn (14.8‒236).
Soil analysis at the Soil, Water, and Forage Analytical Laboratory (SWFAL) at Oklahoma State University was conducted using conventional WC techniques, following well-established procedures documented in the scientific literature. Laboratory results were maintained through standards, blank samples, and internal and external checks. Blank samples were used to verify each analysis, while internal reference samples were employed every 10 to 20 samples. If a check failed, the entire batch was reanalyzed. All results were double-checked for accuracy and reviewed for issues.
Drying and grinding can chemically and physically homogenize soil samples and affect the accuracy of prediction models; however, we used the same procedures of sample preparation in the two experiments to ensure the same conditions for comparison purposes between pXRF and traditional WC methods. The "traditional methods" refer to well-established and conventional techniques that have been commonly used and accepted in the field of soil science. These methods are typically recognized for their reliability and have a history of being employed as standard practices for nutrient extraction and analysis.
In the 1 st experiment (no plants), soil samples were dried, sieved to a 2mm size, and analyzed for pH, soil OC, and extractable P, K, Ca, Mg, S, Fe, Cu, and Zn. The soil pH was determined in deionized water (DI) with a 1:1 soil-to-water ratio 16 . The OC was determined with dry combustion 17 using a LECO 828 carbon and nitrogen analyzer (St. Joseph, MI). Plant-available P, K, Ca, and Mg were extracted by shaking 2 g of soil in 20 mL of Mehlich 3 (M3) solution (0.001 M EDTA, 0.015 M NH 4 F, 0.2 M CH 3 COOH, 0.25 M NH 4 NO 3 , 0.013 M HNO 3 ; pH buffered to 2.5) for 5 min 18 and quantified by inductively coupled plasma atomic emission spectroscopy (ICP-AES). Plant-available S was extracted by shaking 10 g of soil in 25 mL of 0.008 M MCP ([Ca(H 2 PO 4 ) 2 .H 2 O]) for 30 min 19 and determined with an ICP-AES. Phytoavailable micronutrients (Fe, Cu, and Zn) were analyzed by adding 20 mL of DTPA (0.005 M DTPA, 0.01 M CaCl 2 .2H 2 O, 0.2 M Sorbitol, 0.11 M triethanolamine, 0.05 M HCl; pH buffered to 7.3) to 10 g of soil, shaken for 2 h 20 , and quantified with ICP-AES.
In the second experiment, we employed a variety of traditional methods to enhance the global relevance of our study. It must be emphasized that, on a global scale, soil nutrient extraction methods carry inherent uncertainty, influenced by various factors. Within a lab, precision can be improved through standardization, calibration, and skilled technicians, but slight variations may still occur due to sample homogenization and instrument precision. Between labs, uncertainty grows due to equipment differences, calibration variations, and technician expertise. To reduce inter-laboratory disparities and enhance soil nutrient analysis accuracy, standardized protocols, and reference materials are essential.
Additionally, we introduced a plant factor to investigate its potential influence on pXRF measurements, as well as to assess the correlation between soil nutrient levels obtained via pXRF and those acquired through various extraction methods, concerning nutrient concentrations in plant tissues. In both experiments, we treated the soils with elevated doses of biochar, thereby creating a practical range of soil pH and organic carbon (OC) levels suitable for agricultural applications, as illustrated in Fig. 1 .
Boxplot of soil pH and organic carbon (OC) after switchgrass- and poultry litter-derived biochar application rates. The squared symbols denote the average. Boxes span the 25th to 75th data percentile, whiskers represent 1.5 × the interquartile range, and horizontal lines denote the median.
At the end of the pot experiment, the dried and sieved soil samples were analyzed by the same methods described above. In addition, plant available P was also determined by Bray-1, Olsen, and Water Extractable Phosphorus (WEP); Ca, Mg, and K by ammonium acetate (NH 4 Ac); and S, Zn, Cu, and Fe by M3 (determined in the same filtrate used for M3-P, -K, -Ca, and -Mg determination). The Olsen-P and Bray-P consist of the phosphate (PO 4 -P) extraction using a 0.5 N NaHCO 3 solution adjusted to pH 8.5 21 and a dilute acid solution of hydrochloric acid containing ammonium fluoride 22 , respectively. Ammonium molybdate and antimony potassium tartrate were added so that the orthophosphate ion reacted to form a complex. The complex was reduced with ascorbic acid to form a blue-colored complex that absorbs light at 880 nm in a spectrometer, instrument used for Olsen-P and Bray-P colorimetric determination. The absorbance is proportional to the concentration of orthophosphate in the sample. For the WEP extraction, DI water was used 23 : 2.0 g of soil was weighed and placed in 50 mL centrifuge tubes, then 20 mL of DI water was added to create soil-to-water extraction ratios of 1:10 (w:v). After the addition of DI water, tubes were shaken on an end-over-end shaker for 1-h, centrifuged at 5000 rpm for 5 min, and filtered through a 0.45 µm glass filter paper before determination by an ICP-AES. For K, Ca, and Mg extracted with ammonium acetate, the NH 4 Ac solution was buffered at pH 7.0, and 20 mL was added to a 2 g of dried soil into a 50 mL container, then the mixture was slowly shaken in a reciprocating shaker for 1 h, filtered and analyzed 24 by an ICP-AES.
All X-ray fluorescence (XRF) measurements were performed with a portable XRF (pXRF) using the TRACER 5i Portable/Handheld XRF Spectrometer (Bruker, Kennewick, WA, USA). A full description of the instrument can be found in Zhang et al. 15 . The ‘Soil Nutrient and Metal’ calibration provided by the manufacturer was used for all measurements. As shown in Fig. 2 , the pXRF was mounted upside down on a stand and samples were packed in a sample holder and then placed on the window, then readings were taken in duplicates at 50 kV, and 39 µA 15 . Samples were analyzed using 2 phase scans of 30 s each phase, totalizing 60 s per reading.
( a ) The sample is carefully positioned to ensure it fully occupies the portable X-ray Fluorescence (pXRF) sample holder, specifically designed for 32mm XRF Sample Cups. ( b ) Subsequently, the sample holder, housing the specimen earmarked for analysis, is securely inserted into the inverted mount of the pXRF device. Within this configuration, pXRF measurements are conducted and automatically recorded on a connected computer system. This graphical representation has been adapted from the research conducted by Zhang et al. (2021).
A certified reference material (CRM), ‘Metal-rich reference sediment’ (SdAR-M2, International Association of Geoanalysts, Keyworth, Nottingham, UK), was included in the determination of the total elemental concentrations by pXRF. Before analysis, the accuracy of the equipment was checked using the CRM. The average ± standard deviation (SD) for the concentrations of the elements of interest (P, K, Ca, Mg, S, Fe, Cu, and Zn) of both actual samples and CRM were calculated along with the difference between reported values of the CRM and CRM measurements (Supplementary Table S2 ). For multivariate modeling, the elements selected were those obtained for all samples by pXRF: P, K, S, Ca, Mg, Fe, Cu, Zn, Al, Si, Ti, V, Cr, Mn, Ni, As, Rb, Sr, Zr, Ba, Pb. Descriptive statistics of all other elements (excluding the elements of interest) are presented in Supplementary Table S3 , and their accuracy is in Supplementary Table S4 .
The K-edge absorption energies and the pXRF limit of detection (LOD) for the elements of interest are summarized in Table 2 . Ranges in soil nutrient concentrations determined by the traditional methods and pXRF are shown in Table 3 .
Ryegrass shoots were harvested, washed with DI water, and oven-dried at 105° C until constant weight. Dried plant materials were ground using a mechanical grinder to pass through a 1mm screen and then analyzed for P, K, Ca, Mg, S, Fe, Cu, and Zn using EPA method 3050B 25 ‒ digestion by concentrated HNO 3 and H 2 O 2 . To determine nutrient concentrations in plant tissues, 0.5 g of ground plant materials were predigested for 1 h with 10 mL of trace metal grade HNO 3 in the HotBlock™ Environmental Express block digester (Environmental Express, 2345A Charleston Regional Parkway, Charleston, South Carolina 29,492, SC, United States), and the digests were then heated to 115 °C for 2 h and diluted with DI water to 50 mL 26 . Finally, the digested samples were filtered, and P, K, Ca, Mg, S, Fe, Cu, and Zn were determined by ICP-AES.
Data analysis was conducted by following established procedures from the existing literature, drawing from studies that have previously calibrated and modeled predictions for soil nutrient availability using portable X-ray fluorescence (pXRF) technology.
Linear regression (LR), 2 nd -degree polynomial regression (PolR), power regression (PowR), and stepwise multiple linear regression (SMLR) were applied for the predictions of soil nutrient concentrations. LR, PolR, and PowR were generated considering the elemental concentrations obtained by pXRF as independent variables and the available concentrations as dependent variables 3 . For the assessment of the best model, the coefficient of determination (R 2 ), root mean square error (RMSE), and Akaike Information Criterion ( AICc ) were evaluated.
For the 1 st experiment, the rates of both biochars were tested separately and their results were combined to verify the relationship between increased rates of biochar application and soil nutrient increment (Supplementary Table S5 ). Since the results did not exhibit a clear trend, analyses of covariance (ANCOVA) between independent and dependent variables were performed for both experiments in JMP 15. According to the results (Table 4 ), it was reasonable to combine the data from soils treated with either SGB or PLB to evaluate the regression models. Although multiple regression models were tested, we plotted graphs showing the significant linear relationship (LR) graphics since it is easier to use.
Pearson, Spearman, and Kendall correlation coefficients were calculated between biochar rates and results from pXRF with traditional WC methods when appropriate.
For the Stepwise Multiple Linear Regression (SMLR), all 21 soil elements provided by pXRF measurements were used as independent variables to predict the soil nutrient availability. This was to account for any elemental interference that can occur due to overlapping spectra. The SMLR was generated with the whole dataset of measurements in JMP Pro 15 using the backward method 27 as described in Pelegrino et al. 3 . Initially, the model contains all variables. Then, the least statistically significant ones are removed. The remaining variables comprise the final SLR model. In this study variable removal was based on the AICc because it is more appropriate for finding the best model for predicting future observations 28 .
The stepwise regression process used statistical criteria to automatically select variables, indicating that multicollinearity, if present, did not significantly affect the model's performance. Post-modeling diagnostics, such as correlation assessments and variance inflation factor calculations, confirmed the independence of these variables, supporting the conclusion that multicollinearity was not a substantial concern. Therefore, the simultaneous inclusion of nutrients and XRF-derived elements in the SMLR model is scientifically justified, considering their distinct contributions and the absence of multicollinearity issues.
Non-linear segmented models were performed only for nutrients that provided a meaningful relationship between pXRF and traditional methods in the 2 nd experiment. As in Stammer and Mallarino 29 , the maximum responsive soil test P&K and pXRF-P&K to P&K concentrations in ryegrass shoots were determined by fitting the segmented polynomial linear–plateau (LP) response models using the NLIN (non-linear) procedure of SAS version 9.4. The models were accepted only when the NLIN convergence criterion method successfully converged, and the model was significant at least at p < 0.05.
All data analysis was performed using the whole dataset of measurements ( n = 27) and graphs were created using Excel.
Relationships in biochar amended soils without plants (1st experiment).
A strong linear relationship was obtained between pXRF and traditional WC methods for P, K, S, Ca, and Cu (Fig. 3 ). The LR was better for those nutrients when compared to the power regression (PowR, Table 5 ). Although PolR was slightly better in terms of AICc and RMSE, the differences were negligible, which makes adopting LR reliable and better in practice. Furthermore, exercising caution when applying the PolR model to data is crucial as it must be determined if there is empirical or theoretical support for a non-linear relationship between plant-available nutrient concentration and total element levels. Specifically, it must be established whether nutrients increase with total elements up to a certain point and then decline as total elements rise. This is vital for accurate model predictions in the context of nutrient-plant interactions. Unfortunately, no relationships were found for Mg, Fe, and Zn when evaluating LR, PolR, and Pow (Table 5 ). On the other hand, the SMLR provided better results for all nutrients and significant results for Mg, Fe, and Zn given the drastic change in all statistical parameters evaluated ( AICc , RMSE, and R 2 ) (Table 5 and Supplementary Fig. S1 ). However, embracing SMLR throughout might not be feasible in practice if different commercial pXRF instruments containing their respective (and specific) calibration methods, LOD, and elements analyzed 30 are used unless a clear and unique trend is globally observed. In our experiment, the variables impacting the prediction of soil available nutrients via pXRF are presented in Table 6 . This will be further discussed for comparison and to elucidate similar trends between the two experiments.
Linear relationship between soil nutrients determined by pXRF and traditional methods (1st experiment). *** : p < .001. RMSE: root mean square error.
In contrast to the work reported by Andrade et al. 31 , the soil Zn availability was not predicted by pXRF in our experiment, although the positive Cu prediction agreed with the results demonstrated by those authors. The work of Pierangeli et al. 32 also achieved successful prediction of available-Cu (R 2 = 0.80) when using pXRF although those authors compared pXRF results with those from Mehlich-1 extraction. Also, the inability to predict available Fe was observed by Andrade et al. 31 . The area from where soils were collected in our experiment exhibited a parent material rich in Zn 8 , such as sphalerite, smithsonite 33 , and hemimorphite 34 , which might have confounded the relationship between available Zn from DTPA and total Zn from pXRF. The results on the prediction of available micronutrients using a pXRF were further investigated in experiment 2 to verify the consistency of the results.
Overall, our results were like those obtained by Pelegrino et al. 3 for P, K, and Ca, even considering that those authors performed their experiment in acidic tropical soils, which is not the case for our experiment. Although our exchangeable-K predictions were consistent and reliable in our first trial, which disagrees with the observations of Pelegrino et al. 3 , those authors used agricultural lands where several crops were cultivated over the years, which might have influenced their results. The consistency of results in our soil incubation study was further evaluated in the second experiment, where plants were grown.
Soil nutrient availability.
Overall, the LR models could be established only for P and K regardless of the extraction method used to determine their availability with traditional WC methods. However, the LR model was not significant for S, Ca, Cu, Mg, Fe, and Zn (Fig. 4 ). In the case of P and K, the r-values (Pearson) ranged from 0.87 to 0.98 and 0.69 to 0.96 ( data not shown ), respectively for both experiments ( p < 0.001) when comparing the several traditional methods and pXRF measurements. As in the 1 st experiment, the differences observed in the statistical parameters of the LR, PolR, and PowR models tested in this study were negligible (except for K in some cases), which makes adopting LR models more advantageous, mainly to predict soil P availability (Table 5 and Fig. 4 ).
Linear relationship between soil nutrients determined by pXRF and several traditional methods (2nd experiment). *** : p < .001. RMSE: root mean square error. Please note that as the control sample has a P concentration of 25 mg kg −1 , which is lower than the critical threshold, it is below the LOD of the XRF analysis. Approximately two-thirds of the samples are closely clustered around the control sample. Despite the overall range spanning up to 200 mg kg −1 , many of these samples exhibit P concentrations that are above the LOD, indicating detectable variations in P content among them.
In the 2 nd experiment the LR relationship between P-pXRF and available-P from several extraction methods was as strong as in the 1 st experiment (Figs. 3 and 4 ). The reason results comparing pXRF-P and acidic extraction methods, such as M3 and Bray, offering a close 1:1 relationship (given the slopes in Figs. 3 and 4 ) is likely a consequence of the strong acidity of such extraction solutions overestimating available P by dissolving precipitated complexes of calcium and magnesium phosphates at higher pHs. This can be justified by the highly positive correlation ( r -value) among P, Ca, and Mg in both experiments (Supplementary Fig. S2 ).
Oppositely, the work of Pelegrino et al. 3 revealed that soil available-P prediction with pXRF was poor although significant; however, the authors attributed such finding to the presence of iron (Fe) and aluminum (Al) oxides in the acidic tropical soils where their study was carried out 35 , which is not the case of our experiment. Those fixed forms of P in the soil are not available and not detected from conventional extraction methods although are detected by the XRF technique. However, it is important to note that soil has a finite capacity to absorb P into clay complexes. Upon reaching saturation, this condition facilitates more accurate comparisons between total and available P. On the other hand, similar results to those obtained by Pelegrino et al. 3 regarding soil K availability were observed in our 2 nd experiment most likely due to the same reasons. The lack of regression models' ability to establish good relations between total K concentrations determined by pXRF and soil available-K from traditional WC is attributed to the fact that total K concentrations include several forms of K in the soil. In addition to the available concentrations, fixed-K associated to hydroxy-interlayered vermiculite and structural-K 36 associated mainly with the crystalline structure of muscovite 37 , microcline, orthoclase, and biotite (Supplementary Table S6 ) also exist in soils. Those non-available K forms might become available in the presence of plants 37 . However, such availability is not detected with conventional extraction methods. This will be further discussed in the next subsection.
The prediction of soil available-Ca, -S, and -Cu was inconsistent and results varied from one experiment to another. In the 2 nd experiment, none of the common regression models (LR, PolR, PowR) exhibited a significant fit as observed in the 1 st trial (Table 5 ), and it varied to a higher extent for S and Cu when compared to Ca. However, SMLR was successful for those nutrients, and a common variable, Al, affecting the prediction of soil available-Ca was observed in both experiments when using M3 (Tables 5 and 6 ). Hornblende, Augite, and Anorthite are primary minerals containing Ca, Al, and Si and are relatively resistant to weathering (Supplementary Table S6 ), thus not providing readily available Ca in the soil, although are easily measured with a pXRF.
As in the 1st experiment, SMLR succeeded in predicting the availability of Mg, Fe, and Zn (Supplementary Fig. S3 ), nutrients unsuccessfully fitted in the common regression models (LR, PolR, and PowR) (Tables 5 and 6 ). It is assumed that those elements are mostly found in non-available fractions in the soil measured by pXRF, and Mg and Fe are omnipresent components of those mineral fractions. It is further evidenced by evaluating the variables most affecting Mg and Fe prediction from both experiments. When comparing the same solution used for extraction, Al, Si, and Ca are elements commonly affecting the prediction of available Mg and Fe (Table 6 ) and are components of Mg/Fe-bearing minerals such as biotite, hornblende, augite, olivine (Supplementary Table S6 ), and dolomite, this last particularly an Mg-based mineral found in the area of study 34 .
In the work of Dasgupta et al. 38 , the agroclimatic zone also appeared influential in predicting available K, Mg, Zn, and Fe with a pXRF. The work of Mancini et al. 39 , oppositely, has found good predictions of exchangeable/available Ca and Mg; however, those authors used a combined/fused application of pXRF + Vis–NIR + NixPro™. The available Fe prediction was also successfully achieved only when using a combined application of pXRF and magnetic susceptibility 32 . On the other hand, the exchangeable-K prediction of Dasgupta et al. 38 was as poor as ours in the second experiment, with R 2 values less than 0.5 (Fig. 4 ).
Soil available Mg could not be predicted with pXRF in either experiment, which might be a consequence of the complex and heterogenous soil matrix making Mg-pXRF measurements very sensitive. Portable XRF faces limitations when it comes to light elements such as Mg 40 . This is due to the challenges linked to Mg's low atomic number (Z = 12), resulting in reduced sensitivity and detectability in pXRF instruments. It's important to note that the detection limit in pXRF is affected by the atomic number, with lower values yielding weaker X-ray signals. Additionally, matrix effects, caused by complex sample compositions or the presence of other elements, can hinder accurate Mg quantification by influencing X-ray absorption and scattering. Furthermore, the X-ray properties of light elements like Mg are less favorable for pXRF, leading to weaker emissions and potential difficulties in detection and quantification. Spectral interferences from neighboring elements in the X-ray spectrum can further complicate Mg analysis by hindering accurate differentiation and quantification.
The inconsistency with exchangeable-Ca from one experiment to another is also attributed to the heterogeneity of the soil matrix mainly because the plant factor was added. This is further confirmed by the work of Benedet et al. 6 who provided a reliable relationship between Ca/Mg as determined by traditional laboratory methods based on WC and pXRF; however, they evaluated limestone and lime-based materials which offered a more homogeneous matrix for pXRF measurements.
Portable X-ray fluorescence spectrometry is a valuable analytical technique for elemental analysis in soil and environmental science. When assessing biochar and soil matrices separately, as well as in combination, several factors can influence XRF measurements. Biochar, a carbonaceous material resulting from organic matter pyrolysis, may exhibit low mineral content, potentially resulting in weaker XRF signals for mineral elements such as Si, Ca, and Fe. Soil, on the other hand, is a complex matrix containing diverse mineral compositions, depending on type and location, often with high concentrations of elements like Si, Al, Fe, and Ca. Elemental interference can occur due to overlapping spectra in the matrices, and variations in density, homogeneity, particle size, and sample preparation can impact measurement accuracy. When biochar and soil are combined, mixing ratios, homogenization, sample heterogeneity, and the need for specialized calibration standards become crucial considerations. In conclusion, successful XRF analysis of biochar, soil, or their mixtures necessitates careful attention to these factors, meticulous sample preparation, and calibration to mitigate matrix effects and ensure the reliability of results in diverse environmental contexts.
It is crucial to highlight that in our experiment, SGB contains a notably high carbon (C) content when compared to PLB, which, conversely, exhibits a significantly higher ash content. Consequently, the enrichment of soil nutrients stemming from the latter biochar aligns more effectively with the escalating rates of biochar application. In this context, it is essential to underscore the substantial influence of carbon presence within a sample matrix on pXRF spectroscopy outcomes. Carbon's X-ray absorption and scattering can attenuate signal intensity and result in spectral overlap with other elements, potentially leading to calibration discrepancies. This underscores the need for specialized procedures to ensure accurate analysis of carbon-rich samples. Employing proper sample preparation techniques becomes imperative in mitigating these challenges and guaranteeing dependable XRF results when dealing with samples bearing significant carbon content.
Given that P was the nutrient presenting consistent results in both experiments, it was chosen as the primary factor for predicting the maximum responsive concentration in the soil. In contrast, K showed some promise compared to other nutrients, albeit not as consistently as P. This selection was based on establishing a relationship between P&K-pXRF measurements in the soil and the P&K contents in plant tissues, as depicted in Figs. 5 and 6 .
The trends of phosphorus (P) uptake in ryegrass shoots (P Tissue ) as a function of pXRF‒P and available‒P in the soil from several extraction methods. Significant fits to the linear-with-upper-plateau statistical model were obtained. *** : p < .001. Values followed by ± are the standard deviation. WEP: water extractable phosphorus.
The trends of potassium (K) uptake in ryegrass shoots (K Tissue ) as a function of pXRF‒K and soil available‒K from two extraction methods. Significant fits to the linear-with-upper-plateau statistical model were obtained. *** : p < .001. Values followed by ± are the standard deviation. NS: non-significant ( p > .05).
The segmented model was able to predict the maximum responsive P concentration in the soil to the P uptake by plants while using a pXRF, and so did the conventional extraction methods since their relationships to pXRF measurements were equally successful (Figs. 4 and 5 ). Concentrations of P from pXRF were also highly correlated ( p < 0.001) with their concentrations in ryegrass tissues (r = 0.74) ( data not shown ). Therefore, in the case of P, not only does the utilization of a pXRF device allow us to confidently determine 100% sufficiency for ryegrass cultivation, but the robust linear regression constructed (Fig. 4 ), comparing results from traditional methods with pXRF measurements, demonstrates its effectiveness even at elevated levels of P concentration. This highlights the pXRF technology extends its applicability beyond the 100% sufficiency threshold for cash crops. Consequently, pXRF can play a vital role in interpreting the dynamics of P buildup and accumulation in soil, which holds significant environmental implications.
Table 7 presents the STP values calculated from the equations of LR, PolR, and PowR models when using the joint point ( njoint ) obtained by XRF and vice-versa. When comparing the ‘ njoint (pXRF)’ and the ‘predicted value (pXRF)’ it is noticed that LR gives the closest values obtained by pXRF in the segmented model, except for the WEP (Table 7 ). This reinforces the reliability of using LR to predict soil available P with pXRF for fertilizer recommendation. Interestingly, the opposite is observed when comparing 'predicted value (TM)' and ' njoint (TM)' since PowR exhibited the closest values to the joint point obtained with traditional methods (Table 7 ). This also makes sense since segmented models are non-linear models and so are PowR models.
The non-linear segmented model did not fit for pXRF‒K vs K Tissue as it fitted for pXRF‒P vs P Tissue . It might be a consequence of the weak linear relationship between pXRF‒K × M3‒K and NH 4 Ac‒K in the 2nd experiment (Fig. 4 ). However, the pXRF‒K × M3‒K relationship in the 1st experiment was good (Fig. 3 ). Thus, the plant factor (2nd experiment) seems to influence the soil available‒K. The total K from pXRF includes other K fractions such as non-exchangeable, which might be extractable by M3 and NH 4 Ac. Those fractions may become available in the presence of plants 37 . Although there is no trend in pXRF-K × K Tissue relationship (Fig. 6 ), it is possible to visually observe a gradual increase, especially for PLB, not a plateau point ( njoint ) since it was not significant. On the other hand, a njoint point is observed for the traditional methods (Fig. 6 ) since they are designed to extract only exchangeable or other readily available forms of K to plants.
The significant influence of silicon (Si) and aluminum (Al), the primary components in K-containing minerals, on predicting available K through pXRF measurements is evident in both experiments (Supplementary Table S6 and Table 6 ). It is important to note that K fractions associated with mineral forms, as measured by pXRF, exhibit different dynamics in terms of plant availability compared to readily available forms. This explains the lack of correlation between pXRF‒K × K Tissue and the weak linear relationship between pXRF‒K × M3‒K and NH 4 Ac‒K in the second experiment. Consequently, statistical parameters RMSE and R 12 from PolR and PowR respectively show more significant improvements over LR (Table 5 ) in the second experiment when plant factors are considered, reinforcing the earlier discussion.
In conclusion, the values for a potential joint point ( njoint ) derived from pXRF measurements range from 5875 to 6112 mg K kg −1 , depending on the extraction method employed (Table 7 ). These values exceed those calculated using LR (5476 mg K kg −1 ), demonstrating that the availability of previously unavailable K does not adhere to a linear relationship during plant cultivation because K forms can be indiscriminately measured by pXRF.
This work evaluated the applicability of a pXRF instrument to predict the available concentrations of macro- and micro-nutrients in a soil amended with various rates of biochars derived from two different feedstocks. The soils studied with and without plants encompassed wide ranges of pH and OC and well represented common agricultural fields. Overall, XRF sensing results are highly correlated with plant-available P from conventional WC methods, but it failed to predict secondaries or micronutrients accurately. More research is needed before XRF can be recommended as a routine tool for nutrient management. In this study, P showed a strong relationship between the total amounts by pXRF and plant available forms of several different extraction methods. Distinguished biochar outcomes were integrated into regression models, with SGB serving as a link between P levels observed in the control and the higher P levels achieved with PLB. Nevertheless, special consideration should be given to very low to low P levels in alternative agricultural scenarios to accurately evaluate XRF as a test for available P.
Potassium prediction could be affected by the presence of plants grown in the soil. It is imperative to acknowledge that, despite rigorous analysis and attempts at data transformation, the K dataset cannot be conformed to exhibit the same robustness observed in the P dataset, which consistently demonstrated efficacy across all experimental conditions explored in this study. The linear relationship was inconsistent for Ca, S and Cu when comparing the two experiments. Therefore, predictions for those nutrients with pXRF might be influenced by some effect originating from ryegrass cultivation. The pXRF did not work to linearly predict the availability of Mg, Fe, and Zn.
It is pertinent to note that the limited accuracy in predicting the presence of various soil nutrients, including Mg, S, Zn, and Cu, could be attributed to the inability of a portable X-ray fluorescence spectrometer (pXRF) to detect these elements when analyzing PLB, as outlined in Table 1 . The difference in the ability of a pXRF to detect certain elements in PLB compared to SGB could be attributed to the distinct elemental compositions of these two biochars, not their quantities since PLB exhibited a much higher content of those elements in comparison to SGB (Table 1 ). We assume that PLB, being derived from a different source, may have a different elemental profile, possibly containing elements in forms or concentrations that are less conducive to pXRF analysis, including the chemical matrix in which elements are bound and their oxidation states. Differences in the organic and inorganic constituents of these biochars can lead to varying detection capabilities by the instrument, resulting in differential accuracy in predicting the presence of Mg, S, Zn, and Cu. Further investigation is necessary to precisely determine the specific reasons behind the variability in pXRF detection performance between contrasting biochars.
Several studies have attempted to assess the predictive capabilities of portable X-ray fluorescence (pXRF) in soil nutrient analysis, yielding somewhat inconsistent results. However, there has been a limited exploration of pXRF utility in quantifying elemental concentrations in biochars for assessing their potential as fertilizers. Our research represents a pioneering effort in bridging these two contrasting matrices—biochar and soil. Notably, our study focused on evaluating two distinct biochars with unique properties. This innovative approach opens avenues for future research endeavors that could involve broader scale pXRF measurements, encompassing a wider array of soils and a more extensive range of biochars. It also paves the way for investigations involving various matrix combinations in agricultural soils subjected to organic amendments. It is highly encouraged to further evaluate the efficacy of a portable XRF in situ to speed up the process of fertilizer recommendation and the accuracy of nutrient management.
All research studies on cultivated plants, including the collection of plant material, is in comply with relevant institutional, national, and international guidelines and legislation.
Authors can make raw data available upon proper request after consulting the corresponding author.
Linear regression
Poultry litter biochar
Polynomial regression
Power regression
Switchgrass biochar
Stepwise multiple linear regression
Faria, Á. J. et al. Rapid elemental prediction of heterogeneous tropical soils from PXRF DATA: A comparison of models via linear regressions and machine learning algorithms. Soil Res. 61 , 598–615 (2023).
Article Google Scholar
Silva, S. H. et al. PXRF in tropical soils: Methodology, applications, achievements and challenges. Adv. Agron. https://doi.org/10.1016/bs.agron.2020.12.001 (2021).
Pelegrino, M. H. et al. Prediction of soil nutrient content via pXRF spectrometry and its spatial variation in a highly variable tropical area. Precision Agricult. 23 , 18–34 (2021).
Rawal, A. et al. Determination of base saturation percentage in agricultural soils via portable X-ray fluorescence spectrometer. Geoderma 338 , 375–382 (2019).
Article ADS Google Scholar
Benedet, L. et al. Rapid soil fertility prediction using X-ray fluorescence data and machine learning algorithms. CATENA 197 , 105003 (2021).
Article CAS Google Scholar
Benedet, L. et al. Clean quality control of agricultural and non-agricultural lime by rapid and accurate assessment of calcium and magnesium contents via proximal sensors. Environ. Rese. 221 , 115300 (2023).
Nawar, S., Richard, F., Kassim, A. M., Tekin, Y. & Mouazen, A. M. Fusion of gamma-rays and portable X-ray fluorescence spectral data to measure extractable potassium in soils. Soil Tillage Res. 223 , 105472 (2022).
Antonangelo, J. A., Zhang, H. & Sitienei, I. Biochar amendment of a metal contaminated soil partially immobilized Zn, pb, and CD and reduced ryegrass uptake. Frontiers in Environ. Sci. https://doi.org/10.3389/fenvs.2023.1170427 (2023).
Borges, B. M. M. N. et al. Chemical and spectroscopic evaluations supporting superior P availability after biochar-P fertilizer application. Soil Tillage Res. 223 , 105487 (2022).
Hass, A. et al. Chicken manure biochar as liming and nutrient source for acid Appalachian soil. J. Environ. Q. 41 , 1096–1106 (2012).
Novak, J. M., Johnson, M. G. & Spokas, K. A. Concentration and release of phosphorus and potassium from lignocellulosic- and manure-based Biochars for fertilizer reuse. Frontiers in Sustainable Food Systems 2, 1–9 (2018).
Antonangelo, J. A. & Zhang, H. Heavy Metal phytoavailability in a contaminated soil of northeastern Oklahoma as affected by Biochar Amendment. Environ. Sci. Pollut. Res. 26 , 33582–33593 (2019).
Antonangelo, J. A., Zhang, H., Sun, X. & Kumar, A. Physicochemical properties and morphology of biochars as affected by feedstock sources and pyrolysis temperatures. Biochar 1 , 325–336 (2019).
Antonangelo, J. A., Souza, J. L., Whitaker, A., Arnall, B. & Zhang, H. Evaluation of mehlich-3 as a multi-element extractant of micronutrients and sulfur in a soil–ryegrass system amended with varying biochar rates from two feedstocks. Land 11 , 1979 (2022).
Zhang, H., Antonangelo, J. & Penn, C. Development of a rapid field testing method for metals in horizontal directional drilling residuals with XRF sensor. Sci. Rep. https://doi.org/10.1038/s41598-021-83584-4 (2021).
Article PubMed PubMed Central Google Scholar
Burt, R. Soil Survey Laboratory methods manual (Scientific Publishers, 2004).
Google Scholar
Nelson, D. W. & Sommers, L. E. Total carbon, organic carbon, and organic matter. SSSA Book Series https://doi.org/10.2136/sssabookser5.3.c34 (2018).
Mehlich, A. Mehlich 3 soil test extractant: A modification of Mehlich 2 extractant. Commun. Soil Sci. Plant Anal. 15 , 1409–1416 (1984).
Brown, J. R. Recommended chemical soil test procedures for the North Central Region (University of Missouri-Columbia, 1998).
Lindsay, W. L. & Norvell, W. A. Development of a DTPA soil test for zinc, iron, manganese, and copper. Soil Sci. Soc. Am. J. 42 , 421–428 (1978).
Article ADS CAS Google Scholar
Olsen, S. R., Watanabe, F. S. & Cole, C. V. Effect of sodium bicarbonate on the solubility of phosphorus in calcareous soils. Soil Sci. 89 , 288–291 (1960).
Bray, R. H. & Kurtz, L. T. Determination of total, organic, and available forms of phosphorus in soils. Soil Sci. 59 , 39–46 (1945).
Roswall, T. et al. Hotspots of legacy phosphorus in agricultural landscapes: Revisiting water-extractable phosphorus pools in soils. Water 13 , 1006 (2021).
Normandin, V., Kotuby-Amacher, J. & Miller, R. O. Modification of the ammonium acetate extractant for the determination of exchangeable cations in calcareous soils. Commun. Soil Sci. Plant Anal. 29 , 1785–1791 (1998).
Church, C., Spargo, J. & Fishel, S. Strong acid extraction methods for “total phosphorus” in soils: EPA method 3050B and EPA method 3051. Agricult. Environ. Lett. 2 , 160037 (2017).
Jones, J. B. & Case, V. W. Sampling, handling, and analyzing plant tissue samples. SSSA Book Series https://doi.org/10.2136/sssabookser3.3ed.c15 (2018).
Wang, Q., Koval, J. J., Mills, C. A. & Lee, K.-I.D. Determination of the selection statistics and best significance level in backward stepwise logistic regression. Commun. Stat. Simul. Comput. 37 , 62–72 (2007).
Article MathSciNet CAS Google Scholar
Brewer, M. J., Butler, A. & Cooksley, S. L. The relative performance of AIC, AICc and BIC in the presence of unobserved heterogeneity. Methods Ecol. Evol. 7 , 679–692 (2016).
Stammer, A. J. & Mallarino, A. P. Plant tissue analysis to assess phosphorus and potassium nutritional status of corn and soybean. Soil Sci. Soc. Am. J. 82 , 260–270 (2018).
Weindorf, D. C. & Chakraborty, S. Portable x-ray fluorescence spectrometry analysis of Soils. Soil Sci. Soc. Am. J. 84 , 1384–1392 (2020).
Andrade, R. et al. Micronutrients prediction via PXRF Spectrometry in Brazil: Influence of weathering degree. Geoderma Reg. 27 , e00431 (2021).
Pierangeli, L. M. et al. Combining proximal and remote sensors in spatial prediction of five micronutrients and soil texture in a case study at farmland scale in southeastern Brazil. Agronomy 12 , 2699 (2022).
Beattie, R. E. et al. Quantitative analysis of the extent of heavy-metal contamination in soils near Picher, Oklahoma, within the Tar Creek Superfund Site. Chemosphere 172 , 89–95 (2017).
Article ADS CAS PubMed Google Scholar
Schaider, L. A., Senn, D. B., Brabander, D. J., McCarthy, K. D. & Shine, J. P. Characterization of zinc, lead, and cadmium in mine waste: Implications for transport, exposure, and bioavailability. Environ. Sci. Technol. 41 , 4164–4171 (2007).
Resende, M. Mineralogia de Solos Brasileiros interpretação E APLICAÇÕES (UFLA, 2005).
Meena, V. S., Maurya, B. R. & Verma, J. P. Does a rhizospheric microorganism enhance K+ availability in agricultural soils?. Microbiol. Res. 169 , 337–347 (2014).
Article CAS PubMed Google Scholar
Firmano, R. F. et al. Potassium reserves in the clay fraction of a tropical soil fertilized for three decades. Clays Clay Min. 68 , 237–249 (2020).
Dasgupta, S. et al. Influence of auxiliary soil variables to improve PXRF-based soil fertility evaluation in India. Geoderma Reg. 30 , e00557 (2022).
Mancini, M. et al. Proximal sensor data fusion for Brazilian soil properties prediction: Exchangeable/available macronutrients, aluminum, and potential acidity. Geoderma Regional 30, (2022).
Marguí, E., Queralt, I. & de Almeida, E. X-ray fluorescence spectrometry for Environmental Analysis: Basic principles, instrumentation, applications and recent trends. Chemosphere 303 , 135006 (2022).
Article PubMed Google Scholar
Download references
This study was funded by the Oklahoma Agricultural Experiment Station and Washington State University CAHNRS.
Authors and affiliations.
Department of Crop and Soil Sciences, Washington State University, Pullman, WA, USA
Joao Antonangelo
Plant and Soil Sciences Department, Oklahoma State University, Stillwater, OK, USA
Hailin Zhang
You can also search for this author in PubMed Google Scholar
JA and HZ contributed to the conception and design of the study. JA organized the database. JA performed the statistical analysis. JA and HZ wrote the first draft of the manuscript. JA and HZ wrote sections of the manuscript. JA contributed to the methodology. JA contributed to visualization. JA contributed to the data validation and data curation. HZ contributed to the resources. JA and HZ contributed to writing the first draft, reviewing, and editing. HZ contributed to supervision, project administration, and funding acquisition. JA and HZ contributed to the manuscript revision, and read, and approved the submitted version.
Correspondence to Joao Antonangelo .
Competing interests.
The authors declare no competing interests.
Publisher's note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information., rights and permissions.
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .
Reprints and permissions
Cite this article.
Antonangelo, J., Zhang, H. Assessment of portable X-ray fluorescence (pXRF) for plant-available nutrient prediction in biochar-amended soils. Sci Rep 14 , 20377 (2024). https://doi.org/10.1038/s41598-024-71381-8
Download citation
Received : 08 March 2024
Accepted : 27 August 2024
Published : 02 September 2024
DOI : https://doi.org/10.1038/s41598-024-71381-8
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.
Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.
IMAGES
VIDEO
COMMENTS
The independent variable is the cause. Its value is independent of other variables in your study. The dependent variable is the effect. Its value depends on changes in the independent variable. Example: Independent and dependent variables. You design a study to test whether changes in room temperature have an effect on math test scores.
Here are several examples of independent and dependent variables in experiments: In a study to determine whether how long a student sleeps affects test scores, the independent variable is the length of time spent sleeping while the dependent variable is the test score. You want to know which brand of fertilizer is best for your plants.
A dependent variable is the variable being tested in a scientific experiment. The dependent variable is "dependent" on the independent variable. As the experimenter changes the independent variable, the change in the dependent variable is observed and recorded. When you take data in an experiment, the dependent variable is the one being measured.
The dependent variable is the variable that is being measured or tested in an experiment. This is different than the independent variable, which is a variable that stands on its own. For example, in a study looking at how tutoring impacts test scores, the dependent variable would be the participants' test scores since that is what is being ...
A dependent variable is the variable that is tested and measured in a scientific experiment. It is sometimes called the responding variable. The dependent variable gets its name from the fact that it depends on the independent variable. As the experimenter manipulates the independent variable, a change in the dependent variable is observed and ...
Definition: Dependent variable is a variable in a study or experiment that is being measured or observed and is affected by the independent variable. In other words, it is the variable that researchers are interested in understanding, predicting, or explaining based on the changes made to the independent variable. ...
To be a true dependent variable detective, let's revisit its definition: a dependent variable is what we measure in an experiment and what changes in response to the independent variable. It's like the echo to a shout, the reaction to an action. Relationship with Changes. In the dance of variables, the dependent variable is the one that ...
Dependent Variable Examples. For example, say you want to know whether the amount you eat changes from day to day. You can set this up as an experiment in which you record food ingested over time. You add up all the calories you eat during a day or you measure the mass of food per day. To get meaningful data, you carry out the project for a month.
In research, a variable is any characteristic, number, or quantity that can be measured or counted in experimental investigations. One is called the dependent variable, and the other is the independent variable. In research, the independent variable is manipulated to observe its effect, while the dependent variable is the measured outcome.
The independent variable is the one you control, while the dependent variable depends on the independent variable and is the one you measure. The independent and dependent variables are the two main types of variables in a science experiment. A variable is anything you can observe, measure, and record. This includes measurements, colors, sounds ...
A variable is considered dependent if it depends on an independent variable.Dependent variables are studied under the supposition or demand that they depend, by some law or rule (e.g., by a mathematical function), on the values of other variables.Independent variables, in turn, are not seen as depending on any other variable in the scope of the experiment in question.
by Zach Bobbitt February 5, 2020. In an experiment, there are two main variables: The independent variable: the variable that an experimenter changes or controls so that they can observe the effects on the dependent variable. The dependent variable: the variable being measured in an experiment that is "dependent" on the independent variable.
While the independent variable is the " cause ", the dependent variable is the " effect " - or rather, the affected variable. In other words, the dependent variable is the variable that is assumed to change as a result of a change in the independent variable. Keeping with the previous example, let's look at some dependent variables ...
Independent variables are also known as predictors, factors, treatment variables, explanatory variables, input variables, x-variables, and right-hand variables—because they appear on the right side of the equals sign in a regression equation. In notation, statisticians commonly denote them using Xs. On graphs, analysts place independent variables on the horizontal, or X, axis.
Independent and Dependent Variable Examples. In a study to determine whether the amount of time a student sleeps affects test scores, the independent variable is the amount of time spent sleeping while the dependent variable is the test score. You want to compare brands of paper towels to see which holds the most liquid.
The dependent variable (sometimes known as the responding variable) is what is being studied and measured in the experiment. It's what changes as a result of the changes to the independent variable. An example of a dependent variable is how tall you are at different ages. The dependent variable (height) depends on the independent variable (age).
Definition of Independent and Dependent Variables. The independent variable and dependent variable are used in a very specific type of scientific study called the experiment.. Although there are many variations of the experiment, generally speaking, it involves either the presence or absence of the independent variable and the observation of what happens to the dependent variable.
Independent and Dependent Variables, Explained With Examples. Written by MasterClass. Last updated: Mar 21, 2022 • 4 min read. In experiments that test cause and effect, two types of variables come into play. One is an independent variable and the other is a dependent variable, and together they play an integral role in research design. Explore.
In an experiment, you manipulate the independent variable and measure the outcome in the dependent variable. For example, in an experiment about the effect of nutrients on crop growth: The independent variable is the amount of nutrients added to the crop field. The dependent variable is the biomass of the crops at harvest time.
Dependent Variable The variable that depends on other factors that are measured. ... It refers to the condition of an experiment that is systematically manipulated by the investigator. It is the presumed cause. ... The best way to understand the difference between a dependent and independent variable is that the meaning of each is implied by ...
The independent variable is the cause. Its value is independent of other variables in your study. The dependent variable is the effect. Its value depends on changes in the independent variable. Example: Independent and dependent variables. You design a study to test whether changes in room temperature have an effect on maths test scores.
The independent variable is the drug, while the patient's blood pressure is the dependent variable. In some ways, this experiment resembles the one with breakfast and test scores. However, when comparing two different treatments, such as drug A and drug B, it's usual to add another variable, called the control variable.
The coefficient β 1 measures the impact of the policy on the dependent variable. ... Table 1 Variable definition. ... In a quasi-natural experiment, various factors may influence the relationship ...
In our experiment, the variables impacting the prediction of soil available nutrients via pXRF are presented in Table 6. This will be further discussed for comparison and to elucidate similar ...