Statistical tests, a process that makes or breaks your research, evaluate the evidence provided by data against the hypothesis depending on the nature of your study. Statistical tests include data collection, data scrubbing, data analysis and data interpretation. There are several statistical tests out there like independent t-test, ANOVA test, Chi-square test, multiple regression tests etc. Among all other statistical analyses, multiple regression test is the most popular and commonly used test.
Multiple regression method, an extension of linear regression method is used to determine the variable based the value of other ( two or more) variables. Multiple regression can be linear or nonlinear, and in rare condition, a dependent variable can be explained by one variable. This method consists of the following stages:
1) Analysing the correlation and directionality of a data
2) Estimating the model
3) Evaluating the usefulness and validity of the model.
The main advantage of using multiple regression method is accuracy in the result and use of multi-variables. Some of the other advantages are 1) causal analysis, 2) forecasting an effect, and 3) trend forecasting. But before performing the test, one has to ensure that all the assumptions are met.
Let’s have a look at the assumptions.
- Assumption 1: The dependent variable must be measured on a continuous scale. Types of variables that meet this criterion are revision time, intelligence, exam performance, and weight.
- Assumption 2: Here you will have two or more independent variables. These variables can be either continuous or categorical.
- Assumption 3: One must have an independence of observations which can easily be checked using the Durbin-Watson statistic.
- Assumption 4: There should be a linear relationship between (1) the dependent variable & each of the independent variables and (2) the dependent variable and the collective independent variables. To check the linear relationship, it is advisable to create scatter plots and partial regression plots.
- Assumption 5: The data needs to display homoscedasticity.
- Assumption 6: The data should not show multicollinearity. Technically, multicollinearity occurs when there are two or more independent variables that are correlated with each other.
- Assumption 7: There should not be significant outliers, high leverage points and highly influential points. Outliers, leverage and influential points represent observations in the data set. These classifications of unusual points reflect an impact on the regression line.
- Assumption 8: Residuals should be appropriately distributed. Two methods to check this assumption include a histogram and a Normal P-P Plot.
If you still have any query or you are not confident enough to conduct this test, consider taking help from statisticians.