Many ways can be used for Excel data analysis, such as descriptive statistics and regression analysis. As well, we use correlation matrix and graphs for comparison as regression lines; Bar charts, trend lines, etc. The details of each aspect are given in the following section. This analysis shows that data on the section is more volatile than the course. While both variables satisfy the property of the classical linear regression model and are normally distributed with constant mean and constant variance. The result of the regression analysis shows that both variables are not related because of their highly insignificant. So, to find the appropriate relationship, we will have to add some relevant variables to the model as multiple linear regression models can remove the specification bias.
Different analyses can be run on the project of given data, which describe the demand for tutoring at the MCOB tutoring center during 2019. However, Co-variance analysis appears the most acceptable. Co-variance is about data variability. It tells us how much two random variables fluctuate together. Co-variance has resembled the variance and the variance describes how the single variable varies. Our results given in the table tell us that the covariance between both variables is almost 443444.4. Covariance and variance move together. It means the values of both variables are highly correlated with each other.
Table # 1: Co-Variance between course and section
- Course Section
- Course 1088490
- Section 443444.4 1219263
Descriptive statistics analysis is the main analysis that is required for the analysis before any regression. Descriptive statistics are considered the key part of any analysis. Descriptive statistics analysis of course and sections is following.
Table # 2: Descriptive Statistics of the course and section
- Descriptive Statistics Course Section
- Mean 3652.484 22057.59
- Standard Error 38.45429 40.66836
- Median 3304 22114
- Mode 3302 22114
- Standard Deviation 1044.655 1104.803
- Sample Variance 1091304 1220590
- Kurtosis 1.168211 -0.58067
- Skewness 1.264636 0.674953Range 5075 4275
- Minimum 1301 20586
- Maximum 6376 24861
- Sum 2695533 16278505
Count 738 738
As in the above-given results, the mean of the course is 3652, and the mean of the section is 22057, which is higher than the mean of the section. Moreover, the cross-comparison of both variables across time subjects and dates is given below. The standard deviation shows the dispersion of the data. It is higher in the section than in the course.
However, the standard error of the course is 38.45, while the standard error of the section is 20057.59, and the standard deviation of the course is almost like the standard deviation of the section. The maximum number of the course is around 2695533 and the maximum number of the class is 24861. From the comparison, we can see that the section on descriptive analysis is richer in numbers than the courses.
Regression analysis is mostly used to check the impact of the independent variables on the independent variable or the outcome variable.
The value of the R square is not high, and the value of the adjusted R square is also not high in the case of having one independent variable. We will check for its fluctuations with time. On the other side, its R square is also not high, so it is not a good kind of regression that proves that variables are not highly related to each other. When the R square is too low, it indicates that we have variables that we have added to the model but are irrelevant to the dependent variable. hence, in this scenario, the adjusted r square will also be lower than the explanatory model. It will decrease due to two reasons one for the decrease in the degree of freedom and the other reason due to the addition of the irrelevant variable.
Based on the above analysis, it is concluded that data on the section is more volatile than on the course. Regression results show that both the coefficient and the intercept are insignificant. There is no relationship between both variables, so, it’s a spurious regression. As the ordinary least square estimate is not statistically significant, it can be concluded that it has no significant impact on the regressor on any level of significance. Moreover, when the model is not giving good estimates, it shows that there is some specification bias. it may be due to the missing variables, wrong functional form, etc. Also, graphs represent that data is stationary and that both mean and variance are constant over time. It is one of the main properties of the classical linear regression model that data is normally distributed.
Usually, the investigator seeks to verify the causal impact of one variable upon any other. So, to remove the problem, which is present in our model, we should include the independent variables that are related to the dependent variable and uncorrelated with the error term.