http://bit.ly/31RDXAJ

This is for my understanding and not intended for public use.

Regression is the lesser sensational cousin of Deep Neural Networks, but works best in large enterprises where there is a need for interpretability when it comes to making decisions.

Regression Analysis

• Regression analysis is a form of predictive modelling technique which investigates the relationship between a dependent and independent variable.
• Used for forecasting, time series modelling, finding casual effect relationships between variables.
• Fit a line to the data in such a way that the differences between the distance of data points from the curve is minimal.
• Helps find the significant relationships and strength of impact between and on variables.

Type of regression techniques

• Number of independent variables
• Shape of regression lines
• Type of dependent variable

Common Terminologies

Regression coefficient : A regression coefficient in multiple regression is the slope of the linear relationship between the criterion variable and the part of a predictor variable that is independent of all other predictor variables.

Linear Regression

• Dependent variable is continuous and independent variables can be continuous or discrete, while the nature of regression is linear
• Uses a best fit straight line.

Y=a+b*X + e

• Calculates the best fit straight line by minimizing the sum of the squares of the vertical deviations from each data point to the line.
• Theremust be a linear relationshipbetween the independant and independent variables.
• Multiple regression suffers from
• Multicollinearity : occurs when independent variable in a model are correlated. If correlation between vars is high enough it can cause problems when you fit the model.

The interpretation of aregression coefficientis that it represents themeanchange in the dependent variable for each 1 unit change in an independent variable when youhold all of the other independent variables constant.

However when variables are correlated it indicates that changes in one variable are associated with a shift in another.

In multicollinearity, even though the least squares estimates are unbiased, their variances are large which deviates the observed value far from the true value.

• Structural multicollinearity : This type occurs when we create a model term using other terms. In other words, it’s a byproduct of the model that we specify rather than being present in the data itself. For example, if you square term X to model curvature, clearly there is a correlation between X and X2.
• Data multicollinearity : This type of multicollinearity is present in the data itself rather than being an artifact of our model. Observational experiments are more likely to exhibit this kind of multicollinearity.
https://statisticsbyjim.com/regression/multicollinearity-in-regression-analysis/
• Autocorrelation : Autocorrelation, also known as serial correlation, is the correlation of a signal with a delayed copy of itself as a function of delay. Informally, it is the similarity between observations as a function of the time lag between them