  #### Top Courses #### Machine Learning with Python & Statistics 4 (4,001 Ratings) 218 Learners

#### Live Masterclass on "Python for Artificial Intelligence" Dec 4th (7:00 PM) 208 Registered
More webinars

# What is R-Squared And Adjusted R-Squared? Sulochana Kamshetty

5 months ago R-Squared And Adjusted R-Squared | insideaiml

## Table of Content

• Introduction
• What is R-squared (R2)
• SST is The Total Sum of Squares.
• Limitations of R-squared.
• Difference Between R-Squared VS Adjusted R-Squared?
• Conclusion

# Introduction

When I look at the movement, of the stock market, I get nervous !! thanks to the mathematicians who came up with great ideas to provide easy hacks to learn about stock speculation, and movement. If we really want to make decisions about the stock market, who is interested, the simple things you need to learn that basic knowledge about statistics, and the metrics of the error with the right information, which we will include in this content.
Before understanding, these concepts one has to be clear with important topic i.e“ Logistic Regression” which made me pick the knowledge from InsideAIML provided content for facile understanding.

# What is R-squared (R2)

In R-squared coefficient is the determination of a statistical tool, which measures the level of safety of any operation, which may be due to the performance of a particular benchmark indicator.
R-squared (R2) is a statistical measure that represents the proportion of the variance for a dependent variable that's explained by an independent variable or variables in a regression model.
Whereas correlation explains the strength of the relationship, between an independent and dependent variable, R-squared explains to what extent the variance of one variable, explains the variance of the second variable. So, if the R2 of a model is 0.50, then approximately half of the observed variation can be explained by the model’s inputs.
“Higher the “R-squared” the more variation is explained by our input variables so it is called the good model”.

Recommended blog for you : Linear Regression Vs Logistic Regression

Note :-
Instead of working with multiple parameters, the easiest way to understand it is to check the accuracy of the model by looking at the R-squared, directly from the statistical description sheet from the generated project.
Let’s understand these points of R-squared, with the example of S&P portfolio performance.
• The R-squared value, which is not to quantify the performance of the portfolio; instead, it shows the correlation between the performance of the portfolio, as well as the benchmark index.
• The R-squared value, which is measured on a scale from 1 to 100, the higher the value. the greater a portfolio, this is related to the performance, the greater is its display.
• The R-squared value of 100 indicates that all movement of the effects can be accounted for by the index's movements.
• The R-squared value of the index and the value is between 70 and 100 indicating a strong correlation between the real rate of return on the portfolio, while the benchmark value is below 70, which means that there is an average correlation of.
• Security, the R-squared value means that 90% of the movements in the market price of the securities may occur at the time of the index, allowing investors to evaluate the indicators of the portfolio and the index, and the prediction of how the portfolio of work and in the future, as compared to the one built into the market.
• For example, Mr X is looking for potential stocks of the S&P's portfolio, and the data are highly correlated, share it with an R-squared value of the price of the shares of A, B, and C, with an R-squared value from the price of 85, 95, and 45 respectively, for Mr X will direct to exclude all stock C(45). Since it has a high R-squared value is to be set out in the technical indicator on the price chart for the convenience of the self-assessment.
The summary provides two R-squared values, namely Multiple R-squared, and Adjusted R-squared.
The Multiple R-squared is calculated as follows:
Multiple R-squared = 1 — SSE/SST
where:
1. SSE is the sum of square of residuals.
Residual is the difference between, the predicted value and the actual value, and can be accessed by prediction Models residuals.

# SST is The Total Sum of Squares

It is calculated by summing the squares of difference between the actual value and the mean value.
For example,
let's say that we have different models with the value of 5, 6, 7, and 8, and a model predicts the outcomes as 4.5, 6.3, 7.2, and 7.9. Then,
SSE can be calculated as:
SSE = (5–4.5) ^ 2 + (6–6.3) ^ 2 + (7–7.2) ^ 2 + (8–7.9) ^ 2;
Here it calculates the difference between the original value with the predicted values where the output as:
0.04,
0.09,
0.04,
0.01.
SST can be calculated as:
mean = (5 + 6 + 7 + 8) / 4 = 6.5;
SST = (5–6.5) ^ 2 + (6–6.5) ^ 2 + (7–6.5) ^ 2 + (8–6.5) ^ 2.
0.25,
0.25,
0.25,
2.25. respectively
So these were basic concepts, of error metrics but let’s work with adjusted R-squared.

# Limitations of R-squared

• However, the problem with R-squared is that it can remain the same or increase with the addition of many variants, even if they do not have a relationship with the output variables.
• Therefore, if you are creating a linear regression of many variables, it is always recommended that you use Adjusted R-squared to judge the beauty of the model. In case you have one input variable, the R-square and Adjusted R squared will be exactly the same.
• Typically, the less important variables add to the model, the gap in R-squared and Adjusted R-squared increases. The Adjusted R-squared value is similar to the Multiple R-squared value,
but it accounts for the number of variables. This means that the Multiple R-squared will always increase.
when a new variable is added to the prediction model, but if the variable is a non-significant one, the Adjusted R-squared value will decrease.
Here,
• n represents the number of data points in our dataset
• k represents the number of independent variables, and
• R represents the R-squared values determined by the model.

# Difference Between R-Squared VS Adjusted R-Squared?

## R-Squared

• R-Squared shows how well terms (data points) fit a curve or line.
• An R-squared value of 1 means that it is a perfect prediction model.

• Adjusted R-Squared is a special form of R-Squared, the coefficient of the determination.
• Adjusted R-squared is to judge goodness of model.
• Adjusted R-Squared also indicates how well terms fit a curve or line, but adjusts for the number of terms in a model.