Mean Squared Logarithmic Error Loss

Kajal Pawar

16 days ago

As in the previous article, we have seen in the previous article what is Mean Squared Error (MSE) and how it works behind the scene. In this article, we will try to extend and see one more variation of MSE which is known as Mean Squared Logarithmic Error (MSLE) and its works.
Now, sometimes we may encounter certain regression problems when the target value has a spread of values and when we try to predict a very large amount, we may not want to punish the model as much as we punish it as a mean squared error.
In Mean Squared Logarithmic Error (MSLE), we begin by calculating the natural logarithm of each of the predicted. Values and calculate the mean squared error, i.e., it is called the Mean Squared Logarithmic Error (MSLE).
It can also be interpreted as a measure of the ratio between the actual and predicted values.

MSLE Mathematical Formula

MSLE is the relative difference between the log-transformed actual and predicted values.
The formula of the MSLE is :
MSLE formula | insideaiml
MSLE formula | insideaiml
where ŷ is the predicted value.
This can also be interpreted as the ratio between true and predicted values ​​and can be written as:
The measure of the ratio between the true and predicted values | insideaiml
The measure of the ratio between the true and predicted values | insideaiml
Note: Here, ‘1’ is added to both y and ŷ is for mathematical convenience since log (0) is not defined but both y or ŷ can be 0.
    
Recommended blogs for you : What are R-Squared and Adjusted R-Squared?

Some of the benefits of using MSLE are mentioned below:

MSLE only care about the percentual difference between the log-transformed actual and predicted values.
MSLE tries to treat small differences between small actual and predicted values approximately the same as big differences between large actual and predicted values.
MSLE tries to treat a small and big difference between the actual and predicted values ​.
For example :
Example of Big and Small differences of MSE loss and MSLE loss | insideaiml
Example of Big and Small differences of MSE loss and MSLE loss | insideaiml
We see when the True value is equal to 40 and the predicted value is equal to 30 and when we calculate the MSE between these values ​​we get 100 and calculate the MSLE equal to 0.07816771.
Similarly, when we have actual value 4000 and predicted value 3000. We found MSE 100000000 and MSLE 0.08271351 respectively.
The difference between the MSEs value of these two cases have a very big difference. And when compare the difference between their MSLE value so we can see it is almost equal or have a very small difference. So, using MSLE it tries to treat small differences between small actual and predicted values approximately the same as big differences between large actual and predicted values.
The difference between the  MSEs of the two cases has a very large difference. And if you compare the difference between their MSLE values ​​so we can see if you are probably equal or have a very small difference. Therefore, using MSLE attempts to treat small differences between the actual and predicted values ​​almost the same as the significant differences between the actual and predicted values.
MSLE penalizes underestimates more than overestimates
MSLE also attempts to penalize the underestimates value more than overestimates values.
For example :
MSLE penalizes underestimates more than overestimates | insideaiml
MSLE penalizes underestimates more than overestimates | insideaiml
In the  above example, two cases are having the same true value 20, and different predicted values 10 and 30 respectively.
In case 1, we can say that the predicted value is underestimated by 10 and in case 2 the predicted value is overestimated by 10.
We get MSE value of both case is same equals to 100. But we get different values for MSLE are  0.07886 and 0.02861.
Here, we see that the difference between the two numbers is quite large. So, we can say that MSLE penalized the underestimated value more than the overestimates value.
MSLE  has the effect of relaxing the punishing effect of large differences in large predicted values.
MSLE as a loss measure may be more appropriate when the model predicts an indirect quantity.

Use of Mean squared logarithmic error

RMSLE is usually used when you don't want to penalize the large differences in the predicted and the actual values when the predicted and the actual values are big numbers.
Example: You want to Predict how many future visitors a restaurant will receive. The future visitors is a continuous value, and therefore, we want to do regression MSLE can here be used as the loss function.

Implementation of MSLE using Python

We can implement Mean Squared Logarithmic error on any regression problem as follows:
# mlp for regression with msle loss function
from sklearn.datasets import make_regression
from sklearn.preprocessing import StandardScaler
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD
from matplotlib import pyplot

# generate regression dataset
X, y = make_regression(n_samples=1000, n_features=20, noise=0.1, random_state=1)
# standardize dataset
X = StandardScaler().fit_transform(X)
y = StandardScaler().fit_transform(y.reshape(len(y),1))[:,0]
# split into train and test
n_train = 500
trainX, testX = X[:n_train, :], X[n_train:, :]
trainy, testy = y[:n_train], y[n_train:]
# define model
model = Sequential()
model.add(Dense(25, input_dim=20, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(1, activation='linear'))
opt = SGD(lr=0.01, momentum=0.9)
model.compile(loss='mean_squared_logarithmic_error', optimizer=opt, metrics=['mse'])

# fit model
history = model.fit(trainX, trainy, validation_data=(testX, testy), epochs=100, verbose=0)
# evaluate the model
_, train_mse = model.evaluate(trainX, trainy, verbose=0)
_, test_mse = model.evaluate(testX, testy, verbose=0)
print('Train: %.3f, Test: %.3f' % (train_mse, test_mse))

# plot loss during training
pyplot.subplot(211)
pyplot.title('Loss')
pyplot.plot(history.history['loss'], label='train')
pyplot.plot(history.history['val_loss'], label='test')
pyplot.legend()

# plot mse during training
pyplot.subplot(212) 
pyplot.title('Mean Squared Error') 
pyplot.plot(history.history['mean_squared_error'], label='train')
pyplot.plot(history.history['val_mean_squared_error'], label='test')
pyplot.legend()
pyplot.show()

Output:

The output of the MSLE for the model on the train and test datasets as
Train: 0.175, Test: 0.190
Then it will plot training and testing loss as shown below:
Plot training and testing loss | insideaiml
Plot training and testing loss | insideaiml
After reading this article, finally you came to know the importance of Mean Squared Logarithmic Error (MSLE). For more blogs/courses in data science, machine learning, artificial intelligence and new technologies do visit us at InsideAIML.
Thanks for reading…
    
Recommended courses for you :
     
Recommended blogs for you:

Submit Review