All Courses

Why is my f1_scores different when i calculate them manually vs output by sklearn.metrics

By Ani99, 2 years ago
  • Bookmark
0

dataset = pd.read_csv('diabetes-data.csv')

zero_not_accepted = ['Glucose', 'BloodPressure', 'SkinThickness', 'BMI', 'Insulin']

for column in zero_not_accepted:
    dataset[column] = dataset[column].replace(0, np.NaN)
    mean = int(dataset[column].mean(skipna=True))
    dataset[column] = dataset[column].replace(np.NaN, mean)
    
X = dataset.iloc[:, 0:8]
y = dataset.iloc[:, 8]
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0, test_size=0.2)

print(X_test)

sc_X = StandardScaler()
X_train = sc_X.fit_transform(X_train)
X_test = sc_X.transform(X_test)

classifier = KNeighborsClassifier(n_neighbors=11, p=2, metric="euclidean")

import math
math.sqrt(len(y_test))

classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
cm = confusion_matrix(y_test, y_pred)

My final confusion matrix is [[94 13] [15 32]]

This is where it get confusing, if I calculate the F1 score manually, I get 0.8704. However, in python it returned me 0.6956 using f1_score(y_test, y_pred). Can anyone please explain to me what was the issues?

Additional information: I tried to print the classification_report(y_test, y_pred)) and this is the output: *

Classification Report:

               precision    recall  f1-score   support

           0       0.86      0.88      0.87       107
           1       0.71      0.68      0.70        47

    accuracy                           0.82       154
   macro avg       0.79      0.78      0.78       154
weighted avg       0.82      0.82      0.82       154


Ai
F1_score
Sklearn.metrics
0 Answer
Your Answer

Webinars

Why You Should Learn Data Science in 2023

Sep 30th (7:00 PM) 1117 Registered
More webinars

Related Discussions

Running random forest algorithm with one variable

View More