-
Notifications
You must be signed in to change notification settings - Fork 53
Multiclass ROC curve #98
Comments
This also affects on |
we support multi-class for both (https://sklearn-evaluation.readthedocs.io/en/latest/api/plot.html#sklearn_evaluation.plot.precision_recall) I remember implementing it, but the docs don't have examples. so either I really never implemented it or the input format you have is incorrect. in any case, we should address the can you paste the full traceback? |
Full stack:
|
Regardless to the error, there should be a concrete usage example in the user guide (better docs). |
yep, we're missing an example here. the problem is the format,
so we need both an example, and one-hot encode the user's input in case they pass one like yours. (and possibly show a warning saying we did the one-hot implicitly) |
There user story has 3 action items here:
|
@edublancas I'm not sure I completely understood this.
I tried to one-hot encode Tried to skip these checks by hardcoding some variables but eventually, this fails here (y_score is
Based on this, I tried to one-hot encode the Please let me know your thoughts on this |
please open a PR so I can run your code |
@edublancas I just used the same example @idomic used but one-hot encoded prediction array. current example import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn import tree
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import confusion_matrix, accuracy_score, classification_report
from sklearn_evaluation import plot, table
from sklearn.metrics import roc_curve, auc
df = sns.load_dataset('penguins')
df.isnull().sum()
df.dropna(inplace=True)
Y = df.species
Y = Y.map({'Adelie': 0, 'Chinstrap': 1, 'Gentoo': 2})
df.drop('species', inplace=True, axis=1)
se = pd.get_dummies(df['sex'], drop_first=True)
df = pd.concat([df, se], axis=1)
df.drop('sex', axis=1, inplace=True)
le = LabelEncoder()
df['island'] = le.fit_transform(df['island'])
X = df
X_train, X_test, y_train, y_test = train_test_split(
X, Y, test_size=0.3, random_state=40)
dtc = tree.DecisionTreeClassifier()
dt_model = dtc.fit(X_train, y_train)
predictions = dt_model.predict(X_test) one hot encode prediction and run plot.roc predictions_one_hot = np.zeros((a.size, a.max() + 1))
predictions_one_hot[np.arange(a.size), a] = 1
plot.ROC.from_raw_data(y_test, predictions_one_hot) I'm not sure how to implement this in the code since I don't know when and if should I one-hot the input. |
Ok, so what we want is this format:
this one:
and this one (I didn't remember this one) - these are output probabilities per class
To produce the same output. If I understand correctly, the first one breaks because if falls under let me know if this clarifies things @yafimvo |
@edublancas yes, I wasn't sure if this check is valid or not. |
I see you built a one hot encoding function, and I just found sklearn has one (which I think is the same one that we have in our codebase
then, we can change this check for a more generic version that verifies if the number of unique values is 2 |
let's also document in the docstring that all the three formats are valid. I think if we put a |
I just realized I made a mistake when describing the issue. ROC takes two inputs: y_test, and y_scores. For y_test, we should accept (let's call this format "classes"):
and (let's call this format "one-hot encoded classes")
However, for y_score, the only valid format is (lets call this "scores"):
since ROC needs the raw scores (0-1) for plotting. I think we should also do some validation. For example, if user passes "scores" to y_test, we should throw an error. and if the user passes "classes" or "one-hot encoded classes" to ROC, we should also throw an error (and tell the user that they can generate the scores with |
@edublancas I pushed yesterday a PR that allows passing all these inputs without the errors you mentioned. Change it? |
Yes, please change. Passing classes or one-hot encoded to ROC in the y_scores is a methodological mistake - that's why we're seeing these weird ROC curves. The methods you implemented are still valuable; we can use them in other plots. So please don't remove them and ensure they're documented. For example, in the confusion matrix, we can use them. Since there we don't require scores, but predictions ( one question, how are you converting scores to the binary format? (which threshold are you using) |
I used |
ROC curve is commonly used to compare the performance of models. It is usually used in binary classification, but it can also be used in multiclass classification using averaging methods.
When running
plot.roc(y_test, prediction1)
on more than 2 classes it fails, we should add support to it:ValueError: multiclass format is not supported
I tried it for ROC curve, but this issue also applies to PR curve. (
plot.precision_recall(y_test, prediction1
)The text was updated successfully, but these errors were encountered: