# How to Plot ROC Curve in Python

The Receiver Operating Characteristic (ROC) curve is a graphical plot for evaluating the performance of binary classification models such as logistic regression, support vector machines, etc.

ROC curve visualizes the trade-off between sensitivity (true positive rate) and specificity (false positive rate) for all possible threshold values.

A model with good predictability will have ROC curve that extends towards the upper-left corner of the plot (high true positive rate and
low false positive rate). A perfect prediction model will have an ROC curve with `true positive rate (TPR)`

= 1 and
`false positive rate (FPR)`

= 0.

In addition, the ROC curve summarises the model predictability based on the area under the ROC curve (AUC). AUC ranges from 0 to 1, and a model with higher a AUC (close to 1) has higher predictability.

In Python, the ROC curve can be plotted using the `roc()`

function from the `bioinfokit`

package.

We will take the example of the logistic regression to plot the ROC curve in Python.

## Getting the dataset

We will use the sample breast cancer dataset for fitting the logistic regression model.

This sample breast cancer dataset includes four features (predictors) and outcome [patient is healthy (0) or cancerous (1)].

```
# import package
import pandas as pd
# load dataset
df = pd.read_csv("https://reneshbedre.github.io/assets/posts/logit/breast_cancer_sample_2.csv")
# view first few rows
# Classification is the outcome with two levels with cancer (1) or healthy (0) patients
df.head(2)
Age BMI Insulin Leptin Classification
0 48 23.500000 2.707 8.8071 0
1 83 20.690495 3.115 8.8438 0
```

## Train-Test split

Split the dataset into train and test datasets. We will use the `train_test_split()`

function from the `sklearn`

package to
split 70% as training and 30% as test datasets.

The training dataset will be used for training the model and the test dataset will be used for prediction.

```
# import package
from sklearn.model_selection import train_test_split
# split into training and testing
df_train, df_test = train_test_split(df, train_size = 0.7, random_state = 0)
```

## Fit the logistic regression model and perform prediction

Fit the logistic regression model using training dataset,

```
# import package
from sklearn.linear_model import LogisticRegression
# get X and y
X_train = df_train[["Age", "BMI", "Insulin", "Leptin"]]
y_train = df_train["Classification"]
# fit the model
fit = LogisticRegression(random_state = 0).fit(X_train, y_train)
# perform prediction
# get X and y
X_test = df_test[["Age", "BMI", "Insulin", "Leptin"]]
y_test = df_test["Classification"]
# calculate predicted probabilities
pred_probs = fit.predict_proba(X_test)[:, 1]
```

## Plot ROC curve

We will use the `roc()`

function from the `bioinfokit`

to plot the ROC curve. ROC plot requires TPR (sensitivity) and
FPR (specificity) values.

Calculate TPR and FPR for ROC,

```
# import package
from sklearn.metrics import roc_curve, roc_auc_score
# calculate FPR and TPR and AUC
fpr, tpr, thresholds = roc_curve(y_true = y_test, y_score = pred_probs)
auc = roc_auc_score(y_true = y_test, y_score = pred_probs)
```

Now, plot the ROC curve,

```
# import package
from bioinfokit.visuz import stat
# plot ROC
stat.roc(fpr = fpr, tpr = tpr, auc = auc, shade_auc = True, per_class = True,
legendpos='upper center', legendanchor=(0.5, 1.08), legendcols=3)
```

Based on the ROC curve and AUC (0.56), the model has poor predictability. The model will not perform well in classifying the healthy and cancer patients.

**Related**: Calculate AUC in Python

## Enhance your skills with courses on machine learning

- Advanced Learning Algorithms
- Machine Learning Specialization
- Machine Learning with Python
- Machine Learning for Data Analysis
- Supervised Machine Learning: Regression and Classification
- Unsupervised Learning, Recommenders, Reinforcement Learning
- Deep Learning Specialization
- AI For Everyone
- AI in Healthcare Specialization
- Cluster Analysis in Data Mining

This work is licensed under a Creative Commons Attribution 4.0 International License

Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.