# Logistic Regression In ML

Logistic Regression as a method of classification problem.Logistic regression will allow us to solve Classification Problem where we will be working with Discrete Categories.

Logistic Regression as a method of classification problem

**Some examples of Classification Problem :**

- Spam Emails versus not Spam Emails
- Loan Default (Yes/No)
- Disease Diagnose

All the above examples are of Binary Classification.

Logistic regression will allow us to solve Classification Problem where we will be working with Discrete Categories.

## Logistic Regression for Binary Classification

The convention for binary classification is to have two classes 0 and 1.

- Logistic Regression outputs probabilities
- If the probability 'P' is greater than 0.5 : The data is labelled to 1.
- If the probability 'P' is less than 0.5 : The data is labelled to 0.

### Probability Threshold

By default , Logistic Regression probability threshold is 0.5 .

```
class sklearn.linear_model.LogisticRegression(penalty=’l2’, dual=False, tol=0.0001, C=1.0, fit_intercept=True, intercept_scaling=1, class_weight=None, random_state=None, solver=’warn’, max_iter=100, multi_class=’warn’, verbose=0, warm_start=False, n_jobs=None, l1_ratio=None)
```

### Regularization Parameter : C

C controls the inverse of the regularization strength.

- Large C can lead to overfit a model
- Small C can lead to underfit a model

### Penalty hyperparameter

In addition to C, logistic regression has a '**penalty**' hyperparameter which specifies whether to use 'L1' or 'L2' regularization.

### Using Logistic Regression Model instead of Linear regression model

- We can't use a normal linear regression model on different binary groups. It won't lead to a good fit.
- Instead we can transform our linear regression to logistic regression curve.

### Sigmoid Function

The sigmoid function( also called Logistic Function) take in any value and outputs it between 0 and 1.

We can take our linear regression solution and can place it into a sigmoid function.

This results in a probability from 0 to 1 belonging in the 1 class.

We can set the cutoff point at 0.5, anything below it will be labeled as class 0 and above it to class 1.

So, we use the logistic function to output a value ranging from 0 to 1. Based off of this probability we assign a class.

## Model Evaluation

After you trained a logistic regression model on some training data, you will evaluate your model's performance on some test data.

You can use a confusion matrix to evaluate classification models.

**For example**, Imagine testing for diseases

Ex : Test for presence of diseases

NO = Negative Test = False = 0

Yes = Positive Test = True = 1

Basic Terminologies,

- True Positive (TP)
- True Negatives(TN)
- False Positives(FP)
- False Negatives(FN)

Accuracy

- Overall, how often it is correct ?
- Accuracy = (TP+TN)/Total = 150/165=0.91

Missclassification Rate(Error Rate):

- Overall, how often it is wrong ?
- Error = (FP+FN)/Total = 15/165 = 0.09

We define two errors in confusion matrix :

- Type I error (False Positive)
- Type II error(False Negative)

## Code :

Find the code here