Go to the profile of  Paras Patidar
Paras Patidar
I am working on Machine Learning, Python and Django.
3 min read

Logistic Regression In ML

Logistic Regression as a method of classification problem.Logistic regression will allow us to solve Classification Problem where we will be working with Discrete Categories.

Logistic Regression In ML
Logistic Regression as a method of classification problem

Some examples of Classification Problem :

  • Spam Emails versus not Spam Emails
  • Loan Default (Yes/No)
  • Disease Diagnose

All the above examples are of Binary Classification.

Logistic regression will allow us to solve Classification Problem where we will be working with Discrete Categories.

Logistic Regression for Binary Classification

The convention for binary classification is to have two classes 0 and 1.

  • Logistic Regression outputs probabilities
  • If the probability 'P' is greater than 0.5 : The data is labelled to 1.
  • If the probability 'P' is less than 0.5 : The data is labelled to 0.

Probability Threshold

By default , Logistic Regression probability threshold is 0.5 .

class sklearn.linear_model.LogisticRegression(penalty=’l2’, dual=False, tol=0.0001, C=1.0, fit_intercept=True, intercept_scaling=1, class_weight=None, random_state=None, solver=’warn’, max_iter=100, multi_class=’warn’, verbose=0, warm_start=False, n_jobs=None, l1_ratio=None)

Regularization Parameter : C

C controls the inverse of the regularization strength.

  • Large C can lead to overfit a model
  • Small C can lead to underfit a model

Penalty hyperparameter

In addition to C, logistic regression has a 'penalty' hyperparameter which specifies whether to use 'L1' or 'L2' regularization.

Using Logistic Regression Model instead of Linear regression model

  • We can't use a normal linear regression model on different binary groups. It won't lead to a good fit.
  • Instead we can transform our linear regression to logistic regression curve.

Sigmoid Function

The sigmoid function( also called Logistic Function) take in any value and outputs it between 0 and 1.

We can take our linear regression solution and can place it into a sigmoid function.

This results in a probability from 0 to 1 belonging in the 1 class.

We can set the cutoff point at 0.5, anything below it will be labeled as class 0 and above it to class 1.

So, we use the logistic function to output a value ranging from 0 to 1. Based off of this probability we assign a class.

Model Evaluation

After you trained a logistic regression model on some training data, you will evaluate your model's performance on some test data.

You can use a confusion matrix to evaluate classification models.

For example, Imagine testing for diseases

Ex : Test for presence of diseases

NO = Negative Test = False = 0

Yes = Positive Test = True = 1

Basic Terminologies,

  • True Positive (TP)
  • True Negatives(TN)
  • False Positives(FP)
  • False Negatives(FN)

Accuracy

  • Overall, how often it is correct ?
  • Accuracy = (TP+TN)/Total = 150/165=0.91

Missclassification Rate(Error Rate):

  • Overall, how often it is wrong ?
  • Error = (FP+FN)/Total = 15/165 = 0.09

We define two errors in confusion matrix :

  • Type I error (False Positive)
  • Type II error(False Negative)

Code :

Find the code here


Check Article on Linear Regression in ML