Naive Bayes clearly defined with maths and scikit-learn | by Yoann Mocquin

Machine Learning

Naive Bayes clearly defined with maths and scikit-learn | by Yoann Mocquin | Mar, 2024

hhhhm

2024年3月3日

Naive Bayes clearly defined with maths and scikit-learn | by Yoann Mocquin | Mar, 2024

[ad_1]

Fixing the iris dataset with a gaussian strategy in scikit-learn.

On this put up, we’ll delve into a specific sort of classifier known as naive Bayes classifiers. These are strategies that depend on Bayes’ theorem and the naive assumption that each pair of options is conditionally impartial given a category label. If this doesn’t make sense to you, preserve studying!

As a toy instance, we’ll use the well-known iris dataset (CC BY 4.0 license) and a particular sort of naive Bayes classifier known as Gaussian Naive Bayes classifier. Keep in mind that the iris dataset consists of 4 numerical options and the goal will be any of three kinds of iris flower (setosa, versicolor, virginica).

We’ll decompose the tactic into the next steps:

Reviewing the Bayes theorem: this theorem gives the mathematical formulation that permits us to estimate the likelihood {that a} given pattern belongs to any class.
We will create a classifier, a device that returns a predicted class for an enter pattern, by evaluating the likelihood that this pattern belongs to a category, for all courses.
Utilizing the chain rule and the conditional independence speculation, we will simplify the likelihood formulation.
Then to have the ability to compute the chances, we use one other assumption: that the function distributions are Gaussian.
Utilizing a coaching set, we will estimate the parameters of these Gaussian distributions.
Lastly, now we have all of the instruments we have to predict a category for a brand new pattern.

I’ve loads of new posts like this one incoming; bear in mind to subscribe!

Bayes’ theorem is a likelihood theorem that states the next:

P(A|B) is the conditional likelihood that A is true (or A occurs) given (or understanding) that B is true (or B occurred) — additionally known as the posterior likelihood of A given B (posterior: we up to date the likelihood that A is true after we all know B is true).

[ad_2]