Maximum Likelihood Estimation VS Maximum A Posteriori Estimation
Introduction
In non-probabilistic machine learning, maximum likelihood estimation (MLE) is one of the most common methods for optimizing a model. In probabilistic machine learning, we often see maximum a posteriori estimation (MAP) rather than maximum likelihood estimation for optimizing a model.
In this blog post, I would like to discuss the connections between the MLE and MAP methods.
Bayes’ Theorem
Bayes’ theorem is stated mathematically as the following equation.
where
If
If
Notice that
Maximum Likelihood Estimation (MLE)
Maximum likelihood estimation, as is stated in its name, maximizes the likelihood probability
Mathematically, maximum likelihood estimation could be expressed as
It is equivalent to optimizing in the log domain since
Maximum A Posteriori Estimation (MAP)
Maximum a posteriori estimation, as is stated in its name, maximizes the posterior probability
Mathematically, maximum a posteriori estimation could be expressed as
It is equivalent to optimizing in the log domain since
MLE and MAP Relationship
By applying Bayes’ theorem, we have
Therefore, maximum a posteriori estimation could be expanded as
If the prior probability
Therefore, we could conclude that maximum likelihood estimation is a special case of maximum a posteriori estimation when the prior probability is uniform distribution.
Which One to Use
In optimization, maximum likelihood estimation and maximum a posteriori estimation, which one to use, really depends on the use cases. If we know the probability distribution for both the likelihood probability
However, in many practical optimization problems, we actually don’t know the distribution for the prior probability
For example, suppose we are going to find the optimal parameters for a model. In the model, we have parameter variables
As been discussed previously, because in many models, especially the conventional machine learning and deep learning models, we usually don’t know the distribution of
This is why we often see maximum likelihood estimation, rather than maximum a posteriori estimation, in conventional non-probabilistic machine learning and deep learning models.
Maximum Likelihood Estimation VS Maximum A Posteriori Estimation
https://leimao.github.io/blog/Maximum-Likelihood-Estimation-VS-Maximum-A-Posteriori-Estimation/