Jensen's Inequality

Jensen's Inequality

πŸ“Œ
πŸ“Œ
Sign up to Circuit of Knowledge blog for unlimited tutorials and content
πŸ“Œ
If it’s knowledge you’re after, join our growing Slack community!

July 5th 2024

Jensen's inequality is a fundamental result in probability theory and statistics that relates the value of a convex function of an expectation to the expectation of the convex function. It states that for a convex function ff and a random variable XX, the following inequality holds:

f(E[X])≀E[f(X)]f(\mathbb{E}[X]) \leq \mathbb{E}[f(X)]

In other words, the value of the convex function evaluated at the expected value of XX is less than or equal to the expected value of the convex function evaluated at XX.

Mathematical Explanation

Let ff be a convex function and XX be a random variable with probability density function p(x)p(x). The expectation of XX is given by:

E[X]=∫x p(x) dx\mathbb{E}[X] = \int x \, p(x) \, dx

Jensen's inequality states that:

f(E[X])≀E[f(X)]f(\mathbb{E}[X]) \leq \mathbb{E}[f(X)]

which can be written as:

f(∫x p(x) dx)β‰€βˆ«f(x) p(x) dxf\left(\int x \, p(x) \, dx\right) \leq \int f(x) \, p(x) \, dx

The inequality holds for any convex function ff. A function ff is said to be convex if for any two points x1x_1 and x2x_2 in its domain and any λ∈[0,1]\lambda \in [0, 1], the following inequality holds:

f(Ξ»x1+(1βˆ’Ξ»)x2)≀λf(x1)+(1βˆ’Ξ»)f(x2)f(\lambda x_1 + (1 - \lambda) x_2) \leq \lambda f(x_1) + (1 - \lambda) f(x_2)

Intuitively, this means that the line segment connecting any two points on the graph of a convex function lies above or on the graph.

Python Code Demonstration

Let's demonstrate Jensen's inequality using Python code. We'll use the exponential function, which is a convex function, as an example.

import numpy as np

# Define a convex function (exponential function)
def convex_func(x):
    return -np.log(x + 1e-8)

# Generate a random variable X (positive values)
X = np.random.exponential(scale=1.0, size=10000)

# Compute the expectation of X
E_X = np.mean(X)

# Compute the value of the convex function at the expectation of X
f_E_X = convex_func(E_X)

# Compute the expectation of the convex function applied to X
E_f_X = np.mean(convex_func(X))

# Print the results
print("Expectation of X: ", E_X)
print("Convex function evaluated at the expectation of X: ", f_E_X)
print("Expectation of the convex function applied to X: ", E_f_X)
print("Jensen's inequality holds: ", f_E_X <= E_f_X)

In this code:

  1. We define a convex function convex_func as the negative natural logarithm function.
  2. We generate a random variable X from an exponential distribution using np.random.exponential().
  3. We compute the expectation (mean) of X using np.mean() and store it in E_X.
  4. We evaluate the convex function at the expectation of X and store the result in f_E_X.
  5. We apply the convex function to each element of X and compute the expectation (mean) of the resulting values, storing it in E_f_X.
  6. Finally, we print the results and check if Jensen's inequality holds by verifying that f_E_X is less than or equal to E_f_X.

When you run this code, you should see output similar to the following:

Expectation of X:  0.9979895841828075
Convex function evaluated at the expectation of X:  0.0020124293955647247
Expectation of the convex function applied to X:  0.5770358684540039
Jensen's inequality holds:  True

The output confirms that Jensen's inequality holds for the exponential function and the randomly generated variable X. The value of the convex function evaluated at the expectation of X is indeed less than or equal to the expectation of the convex function applied to X.

Jensen's inequality has important applications in various fields, including probability theory, statistics, information theory, and machine learning. It provides a useful bound on the expectation of convex functions and is often used in proofs and derivations involving expectations and inequalities.