After this post, I decided to organize my understanding of convolution in this series.

Probability Theory

We shall define some concepts and theorems:

A function X: \Omega \to S is said to be a measurable map from the measure space (\Omega, \mathcal{F}) to (S,\mathcal{S}) if

\displaystyle X^{-1}(B) \equiv \{\omega: X(\omega) \in B\} \in \mathcal{F}

for all B \in \mathcal{S}. If (S,\mathcal{S}) = (\mathbb{R}^d,\mathcal{R}^d) with d>1, then X is called a random vector. If (S,\mathcal{S}) = (\mathbb{R},\mathcal{R}), then X is called a random variable. Here we shall believe that properties of integration on measurable function from the measure theory are fully justified, and we define the following:

If X \geq 0 is a random variable on (\Omega, \mathcal{F},P), then we define its expected value to be

\displaystyle EX=\int XdP

It is not hard to show that if f:(\mathbb{R},\mathcal{R}) \to (\mathbb{R},\mathcal{R}) is measurable, then f(X) is also a random variable. From this point, we need to know how to compute the expected value with change of variables.

Theorem 1:

Let X be a random vector of (\mathbb{R}^d,\mathcal{R}^d) with distribution \mu, i.e., \mu (A) = P(X \in A). If f is a measurable function from (\mathbb{R}^d,\mathcal{R}^d) to (\mathbb{R},\mathcal{R}) with h \geq 0 or E|f(X)|< \infty, then

\displaystyle Ef(X)=\int_{\mathbb{R}^d}f(y) \mu(dy)

This thoerem can be proved by considering the integral in four stages.1Indicator functions → Simple functions → Nonnegative functions → Integrable functions Using the similar technique, we can get the following corollary.

Corollary 1:

Suppose that the probability measure \mu has

\displaystyle \mu(A) = \int_Af(x)dx

for all A \in \mathcal{R}. For any g with g \geq 0 or \int |g(x)|\mu(dx)< \infty, we have

\displaystyle \int g(x)\mu(dx) = \int g(x)f(x)dx

We will use this corollary later. For now, since we will be working with two or more random variables, one version of the well-known Fubini’s theorem will be stated here.

Fubini’s Theorem:

Suppose that (\Omega, \mathcal{F},P) is the product of two measure spaces (X, \mathcal{A}, \mu) and (Y,\mathcal{B},\nu). If f\geq 0 or \int |f| d\mu < \infty, then

\displaystyle \int_X \int_Y f(x,y)\mu_2(dy) \mu_1(dx) = \int_{X \times Y} f d\mu = \int_Y \int_X f(x,y)\mu_1(dx) \mu_2(dy)

we consider the theorem 1 together with the Fubini’s Theorem and state the following theorem.

Theorem 2:

Suppose X and Y are independent and have distributions \mu and \nu. If h: \mathbb{R}^2 \to \mathbb{R} is a measurable function with h \geq 0 or E|h(X,Y)|<\infty, then

\displaystyle Eh(X,Y)=\int \int h(x,y)\mu(dx)\nu(dy)

Now we can investigate the concept of convolution.

Theorem 3:

If X and Y are independent, F(x) = P(X \leq x) and G(y) = P(Y \leq y), then

\displaystyle P(X+Y \leq z)=\int F(z-y)dG(y)

Remark that the notation dG(y) is exactly \nu(dy) where \nu is the probability measure with distribution function G. That is, it means “integrate with respect to the measure \nu with distribution function G. Here, the integral

\displaystyle P(X+Y \leq z)=\int F(z-y)dG(y)

is called the convolution of F and G, denoted by F*G(z). To make it more applicable, the following corollary will be introduced.

Corollary 2:

Suppose that X with density f and Y with distribution function G are independent. Then X+Y has density

\displaystyle h(x)=\int f(x-y)dG(y)

When Y has density g, the last formulat can be written as

\displaystyle h(x)=\int f(x-y)g(y)dy

The first part of this corollary can be proved by Theorem 3 and Fubini’s theorem, and the second part can be proved with a help of corollary 2.

At this point, the last equation looks much more familiar. Yes, without the context, it is exactly the convolution in Fourier Analysis. To see how this concept can be approached from different aspects, check out the other two volumes of this series (Still working).

Most of the content above is based on Probability: Theory and Examples by Rick Durrett, which is a comprehensive textbook for graduate probability theory. I also learned a lot from the class MAT235A taught by Professor Janko GravnerThanks for his kindly teaching!

Leave a Reply