Back to Math Tools
Back to notes

# Inequalities for Analysis

## Hölder-type inequalities

Hölder-type inequalities provide a convenient way to switch between different $L^p$-norms. Let’s recall some definitions in measure theory. For those who only care about finite cases, feel free to skip the preliminary.

### Preliminary

Here, we informally define some basic notions.

A triple $(\Omega,\Sigma,\mu)$ is called measure space if $\Omega$ is the universe, $\Sigma$ is a $\sigma$-algebra over $\Omega$ and $\mu$ is a measure of $(\Omega,\Sigma)$.

If $\mu$ is a probability measure, then we call $(\Omega,\Sigma,\mu)$ a probability space.

Let $(\Omega,\Sigma,\mu)$ be a measure space and $f$ be a $\Sigma$-measurable function. For any $p\geq1$, define $$\|f\|_p:=(\int_{\Omega}\card{f}^pd\mu)^{1/p}.$$ Also, define the infinity norm as $$\|f\|_{\infty}\inf\{\alpha:\mu\{\card{f}>\alpha\}=0\}.$$

Note that for finite space, one can simply think of the infinity norm as the maximum of absolute value.

For any $p\in[1,\infty]$, define $L^{p}(\mu):=\{f:f\text{ is }\Sigma\text{-measurable and }\|f\|_p<\infty\}$.

### Cauchy-Schwartz inequality

Let $f,g\in L^2(\mu)$, we have $$\|f\cdot g\|_1\leq\|f\|_2\cdot\|g\|_2.$$ The equality holds when $g=c\cdot f$ for some $c\in\R$. Note that $\|f\cdot g\|_1=\langle f,g\rangle$.

The idea is to decompose $g$ into two parts: one parallel to $f$ and one perpendicular to $f$. Namely, $g=c\cdot f+h$ such that $\langle f,h\rangle=0$. Observe that $$\|g\|_2^2 = c^2\|f\|_2^2+\|h\|_2^2\geq c^2\|f\|_2^2.$$ That is, $c\leq\frac{\|g\|_2}{\|f\|_2}$. Thus, \begin{align} \|f\cdot g\|_1 &= \card{\langle f,c\cdot f+h\rangle}\\
&= \card{c\cdot\|f\|_2^2}\\
&\leq \card{\|f\|_2\cdot\|g\|_2}. \end{align}

### Hölder’s inequality

Let $p,q\in[1,\infty]$ such that $\frac{1}{p}+\frac{1}{q}=1$. For any $f\in L^p(\mu)$ and $g\in L^q(\mu)$, we have $$\|f\cdot g\|_1\leq\|f\|_p\cdot\|g\|_q.$$ Moreover, we say $p$ is the Hölder conjugates of $q$ and vice versa.

We need the following lemma to prove Hölder’s inequality.

Let $a,b\geq0$ and $(p,q)$ are Hölder conjugates pair, we have $$ab\leq \frac{a^p}{p} + \frac{b^q}{q}.$$ The equality holds when $a^p=b%q$.

The idea is based on the concavity of logarithm function. As the inequality obviously holds when at least one of $a$ or $b$ is 0, assume $a,b>0$. \begin{align} \log(\frac{a^p}{p}+\frac{b^q}{q})&\geq\frac{1}{p}\log(a^p)+\frac{1}{q}\log(b^q)\\
&=\log(ab).

With Young’s inequality, Hölder’s inequality holds with some simple manipulation.

### Variants

Here, we list several variants of Hölder’s inequality. Note that we omit some details such as $f\in L^p$ which should be clear in the context.

Name Inequality
Generalization If $\sum_{k\in[n]}\frac{1}{p_k}=\frac{1}{r}$, then $\|\prod_{k\in[n]}f_k\|_r\leq\prod_{k\in[n]}\|f_k\|_{p_k}$.
Interpolation If $\sum_{k\in[n]}\frac{\theta_k}{p_k}=\frac{1}{r}$, then $\|f\|_r\leq\prod_{k\in[n]}\|f\|_{p_k}^{\theta_k}$.
Extremal If $\frac{1}{p}+\frac{1}{q}=1$, then $\|f\|_p = \max\{\|f\cdot g\|_1: \|g\|_q\leq1 \}$.

## Error analysis

Let $0<\delta<1$, we have for any $k\in\N$, $k\delta(1-\delta)^{k-1}\leq1$.

For $k\leq1/\delta$, as both $k\delta$ and $(1-\delta)^{k-1}$ are not greater than 1, the inequality is trivially correct. Consider the case where $k>1/\delta$, observe that $(1-\delta)^{k-1}\leq e^{-(k-1)\delta}$. Compare the derivative of $-(k-1)\delta$ and $\ln1/k\delta$, \begin{align} \frac{d}{d k}(k-1)\delta &= \delta,\\
\frac{d}{d k}\ln k\delta &= \frac{1}{k}. \end{align} That is, $\frac{d}{d k}-(k-1)\delta\leq\frac{d}{dk}\ln1/k\delta$<0$. Thus, we conclude that the inequality holds for all$k$. ### Natural logarithm Let$-1<x<1$, $$\ln(1+x) = x-\frac{x^2}{2}+\frac{x^3}{3}-\cdots.$$ With the Taylor expansion above, we have the following useful first-order and second order approximation for natural logarithm. For any$0\leq\epsilon<1, we have \begin{align} \epsilon-\epsilon^2/2\leq\ln(1+\epsilon)&\leq\epsilon. \end{align} Also, when-1/2\leq\epsilon\leq0, we have \begin{align} \epsilon-\epsilon^2\leq\ln(1+\epsilon)&\leq\epsilon. \end{align} Let us prove by picture as follows. (Thanks Wei-Cheng Lee for spotting an error in an earlier version of this lemma.) ## Convexity ### Log sum inequalities Leta_1,\dots,a_n$and$b_1,\dots,b_n$be non-negative numbers, we have $$\sum_{i\in[n]}a_i\log\frac{a_i}{b_i}\geq(\sum_{i\in[n]}a_i)\log\frac{\sum_{i\in[n]}a_i}{\sum_{i\in[n]}b_i}.$$ ### Jenson’s inequality Let$f$be a convex function on$\bbR^d$,$x_1,\dots,x_n\in\bbR^d$, and$0\leq a_1,\dots,a_n$such that$\sum_{i\in[n]}a_i=1$. Then, $$\sum_{i\in[n]}a_if(x_i)\geq f(\sum_{i\in[n]}a_i x_i).$$ Let$(\Omega,\mu)$be a probability space and$g$is a real-valued and$\mu$-integrable function. Suppose$f$is a convex function on$\bbR$, then $$f(\int_{\Omega}g\ d\mu)\geq\int_{\Omega}f\circ g\ d\mu.$$ ## Gaussian’s tail bounds Here, let$\varphi(x)=\frac{1}{\sqrt{2\pi}}e^{-x^2/2}$be the pdf of standard normal distribution and$\Phi(x)=\int_{-\infty}^x\varphi(t)dt$be the cdf of standard normal distribution. In the following, we estimate the value of$\int_x^{\infty}\varphi(t)dt=1-\Phi(x)$for$x>0$. ### Upper bound $$\int_x^{\infty}\varphi(t)dt\leq\frac{1}{x}\varphi(x).$$ Note that for$t\geq x$,$\frac{t}{x}\geq1\$. Thus, we have \begin{align} \int_x^{\infty}\varphi(t)dt&\leq\int_x^{\infty}\frac{t}{x}\varphi(t)dt\\
&=\frac{1}{x}\cdot\frac{1}{\sqrt{2\pi}}e^{-t^2/2}|_x^{\infty}\\
&=\frac{1}{x}\varphi(x). \end{align}

### Lower bound

$$\int_x^{\infty}\varphi(t)dt\geq(\frac{1}{x}-\frac{1}{x^3})\varphi(x).$$

By integration by parts, we have \begin{align} \int_x^{\infty}\varphi(t)dt &= \int_x^{\infty}\frac{1}{\sqrt{2\pi}}\frac{1}{t}\cdot te^{-t^2/2}dt\\
&=\int_x^{\infty}\frac{1}{\sqrt{2\pi}}\frac{1}{t}d(-e^{-t^2/2})\\
&=\frac{-e^{-t^2/2}}{t\sqrt{2\pi}}|_x^{\infty}-\int_x^{\infty}\frac{e^{-t^2/2}}{t^2\sqrt{2\pi}}dt\\
&=\frac{e^{-x^2/2}}{x\sqrt{2\pi}}-\int_x^{\infty}\frac{te^{-t^2/2}}{t^3\sqrt{2\pi}}dt\\
&\geq\frac{e^{-x^2/2}}{x\sqrt{2\pi}}-\frac{1}{x^3\sqrt{2\pi}}\int_x^{\infty}te^{-t^2/2}dt\\
&=(\frac{1}{x}-\frac{1}{x^3})\varphi(x). \end{align}