Back to Computational Complexity
Back to notes

Pseudorandom Generators - From weaker assumptions

20%

This notes is mainly based on the book by Vadhan. The subsection about cryptographic PRGs refers to the lecture notes by Trevisan.

The goal of this note

In previous note, we have seen that by probabilistic argument, there exists a perfect PRG where the seed length is logarithmic in the output length. Then, we digress to the cryptographic PRGs and show the non-existence of perfect cryptographic PRG and the equivalence of cryptographic PRGs and other cryptographic primitive such as OWFs and OWPs.

In this note, we are turning back to the derandomization PRGs, i.e., aiming for $(m,1/m)$-pseudorandomness, and see how to achieve our goal via weaker assumption such as the existence of hard function.

Hard functions

Intuitively, hard functions refers to boolean functions that are not approximable by small circuits. There are two common definitions for hard functions: (please be careful of the order of the quantifiers)

• worst-case hard: for any small (randomized) circuits, there exists a input that behaves non-negligibly different between the function and the circuit.
• average-case hard: for any small (randomized) circuits, most of the input behave non-negligibly different between the function and the circuit.

Formally, we have the following definitions.

Let $f:\bit^n\rightarrow\bit$, $s\in\bbN$, and $\epsilon>0$. We say $f$ is $(s,\epsilon)$-worst-case hard if for any (randomized) circuit of size at most $s$, there exists $x\in\bit^n$, $\bbP[f(x)\neq C(x)]> 1-\epsilon$. If we do not specify $\epsilon$, we think of $\epsilon$ as some constant say $1/3$.

Let $f:\bit^n\rightarrow\bit$, $s\in\bbN$, and $\epsilon>0$. We say $f$ is $(s,\epsilon)$-average-case hard if for any (randomized) circuit of size at most $s$, $\bbP_{x\leftarrow U_n}[f(x)\neq C(x)]> \frac{1}{2}-\epsilon$.

• Here we use randomized circuit since by averaging argument, there exists a realization of the randomness such that when we hard-wire the realization into the circuit, the difference between the circuit and the function remains.
• In the definition of average-case hard function, we choose $\frac{1}{2}-\epsilon$ because it is the closet to random guess. That is, every small functions behave like random guessing the value of the function.
• Note that the existence of average-case hard function is a stronger assumption than the existence of worst-case hard function because the former implies the later. Here, we say an assumption is stronger if it is more impossible.

In the construction of PRGs or other pseudorandom objects, we normally base on the following assumptions about hard functions.

Suppose there exists a function in $\E$ that is $(s,1/3)$-worst-case hard where $s(n)=n^{\omega(1)},2^{n^{\Omega(1)}},2^{\Omega(n)}$.

Suppose there exists a function in $\E$ that is $(s,\epsilon)$-average-case hard for some (or for any) $\epsilon>0$ where $s(n)=n^{\omega(1)},2^{n^{\Omega(1)}},2^{\Omega(n)}$.

The reason why we require the hard function to lie in $\E$ is because in the construction of PRGs or other pseudorandom objects, the function will be evaluated and we want the evaluation to be efficient.

PRGs from average-case hard functions

In this section, we are going to prove the following theorem which says that the existence of average-case hard functions implies the existence of derandomization PRGs.

Suppose there exists a boolean function $f:\bit^{\ell}\rightarrow\bit$ in $\E$ that is $(s(\ell),1/3)$-average-case hardness

Concatenation and composition

Before we prove the main theorem, let’s see several toy examples to have some feelings on constructing PRGs from average-case hard functions.

Suppose $f_{\ell}:\bit^{\ell}\rightarrow\bit$ is $(s(\ell),1/2-\epsilon)$-average-case hard and $f_{\ell}\in\E$, then there exists a mildly explicit $(s(\ell),\epsilon)$-PRGs $G_{\ell+1}:\bit^{\ell}\rightarrow\bit^{\ell+1}$ defined as $G_{\ell+1}(x)=(x,f_{\ell}(x))$.

The idea is proving by contradiction. Concretely, suppose there exists a next-bit predictor for $G_{\ell}$, we are going to construct a circuit that approximates $f_{\ell}$ well.

Suppose $G_{\ell+1}$ is not $(s(\ell),\epsilon)$-pseudorandom, then by the equivalence of pseudorandomness and next-bit prediction, we know that $G_{\ell+1}$ is also $(s(\ell),\epsilon/m)$-next-bit predictable. Furthermore, as the first $\ell$ bits of $G_{\ell_1}(U_{\ell})$ is uniformly distributed, $G_{\ell+1}$ is actually $(s(\ell),\epsilon)$-next-bit predictable and the predictable bit is the $(\ell+1)$th bit. That is, exists circuit $P$ of size at most $s(\ell)$ such that $$\bbP[P((G_{\ell+1}(U_{\ell}))_{1,\dots,\ell})=(G_{\ell+1}(U_{\ell}))_{\ell+1}]=\bbP[P(U_{\ell})=f_{\ell}(U_{\ell})]>\frac{1}{2}+\epsilon.$$

That is, $P$ agrees with $f_{\ell}$ on more than $1/2+\epsilon$ fraction of the inputs, which contradicts to the $(s(\ell),1/2-\epsilon)$-average-case hardness of $f_{\ell}$. As a result, we conclude that $G_{\ell+1}$ is $(s(\ell),\epsilon)$-pseudorandom.

The above concatenation can yield a PRG that produce one additional bit with pseudorandomness being the same as the average-case hardness of the function. To produce more bits, it’s natural to compose the same concatenation more than one time as follows.

Suppose $f_{\ell}:\bit^{\ell}\rightarrow\bit$ is $(s(\ell),1/2-\epsilon_1)$-average-case hard and $f_{\ell}\in\E$. Let $X_{\ell}$ be $(t(\ell),\epsilon_2)$-pseudorandom, then $(X_{\ell},f_{\ell}(X_{\ell}))$ is $(p(\ell),\epsilon_1+\epsilon_2)$-pseudorandom, where $p(\ell)=\min\{t(\ell)-s(\ell),s(\ell)\}$.

Let’s prove by contradiction. Concretely, suppose $(X_{\ell},f_{\ell}(X_{\ell}))$ and $U_{\ell+1}$ are $(p(\ell),\epsilon_1+\epsilon_2)$-computationally distinguishable, we are going to construct a circuit of size at most $s(\ell)$ that approximates $f_{\ell}$ well.

Suppose there exists a circuit $C$ of size at most $s(\ell)-t(\ell)$ such that $$|\bbP[C(X_{\ell},f_{\ell}(X_{\ell}))=1]-\bbP[C(U_{\ell+1})=1]|>\epsilon_1+\epsilon_2.$$

WLOG, assume it is the case (note that we remove the $\card{\cdot}$) $$\bbP[C(X_{\ell},f_{\ell}(X_{\ell}))=1]-\bbP[C(U_{\ell+1})=1]>\epsilon_1+\epsilon_2.$$

Consider $C(U_{\ell},f_{\ell}(U_{\ell}))$, by triangle inequality, at least one of the followings holds:

• $\bbP[C(U_{\ell},f_{\ell}(U_{\ell}))=1]-\bbP[C(U_{\ell+1})=1]>\epsilon_2$
• $\bbP[C(X_{\ell},f_{\ell}(X_{\ell}))=1]-\bbP[C(U_{\ell},f_{\ell}(U_{\ell}))=1]>\epsilon_1$

Suppose the first case holds, then construct a approximation for $f_{\ell}$ as follows.

1. Randomly pick $b\in\bit$.
2. If $C(x,b)=1$, then output $b$.
3. Otherwise, output $1-b$.

With the same analysis of next-bit predictor, we know that the above approximation has advantage $1/2+\epsilon_2$. Also, the complexity of the approximation is the same as the complexity of $C$, i.e., at least $s(\ell)$, which contradicts to the average-case hardness of $f_{\ell}$.

Suppose the second case holds, then construct a distinguisher for $X_{\ell}$ and $U_{\ell}$ as follows. $$D(x) = C(x,f_{\ell}(x)).$$ Note that the complexity of $D$ is the complexity of $C$ plus the complexity of $f_{\ell}$. That is, at least $\big(t(\ell)-s(\ell)\big) + s(\ell)=t(\ell)$, which contradicts to the $(t(\ell),\epsilon_2)$-pseudorandomness of $X_{\ell}$.

Note that from the second lemma, the performance of concatenate more than one inputs is not so good. The reason is that we cannot (or I don’t know) how to handle pseudorandomness and average-case hardness simultaneously!

Nisan-Wigderson PRGs

From the previous subsection, one can see that the naive concatenation and composition cannot easily yield good PRGs from average-case hard functions. However, from the first concatenation lemma, one can see that using the output of the hard function produce the optimal pseudorandomness (since the pseudorandomness is the same as average-case hardness). What if we apply the hard function more than once on different parts of the input!? Intuitively, as long as we make sure each part does not overlap too much, the output should be sufficiently pseudorandom. Namely, a distinguisher of these output can be easily transformed into a approximation of the hard function.