The Polynomial Hierarchy, Random Oracles, and Boolean Circuits
Introduction
Rossman, Servedio, and Tan in 2015 resolved a 30-year open problem proposed by Håstad in 1986 saying that the polynomial time hierarchy is infinite under random oracles. Before this work, people only knew that there exists oracles such that polynomial time hierarchy is infinite. On the other hand, the first few levels in the polynomial time hierarchy are also known to be separated under random oracles. The state-of-the-art before RST was by O’Donnell and Wimmer in 2007, who showed that Σ3⊈Σ2 under random oracles.
By Furst, Saxe, and Sipser [FSS81], we have known that showing polynomial time hierarchy to be infinite is equivalent to proving good enough separation for AC0 of distinct depth. Concretely, for any k∈N, the goal is to find a function f such that f has a polynomial size (in fact linear size) depth k+1 circuit but no quasipolynomial size depth k circuit approximates f well. We call this kind of theorem depth hierarchy theorem.
In [RST15], they show the following depth hierarchy theorem.
For any k≤clognloglogn where c>0 is some constant, there exists f such that (1) there exists a linear size depth k+1 circuit that computes f and (2) for any depth k circuit C of size exp(n1/k), C agrees with f on at most (12+n−Ω(k)) fraction of inputs.
Relativized complexity and circuit lower bound
Relativized complexity
The goal of relativized complexity is understanding the relation between complexity classes by relativizing them with respect to some oracles. The motivation is because of the obstacles of showing relation directly. From relativizing results, people can somehow get an idea about the relation between complexity classes.
One of the most famous relativizing result is by Baker, Gill, and Solovay in 1975 showing that there exists oracles A and B such that PA=NPA but PB≠NPB. In some sense this result indicates that all the techniques that are relativized cannot be used to resolve the relation between P and NP. Later, people started considering relativizing under random oracles. That is, showing the relation of complexity classes with respect to random oracles. By Komogolrov 0-1 law, when picking an oracle randomly, the relation between two complexity classes will be the same with high probability. For instance, we have PO[PO≠NPO]=1. Though people soon realized the relation under random oracles do not imply real relation since IP=PSPACE but PO[IPO≠PSPACEO]=1. Nevertheless, showing relativized results either worst-case (exists an oracle) or average-case (under random oracles) are still interesting for theorists. In the end, usually this is much easier than showing the complexity relation directly.
Circuit lower bound came to play
A surprising connection between relativized complexity and circuit lower bound was first discussed by Furst, Saxe, and Sipser [FSS81]. Specifically, they provided a framework of sufficient condition to show the separation in polynomial time hierarchy. Here is an example. To show NP⊈P with respect to certain oracle, it suffices to show the following. There exists a function f such that
- f has a polynomial size DNF formula.
- For any poly-log depth decision tree of size quasipolynomial does not compute f.
That is, by showing a decision tree lower bound for a function f plus a small DNF for f implies NP⊈P. Here are some intuitions.
-
(Oracle Turing machine) By Cook-Levin theorem, one can think of a Turing machine as a logical checking process. For P, it is checking whether the input satisfies certain logical relations. Also, the checking process is efficient, it can be done in time polynomial in n. As to the oracle Turing machine, once we fix an input string, one can think of it as a logical checking process on the oracle itself. Specifically, it is a process on the truth table of the oracle. Concretely, we can write this process into a decision tree Tx having input as the truth table of an oracle. Note that as the oracle Turing machine runs in time polynomial in n, say m=poly(n). The length of the input it queries the oracle is also at most poly(n) and thus we only need to look at the truth table for bit length at most N=2polyn. As a result, the depth of Tx is at most m=poly(n)=polylog(N) and the size is at most 2m=2polylog(N).
-
(Language) Suppose there exists a function f that has a DNF of poly-logarithmic width and quasipolynomial size but every decision tree of depth