Back to Boolean Circuits
Back to notes

# Parity is not in $\AC{0}$

60%

## Overview

In 1984, Furst, Saxe, and Sipser showed that the parity function does not lie in $\AC{0}$. Later on in 1986, Hastad showed an substantial improvement in his Ph.D. thesis.

We are going to prove the following theorem in several ways and discuss the implication as well as influence of this theorem.

For sufficiently large $n$ and any constant $d\geq2$, the parity function on $n$ bits cannot be computed by depth $d$ size $2^{n^{o(1/d)}}$ circuits with NOT gates, and unbounded fan-in AND, OR gates.

Parity is not in $\AC{0}$.

In fact, the above theorem can be much strengthened due to Hastad as follows.

For sufficiently large $n$ and any constant $d\geq2$, the parity function on $n$ bits cannot be computed by depth $d$ size $2^{n^{1/(d-1))}}$ circuits with NOT gates, and unbounded fan-in AND, OR gates.

## Random restriction and switching lemma

The idea of random restriction is to randomly choose some input bits of a circuit and fix them to some random values. The hope is that after such random restriction, the circuit will becomes something simpler and easier to analyze. For example, in Hastad’s switching lemma, which we will see in a minute, after hitting a random restriction to a DNF or CNF, the resulting (restricted) function turns out to have a small decision tree with high probability.

In the rest of this section, we first formally define the mathematical formulation of random restriction. Then give an overview/review on decision tree, DNF, and CNF. Finally, we state and prove Hastad’s switching lemma and use it to complete the proof for our main theorem showing parity function not in $\AC{0}$.

### Random restriction

As a disclaimer, there are many different forms of random restrictions on different circuit models. Here we focus on one of the simplest and basic one. In particular, all the randomness here are chosen uniformly over the domain while in some applications beyond the scope of this notes people might consider biased random restriction.

Let $f:{0,1}^n\rightarrow{0,1}$ be an $n$-bit boolean function. For every integer $1\leq m\leq n$, a random $m$-restriction can be sampled as follows.

1. Randomly choose a subset $S\subseteq[n]$ of size $\card{S}=n-m$.
2. Randomly choose a string in $v\in{0,1}^{n-m}$.
3. Set the variables chosen by $S$ to value $v$.

We denote a restriction as $\rho$ and denote the resulting restricted function as $f\vert_\rho$. Note that if $\rho$ is a $m$-restriction then $f\vert_\rho$ is an $m$-bit function.

### Decision tree, DNF, and CNF

Decision tree (DT) is one of the simplest and natural computational models. For boolean function, a decision tree is simply a binary tree with internal vertices associated by a variable $x_i$ and the two out-going edges correspond to $x_i$ being set to $0$ or $1$. The leaves of a decision tree are indexed by $0$ or $1$. The boolean function computed by a decision tree $T$ is naturally defined as starting from the root and going downward according to the value of to variables and finally outputting the value in the leaf one ends at. See the following for an example.

In decision tree, we are usually interested in the depth (i.e., the longest distance from the root to a leaf) of the tree as an important complexity measure. It is not difficult to see that any decision tree that computes the parity function of $n$ bits requires depth to be at least $n$.

A CNF (Conjunctive Normal Form) is of the form $C_1\wedge\cdots\wedge C_s$ where each $C_i=(\ell_{i_1}\vee\cdots\vee\ell_{i_w})$, $\ell_{i_j}$ is a literal (i.e., of the form $x$ or $\neg x$ for some variable $x$), and $w$ is called the width of this CNF. Similarly, a DNF (Disjunctive Normal Form) is of the form $C_1\vee\cdots\vee C_s$ where each $C_i=(\ell_{i_1}\wedge\cdots\wedge\ell_{i_w})$. The important complexity measures of CNF and DNF are the size (i.e., the number of clauses) and the width.

Some basic but important observations about decision tree, CNF, and DNF:

• Every decision tree of depth $d$ can be turned into a CNF (also DNF) of width at most $d$ and size at most $2^d$. Can you see how?
• Every CNF (also DNF) of width $w$ and size $s$ can be turned into a decision tree of depth at most $w$. In particular, one can turn it into a canonical decision tree that has special structure as depicted in the following figure.

[TODO: add a figure]

### Hastad’s switching lemma

In 1986, Hastad proved the following switching lemma.

Let $f:\bit^n\rightarrow\bit$ be a function that can be expressed by a DNF or CNF of width at most $w$. Let $1\leq m\leq n/5$ and let $\rho$ be a random $m$-restriction. Then, for any $d\geq2$, $$\mathbb{P}_{\rho}[f\vert_{\rho} \text{ is not expressible as a }\text{DT of depth }d]\leq\left(\frac{mw}{n}\right)^{d}.$$

[TODO: complete the proof]

Now, we show how to use switching lemma to prove our main theorem. The overall strategy is to use a sequence of random restrictions $\rho_0,\rho_1,\dots,\rho_{d-2}$ with properly chosen size and show that for every $AC{0}$ circuit $C$ of size $s$ and depth $d$, $$\Pr_{\rho_0,\dots,\rho_{d-2}}[C\vert_{\rho_0}\cdots \vert_{\rho_{d-2}}\text{ is not expressible as a }\text{DT of depth }m]<1$$ where $m$ is the number of free variables left after the restrictions $\rho_0,\rho_1,\dots,\rho_{d-2}$.

### Step 1: Turn a circuit into the general position

Let’s fix an arbitrary $AC{0}$ circuit $C$ of size $s$ and depth $d$. We turn $C$ into the general position described as follows.

• The fan-out of every gates is 1 so that $C$ is a tree.
• All the NOT gates lie in first layer of $C$.
• The AND and OR gates are switching between layers.
• The input/bottom layer consists of OR gates with fan-in at most $10\log s$.

Note that for the first three bullets, we can them with at most polynomial blow-up in size and no blow-up in depth. For the last bullet, in general we cannot do so. The trick is that we first apply a random $(n/2)$-restriction $\rho_0$ so that with probability at least $1-0.1$ each bottom OR gate with fan-in more than $10\log s$ is fixed by the restriction.

### Step 2: Reduce the depth by random restriction

In this step, we use switching lemma together with union bound and probabilistic method to reduce the depth of the circuit. Concretely, we want to inductively show the following: After the $i$th iteration for $i=1,\dots d-1$.

• The circuit has depth $d-i$.
• The gates in the bottom layer have fan-in at most $k_i$ where $k_i$ is a constant set later.
• The gates in the layer above the bottom layer can be expressed with a
• There are $n_i-\sqrt{n_i}$ inputs being fixed and $\sqrt{n_i}$ left undetermined where $n_0=n$ and $n_i=n^{1/2^{i}}$.

We achieve the above setting by the following probabilistic argument.

1. By induction, we know that at the beginning of the $i$th iteration, the gates in the layer above the bottom layer can be expressed as a $(k_{i-1})$-CNF (or $(k_{i-1})$-DNF). Thus, we can apply Hastad’s switching lemma.
2. For a single gate in the layer above the bottom layer, the probability of cannot be expressed as a $k_i$-DNF (or $k_i$-CNF) is at most $$(\frac{\sqrt{n_i}k_{i-1}^{10}}{n_i})^{k_i/2}.$$

Take $k_i=10b2^{i+2}$, the probability is upper bounded by $1/(10n^{n^b})$. As a result, by union bound and probabilistic argument, we know that there exists a restriction of size $n_i-\sqrt{n_i}$ such that every gates in the layer above the bottom layer can be expressed as a $k_i$-DNF (or $k_i$-DNF). As a result, we can combine the layer above the bottom layer with the layer above into a single layer.

### Step 3: Final step

After $d-2$ iteration, the circuit has the following properties.

• Having $n_{d-2}=n^{1/2^{d-2}}$ input bits remain unfixed.
• The circuit can be expressed as a $k_{d-2}$-CNF (or $k_{d-2}$-DNF).

As a result, the output of the circuit can be determined by $k_{d-2}$ value which is a constant with respect to the remaining unfixed input bits. As the parity function cannot be determined until the last bit is fixed, we know that $C$ cannot compute the parity function.