next up previous
Next: Ramsey Theory Up: Discrete Math, 12th Problem Previous: Diameter of

Markov Chains

Read the handout ``Finite Markov Chains.''

Definition 3.1   A stochastic matrix is a matrix with nonnegative entries and row sums of $ 1$. A doubly stochastic matrix is a matrix $ A$ such that $ A$ and $ A^{tr}$ (transpose) are both stochastic. A stochastic matrix is ergodic if it (precisely, the digraph of its nonzero entries) is strongly connected and aperiodic.


\begin{exercise}
% latex2html id marker 882
An eigenvalue of a stochastic matrix has norm at most $1$.
\end{exercise}


\begin{exercise}
% latex2html id marker 884If a stochastic matrix is strongly ...
...da$\ is an eigenvalue, then so is $\omega\lambda$.
\end{enumerate}\end{exercise}


\begin{exx}
% latex2html id marker 887If a stochastic matrix is ergodic, then ...
...y less
than $1$. (This is almost equivalent to the following theorem.)
\end{exx}

Theorem 3.2 (Perron-Frobenius)   If $ T$ is an ergodic matrix, then $ T^\infty=\lim_{k\to\infty}T^k$ exists.

Observation 3.3   Since $ TT^\infty=T^\infty$, we must have $ Tx=x$ for $ x$ a column of $ T$, but since the geometric multiplicity is $ 1$, this eigenvector is a multiple of the all $ 1$s vector. Thus all rows are equal; they are the unique stationary distribution $ \pi$.


\begin{exercise}
% latex2html id marker 889
If a Markov chain is ergodic, then f...
...q_0$,
$\lim_{k\to\infty}q_0T^k=\pi$, the stationary distribution.
\end{exercise}


\begin{exercise}
% latex2html id marker 891If $T$\ is ergodic and doubly stochastic, then the stationary distribution
is uniform.
\end{exercise}

One way to guarantee that $ T$ is doubly stochastic is to ask $ T=T^{tr}$. A walk on a Cayley graph is always doubly stochastic. Having the generating set $ S$ be closed under inverses makes $ T=T^{tr}$, but we do that only to invoke the Laundau-Odlyzko theorem.

Theorem 3.4 (Landau-Odlyzko)   A random walk on a regular undirected graph of degree $ \Delta$, diameter $ d$ and $ n$ vertices has an eigenvalue gap

$\displaystyle \gamma=1-\max_{\lambda \ne 1}\vert\lambda\vert\ge\frac{c}{n\Delta d}$

The eigenvalue gap tells us about the rate of convergence to the stationary distribution. This is always true, but easier if $ T$ is symmetric. Then the spectral theorem gives us an orthonormal basis $ e_1,\ldots,e_n$, with $ e_iT=\lambda_i e_i$. for $ T$. We can choose $ e_1=(1,\ldots,1)/\sqrt n$. Then for $ C$ the rotation that sends those eigenvectors to the standard basis, $ C^{-1}TC$ is diagonal, with entries its eigenvalues. This ``separation of coordinates'' makes raising the matrix to powers easy: $ C^{-1}T^kC=(C^{-1}TC)^k$, a diagonal matrix with entries $ \lambda_i^k$. This exponentially decays to the diagonal matrix with entries $ (1,0,\dots,0)$. Conjugating by the inverse of $ C$ then gives $ T^\infty$.

To measure convergence of $ T^k$ to $ T^\infty$, we need a measure of the size of a matrix and apply it to $ T^k-T^\infty$, which conjugated by $ C$ gives a diagonal matrix with entries $ (0,\lambda_2,\ldots,\lambda_n)$.

Definition 3.5   The operator norm or matrix norm $ \vert A\vert$ of a matrix $ A$ is $ {\displaystyle \max_{x \ne 0}} \frac{\Vert xA\Vert}{\Vert x\Vert}$, where the norm $ \Vert x\Vert$ of vectors is the $ \ell^2$-norm: $ \sqrt{\sum x_i^2}$. The Frobenius norm $ \Vert A\Vert _F$ is the $ \ell^2$ norm on the entries: $ \sqrt{\sum a_{ij}^2}$.


\begin{exercise}
% latex2html id marker 893
$\Vert A\Vert\le \Vert A\Vert _F\le ...
...ll $1$s
matrix and the identity matrix show that these are sharp.
\end{exercise}


\begin{exercise}
% latex2html id marker 895
If $A=A^{tr}$\ then $\Vert A\Vert=\m...
...ambda}\vert\lambda\vert$. \ {\em Hint.} Use the
spectral theorem.
\end{exercise}


\begin{exercise}
% latex2html id marker 897
If $C$\ is an orthogonal matrix, $\Vert AC\Vert=\Vert CA\Vert=\Vert A\Vert$.
\end{exercise}

Thus $ \Vert T^k-T^\infty\Vert=\Vert C^{-1}(T^k-T^\infty)\Vert=\lambda_2^k=
(1-\gamma)^k<e^{-\gamma k}$. So for $ \Vert T^k-T^\infty\Vert<\varepsilon$ it suffices to have $ k\ge \frac{1}{\gamma}\ln\frac{1}{\varepsilon}$.

Now we need to relate this bound on the operator norm to what we care about: the deviation of the distribution from uniform, $ \sum_{j=1}^n\left\vert p_{ij}^{(k)}-\frac1n\right\vert$.

Set $ \varepsilon=\frac{\delta}n$ and $ A=T^k-T^\infty$. If $ \Vert A\Vert<\varepsilon$, then $ \Vert A\Vert _F^2<n^2\varepsilon^2<\delta$, so for all $ i$ and $ j$, $ \vert A_{ij}\vert<n\varepsilon=\delta$. So to be within $ \delta$ of a uniform distribution, we need $ \Vert A\Vert<\varepsilon=\frac{\delta}n$, which we can achieve with $ k>\frac{\ln(\frac 1\varepsilon)}{\gamma}\sim\frac{2\ln n}{\gamma}$ (since $ \delta$ is a constant) and by Landau-Odlyzko, $ \gamma>\frac{c}{N\Delta d}>cn^{-7}$ ($ N=n^3$, $ \Delta\le n$, $ d\le n^3$), so we can let $ k$ be $ O(n^7)$, which leads to a similar diamater and we have proved

Theorem 3.6   If $ S_n=\langle S\rangle$ and one of the generators has degree less than $ .3n$ then the diameter of the Cayley graph is $ O(n^7\log^cn)$.

Theorem 3.7   Every chain of subgroups in $ S_n$ has length at less than $ 2n$.


\begin{exercise}
% latex2html id marker 899
Use the above theorem to show that a...
...ors of
$S_n$\ has fewer than $2n$\ elements and thus $\Delta<2n$.
\end{exercise}


next up previous
Next: Ramsey Theory Up: Discrete Math, 12th Problem Previous: Diameter of
Varsha Dani 2003-08-15