\magnification = 1200
\hfuzz=10pt
\hsize=4.8in
\vsize=7.3in
\baselineskip=18pt
\hoffset=0.35in
\voffset=0.1in
\parindent=3pt
\def\vsni{\vskip 0.2cm}
\def\v{\par\noindent}
\def\di{\displaystyle}
\def\R{I\!\!R}
\def\PS{I\!\!\!P}
\def\P{\rm \PS}
\def\ES{I\!\!\!E}
\def\E{\rm \ES}
\def\C{I\!\!\!\!C}
\def\N{I\!\!N}
\def\Q{I\!\!\!\!Q}
\def\Z{I\!\!\!\!Z}
\def\ui{[0,1]}
\def\O{\Omega}
\def\CO{{\cal O}}
\def\o{\omega}
\def\t{\theta}
\def\z{\zeta}
\def\l{\lambda}
\def\tz{\tilde{\zeta}}
\def\vt{{\tilde v}}
\def\zg{\N_{0}}
\def\limsup{\mathop{\overline{\rm lim}}}
\def\liminf{\mathop{\underline{\rm lim}}}
\def\ut{{\tilde u}}
\def\dz{\zeta^{\prime}}
\def\S{\Sigma}
\def\s{\sigma}
\def\sp{\sigma^{\prime}}
\def\op{\omega^{\prime}}
\def\a{\alpha}
\def\b{\beta}
\def\k{\kappa}
\def\hf{\hat{f}}
\def\hg{\hat{g}}
\def\hphi{\hat{\xi}}
\def\hpi{\hat{\pi}}
\def\hx{\hat{x}}
\def\hV{\hat {V}}
\def\hW{\hat {W}}
\def\eps{\epsilon}
\def\sp{\sigma^{\prime}}
\def\A{{\cal A}}
\def\L{\cal L}
\def\I{{\cal I}}
\def\ress{r_{\rm ess}}
\def\ess{\rm ess}
\def\Ft{{\cal F}_{\t}}
\def\F{{\cal F}}
\def\df{f^{\prime}}
\def\dhf{\hf^{\prime}}
\def\dhg{\hg^{\prime}}
\def\ddf{f^{\prime \prime}}
\def\dphi{\xi^{\prime}}
\def\dpsi{\psi^{\prime}}
\def\dg{g^{\prime}}
\centerline{\bf ON THE RATE OF CONVERGENCE TO EQUILIBRIUM}
\centerline{\bf FOR COUNTABLE ERGODIC MARKOV CHAINS}
\vglue 0.2cm
\vglue 0.2cm\centerline{Stefano Isola}
\vglue 0.4cm
\centerline{\it Dipartimento di Matematica, Universit\`a degli Studi di
Bologna,}
\centerline{\it piazza di Porta S.Donato 5, I-40127 Bologna, Italy.}
\centerline{\it e-mail: isola@dm.unibo.it}
\vskip 3cm
{\bf Abstract.} Using elementary methods, we prove
that for a countable Markov
chain $P$ of ergodic degree $\ell \geq 1$ the rate of convergence
towards the stationary distribution is $o(n^{-(\ell-1)})$,
provided the initial distribution satisfies certain
conditions of asymptotic decay. An example, modelling
temporal intermittency in dynamical system theory,
is worked out
in detail, illustrating the relationships between convergence
behaviour, analytic properties of the generating functions
associated to transition probabilities
and spectral properties of the Markov operator
$P$ on the Banach space
$\ell_1$. Explicit conditions allowing to obtain
either optimal bounds or
the actual asymptotics for the rate of
convergence are also discussed.
\vskip 0.2cm
\noindent
AMS 1991 Subject Classification: Primary 60J10
\vskip 0.2cm
\noindent
Keywords: Countable Markov chains, convergence to equilibrium,
ergodic degreee, intermittent dynamical systems.
\vfill \eject
\noindent
{\bf 1. Preliminaries and statement of the results.}
\vsni
\noindent
Let $S$ be a countable set and $P:S\times S \to [0,1]$
be a transition probability matrix.
With no loss we may set $S=\N$.
We shall assume that
$P$ governs an irreducible, recurrent and aperiodic Markov chain
$X=(x_n)_0^{\infty}$ with state space
$S$.
To be more precise, we set $\zg :=\N \cup \{0\}$ and
let $\O$ denote the subset of $S^{\zg}$
given by all sequences $\o=(\o_i)_{i\in \zg}$ which satisfy for any
integer $i$: $p_{\o_i\o_{i+1}}\equiv P(\o_i,\o_{i+1})>0$.
Let moreover
${\bf P}_{\nu}$ be the
probability measure with initial distribution ${\nu}$
(that of $x_0$) on $\O$, i.e.
$$
{\bf P}_{\nu}\{x_n(\o)=j\}=\sum \nu_i p^{n}_{ij}=
(\nu P^n)_j
$$
where $p^{n}_{ij}\equiv P^{(n)}(i,j)$.
In particular, if $\nu =\delta_i$, where $i$ is some
reference state chosen from the outset, we have
$$
{\bf P}_{i}\{x_n(\o)=j\}={\bf P}\{x_n(\o)=j|x_0(\o)=i\}=p^{n}_{ij}.
$$
Our sample space will be $\O$ equiped with the
restriction of the product $\s$-field and with probability
measure ${\bf P}_{\nu}$ for some initial distribution
$\nu$. We shall denote by
${\bf E}_{\nu}$ the expectation w.r.t. ${\bf P}_{\nu}$.
Finally, for any $n\in \zg$
we let $x_n$ be the projection on the
$n^{\rm th}$ coordinate, i.e. $x_n(\o)=\o_n$.
\noindent
It is well known that $P$ has a (unique) stationary
probability distribution if and only if
it is ergodic or, equivalently, positive recurrent
(see e.g. [C], [KSK]).
This means that for some
(and hence for all) $i\in S$, the
${\bf P}_i$-expectation $m_{i}$
of $\min\{n\in \zg, x_n(\o)=i\}$, the time of the first visit at $i$,
is finite.
In this case, the distribution ${\bf \pi}$ on $S$ given by
${\bf \pi} = (\pi_i)_1^{\infty}=(1/m_{i})_1^{\infty}$ is a solution
to ${\bf \pi}={\bf \pi} P$ and thus defines a
stationary distribution.
\noindent
In the sequel we identify sequences
${\bf \nu} =(\nu_i)_1^{\infty} \in \ell_1(S)$,
the corresponding row vectors
${\bf \nu}=(\nu_1,\nu_2,\dots)$, and finite signed measures on $S$, and
define
$$
\Vert {\bf \nu} \Vert = \sum_{i=1}^{\infty}|\nu_i|.
$$
A signed measure $\nu$ satisfying $\sum \nu_l =1$ will be called
a signed distribution in the sequel.
Similarly, we shall identify sequences
${\bf u}=(u_i)_1^{\infty} \in \ell_{\infty}(S)$, the corresponding
column vectors ${\bf u}=(u_1,u_2,\dots)^t$, and bounded
functions on $S$.
\noindent
We introduce the classical {\it taboo} quantities:
$$\eqalign{
f^n_{ij}&={\bf P}\{x_l(\o) \neq j, 00$ for $k\geq 1$.
Moreover,
the state $i$
being recurrent, $r_1^{(i)},r_2^{(i)},\dots$
are i.i.d. random variables under the probability ${\bf P}_i$.
Their common distribution is given by
$$
{\bf P}_i\{r_k^{(i)}(\o)=n\}=f^n_{ii}\qquad k\geq 1.\eqno(1.5)
$$
On the other hand, having fixed an initial distribution
$\nu$ and a reference state $i$, the random variable $r_0^{(i)}$
(the delay
in the embedded renewal process) is distributed according to
${\bf P}_{\nu}$. More specifically,
$$
{\bf P}_{\nu}\{r_0^{(i)}=n\}=\nu_i \, \delta_{n0}+
\sum_{l\neq i}\nu_lf^n_{li}.
$$
For $\ell \in \N_0$, $i,j\in S$ (and $k\geq 1$), we set
$$
M_{ij}^{(\ell)}:={\bf E}_i(|r_k^{(j)}|^{\ell})=
\sum_{n=1}^{\infty} n^{\ell}f^n_{ij}.\eqno(1.6)
$$
Notice that $m_i\equiv M_{ii}^{(1)}$.
Given a signed distribution
$\nu$ on $S$, we also set,
$$
{\hat M}_{\nu i}^{(\ell)}:={\bf E}_{\nu}(|r_0^{(i)}|^{\ell})=
\sum_{l\neq i}\nu_l\sum_{n=1}^{\infty} n^{\ell}f^n_{li}
=\sum_{l\neq i}\nu_l \, M_{li}^{(\ell)}.
\eqno(1.7)
$$
\vsni
\noindent
{\bf Lemma 1.} {\sl If $\ell \geq 1$, then
$M_{ii}^{(\ell)}<\infty$ if and only if
${\hat M}_{\pi i}^{(\ell-1)}<\infty$.}
\vsni
\noindent
{\sl Proof.}
We have
$$
\eqalign{
{\hat M}_{\pi i}^{(\ell-1)}&=
\sum_{l\neq i}\pi_l\sum_{n=1}^{\infty} n^{\ell-1}f^n_{li}
=\pi_i\sum_{l\neq i} { }_ip^*_{il}
\sum_{n=1}^{\infty} n^{\ell-1}f^n_{li}\cr
&=\pi_i\sum_{n=1}^{\infty}n^{\ell-1}\sum_{m=0}^{\infty}
\sum_{l\neq i} { }_ip^m_{il} { }_ip^n_{li}
=\pi_i\sum_{n=1}^{\infty}n^{\ell-1}\sum_{m=0}^{\infty}
{ }_ip^{n+m}_{ii}\cr
&=\pi_i\sum_{n=1}^{\infty}n^{\ell-1}\sum_{m=n}^{\infty}
f^{m}_{ii}
=\pi_i\sum_{n=1}^{\infty}n^{\ell}f^{n}_{ii}=\pi_i \, M_{ii}^{(\ell)}\cr
}
$$
where we have used the last identity in (1.1) along with the properties
${ }_ip^0_{il}=0$ and
${ }_ip^{n+m}_{ii}=\sum_{l\neq i} { }_ip^m_{il} { }_ip^n_{li}$.
\hfill$\diamondsuit$
\vsni
\noindent
Furthermore, it is well known that, for a recurrent chain,
if $M_{ii}^{(\ell)}<\infty$ for some state
$i$ then $M_{jj}^{(\ell)}<\infty$ for all
$j\in S$ (see, e.g., [C], Chap. I.11, Cor. 1).
We now state the following definition
(see [KSK] for related notions),
\vsni
\noindent
{\bf Definition 1.} {\it The recurrent chain $P$ has
{\rm ergodic degree} $\ell$ if
$M_{ii}^{(\ell)}<\infty$ but $M_{ii}^{(\ell+1)}=\infty$
for some (and hence for all) $i\in S$.}
\vsni
\noindent
Notice that $M_{ii}^{(0)}=1$ so that the degree is always defined.
Thus, null chains have ergodic degree $0$ whereas ergodic chains
have ergodic degree at least $1$. If $M_{ii}^{(\ell)}<\infty$
for every $\ell$, one says that $P$ has infinite ergodic degree.
This happens for instance if the coefficients $f_{ii}^n$ decay
geometrically with $n$.
In this case the corresponding
chain is accordingly called geometrically ergodic.
We refer to [FMM] for related convergence results
in the geometrically ergodic case.
\noindent
Finally, the preceeding
observations motivate the next definition.
\vsni
\noindent
{\bf Definition 2.} {\it
Given an ergodic chain $P$ with state space $S$
and a signed distribution
$\nu$ on $S$,
we say that $\nu$ has $P$-order $\ell$, if
${\hat M}_{\nu i}^{(\ell)}<\infty$ but
${\hat M}_{\nu i}^{(\ell+1)}=\infty$
for some (and hence for all) $i\in S$. }
\vsni
\noindent
In particular, from
Lemma 1 it follows that if $P$ has ergodic degree $\ell$ then
$\pi$ has $P$-order $\ell-1$, and viceversa. We now state the main
result of this Section.
\vsni
\noindent
{\bf Theorem 1.} {\sl Suppose $P$ has ergodic degree
$\ell \geq 1$. Then, for any initial signed distribution
${\bf \nu}$ of $P$-order at least $\ell -1$, we have}
$$
||{\bf \nu} P^n-{\bf \pi}|| =o(n^{-(\ell-1)}).
$$
\vsni
\noindent
{\bf Remark.}
In the next Section we shall prove
Theorem 1 in several Lemmas
using elementary arguments.
Some partial results
in our direction were obtained in [F1].
Related results can be obtained
using the coupling method (see [L]).
Another method, making use of matrix-valued
analytic functions and allowing to sharpen
the above result under suitable conditions, is proposed
in Section 3 for a specific example.
\vsni
\noindent
We let $\tau$ be the shift transformation on $\O$, that is
$x_k\circ \tau (\o)=\o_{k+1}$.
With $P$ and ${\bf \pi}$ one can
define a $\tau$-invariant Markov random field $\mu = \mu(P,{\bf \pi})$
supported by $\O$ as follows:
$$
\mu(\{ x_k(\o)=\xi_0,\dots ,x_{k+n}(\o)=\xi_{n}\})=
\pi_{\xi_0}\prod_{j=1}^np_{\xi_{j-1}\xi_j}\eqno(1.8)
$$
We then have the following,
\vsni
\noindent
{\bf Corollary 1.} {\sl Suppose $M_{ii}^{(\ell)}<\infty$
for some $\ell \geq 1$. Then, for any pair of
bounded vectors ${\bf u}, {\bf v} : S \to \R$,}
$$
|\, \mu({\bf u}(x_n){\bf v}(x_0))-\mu({\bf u}(x_0))\,\mu({\bf
v}(x_0))\,|
= \, o(n^{-(\ell-1)})
$$
\vsni
\noindent
{\sl Acknowledgements}: I would like to thank Lai-Sang Young
for interesting conversations at the origin of this research.
\vskip 0.5cm
\noindent
{\bf 2. Proofs of the main results.}
\vsni
\noindent
First, we recall the following result, which will be used
several times in the sequel and whose proof can be
found in ([C], Chap. I.5, Lemma A):
\vsni
\noindent
{\bf Lemma 2.} {\sl Let $\{a_n\}_{n\geq 0}$ be a
sequence of nonnegative numbers not all vanishing.
If
$$\lim_{n\to \infty} {a_n\over \sum_{m=0}^na_m}=0
$$
then, whenever the sequence $\{b_n\}_{n\geq 0}$ of real
numbers has a limit, we have}
$$
\lim_{n\to \infty}{\sum_{m=0}^na_mb_{n-m}\over \sum_{m=0}^na_m}
=\lim_{n\to \infty} b_n.
$$
\noindent
The following three Lemmas are supposed to hold
under the hypotheses of Theorem 1.
\vsni
{\bf Lemma 3.} {\sl For each $i\in S$, we have}
$$
|p_{ii}^n-\pi_i|=o(n^{-(\ell-1)}).
$$
{\sl Proof.}
We introduce the generating functions
$$
P_{ij}(z)=\sum_{n=0}^{\infty}p^n_{ij}\, z^n,\qquad
F_{ij}(z)=\sum_{n=0}^{\infty}f^n_{ij}\, z^n\eqno(2.1)
$$
and from (1.2) we get the relations (we set $f^n_{ij}=0$ for $n=0$)
$$
P_{ii}(z)={1\over 1-F_{ii}(z)}, \quad
P_{ij}(z)=F_{ij}(z)P_{jj}(z),\quad i\neq j.\eqno(2.2)
$$
We first show that the function $P_{ii}(z)$ is
analytic in $|z|<1$ and converges at every point of the
unit circle besides $z=1$.
Indeed, recurrence of the state $i$ implies $F_{ii}(1)=1$,
so that $|F_{ii}(z)|<1$ for $|z|<1$ because $f^n_{ii}\geq 0$.
Moreover, $|F_{ii}(z)|<1$ also for $|z|=1$, $z\neq 1$.
This follows from the fact that, since the chain is
aperiodic, ${\rm g.c.d.}\{n, f^n_{ii}\neq 0\}=1$.
Now set
$$
D_{ii}(z)=\sum_{n=0}^{\infty}d_n^{(i)} z^n, \;\;\;
d_n^{(i)} :=\sum_{k>n}f^k_{ii}
$$
and notice that $D_{ii}(z)$ converges absolutely in $|z|\leq 1$ and
has no zeros on $|z|=1$. In addition
$\sum_{n=0}^{\infty}d_n^{(i)}=m_i$. It then follows
(see [Z], p.140) that the function
$$
{1\over D_{ii}(z)}=(1-z)P_{ii}(z)
$$
has a power series expansion which converges absolutely in the closed
unit disk and, moreover, its value at $z=1$ is
$m_i^{-1}=\pi_i$. As a
consequence of Abel's theorem
(see [T], p.229) we then find
$$
\lim_{n\to \infty}p^n_{ii} = \pi_i.
$$
We now study the rate of convergence to zero of
$$
\mu_n^{(i)} := m_i\, p^n_{ii}-1
$$
as $n\to \infty$.
To this end, we first observe that $\mu_n^{(i)}$ is the coefficient
of $z^n$ in
$$
H_{ii}(z):=m_i\, P_{ii}(z)-{1\over 1-z} = {E_{ii}(z)\over D_{ii}(z)}
$$
where
$$
E_{ii}(z)=\sum_{n=0}^{\infty}e_n^{(i)} z^n, \;\;\; e_n^{(i)}
:=\sum_{k>n}d_k^{(i)}.
$$
Also observe that
$$
\partial F_{ii}(1)=
\sum_{n=1}^{\infty}n f^n_{ii}=M_{ii}^{(1)}= m_i=D_{ii}(1).
$$
where the notation $\partial^m G(z_0)={d^mG(z)/ dz^m}\big|_{z=z_0}$
has been used.
Let us write $H_{ii}(z)$ as
$$
H_{ii}(z)=P_{ii}(z)\cdot \bigl\{\, (1-z)E_{ii}(z)\, \bigr\}=
P_{ii}(z)\cdot \bigl\{\,m_i-D_{ii}(z)\, \bigr\}.
$$
It then follows that under the only
assumption that $M_{ii}^{(1)} <\infty$ the power series of the
term within braces
converges absolutely for $z=1$ and its value is $0$.
This fact and the above discussion entail
that $\mu_n^{(i)} \to 0$ as $n\to \infty$.
Next, assume $M_{ii}^{(2)}<\infty$. In this case we have
$$
\partial^2 F_{ii}(1)=\sum_{n=2}^{\infty}n(n-1) f^n_{ii}=
M_{ii}^{(2)}-M_{ii}^{(1)}=2\,\partial D_{ii}(1)=
2\,E_{ii}(1).
$$
Then notice that
$n\, \mu_n^{(i)}$ is the coefficient of $z^{n-1}$ in
$$
\partial H_{ii}(z) = P_{ii}(z)\cdot \left\{\, E_{ii}(z)-
m_i\,{\partial D_{ii}(z)\over D_{ii}(z)}\, \right\}
$$
and by the above identities
the power series of the term within curly brackets converges
absolutely for $z=1$ and
its value is $0$.
We then have that
$n\, \mu_n^{(i)}$ tends to zero.
\noindent More generally, if $M_{ii}^{(\ell)}<\infty$,
one has the identities
$$
\eqalign{
\partial^{\ell} F_{ii}(1) &=
\sum_{n={\ell}}^{\infty}n(n-1)\dots (n-{\ell}+1) f^n_{ii}\cr
&=
\sum_{j=1}^{\ell}S_{\ell}^{(j)}M_{ii}^{(j)}={\ell}\,
\partial^{\ell-1} D_{ii}(1)=
{\ell}({\ell}-1)\, \partial^{\ell-2} E_{ii}(1)
\cr }
$$
where the $S_{\ell}^{(j)}$ are Stirling numbers of the first kind
defined by the relation
$$
x(x-1)\dots (x-{\ell}+1)=\sum_{j=1}^{\ell}S_{\ell}^{(j)}x^j.
$$
For $k\geq 0$, set
$$
J_k(z):={1\over P_{ii}(z)}\cdot\partial^k H_{ii}(z)
$$
so that
$$
J_{k+1}(z)=\partial J_k(z) + J_k(z)\cdot
\partial\log P_{ii}(z).
$$
Using these relations
and proceeding inductively in $\ell$
we find that if $M_{ii}^{(\ell)}<\infty$
then the power series of
$$
{1\over P_{ii}(z)}\cdot\partial^{{\ell}-1}H_{ii}(z)
$$
converges absolutely for $z=1$ and its value is $0$.
The above and the fact that
$n(n-1)\dots (n-{\ell}+2)\mu_n^{(i)}$ is the coefficient of
$z^{n-{\ell}+1}$ in $\partial^{{\ell}-1}H_{ii}(z)$
then complete the proof.
\hfill$\diamondsuit$
\vsni
{\bf Lemma 4.} {\sl For all $j\in S$,}
$$
|p_{ij}^n-\pi_j|=o(n^{-(\ell-1)})
$$
\noindent
{\sl Proof.}
First, using Lemma 2, it follows from (1.2) and
$\sum_{n}f^n_{ij}=1$ that
$p_{ij}^n\to\pi_j$ as $n\to \infty$.
Moreover, as already remarked,
the condition $M_{ii}^{(\ell)}<\infty$
implies that $\sum_{n=1}^{\infty} n^{\ell}f^n_{ij}<\infty$,
for all pairs (distinct or not) $i,j\in S$.
This implies
$\sum_{k=n}^{\infty}f^k_{ij}=o(n^{-\ell})$.
Putting together these observations,
Lemma 2 and Lemma 3, along with the inequality
$$
|p_{ij}^n-\pi_j|\leq \sum_{k=1}^{n}f^k_{ij}|p_{jj}^{n-k}-\pi_j|
+\pi_j\sum_{k=n+1}^{\infty}f^k_{ij},
$$
we then have that the rate of convergence is the same
as in Lemma 3. \hfill$\diamondsuit$
\vsni
\noindent
These properties entail that $P^n$ tends to the matrix
whose rows are $(\pi_1,\pi_2,\dots)$.
In addition, we have the following
\vsni
{\bf Lemma 5.}
$$
\sum_{j\in S}|p_{ij}^n-\pi_j|=o(n^{-(\ell-1)}).
$$
\noindent
{\sl Proof.}
This follows
by first noting that the proofs of Lemmas 2 and 3
actually imply that $|{p_{ij}^n/ \pi_j}-1|=o(n^{-(\ell-1)})$
and then using the fact that ${ \pi}$ is a probability vector.
\hfill$\diamondsuit$
\vsni
\noindent
{\sl Proof of Theorem 1.}
Lemma 5 implies that $||{\bf \delta}_iP^n-{\bf
\pi}||=o(n^{-(\ell-1)})$.
Putting ${\bf \nu} = \sum \nu_l \, \delta_l$ and using the fact
that ${\bf \nu}$ is normalized, i.e. $\sum \nu_l=1$, we write
$$\eqalign{
{\nu} P^n-{\pi}=\sum_l &\, \nu_l \,\, ({\delta}_iP^n-{\pi})+
\sum_{l \neq i}\nu_l \, ({\delta}_lP^n-{\delta}_iP^n)\cr
&=({\delta}_iP^n-{\pi})+
\sum_{l \neq i}\nu_l \, ({\delta}_lP^n-{\delta}_iP^n)\cr }
$$
The $\ell_1$-norm of the first term in the r.h.s.
is then estimated by
Lemma 5. For the second term we have
$\Vert {\delta}_lP^n-{\delta}_iP^n \Vert =
\sum_j | p_{lj}^n-p_{ij}^n|$. Using the decompositions
$p_{lj}^n={}_ip_{lj}^n +\sum_{k=1}^{n-1}f_{li}^kp_{ij}^{n-k}$
([C], Chap. I.9, Thm. 1) and $p_{ij}^n=\sum_{k=1}^{n-1}f_{li}^k\,
p_{ij}^{n}+
\sum_{k=n}^{\infty}f_{li}^k\, p_{ij}^{n}$,
and noting that $\sum_j{}_ip_{lj}^n=\sum_{k\geq n}f_{li}^k$
and $\sum_jp_{ij}^{n}=1$, we obtain
$$
\Vert {\delta}_lP^n-{\delta}_iP^n \Vert
\leq
2\sum_{k= n}^{\infty}f_{li}^k+
\sum_{k=1}^{n-1}f_{li}^k\sum_j|p_{ij}^{n-k}-p_{ij}^n|.
$$
Now, using Lemma 2, Lemma 5 and $f^*_{li}=1$, one readily
shows that the second term above gives a contribution
$o(n^{-(\ell -1)})$ to $\Vert {\nu} P^n-{\pi}\Vert$.
The same holds true for the first term, since
the assumption that ${\bf \nu}$
has $P$-order at least $\ell -1$ (w.r.t to the reference state
$i$) implies that
$$
\sum_{l \neq i}\nu_l \,\sum_{k\geq n}f_{li}^k =o(n^{-(\ell-1)}).
\eqno{\diamondsuit}
$$
\vsni
\noindent
{\bf Remark 1.} As we made no use of the fact that
$M_{ii}^{(\ell+1)}=\infty$, the statement of Theorem 1 actually
holds under the only assumption that $M_{ii}^{(\ell)}<\infty$.
We shall see later an example where
the assumption
on the ergodic degree
can be made effective to yield
optimal bounds on the rate
of convergence.
\vsni
\noindent
{\bf Remark 2.} The proof given
above brings out the meaning
of the condition on the $P$-order of the initial distribution
$\nu$. This is related
to the fact that the behaviour of
$|p_{ij}^n-\pi_j|$ and hence of
$||{\bf \delta}_iP^n-{\bf \pi}||$ is
necessarily not uniform in the
departing state index $i$. Indeed, according to the
above discussion, such uniformity would
imply the existence of two positive constant $C_1, C_2$
and an integer $n_0$,
which {\sl do not depend} on $i$ and $l$,
such that, for all $n\geq n_0$
$$
C_1 \leq {\sum_{k\geq n}f_{li}^k \over
\sum_{k\geq n}f_{il}^k} \leq C_2.
$$
This, in turn, would imply that the ratio
$M_{li}^{(1)}/M_{il}^{(1)}$ satisfies a similar bound.
On the other hand it is well known
that
$\lim_{l\to \infty}{(M_{li}^{(1)}/ M_{il}^{(1)})}=0$,
for all $i\in S$ ([C], Chap. I.11, Thm. 6; see also [H]).
\vsni
\noindent
{\sl Proof of Corollary 1.}
For any pair ${\bf u}\in \ell_{\infty}(S)$, ${\bf \rho} \in \ell_1(S)$
we define
${\overline {\rho {\bf u }}}=
(\rho(1)u(1),\rho(2)u(2),\dots)$ and
$ \rho\cdot{\bf u} = \sum_{i\in S}
\rho(i)u(i)$. Thus ${\overline {\rho{\bf u}}}\cdot {\bf 1} =
{\bf \rho}
\cdot {\bf u}$,
and the unit column vector ${\bf 1}=(1,1,\dots)^t$ satisfies
$P{\bf 1}={\bf 1}$. For definiteness and without loss,
suppose that $\mu({\bf u})\, \mu({\bf v})\neq 0$.
Then we have
$$
\eqalign{
|\, \mu({\bf u}(x_n){\bf v}(x_0))-
\mu({\bf u}(x_0))\,\mu({\bf v}(x_0))\,| &=
|\,{\overline {{\bf \pi} {\bf v}} } \, P^n\cdot {\bf u} -
({\pi} \cdot {\bf v})({\pi} \cdot {\bf u}) \, | \cr
&=|\,({\overline {{\bf \pi} {\bf v}}}\, P^n-
{\bf \pi} \, ({\overline {{\bf \pi} {\bf v}}}\cdot {\bf 1})\,)
\cdot {\bf u}\, |\cr
&\leq \Vert {\bf u}\Vert_{\infty}\, \Vert {\bf v}\Vert_{\infty}\,
\Vert \,\nu\,P^n-{\bf \pi}\, \Vert\cr
}
$$
where $\nu$ denotes the normalized $\ell_1$ row vector
${\overline {{\pi} {\bf v}}}/ ({\pi} \cdot {\bf v})$.
The result now follows putting together Lemma 1 and Theorem 1.
\hfill$\diamondsuit$
\vskip 0.5cm
\noindent
{\bf 3. Convergence vs analytic
and spectral properties.}
{\bf An example.}
\vsni
\noindent
As we have seen, the dependence
on the departing state $i$ of the behaviour of
$\Vert \delta_i P^n -\pi \Vert$,
although not explicitly indicated in Lemma 5,
is what makes
our assumptions on the $P$-order of the initial
distribution $\nu$ necessary.
\noindent
Moreover, from our discussion it follows that
the rate of convergence
to zero of $\Vert \delta_i P^n -\pi \Vert$ is connected
with the analytic properties of the generating functions
$P_{ij}(z)$ in the vicinity of the singular point $z=1$.
\noindent
If we now consider $P$ as a bounded linear
Markov operator acting on the Banach space $\ell_1(S)$,
its adjoint $P^*$ is
represented by the transposed matrix acting on the dual space
$\ell_1^*=\ell_{\infty}$. The resolvent
$R_{\lambda}(P):=(\lambda I - P)^{-1}$ admits,
for $|\lambda|> \Vert P\Vert$, the expansion
$$
\lambda \, R_{\lambda}(P) = I +
\sum_{n=1}^{\infty}\left({P\over \lambda}\right)^n
$$
which shows that $1-\delta_{ij}+P_{ij}(z)$ is the
$(i,j)$-element of $\lambda \, R_{\lambda}(P)$,
with the identification $z=1/\lambda$.
This, in turn, indicates that the convergence properties
of $\Vert \delta_i P^n -\pi \Vert$, the analytic properties
of the functions $P_{ij}(z)$, and the spectral properties
of $P$ in $\ell_1(S)$ are intimately connected items.
In particular, the dependence of the first two from the
state index $i$ plays an important role in
determining nature of the latter,
as we shall see in the following example\footnote{$^{1}$}{
We shall adopt the convention that a matrix
$(t_{ij})$ representing an operator $T$ acts from the right, that is
through the equations $(Tx)_j=\sum_{i\in S}x_i\,t_{ij}$.}.
\vsni
\noindent
{\sl Example.}
Suppose that $S=\N$ and the transition matrix is
$$
P=\pmatrix{p_1&p_2&p_3&\ldots \cr
1 &0 &0 &\ldots \cr
0 &1 &0 &\ldots \cr
0 &0 &1 &\ldots \cr
\vdots &\vdots &\vdots&\ddots \cr}
$$
The space $\O$ is then given by all sequences $\o$ satisfying
the following condition: given $\o_i$ then either $\o_{i-1}=\o_i+1$
or $\o_{i-1}=1$.
We shall assume that the probability vector $p=(p_1,p_2,\dots)$
has the property
${\rm g.c.d.}\{n: p_n>0\}=1$.
It then follows that the corresponding
chain is aperiodic and recurrent.
Let the coefficients $d_n$ be defined by
$d_n:=\sum_{i>n}p_i$, ($n\geq 0$).
The steady-state equation is $\pi_n=\sum_{i\in S}\pi_i\, p_{in}$
and is formally solved by $\pi_n=\pi_1\, d_{n-1}$, ($n\geq 1$).
We also have $f^n_{11}=p_n$.
Consequently, the chain is positive-recurrent
if and only if $\sum d_n < \infty$,
null-recurrent in the opposite case.
In the former case, we have
$\pi_1=(\sum_{n=1}^{\infty}np_n)^{-1}=
(\sum_{n=0}^{\infty}d_n)^{-1}$.
Notice that the two probability vectors $\pi$ and $p$ coincide
if and only if $p_n=2^{-n}$. On the other hand, if
$p_n\sim n^{-(\ell +2)}\, L(n)$ with $L(n)$
slowly varying at infinity,
then the chain
has ergodic degree $\ell$.
\vsni
\noindent
{\bf Remark.}
This example has interesting applications
in modelling temporal intermittency in dynamical systems
theory (see, e.g., [W]). A brief discussion
on the consequences of the results stated below
in the context of intermittent interval maps
is given in the Appendix at the end of the paper.
\vsni
\noindent
{\bf Proposition 1.} {\sl Suppose that the chain $P$ defined above
has finite ergodic degree $\ell \geq 1$. Then,
\item{I.} The generating functions $P_{ij}(z)$
defined in (2.1) are analytic in
the open unit disk with no singularities on the unit circle
besides a non-polar singular point at $z=1$;
\item{II.} The spectrum $\sigma (P)$ of
the Markov operator $P$ acting on
$\ell_1(\N)$ coincides with the closed unit disk and
decomposes as follows:
$\sigma_p (P)=\{\lambda : |\lambda|<1\}\cup \{1\}$ and
$\sigma_c (P)=\{\lambda : |\lambda|=1, \lambda \neq 1\}$;
\item{III.} for any
bounded vector ${\bf u}$ and any initial
distribution $\nu=(\nu_i)_1^{\infty} \in \ell_1(S)$ s.t.
$\nu_i = { O}(\pi_i)$,
the quantity
$({\bf \nu} P^n-{\bf \pi})\cdot {\bf u}$
decays as $o(n^{-(\ell-1)})$.
\noindent
Assume furthermore that
$p_n\sim n^{-(\ell +2)}\, L(n)$ with $L(n)$
slowly varying at infinity. Then,
for $n$ large enough,
$$
|\,({\bf \nu} P^n-{\bf \pi})\cdot {\bf u}\, |\leq
C \, n^{-\ell}\, L(n),
$$
where
$C=({\pi}\cdot {\bf u})\,({ \nu}\cdot {\bf 1})/(\ell (\ell+1)m_1)$.
Under the additional hypotheses that
$u_i = o (1)$ and $\nu_i=o (\pi_i)$
we have
$$
({\bf \nu} P^n-{\bf \pi})\cdot {\bf u}\sim
C \, n^{-\ell}\, L(n)\quad \hbox{as}\quad n\to \infty.
$$
}
\noindent
The proof of Proposition 1 will follow from the
discussion given hereafter.
\vsni
\noindent
{\sl I. Generating functions and their analytic properties.}
\noindent
First, it is easy to check that
all entries of the
first $n$ rows of $P^n$ are positive, the $i$-th row of
$P$ being the $(i+n-1)$-th of $P^n$. More specifically, one sees
inductively
that for $n>1$, $i>1$, $j\in \N$,
$$
P^{n}(i,j)=P^{n-1}(i-1,j).\eqno(3.1)
$$
For the generating functions
of the $P^{n}(i,j)$'s we then obtain the relations
$$\eqalign{
P_{ij}(z) &= \delta_{ij}+ z^{i-1}P_{1j}(z),\quad j\geq i>1 \cr
P_{i1}(z) &= z^{i-1}P_{11}(z),\quad i\geq 1 \cr
P_{ij}(z) &= z^{i-j} + z^{i-1}P_{1j}(z),\quad i>j>1.\cr }
\eqno(3.2)
$$
It then suffice to study the behaviour of the
entries of the first row.
They satisfy the recurrence relations (recall that
$P^0(i,j)=\delta_{ij}$),
$$
P^{n}(1,j)=P^{n-1}(1,1)\, P(1,j) + P^{n-1}(1,j+1), \qquad j\geq 1.
$$
This yields
$$
P^{n}(1,j)=\sum_{k=1}^{n-1}P(1,k)\ P^{n-k}(1,j) + P(1,j+n-1).
\eqno(3.3)
$$
Putting $j=1$ and recalling that $P(1,k)=p_k=f^k_{11}$
one gets a particular case of equation (1.2),
$$
P^{n}(1,1)=\sum_{k=1}^nP(1,k)\ P^{n-k}(1,1).
$$
We then find for the generating function of the $P^{n}(1,1)$'s,
$$
P_{11}(z) = {1\over 1-\sum_{n=1}^{\infty}p_nz^n}
={1\over (1-z)D(z)}\eqno(3.4)
$$
where $D(z)=\sum_{n=0}^{\infty}d_nz^n$. More generally,
we get for $j>1$
$$
P_{1j}(z) = {z^{1-j}\, P_j(z)\, P_{11}(z)}\eqno(3.5)
$$
where $P_j(z)=\sum_{n=j}^{\infty}p_{n}z^n$.
Finally, using (1.1)-(1.2) along with (3.2), (3.4) and (3.5) we obtain
$$\eqalign{
F_{ij}(z) &= z^{i-j},\qquad\qquad\qquad \quad i> j, \cr
F_{ij}(z) &={z^{i-j}P_j(z)\over 1-\sum_{ 00$. But this is impossibile because of our assumption
on the probability vector $p$. Therefore the functions
$P_{ij}(z)$ are analytic in the open unit disk with only
one singularity at $z=1$.
\noindent
The
precise nature of the latter depends on
the asymptotic behaviour of the coefficients $p_n$.
For example, if $p_n$ decreases exponentially so does $d_n$
and hence $P_{ij}(z)$ can be extended analytically to an open
domain containing the unit disk, with the exception of a
simple pole at $z=1$.
More generally, one sees that
if the chain has finite ergodic degree, that is
$\sum n^{\ell}p_n<\infty$ but $\sum n^{\ell+1}p_n=\infty$,
for some $\ell \geq 0$,
then $z=1$ is a non-polar singular point
for $P_{ij}(z)$. To see this, define the family
of formal tail sequences $\{d^{(m)}_n\}$, $m\geq 1$, as
$d^{(1)}_n=d_n$ and $d^{(m)}_n =\sum_{l>n}d^{(m-1)}_l$
for $m>1$,
and formal power series $D^{(m)}(z)=\sum d^{(m)}_n z^n$,
so that, in particular, $D^{(1)}(z)=D(z)$. It is easy to
realize that if $P$ is ergodic with ergodic degree $\ell \geq 1$ then
$\{d^{(m)}_n\}$ is defined for $m\leq \ell +1$,
$D^{(\ell)}(z)$ is absolutely convergent at $z=1$, with
$D^{(\ell)}(1)=d^{(\ell +1)}_0$, but $D^{(\ell+1)}(z)$
diverges at $z=1$.
\noindent
Now, if $\ell =0$ then
clearly $(1-z)P_{11}(z)\to 0$ but $P_{11}(z)\to \infty$ as
$z\uparrow 1$. If, instead, $\ell \geq 1$, we define
$$
H^{(\ell)}(z)= P_{11}(z)\,
(d^{(\ell+1)}_0-D^{(\ell)}(z))=
{D^{(\ell+1)}(z)\over D^{(1)}(z)}.
$$
Again, we have $(1-z)H^{(\ell)}(z)\to 0$ but $H^{(\ell)}(z)\to \infty$
as
$z\uparrow 1$. This shows that for each of these functions, but
in particular for $P_{11}(z)$, the point $z=1$ is a non-polar
singular point. Due to (3.2) and (3.5),
similar statements hold for $P_{ij}(z)$ with arbitrary $i,j$.
\vsni
\noindent
{\sl II. Spectral properties of $P:\ell_1(\N)\to \ell_1(\N)$.}
\noindent
>From (3.1)-(3.2) we have that
the rate of convergence of $P^n(i,j)$ to $\pi_j$ is not uniform
in the departing state $i$
(see also the Remark after the proof of Theorem 1).
We are now going to see how this fact
reflects in the nature of the spectrum of $P$ in $\ell_1$.
In particular, the eigenvalue $1$ is
not isolated, even in the case where
the $p_n$'s are exponentially decreasing.
\noindent
We study the structure of the
spectrum of $P$ using the method of generating functions
(see, e.g., [VJ]).
The vector equation $y = Px$,
with $y=(y_1,y_2,\dots)$ and $x=(x_1,x_2,\dots)$,
takes the form
$$
Y(w)={X(w) - x_1w(1-w)D(w) \over w}
$$
where
$Y(w)=\sum_{n=1}^{\infty}y_nw^n$ and
$X(w)=\sum_{n=1}^{\infty}x_nw^n$.
In a similar fashion, the formal
solutions of the equations
$$
(\lambda I - P)x=0, \quad (\lambda I - P^*){\bf x}=0
\quad\hbox{and}\quad (\lambda I - P)x=y
$$
can be written as
$$
X(w)={x_1w(1-w)D(w)\over 1-\lambda w},\eqno(3.7)
$$
$$
X(w)=p\cdot {\bf x}\left({w\over \lambda -w}\right)\eqno(3.8)
$$
and
$$
X(w)={x_1w(1-w)D(w) -wY(w)\over 1-\lambda w},\eqno(3.9)
$$
respectively, where $p\cdot {\bf x} = \sum_{n\geq 1}x_np_n$.
The equation $1-\lambda w=0$ (and its reciprocal $\lambda -w=0$)
entails that the boundary of $\sigma(P)$ (and of $\sigma(P^*)$)
is the unit circle.
\noindent
Let us first consider the point $\lambda =1$. The formal
expressions in (3.7) and (3.8) become
$$
X(w) = x_1 w D(w) \quad\hbox{and}\quad
X(w)=p\cdot {\bf x}\left({w\over 1-w}\right).
$$
The latter has the solution $X(w)=w/(1-w)$ which is the generating
function of the unit vector in $\ell_{\infty}$. On the other hand,
the former is the generating function of an $\ell_1$-vector
if and only if $D(1)<\infty$.
Hence, we have that in the positive-recurrent
case $1$ lies in $\sigma_p(P)$ (for the null-recurrent chain
it lies in $\sigma_r(P)$).
\noindent
More generally, one sees that
the open unit disc $\{\lambda : |\lambda |<1\}$
is always in the point spectrum, whereas any $\lambda$ s.t.
$|\lambda|=1$, $\lambda \neq 1$ lies in $\sigma_c(P)$.
Indeed, take $\lambda = e^{i\theta}$ with $0 < \theta < 2\pi$.
Then the equation in (3.8) gives for the coefficients $x_n$
the relation $x_n = e^{-i(n+1)\theta}\, p\cdot x$.
Multiplying by $p_n$ and summing over $n$ we then get
$1=\sum_n p_n e^{-i(n+1)\theta}$ which is impossible.
Finally, one can easily check using (3.9) that
any point $\lambda$ with $|\lambda|>1$ is in the resolvent set.
Indeed, one can always produce a choice of $x_1$ in such a way
that $x_1w_0(1-w_0)D(w_0) -w_0Y(w_0)=0$ where $w_0$ satisfies
$1-\lambda w_0=0$, thus letting (3.9) be the generating function
of an $\ell_1$-vector.
\vsni
\noindent
{\sl III. Convergence properties.}
\noindent
Next, we discuss the convergence properties of this
chain under the hypothesis that it is positive-recurrent.
In particular we shall prove statement 3) of Proposition 1
(the first part of which is a version of Theorem 1),
using a different, more direct, method, which appears to be
interesting in its own, for it may be extended to some
more general (i.e. non-markovian) mixing Gibbs
random fields [I]. In addition, this method allows one
to obtain optimal bounds for the rate of convergence.
\noindent
For $z\in \C$, consider the matrix $L_z$ given by
$$
L_z=\pmatrix{p_1z&p_2z&p_3z&\ldots \cr
p_1z^2 &p_2z^2 &p_3z^2 &\ldots \cr
p_1z^3 &p_2z^3 &p_3z^3 &\ldots \cr
\vdots &\vdots &\vdots \cr}
$$
For $z=1$ the matrix $L_z$ can be viewed as the transition
matrix of the process $r_0^{(1)},r_1^{(1)},\dots$
given by the sequence of times between returns to the
state $1$ (see (1.4)).
The vector equation $y = L_z x$, takes the
generating function form
$Y(w)=p\cdot {\bf 1}_w\, X(z)$ where
${\bf 1}_w:=(w,w^2,w^3,\dots)^t$.
We then have that, for $z$ in the open unit disk, $L_z$ defines
a bounded operator-valued analytic function acting on
$\ell_1(S)$, with bounded continuous extension to the boundary
$|z|=1$. We also have that, for each $z$ in the closed unit disk,
$\sigma_p(L_z)=\{\lambda_z\}$ and the
eigenvalue $\lambda_z$ is simple. To see this, it suffices to observe
that
the eigen-equation $(\lambda_z I -L_z)x=0$ takes
the form $p\cdot {\bf 1}_w\,X(z)=\lambda_z\, X(w)$ which has the unique
solution
$\lambda_z=p\cdot {\bf 1}_z$ and
$X(w)={\rm const.}\,p\cdot {\bf 1}_w$, that is
$x={\rm const.}\,p$.
The dual equation $(\lambda_z I -L_z^*){\bf x}=0$ gives
the eigenvector $X(w)=zw/(1-zw)$, that is
${\bf x}={\bf 1}_z$.
\noindent
There is a simple algebraic relation between the matrices $L_z$
and $P$: let $Q$ be the transient chain given by the matrix
$$
Q=\pmatrix{0 &0 &0 &\ldots \cr
1 &0 &0 &\ldots \cr
0 &1 &0 &\ldots \cr
0 &0 &1 &\ldots \cr
\vdots &\vdots &\vdots&\ddots \cr}
$$
An easy calculation shows that
$$
(I-zQ)(I-L_z)=(I-zP).\eqno(3.10)
$$
This relation entails that if $u$ is an eigenvector of
$P$ with eigenvalue $1/z$, then
$v=u(I-zQ)$ is an eigenvector
of $L_z$ with eigenvalue $1$. On the other hand we already know that
$P$, when acting on $\ell_1$, has spectral radius equal to $1$ and
no eigenvalues on the unit circle besides eventually $1$.
The choice $z=1$ gives $u=\pi$ and
$v={\bf \pi}(I-Q)=\pi_1\,p$, as expected.
\noindent
Let now ${\bf u}:S\to R$ be a bounded vector and ${\bf \nu}$
an initial distribution on $S$,
which will be assumed to decay not slower than $\pi$
at infinity. The latter condition is equivalent to the assumption
made in Theorem 1: if the $P$ has ergodic degree
$\ell \geq 1$ then $\pi$ ($\nu$) has $P$-order (at least) $\ell -1$.
In fact, $\nu$ can be viewed as the distribution of
the random variable $1+\min\{n: n\geq 0,\, x_n=1\}$.
\noindent
We then consider the following generating function,
$$
S(z)=\sum_{n=0}^{\infty}z^n \,
({\bf \nu} P^n-{\bf \pi})\cdot {\bf u}.
$$
Using (3.10) and ${\bf \nu} \, L_z = ({\bf \nu}\cdot {\bf 1}_z) \, p$
we have, for $|z|<1$,
$$
\eqalign{
\sum_{n=0}^{\infty}z^n \, {\bf \nu} P^n \cdot {\bf u}&=
{\bf \nu} (I-zP)^{-1}\cdot {\bf u}\cr
&={\bf \nu} (I-L_z)^{-1}(I-zQ)^{-1}\cdot {\bf u}\cr
&={({\bf \nu}\cdot {\bf 1}_z) \, p\,
(I-zQ)^{-1}\cdot {\bf u}\over 1-\lambda_z}
+{\bf \nu}\,(I-zQ)^{-1}\cdot {\bf u}
\cr
&=
{({\bf \nu}\cdot {\bf 1}_z) \,
( m_1\,{\bf \pi}_z\cdot {\bf u}) \over 1-\lambda_z}
+{\bf \nu}_z\cdot {\bf u} \cr }
$$
where $m_1=\pi_1^{-1}=D(1)$, ${\bf \nu}_z={\bf \nu}\,(I-zQ)^{-1}$ and
${\bf \pi}_z=\pi_1 p\,(I-zQ)^{-1}$
(in particular ${\bf \pi}_z|_{z=1} \equiv \pi$). Therefore
$$\eqalign{
S(z)&=
{ (m_1\,{\bf \pi}_z\cdot {\bf u})\, ({\bf \nu}\cdot {\bf 1}_z)
\over 1-\lambda_z}-
{ ({\bf \pi}\cdot {\bf u})\, ({\bf \nu}\cdot {\bf 1})\over 1-z}
+{\bf \nu}_z\cdot {\bf u}\cr
&={(m_1\, {\bf \pi}\cdot {\bf u})\, ({\bf \nu}\cdot {\bf 1})\over
1-\lambda_z}-
{({\bf \pi}\cdot {\bf u})\,({\bf \nu}\cdot {\bf 1})\over 1-z}+R(z)\cr
&=({\bf \pi}\cdot {\bf u})\,({\bf \nu}\cdot {\bf 1})\, H(z) + R(z) \cr }
$$
where we have used the notation
$$
H(z)= {m_1 \over 1-\lambda_z}-{1\over 1-z}=
{\sum_{n=0}^{\infty}e_nz^n\over \sum_{n=0}^{\infty}d_nz^n},
\quad\hbox{with}\quad e_n = \sum_{k>n}d_k,
$$
and
$$
R(z) = {(m_1\,{\bf \pi}_z\cdot {\bf u})\, ({\bf \nu}\cdot {\bf 1}_z)
-(m_1\, {\bf \pi}\cdot {\bf u})\, ({\bf \nu}\cdot {\bf 1}) \over
1-\lambda_z}
+{\bf \nu}_z\cdot {\bf u}.
$$
Reasoning as in the proof of Lemma 3, one then shows that
if $M_{11}^{(\ell)}<\infty$ then the coefficients of $H(z)$
decay as $o(n^{-(\ell-1)})$. It remains to examine the
behaviour of $R(z)$. We first rewrite it
as
$$
R(z)={(m_1{\bf \pi}\cdot {\bf u})\,
({\bf \nu}\cdot {\bf 1}_z -{\bf \nu}\cdot {\bf 1})
\over 1-\lambda_z}+{({\bf \nu}\cdot {\bf 1}_z) \, (m_1{\bf \pi}_z\cdot
{\bf u}-
m_1{\bf \pi}\cdot {\bf u})\over 1-\lambda_z}
+{\bf \nu}_z\cdot {\bf u}.
$$
We have
$$
{{\bf \nu}\cdot {\bf 1} -{\bf \nu}\cdot {\bf 1}_z
\over 1-\lambda_z}={\sum_{n=0}^{\infty}\eta_nz^n\over
\sum_{n=0}^{\infty}d_nz^n},
\quad\hbox{with}\quad\eta_n = \sum_{k>n}\nu_k.
$$
Moreover, a straightforward calculation yields
$$
m_1{\bf \pi}_z\cdot {\bf u}=
\sum_{n=0}^{\infty}z^n\left(\sum_{k=1}^{\infty}u_k\,p_{n+k}\right)
$$
and therefore
$$
{ m_1{\bf \pi}\cdot {\bf u}-
m_1{\bf \pi}_z\cdot {\bf u}\over 1-\lambda_z}=
{\sum_{n=0}^{\infty}\xi_nz^n\over \sum_{n=0}^{\infty}d_nz^n},
\quad\hbox{with}\quad\xi_n = \sum_{k=1}^{\infty}u_k\, d_{k+n}.
$$
In a similar way, we get
$$
{\bf \nu}_z\cdot {\bf u}=
\sum_{n=0}^{\infty}\gamma_n\, z^n=
{\sum_{n=0}^{\infty}\left(\sum_{l=0}^n\gamma_l\,
d_{n-l}\right)z^n\over \sum_{n=0}^{\infty}d_nz^n},
\quad\hbox{with}\quad\gamma_n =\sum_{k=1}^{\infty}u_k\,\nu_{n+k}.
$$
In order to compare the term ${\bf \nu}_z\cdot {\bf u}$
with $H(z)$, we rewrite it as
$$
{\bf \nu}_z\cdot {\bf u}=
{\sum_{n=0}^{\infty}\left(\sum_{l=0}^n\gamma_l\,
d_{n-l}\right)z^n\over \sum_{n=0}^{\infty}d_nz^n}.
$$
Lemma 2 then entails that the coefficients
of the power series at the numerator behave asymptotically as
$m_1\, \gamma_n$.
In the same way, the coefficients
of the product $\sum_{n=0}^{\infty}\xi_nz^n \cdot
\sum_{n=0}^{\infty}\nu_nz^n$ have the same behaviour as
$({\bf \nu} \cdot {\bf 1})\, \xi_n$.
Moreover we have
$$
|\xi_n| \leq \Vert u\Vert_{\infty}\sum_{k>n}d_k=
\Vert u\Vert_{\infty}\,e_n,\qquad
|\gamma_n| \leq \Vert u\Vert_{\infty}\sum_{k>n}\nu_k=
\Vert u\Vert_{\infty}\,\eta_n.
$$
Finally, comparing all the above terms,
we have found that under our assumptions
on the distribution ${\bf \nu}$ and the vector ${\bf u}$,
and under the hypothesis that $M_{11}^{(\ell)}<\infty$,
the quantity $({\bf \nu} P^n-{\bf \pi})\cdot {\bf u}$
decays as $o(n^{-(\ell-1)})$.
\noindent
We conclude by showing how
this bound can be sharpened under the hypothesis that
$p_n\sim n^{-(\ell +2)}\, L(n)$ with $L(n)$
slowly varying at infinity. We start by showing
that the coefficients of the power series of
$H(z)$ are asymptotically equal to
$\ell^{-1}(\ell +1)^{-1}D(1)^{-1}n^{-\ell}\, L(n)$. To see this,
suppose first that
$\ell =1$. Then $D(1)<\infty$ and we have
$d_n \sim (\ell +1)^{-1}n^{-(\ell +1)}\, L(n)$,
$e_n \sim \ell^{-1}(\ell +1)^{-1}n^{-\ell}\, L(n)$.
Notice that the power series $\sum_{n=0}^{\infty}e_nz^n$
is divergent at $z=1$.
Lemma 2, or, alternatively, a repeated application of a Tauberian
theorem for
power series (see, e.g., [F2], Chap. XIII.5, Thm. 5) then
yield the claim. More generally, if
$\ell > 1$ we have (with the notation of I) that the numbers
$\{d^{(m)}_n\}$ are defined for $m\leq \ell +1$,
$D^{(\ell)}(1)=d^{(\ell +1)}_0$, but $D^{(\ell+1)}(z)$
diverges at $z=1$. We can then write
$$\eqalign{
H(z)\cdot \sum_{n=0}^{\infty} d_nz^n =
(z&-1)^{\ell-1} \, D^{(\ell+1)}(z)\, + \cr
&+ \, \sum_{m=3}^{\ell +1}(z-1)^{m-3}
\left(d^{(m-1)}_0+d^{(m)}_0\right) \cr }
$$
and the claim follows as above.
Finally, under the additional hypotheses that
$u_i = o (1)$ and $\nu_i=o (\pi_i)$ we have
$\xi_n = o (e_n)$ and $\eta_n = o (e_n)$.
This prevents from possible cancellations among
the various coefficients introduced above and yields
the asymptotic behaviour claimed
in Proposition 1.
\vfill \eject
\noindent
{\bf Appendix.} $\;$ {\sl Denumerable Markov chains and
intermittent interval maps.}
\vsni
\noindent
The example discussed in Section 3 is isomorphic (mod $0$)
to the iteration process of the
piecewise affine `intermittent' map
$f:[0,1]\to [0,1]$ given by
$$
f (x) =\cases{ (x-d_1)/ \alpha_1, &if $d_1\leq x\leq d_0$ \cr
d_{i-1} + (x-d_i)/ \alpha_i, &if $d_i\leq x< d_{i-1},
\,\, i\geq 2$ \cr
}\eqno(A.1)
$$
\noindent
Here the numbers $d_i=\sum_{l>i} p_l$
are supposed to be all distinct,
and $\alpha_i=p_i/p_{i-1}$, $i\geq 1$ (with $p_0=1$).
In what follows we shall always assume that $\sum d_i < \infty$.
\noindent
This map is named `intermittent' for,
if $\limsup \alpha_i =1$, then $f$ can be viewed as a
piecewise affine approximation of a smooth transformation
of $[0,1]$ which is expanding everywhere but at the
fixed point in the origin,
where the derivative is equal to one (see [W], [LSV], [M]).
\noindent
The partition ${\cal I}$ of $[0,1]$
into the intervals $I_i=[d_{i},d_{i-1}]$, $i\geq 1$ is a Markov
partition for $f$. Let $\O$, $\pi$ and $P$ be as in the Example
of Section 3.
One then sees that the map $\phi : \O \to [0,1]$ defined by:
$\phi (\o)=x$ according to $f^j(x)\in I_{\o_j}$, $j\geq 0$,
is a bijection between $\O$ and the residual set of points
in $(0,1]$ which are not preimages of $1$ w.r.t. the map $f$.
Clearly $\phi$ conjugates $f$ with the shift $\tau$
on $\O$. Moreover, let $\mu$ be the $\tau$-invariant
Markov probability measure on $\O$
defined in (1.8) (with $\pi$ and $P$ as above).
Then $\rho =\mu \circ \phi^{-1}$ is $f$-invariant and
it is easy to see that
$P(i,j)=\rho\, (I_i\cap f^{-1}I_j)/\, \rho(I_i)$.
Finally, if $f$ is the
piecewise affine approximation of a smooth transformation
of $[0,1]$ having a tangency at $x=0^+$ of order $1+1/\eta$,
with $\eta > 0$, then
$p_i\sim i^{-(1+\eta)}$ and hence
$\alpha_i \sim 1 - (1+\eta)/ i$. Thus, in order to
have $\sum d_i < \infty$ it is necessary that $\eta > 1$,
and the corresponding Markov chain $P$ has
ergodic degree $\ell =\eta - 1$.
\noindent
It turns out that an effective tool
for studying the statistical properties of
the iterates of a (non-invertible) transformation $f$ on $[0,1]$
is given by Perron-Frobenius operator
$M:L^1([0,1],dx)\to L^1([0,1],dx)$, which is
defined by the validity of (see, e.g., [LM])
$$
\int_0^1\, u\circ f^n(x) \, v(x) \, dx
=\int_0^1\, u(x) \, M^nv(x) \, dx\eqno(A.2)
$$
for all pairs $u,v\in L^1$.
For $f$ as in (A.1), $n=1$ and $d_i\leq x< d_{i-1}$,
(A.2) yields
$$
(M\, v)(x) = \alpha_{i+1}\, v\, (d_{i+1}+\alpha_{i+1}(x-d_i))+
\alpha_1\, v\, (d_1+\alpha_1x).\eqno(A.3)
$$
Let us note that the space $\ell_1(S,p)$ of
vectors $u:\N\to \R$ such that
$$
\Vert u \Vert_{1,p}:=\sum_{i\in S}|u_i|\, p_i < \infty
$$
is left invariant by the operator $M$,
which takes on the matrix representation
$$
M(i,j)={p_i\over p_j}\, P(i,j),\qquad i,j\geq 1.\eqno(A.4)
$$
The eigenequation $M\, h = h$ has a solution $h\in \ell_1(S,p)$
given by
$h_i = h_1\, p_1\, d_{i-1}/ p_{i}$.
Moreover, one easily check that the vector $p$
satisfies $M^*p=p$. Therefore,
recalling that $\pi_i = \pi_1 \, d_{i-1}$, and
putting $h_1= \pi_1/ p_1$, we get
$\pi_i = h_i\, p_i$.
One then sees that the vector $h \in \ell_1(S,p)$ corresponds to the
(locally constant) density
of the absolutely continuous $f$-invariant probability measure
$\rho (dx) = h(x)\, dx$, with $h\in L^1([0,1],dx)$ and $h(x)\equiv h_i$
for
$d_{i}\leq x < d_{i-1}$. Observe that $\rho (I_i)=\pi_i$.
Now, using (A.2) we find
$$
\rho (u\circ f^n \, v ) - \rho (u)\, \rho(v)=
\int_0^1\, u (x)\,
[\, (M^n\, v h)(x) - \rho (v)\, h(x)] \, dx\eqno(A.5)
$$
Suppose that $u$ and $v$ are bounded $L^1$-functions
taking constant values $u_i$ and $v_i$ on the elements $I_i$ of
the Markov partition ${\cal I}$. We shall denote
by ${\bf u}=(u_i)_{i=1}^{\infty}$ and
${\bf v}=(v_i)_{i=1}^{\infty}$ the
corresponding vectors in $\ell_{\infty}(\N)$.
Using (A.4), (A.5) and the above observations
we get (the notation is as in the proof of
Corollary 1),
$$
\rho (u\circ f^n \, v ) - \rho (u)\, \rho (v)
=({\overline {\pi {\bf v}} } \,
P^n-({\pi}\cdot {\bf v})\, \pi) \cdot {\bf u}.\eqno(A.6)
$$
Now set $u_{\infty}=\limsup u_i$, $v_{\infty}=\limsup v_i$,
and suppose that $u_{\infty}\neq 0$ or $v_{\infty}\neq 0$.
Then, setting
${\hat {\bf u}}={\bf u}-u_{\infty}{\bf 1}$ and
${\hat {\bf v}}={\bf v}-v_{\infty}{\bf 1}$
have that
$(\pi \cdot {\hat {\bf u}}) (\pi \cdot {\hat {\bf v}})\neq 0$
provided $\pi \cdot {\bf u}\neq u_{\infty}$ and
$\pi \cdot {\bf v}\neq v_{\infty}$. Moreover
$\limsup {\hat u}_i=0$ and $\limsup {\hat v}_i=0$.
On the other hand we plainly have
$({\overline {\pi {\hat {\bf v}}} } \,
P^n-({\pi}\cdot {\hat {\bf v}})\, \pi) \cdot {\hat {\bf u}}=
({\overline {\pi {\bf v}} } \,
P^n-({\pi}\cdot {\bf v})\, \pi) \cdot {\bf u}$.
We then see that the conditions
$\rho (u)\equiv \pi \cdot {\bf u}\neq u_{\infty}$ and
$\rho (v)\equiv \pi \cdot {\bf v}\neq v_{\infty}$
are equivalent to the conditions
$u_i=o (1)$ and $\nu_i=o (\pi_i)$
(along with $(\pi \cdot {\bf u}) (\nu \cdot 1)\neq 0$)
assumed in the last statement of Proposition 1,
with
the identification
$\nu = {\overline {{\pi} {\bf v}} } /{\pi}\cdot {\bf v}$.
\noindent
The following result is now a direct
consequence of Proposition 1
(see [LSV], [M] for related results):
\vsni
\noindent
{\bf Corollary 2.} {\sl Let $f:[0,1]\to [0,1]$ be as in
(A.1) and assume that $\alpha_i \sim 1 - {(1+\eta)/ i}$
for some $\eta >1$.
Then, for any pair of bounded $L^1$-functions $u,v:[0,1]\to \R$,
locally constant on the Markov partition ${\cal I}$, there is a
positive constant $C=C(u,v)$ such that, for $n$ large enough,
$$
|\,\rho (u\circ f^n \, v ) - \rho (u)\, \rho (v)\, | \leq
C\, n^{-(\eta-1)}.
$$
Assume furthermore that $\rho (u)\neq u_{\infty}$ and
$\rho (v)\neq v_{\infty}$. Then we have}
$$
\rho (u\circ f^n \, v ) - \rho (u)\, \rho (v) \sim
C\, n^{-(\eta-1)}\quad\hbox{as}\quad n\to \infty.
$$
We conclude with a final remark.
>From the proof of Proposition 1 it follows that
if the conditions $\rho (u)\neq u_{\infty}$ and
$\rho (v)\neq v_{\infty}$
are violated, then cancellations may take place
to accelerate the convergence rate.
A trivial example is obtained by taking $u,v$ constant on $[0,1]$.
Conversely, one may argue as follows:
take $a>0$ and let $t_a(x)$ be the first entrance time into
the set $[a,1]$.
When an orbit falls in a small (compared to $a$) neighbourhood
of $0$ it stays there for a time which can be
arbitrarily large before reaching again $[a,1]$.
More precisely, from the above discussion one readily finds that,
under the assumptions of Corollary 2,
$$
\rho \{x\in [0,1] \, :\, t_a(x)>n\}\sim C(a) \,
n^{-(\eta-1)}\quad\hbox{as}\quad
n\to \infty.
$$
Thus, if the condition is satisfied, namely if
the average value of the test functions is reached
away from the origin, then
the term $\rho (u\circ f^n \, v )$ cannot approach its asymptotic
value $\rho (u)\, \rho (v)$ at a rate faster
than that given by the statistics
of first entrance times given above.
\vfill \eject
\noindent
{\bf References.}
\vsni
\noindent
\item{[C]} K L Chung: {\it Markov chains with stationary
transition probabilities},
Springer-Verlag Berlin Heidelberg New York 1967.
\item{[F1]} W Feller:
Fluctuation theory of recurrent events,
{\sl TAMS} {\bf 67} (1949), 99-119.
\item{[F2]} W Feller:
{\sl An Introduction to Probability Theory and Its
Applications}, Volume 2,
J.Wiley and Sons, New York 1970.
\item{[FMM]} G Fayolle, V A Malyshev and M V Menshikov:
{\it Topics in the constructive theory of countable markov
chains},
Cambridge University Press, Cambridge 1992.
\item{[H]} T E Harris:
First passage and recurrence distribution,
{\sl TAMS} {\bf 73} (1952), 471-486.
\item{[I]} S Isola: Dynamical zeta functions and correlation
functions for intermittent interval maps, Preprint 1996.
\item{[KSK]} J G Kemeny, J L Snell and A W Knapp:
{\it Denumerable Markov Chains},
Springer-Verlag, 1976.
\item{[L]} T Lindvall: {\it Lectures on the Coupling Method},
John Wiley and Sons, Inc. 1992.
\item{[LM]} A Lasota and M C Mackey:
{\it Probabilistic properties of deterministic systems},
Cambridge University Press, 1985.
\item{ [LSV]} A Lambert, S Siboni and S Vaienti, {Statistical
properties of a non-uniformly hyperbolic map of the interval},
{\sl J. Stat. Phys.} {\bf 72} (1993), 1305-1330.
\item{[M]} M Mori, {On the intermittency of a piecewise
linear map}, {\sl Tokyo J. Math.} {\bf 16} (1993), 411-428.
\item{[T]} E C Titchmarsh: {\it The Theory of Functions},
2nd ed., Oxford University Press, 1939.
\item{[VJ]} D Vere-Jones: On the spectra of some linear operators
associated with queueing systems, {\sl Z. Wahrsch.} {\bf 2} (1963),
12-21.
\item{ [W]} X J Wang,
{Statistical physics of temporal intermittency},
{\sl Phys. Rev.} {\bf A40} (1989), 6647.
\item{[Z]} A Zygmund: {\it Trigonometrical series},
Cambridge at the University Press, 1959.
\end